Commit Graph

5755 Commits

Author SHA1 Message Date
597115c305 [feature] add SHOW TABLET STORAGE FORMAT stmt (#9037)
use this stmt to show tablets storage format in be, if verbose is set,
    will show detail message of tablet storage format.
    e.g.
    ```
    MySQL [(none)]> admin show tablet storage format;
    +-----------+---------+---------+
    | BackendId | V1Count | V2Count |
    +-----------+---------+---------+
    | 10002     | 0       | 2867    |
    +-----------+---------+---------+
    1 row in set (0.003 sec)
    MySQL [test_query_qa]> admin show tablet storage format verbose;
    +-----------+----------+---------------+
    | BackendId | TabletId | StorageFormat |
    +-----------+----------+---------------+
    | 10002     | 39227    | V2            |
    | 10002     | 39221    | V2            |
    | 10002     | 39215    | V2            |
    | 10002     | 39199    | V2            |
    +-----------+----------+---------------+
    4 rows in set (0.034 sec)
    ```
    add storage format infomation to show full table statment.
    ```
    MySQL [test_query_qa]> show full tables;
    +-------------------------+------------+---------------+
    | Tables_in_test_query_qa | Table_type | StorageFormat |
    +-------------------------+------------+---------------+
    | bigtable                | BASE TABLE | V2            |
    | test_dup                | BASE TABLE | V2            |
    | test                    | BASE TABLE | V2            |
    | baseall                 | BASE TABLE | V2            |
    | test_string             | BASE TABLE | V2            |
    +-------------------------+------------+---------------+
    5 rows in set (0.002 sec)
    ```
2022-04-27 10:53:43 +08:00
b406684486 Modify incorrect comments in ShowExecutor (#9232)
Fixed some incorrect comments in ShowExecutor
2022-04-26 19:10:49 +08:00
a20cf1e03e [typo](annotation): fix typo in ldap.conf (#9200) 2022-04-26 10:25:07 +08:00
7cfebd05fd [fix](hierarchical-storage) Fix bug that storage medium property change back to SSD (#9158)
1. fix bug described in #9159
2. fix a `fill_tuple` bug introduced from #9173
2022-04-26 10:15:19 +08:00
62b38d7a75 [fix](spark load) fix getHashValue of string type is always zero in spark load. (#9136)
Buffer flip is used incorrectly.
When the hash key is string type, the hash value is always zero.
The reason is that the buffer of string type is obtained by wrap, which is not needed to flip.
If we do so, the buffer limit for read will be zero.
2022-04-26 10:14:21 +08:00
cdd1b6d6dd [fix](function) fix lag/lead function return invalid data (#9076) 2022-04-26 09:34:46 +08:00
bdf915abd4 [Enhancement] (image) check image validity as soon as generated (#9011)
* load newly generated image file as soon as generated to check if it is valid.

* delete the latest invalid image file

* fix

* fix

* get filePath from saveImage() to ensure deleting the correct file while exception happens

* fix

Co-authored-by: wuhangze <wuhangze@jd.com>
2022-04-25 19:35:41 +08:00
687421b43f keep at least one validated image file (#9192)
* rename ImageSeq to LatestImageSeq in Storage

* keep at least one validated image file
2022-04-25 19:32:43 +08:00
3bdfcde8e8 [Improvement] not print logs to fe.out when fe is running under daemon mode (#9195)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-04-25 18:29:29 +08:00
7226089116 FIX: getChannel -> getChannel() (#9217)
Co-authored-by: Rongqian Li <rongqian_li@idgcapital.com>
2022-04-25 17:46:00 +08:00
5b9a1a2a5d avoiding a corrupt image file when there is image.ckpt with non-zero … (#9180)
* avoiding a corrupt image file when there is image.ckpt with non-zero size

For now, saveImage writes data to image.ckpt via an append FileOutputStream,
when there is a non-zero size file named image.ckpt, a disaster would happen
due to a corrupt image file. Even worse, fe only keeps the lastest image file
and removes others.

BTW, image file should be synced to disk.

It is dangerous to only keep the latest image file, because an image file is
validated when generating the next image file. Then we keep an non validated
image file but remove validated ones. So I will issue a pr which keeps at least
2 image file.

* append other data after MetaHeader

* use channel.force instead of sync
2022-04-25 17:01:01 +08:00
af2295f971 MOD: remove <scope>provided</scope> (#9177) 2022-04-25 10:00:57 +08:00
a608c3d5dc [Fixbug]assure transaction num in image file is right (#9181)
For now, dbTransactionManager::getTransactionNum is only used by
checkpoint to get transaction num to put into a image file. However,
transactions written into a image file do not come from the same
data structure as the num comes. Thus, we should pay much attention to
assure two data structue is consistent on size. Actually, it is
very difficult to do so.

This patch just let getTransactionNum get number from the same data
structure as write method.

The change was introduced by b93e841688.
2022-04-25 09:59:18 +08:00
Pxl
2d83167e50 [Feature] [Lateral-View] support outer combinator of table function (#9147) 2022-04-24 12:09:40 +08:00
ae25633d50 [fix](cache) Generate md5 value using utf8 encoding for sqlkey string (#9121) 2022-04-23 21:37:34 +08:00
89d37d920e [fix](transaction) Fix running transaction num always be zero when execute show proc '/transactions' stmt (#9106) 2022-04-23 21:37:18 +08:00
4a10b37ca2 [feature](image tool) support image load tool (#8982) 2022-04-23 21:36:58 +08:00
e880dde7a5 [feature-wip](statistics) step1: create the statistics job (#8858)
This is the first PR for statistics collection includes some implementations of the statistics(#6370), it will not affect any existing code and users will not be able to create statistics job.
It mainly implements the semantic checking module for statistical information collection jobs, and the job creation module.
The syntax is:

ANALYZE [[ db_name.tb_name ] [( column_name [, ...] )], ...] [ PROPERTIES(...) ]

e.g. 
ANALYZE;
ANALYZE tbl1;
ANALYZE tbl1(col1, col2) PROPERTIES("cbo_ statistics_ task_ timeout" = "10");
Two configurations have been added:

Timeout time of a single task max_cbo_statistics_task_timeout_sec
The maximum number of running jobs the system can receive cbo_max_statistics_job_num

Co-authored-by: weizhengte <1141550741@qq.com>
Co-authored-by: weizhengte <weizhengte@foxmail.com>
Co-authored-by: EmmyMiao87 <522274284@qq.com>
Co-authored-by: frankywei <frankywei@tencent.com>
2022-04-22 18:24:54 +08:00
81ff49f8e3 [revert] "[Fix bug] fix non-equal out join is not supported (#8857)" (#9150)
This PR cause FE ut failed:

InferFiltersRuleTest
testOn3Tables1stInner2ndRightJoinEqLiteralAt2nd
testOn3Tables1stInner2ndRightJoinEqLiteralAt3rd
2022-04-21 18:20:19 +08:00
ae680b4248 [UDF] support RPC udaf part 1: support create RPC udaf in fe (#8510) 2022-04-21 17:38:58 +08:00
2c2e06a5fe [Fix bug] fix non-equal out join is not supported (#8857) 2022-04-21 12:44:20 +08:00
7af684ad0f [fix] Fix a compatibility problem caused by using a non-existent database when connecting via mysql client (#9127) 2022-04-21 12:43:39 +08:00
7b3865b524 [fix](ut)(vectorized) fix a potential stack overflow bug and some unit test (#9140) 2022-04-21 12:17:03 +08:00
fa5b5fc6d1 [fix](dynamic_partition) fix dynamic partition scheduler not work for olap table with random hash info (#9108) 2022-04-21 12:16:27 +08:00
40362dfaca [fix](partition) Fix wrong partition distribution key info for random hash olap table (#9104) 2022-04-20 17:08:42 +08:00
f253e260c8 [fix] Modify fe jetty configuration parameters (#9075) 2022-04-20 14:51:25 +08:00
39c0fec680 [fix] fix bug when partition_id exceeds integer range in spark load (#9073) 2022-04-20 14:50:55 +08:00
1b4cd76847 [feature](vectorized)(function) Support min_by/max_by function. (#8623)
Support min_by/max_by on vectorized engine.
2022-04-20 14:46:19 +08:00
869fdff2f0 [refactor] add reference path for source file from impala (#9115)
According to the requirements of the APLv2, the referenced code needs to be marked with the path of the source code.
2022-04-20 12:29:57 +08:00
0f86fed547 [improvement](insert) Support verbose keyword in insert query stmt (#9047) 2022-04-18 19:36:40 +08:00
afce993ca7 [feature](load)(csv) CSV import and export support header (#8765)
- Add two new types to stream load boker load: **csv_with_names** and **csv_with_name_sand_types**
- Add two new types to export: **csv_with_names** and **csv_with_names_and_types**
2022-04-18 15:29:18 +08:00
Pxl
44d37acbff Change date/datetime result type to bigint (#8975) 2022-04-18 09:56:28 +08:00
04287cabb2 [Forbidden](Vec) Switch to non-vec engine when outer join + not null column (#8979)
* [Forbidden](Vec) Switch to non-vec engine when outer join + not null column

Vectorized code will occur `core` in the case of ```outer join + not null column```, such as issue #7901
So we need to fall back from vectorized mode to non-vectorized mode when we encounter this situation.

If the nullside column of the outer join is a column that must return non-null like count(*)
then there is no way to force the column to be nullable.
At this time, vectorization cannot support this situation,
so it is necessary to fall back to non-vectorization for processing.
For example:
  Query: set enable_vectorized_engine=true
  Query: select * from t1 left join (select k1, count(k2) as count_k2 from t2 group by k1) tmp on t1.k1=tmp.k1
  Result: Query goes non-vectorized engine
2022-04-18 09:55:33 +08:00
0f8a7ff985 [Refactor](ReportHandler) Remove some unused schema_hash code in fe (#9005) 2022-04-17 10:01:34 +08:00
c7a098c1b0 [fix](sql_block_rule) optimization of alter sql_block_rule stmt (#8971)
Optimization of alter sql_block_rule stmt.
2022-04-16 11:05:31 +08:00
67c16f3a03 [fix](show-function) fix bug for show function (#9025)
show full function
result has an error:
INIT_FN and UPDATE_FN is wrong
2022-04-15 15:18:20 +08:00
7634e55513 [fix] fix p0 test failed because of char type cannot convert to datetime (#8996)
fix p0 test failed because of char type cannot convert to datetime
2022-04-15 15:16:00 +08:00
0fa917703e [Bug] Fix some node in vectorized not have V title (#9028)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-04-15 11:25:52 +08:00
579aee110a [fix](ut)(compile) Fix BE compile bug and FE unit test (#9027)
1. The compile bug is introduced from #8855
2. FE ut bug is introduced from #8848 and #8770
2022-04-14 17:37:41 +08:00
9ac6d23a44 [Feature]support stddev/variance agg functions to window function (#8962) 2022-04-14 12:07:26 +08:00
48c288af94 [refactor](fe) modify warning message of drop backends (#9006)
Modify warning message of drop backends
2022-04-14 11:46:29 +08:00
91200cc7a6 [fix] fix NPE when initialize GlobalState (#8990)
Introduced from #8695
The context object may be null for StreamLoadPlanner
2022-04-14 11:44:41 +08:00
18daefff80 [refactor](fe): remove unused code (#8986) 2022-04-14 11:44:21 +08:00
a1982c4391 [improvement] Use System.currentTimeMillis() to get the current millisecond (#8828) 2022-04-14 10:03:37 +08:00
bca121333e [feature](cold-hot) support s3 resource (#8808)
Add cold hot support in FE meta, support alter resource DDL in FE
2022-04-13 09:52:03 +08:00
7e08d3e320 Modify the maximum and minimum number of threads in jetty (#8960)
Co-authored-by: smallhibiscus <844981280>
2022-04-13 09:50:46 +08:00
d79e8a7b5a [fix](load) start transaction before we need it (#8819) (#8908) 2022-04-13 09:50:26 +08:00
b33ab960a8 [fix] move new add enum OFS of StorageType to last (#8983)
* [fix] move new add enum OFS of StorageType to last

* modify enum in gensrc/thrift/Types.thrift
2022-04-12 20:21:15 +08:00
6af1c52e13 [Feature] add support for tencent chdfs (#8963)
Co-authored-by: chengwu <chengwu@tencent.com>
2022-04-12 16:02:42 +08:00
51269efbb7 [improvement]Disable mini load (#8955)
Disable miniload by default
2022-04-12 16:01:03 +08:00