Commit Graph

5755 Commits

Author SHA1 Message Date
c0e25a1c37 [fix](Nereids) diable unstable test in graph simplifier (#14630) 2022-11-28 14:07:14 +08:00
b9270dace3 [fix](nereids) after injection, min/max value in columnStats for date/dateV2 type is wrong (#14605) 2022-11-28 14:05:33 +08:00
b6605b99aa [ehancement](nereids) eliminate project in the post process phase (#14490)
Remove those projects that used for column pruning only and don't do any expression calculation, So that we could avoid some redundant data copy in do_projection of BE side.
2022-11-28 00:39:36 +08:00
280f8be4bd [test](regression) adjust nereids related regression cases under datev2 (#14578)
1. revert 14439, recovery dup&unique test cases
2. adjust nereids related case
2022-11-27 23:57:51 +08:00
230ede9085 [opt](nereids) avoid broadcast join if hash table is big (#14240)
1. when we choose broadcast join, we only consider transferring less data. This may lead OOM, if hash table is big enough.
2. fix a bug in `Stats.computeSize()`. ColumnStats.dataSize is the total size of this column, but we need the byte of one cell.
2022-11-27 23:22:43 +08:00
948ee41632 [opt](planner) let cardinality in explain result more readable (#14330)
1. add common for big int in explain. for example "1500000" will be printed as "1,500,000"
2. Scan node cardinal is missing
2022-11-27 23:12:41 +08:00
b3859e1e1a [ehancement](fe) Remove unnecessary kill in AutoCloseConnectContext (#14606)
The invocation in ConnectContext.kill in AutoCloseConnectContext is redundant and caused too many useless logs
2022-11-26 23:54:33 +08:00
36419fae48 [fix](JdbcExecutor) fix that JdbcExecutor did not load the class jar (#14598)
JdbcExecutor did not load jdbc driver jar, so add classloader to load jdbc jar.
2022-11-26 23:53:05 +08:00
064b8d2aa6 [fix](multi-catalog) fix coredump when querying partitioned hive table with text format (#14604)
BE will crash when querying partitioned hive table with text format
and put partition column at first of select items.

1. FE should use file slots to set the column mapping index of csv file.
2. BE should use `get_by_name` of block to get right column in a block in csv reader.
2022-11-26 11:42:40 +08:00
52c6ba051e [feature](jsonb type)refactor JSONB type using column and add testcase (#13778)
1. Refactor JSONB type using ColumnString instead making a copy.
2. Add regression testcase for JSONB load and functions.
2022-11-26 10:06:15 +08:00
2ae7dae925 [feature](nereids) Support row policy (#13879)
This pr did two things:
1. 【new logical plan】add **LogicalCheckPolicy** before UnboundRelation in LogicalPlanBuilder.
2. 【new rule】turn **LogicalCheckPolicy** to LogicalFilter if row policy exist, otherwise remove it.
2022-11-25 22:57:56 +08:00
494f35c26b [fuzzy](test) disable some fuzzy variables since it has bugs (#14583)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-11-25 21:15:10 +08:00
45fa2fc56b [fix](multi catalog)Use -1 as external es table column id instead of uniq id (#14557)
Using cache to store external table columns, doesn't persist uniq id for external columns anymore.
So use -1 as column id for ES external table.
Avoid non-master FE trying to get uniq id problem. The problem will cause non-master FE fail to write bdbje.
2022-11-25 16:13:16 +08:00
9630257704 [fix](Nereids): fix bugs in random construct join plan (#14575) 2022-11-25 16:05:29 +08:00
4728e75079 [feature](bitmap) Support in bitmap syntax and bitmap runtime filter (#14340)
1.Support in bitmap syntax, like 'where k1 in (select bitmap_column from tbl)';
2.Support bitmap runtime filter. Generate a bitmap filter using the right table bitmap and push it down to the left table storage layer for filtering.
2022-11-25 15:22:44 +08:00
5efdcb9ed0 [improvement](storage) For debugging problem: add session variable (#14576) 2022-11-25 14:16:00 +08:00
d5d356b17f [vectorized](function) support order by field function (#14528)
* [vectorized](function) support order by field function

* update

* update test
2022-11-25 14:00:46 +08:00
deef491e01 [fix](Nereids) refactor CTE and EliminateAliasNode and fix the bug that CTE reuse relationId (#14534)
This pr contribute:
- support explain CTE;
- refine CTE, fix the bug: reuse the same analyzed plan which LogicalOlapScan has the same relationId;
- change EliminateAliasNode to LogicalSubQueryAliasToLogicalProject and move to the top of rewrite stage, so we can simply observe the analyzed plan by the LogicalSubQueryAlias with alias;
- job traverse left child first, so the ExprId growth from left child to right child.
2022-11-25 10:54:53 +08:00
5ccc875824 [fix](recycle) refactor the logic of erase meta with same name (#14551)
in #14482, we implement the feature to keep specific number of meta with same name in catalog recycle bin.
But it will cause meta replay bug.
Because every time we drop db/table/partition, it will try to erase a certain number of meta with same name.
And when replay "drop" edit log, it will do same thing. But the number of meta to erase it based on current config value,
not persist in edit log, so it will cause inconsistency with "drop" and "replay drop".

In this PR, I move the "erase meta with same name" logic to the daemon thread of catalog recycle bin.
2022-11-25 09:47:24 +08:00
d12112b930 [fix](fe) Fix mem leaks (#14570)
1. Fix memory leaks in StmtExecutor::executeInternalQuery
2. Limit the number of concurrent running load task for statistics cache
2022-11-25 09:16:54 +08:00
9103ded1dd [improvement](join)optimize sharing hash table for broadcast join (#14371)
This PR is to make sharing hash table for broadcast more robust:

Add a session variable to enable/disable this function.
Do not block the hash join node's close function.
Use shared pointer to share hash table and runtime filter in broadcast join nodes.
The Hash join node that doesn't need to build the hash table will close the right child without reading any data(the child will close the corresponding sender).
2022-11-24 21:06:44 +08:00
59b31a03c4 [Improvement](agg function) support group_bit_and/group_bit_or/group_bit_xor functions (#14386) 2022-11-24 16:46:42 +08:00
a04e1b49ec [feature](Nereids) Implement group by grouping sets, cube and rollup (#14496)
Issue Number: close #13615

The main work:

implement grouping sets/ cube/ rollup.
fix if function Infinite loop problem.
Support for isNull transitions to legacy optimizers.
2022-11-24 16:34:31 +08:00
0680b3b4d5 [opt](nereids) adjust nereids related regression test cases (#14439)
1. in dateV2, we adjust the dir structure to avoid creating a tpch-1G database
2. use `drop table XXX`  to replace `delete * from XXX where key>0`
3. remove explain cases, because 
- the explain string itself is variable, and the case is hard to maintain
- it is original planner explain, not nereids
2022-11-24 16:02:52 +08:00
fde474609e [feature](Nereids) Add dphyp job (#14485) 2022-11-24 15:50:05 +08:00
8afe298a0f [Fix](function) fix function retention lost ARRAY's element type … (#14538) 2022-11-24 15:19:50 +08:00
6c7f758ef7 [improvement](hashjoin) support partitioned hash table in hash join (#14480) 2022-11-24 14:16:47 +08:00
e656dae3f0 [fix](fe) fix leaks of connect context (#14529)
Remove ConnectContext which built for internal statistics from threadlocal to avoid memory leaks
2022-11-24 13:26:59 +08:00
ae4f4b9bf1 [fix](agg)having clause should use column name first then alias (#14408)
* [fix](agg)having clause should use column name first then alias

* fix fe ut
2022-11-24 10:31:58 +08:00
6ccdaf0aaf [fix](storage-policy) use Long instead of Date to persiste cooldowntime in storage policy (#14532)
Previously, we use "Date" type for cooldownTime in StoragePolicy.
But the serialization method of Date type in Gson is different in java8 and java11, which may cause inconsistent meta error.

This PR use Long to save cooldownTime.
And notice that in FE, the cooldownTime is saved in milliseconds, and in BE, it is saved in seconds.
2022-11-24 08:32:21 +08:00
496a92b668 [JavaUDF](loader) Fix compatible problem for JAVA 11 (#14519) 2022-11-23 23:36:39 +08:00
404cac42f9 [fix](multi catalog)Fix external table partition name and type inconsistent bug. (#14522)
Origin code using Set to store hms external table partition columns,
which couldn't guarantee the order of the columns.
This could cause the column name and column type doesn't match.
Using List instead of Set to fix the problem.
2022-11-23 21:40:44 +08:00
8d5eabb64f [enhancement](Nereids) reduce CostAndEnforcerJob call times (#14442)
record pruned plan's cost to avoid optimize same GroupExpression more than once.
2022-11-23 16:57:41 +08:00
45975dd321 [enhancement](Nereids): Change circle detector for better performance (#14438) 2022-11-23 14:31:14 +08:00
7a7e714fce [fix](nereids) width and penalty not derive when do stats derive (#14474)
a previous pr (#13883) refactor stats derive code, but missed width and penalty.
2022-11-23 14:26:51 +08:00
fb385dcf23 [opt](nereids) make fragment id in explain get inline with profile (#14421)
Nereids assign fragment ID in its own way. The fragment Id in explain is different from the fragment id in profile.
This difference makes trouble to understand profile.

This pr aims to print fragment id in explain the same as that in profile.·
2022-11-23 14:14:20 +08:00
7955e52b3e [fix](version) fix recover bug for lower version (#14457) 2022-11-23 14:05:17 +08:00
79688c34a1 [feature](catalog) add max num of same name meta information in catalog recycle bin (#14482) 2022-11-23 14:04:14 +08:00
d36b561520 [fix](in)fix in predicate datatype mismatch after union (#14497) 2022-11-23 09:57:03 +08:00
2eca51f3ba [enhancement](broker) broker load support tencent cos (#12801) 2022-11-22 21:51:15 +08:00
6eeebd47a9 [improvement](doc) add missing documents (#14460) 2022-11-22 21:42:00 +08:00
3360bdf124 [feature-wip](statistics) update cache when analysis job finished (#14370)
1. Update cache when analysis job finished
2. Rename `StatisticsStorageInitializer` to `InernalSchemaInitializer`
2022-11-22 21:33:10 +08:00
89c676e597 [Bug] fix bug for grouping set query which where condition is false (#14401) 2022-11-22 16:03:43 +08:00
663f7dddcc [improvement](planner) eliminating useless sort node (#14377) 2022-11-22 15:13:25 +08:00
730cd1a0c1 [Feature](Nereids) Simplify range of predicate (#14113)
Simplify range of predicate

for example:
1. `a > 1 or a > 2` => `a > 1`
2. `a in (1,2,3) or a (3,4,5)` => `a in (1,2,3,4,5)`
2022-11-21 20:24:03 +08:00
91bd76a902 [enhancement](FE) use forEach() to replace stream().forEach() (#14039) 2022-11-21 15:40:43 +08:00
a91fe11b4d [feature](Nereids) Add random test framework (#14388) 2022-11-21 15:16:03 +08:00
ce489cf723 [Feature](JDBC)support clickhouse jdbc external table (#14244) 2022-11-21 10:33:53 +08:00
a9a6fdd8c3 [fix](insert) fix insert into table which contains column name prefix mv_ (#14361) 2022-11-21 10:31:01 +08:00
4976021bf7 [Enhancement] Doris broker support aliyun-oss #13665 (#14305) 2022-11-21 10:29:14 +08:00