Commit Graph

13721 Commits

Author SHA1 Message Date
Pxl
c033c6239f [Bug](table-function) fix wrong result when seprator of explode_split size more than one (#18824)
fix wrong result when seprator of explode_split size more than one
2023-04-21 11:00:47 +08:00
60640bdef0 [chore](cold_heat) fix cold heat case to use correct http api (#18870) 2023-04-21 10:43:52 +08:00
63a76ed115 [refactor](exceptionsafe) disallow call new method explicitly (#18830)
disallow call new method explicitly
force to use create_shared or create_unique to use shared ptr
placement new is allowed
reference https://abseil.io/tips/42 to add factory method to all class.
I think we should follow this guide because if throw exception in new method, the program will terminate.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-21 09:13:24 +08:00
5a0c1cb1b9 [typo](doc)Update materialized-view.md (#18826) 2023-04-21 09:12:42 +08:00
eb93afc614 [MemLeak](pipeline) fix mem leak by exchange node in pipeline (#18864) 2023-04-21 09:06:55 +08:00
b26e2d5d50 [bugfix](memoryleak) close expr after it is pushdown to storage layer (#18849) (#18852)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-21 05:21:16 +08:00
Pxl
9e64951721 [Chore](asan) set decrementOutputRecursionDepth to suppressions and remove some unu… (#18845)
18845
2023-04-20 23:33:25 +08:00
c6b1b9de80 [Improvement](broker) support broker load from tencent Goose File System (#18745)
Including below functions:
1. broker load
2. export
3. select into outfile
4. create repo and backup to gfs
after config env, use gfs like other hdfs system.
2023-04-20 23:12:17 +08:00
2c79f630fb change function version (#18811) 2023-04-20 23:10:39 +08:00
097dcf2119 [fix](outfile) unify broker and hdfs path in outfile (#18809)
unify broker and hdfs path in outfile
fix fe ut and add outfile case
2023-04-20 21:01:39 +08:00
94509e51af [fix](editLog) add sufficient replay logic and edit log for altering light schema change (#18746) 2023-04-20 19:20:03 +08:00
c4e469c82c [feature](agg) Support spill to disk in aggregation (#18051) 2023-04-20 18:59:08 +08:00
668c681fbc [Fix](Nereids) Check bound status in analyze straight after bounding (#18581)
Probleam:
Dead loop cause of keep pushing analyze tasks into job stack. When doing analyze process and generate new operators, the same analyze rule would be pushed again, so it cause dead loop. And analyze process generate new operators when trying to bound order by key and aggregate function.

Solve:
We need to make it throw exception before complex analyze and rewrite process, so checking whether all expressions being bound should be done twice. One is done after bounding all expression, another is done after all analyze process in case of generate new expressions and new operators.

Example:
Cases were put in file: regression-test/suites/nereids_p0/except/test_bound_exception.groovy
2023-04-20 18:50:13 +08:00
8e2146f48c [Enhencement](Export) support export with outfile syntax (#18325)
`Export` syntax provides asynchronous export function, but `Export` does not achieve vectorization.
`Outfile` syntax provides synchronous export function`.
So we can reimplement the export syntax with oufile syntax.
2023-04-20 17:27:04 +08:00
ea795b9909 [fix](nereids)disable SelectMaterializedIndexWithAggregate rule (#18380)
* [fix](nereids)disable SelectMaterializedIndexWithAggregate rule

* rebase code

* disable related test cases

* remove failed test cases for now
2023-04-20 17:02:36 +08:00
918a244068 [chore](pom) update apache pom to 29 (#18843) 2023-04-20 16:57:05 +08:00
c659e0bfc7 [Improvement](bloom filter) adjust bloom filter size (#18846) 2023-04-20 16:50:22 +08:00
b2757c4d02 [test](fix) Fix invalid decimal type in regresion test cases (#18844) 2023-04-20 15:49:58 +08:00
3644dfa9fd [fix](Nereids) stddev functions not support decimalv3 type arg (#18840) 2023-04-20 14:54:12 +08:00
52d32cccad [enhance](Nereids): check cycle by getParentGroupExpressions(). (#18687) 2023-04-20 11:51:58 +08:00
3328a65b75 [Fix](mutli-catalog) Use decimal v3 type to fix decimal loss issue in multi-catalog module. (#18835)
Fix decimal v3 precision loss issues in the multi-catalog module.
Now it will use decimal v3 to represent decimal type in the multi-catalog module.
Regression Test: `test_load_with_decimal.groovy`
2023-04-20 11:02:53 +08:00
ab9500bfa6 [optimize](string) optimize instr and locate function for constant arguments (#18692)
Optimize instr and locate function for constant arguments.

    instr and locate function constant arguments has 58%~200% performance improvement.
    refactor locate(substr, str, pos) as standardized arguments processing.
2023-04-20 10:40:19 +08:00
7c099c5747 [bugfix](be) Fix segment fault if the PID_DIR wasn't set (#18789)
Doris BE would crash if the PID_DIR wasn't set
2023-04-20 10:39:54 +08:00
33d4c60570 [RegressionTest](fuzzy) enable set global enable_pipeline_engine (#18832)
enable set global enable_pipeline_engine
2023-04-20 10:38:11 +08:00
293e115536 [Improvement](bloom filter) initialize bloom filter with adaptive size (#18785) 2023-04-20 10:06:40 +08:00
Pxl
908fbf92cf [Chore](build) ignore compile warning on orc && fix invalid command curdate on conf (#18810)
ignore compile warning on orc && fix invalid command curdate on conf
2023-04-20 10:03:40 +08:00
fa2a50f4c1 Update materialized-view.md (#18827) 2023-04-19 23:31:23 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
Pxl
c40860aba4 [Chore](thrift) generate thrift java code to make code analysis work well (#18793)
generate thrift java code to make code analysis work well
2023-04-19 19:33:17 +08:00
fb377a9da9 [Improvement](functions)Optimized some datetime function's return value (#18369) 2023-04-19 15:51:11 +08:00
1f5f5a12b6 [fix](Nereids): need update parentExpression after replace child. (#18771) 2023-04-19 15:13:42 +08:00
f280b04736 [regression-test](iceberg)add iceberg in regression case (#18792)
add iceberg 'in' clause regression case
for #18226
2023-04-19 15:09:20 +08:00
93b35bbfbf [feature](multi-catalog) add catalog comment and create time info (#18778)
add catalog comment and create time info
```
create catalog hms_ctl
comment 'your comment' 
properties (
'type'='hms',
'hive.metastore.uris' = 'thrift://xx:1234' );
```
Create Time will generate when the catalog is created.

use show catalogs and show create catalog to get these info.
2023-04-19 15:08:42 +08:00
1a25f110ec [Fix](planner)Fix TupleDescriptor include not materialized slot bug (#18783)
setOutputSmap function in ScanNode may include not materialized to outputTupleDesc. This PR is to fix this.
2023-04-19 14:08:09 +08:00
446db3def6 [opt](nereids) estimate broadcast cost by a new formula (#18744)
estimate broadcast cost by an experience formula: beNumber^0.5 * rowCount
1. sender number and receiver number is not available at RBO stage now, so we use beNumber
2. senders and receivers work in parallel, that why we use square of beNumber
2023-04-19 12:14:55 +08:00
7d6b1a115a [feature](nereids)Tpc-h 1T plan shape check #18717
add regression test to check tpc-h 1T plan shape
2023-04-19 12:00:54 +08:00
15529afed8 [minor](decimal) forbid to create table with decimal type exceeds 18 (#18763)
* [minor](decimal) forbid to create table with decimal type exceeds 18

* update
2023-04-19 11:34:27 +08:00
0b379de602 [refactor](scan) optimize the agg function of count(1) (#18739) 2023-04-19 09:10:51 +08:00
19a4c7fae6 [improvement](doc) add --cap-add SYS_PTRACE command for docker-dev.md (#18790) 2023-04-19 09:07:35 +08:00
431ba5341a [Doc] Fix error test example (#18764) 2023-04-19 09:06:27 +08:00
a68af93d30 [fix](compile) Fix block.cpp compilation failure (#18797) 2023-04-19 08:49:23 +08:00
f520e5a2c6 [typo](docs) fix insert load doc error (#18773) 2023-04-18 21:13:23 +08:00
d24a8a524e [refactor](fe): Remove resource group which is useless (#18249) 2023-04-18 21:04:30 +08:00
0165ffbcae [Fix](vertical compaction) Preserve _segment_num_rows during final segment flush (#18779) 2023-04-18 20:58:23 +08:00
5c076b738b [improvement](resource-group) add test for resource group (#18575)
Co-authored-by: wangbo <youseebiggirl_t_t@qq.com>
2023-04-18 20:20:50 +08:00
c323bc44ff [feature](docker)add be init script option (#16909) 2023-04-18 20:03:18 +08:00
380bc16595 fix doc (#18772) 2023-04-18 19:54:28 +08:00
79c446c89f [enhancement](exception) Column filter/replicate supports exception safety (#18503) 2023-04-18 19:23:09 +08:00
4a16eff16d [fix](merge-on-write) enable_unique_key_merge_on_write property should only be used for unique table (#18734) 2023-04-18 18:40:01 +08:00
031d35d4a1 [fix](stats) Stats still in cache after user dropped it (#18720)
1. Evict the dropped stats from cache
2. Remove codes for the partition level stats collection
3. Disable analyze whole database directly
4. Fix the potential death loop in the stats cleaner
5. Sleep thread in each loop when scanning stats table to avoid excessive IO usage by this task.
2023-04-18 16:41:10 +08:00