Commit Graph

5359 Commits

Author SHA1 Message Date
fd2c374426 [fix]Empty string key in aggregation was output as NULL (#11011) 2022-07-19 23:25:28 +08:00
Pxl
95366de7f6 cast array element to same type (#10980)
Fix problem when there are element of different types in an array.
2022-07-19 21:47:10 +08:00
371c7be235 [feature-wip](unique-key-merge-on-write) add segment lookup interface implementation, DSIP-018 (#10922) 2022-07-19 21:14:32 +08:00
d7770db5e2 Revert "[regressiontest] add tpcds_sf1 test (#10852)" (#11008)
This reverts commit d2bee602514e8238dd8ef3d3b9b34fb6171bd26f.
2022-07-19 18:41:53 +08:00
2d90f4b87c [feature-wip](statistics) step4: collect statistics by implementing statistics tasks (#8861)
This pull request includes some implementations of the statistics(https://github.com/apache/incubator-doris/issues/6370), it will not affect any existing code and users will not be able to create statistics job.

Now only MetaStatisticsTask that directly collects statistics by reading FE meta is implemented. SQLStatisticsTask is still being implemented, it needs to query BE through FE.

The following is the function implemented by this pr: 
1. Support statistics collection for partitioned and non-partitioned tables. For partitioned tables, the collection of statistics for the specified partition is implemented.
2. When the task is divided, it is divided according to the partition table and the non-partition table. The most fine-grained is to the tablet level. A matetask collects as many statistics as possible. 
3. Add partition statistics (Table -> Partition -> Column). For example, the size of the table, the number of rows, the size of the partition, the number of rows, the maximum and minimum values of the columns, etc.
4. Display and modify partition-level statistics.
 …
2022-07-19 16:22:25 +08:00
ac4ce4d874 Revert "[regression] Add ssb sf1 test under unique table with zstd (#10957)" (#10992)
This reverts commit 216a55c12c0be5c4090523195b2aff9d96c64f65.
2022-07-19 15:44:32 +08:00
d5fa66d9a3 [Enhancement] [Memory] Limit memory usage use process actual physical memory (#10924) 2022-07-19 11:08:39 +08:00
b70274e2af [docs] Changing the symbol of dataX doriswriter table creation statement (#10632)
* Update datax.md
2022-07-19 10:15:27 +08:00
f6cb7a838b [Optimize] Improve performance like/not like filter through pushdown function to storage engine (#10355)
* support like/not like conjuncts push down to storage engine
* vectorized engine support like/not like conjuncts push down to storage engine
* support both evaluate and evaluate_vec method in like predicate
* reuse remove_pushed_conjuncts and prevent logic error during move function conjuncts
* change #ifndef to pragma once as per comments
* change enable_function_pushdown default to false
Co-authored-by: heguangnan <heguangnan@bytedance.com>
2022-07-19 08:33:04 +08:00
d2bee60251 [regressiontest] add tpcds_sf1 test (#10852)
Co-authored-by: smallhibiscus <844981280>
Co-authored-by: stephen <hello-stephen@qq.com>
2022-07-19 08:30:53 +08:00
2acd5efcd8 [improvement-log]print a log when got a lower image version (#10910) 2022-07-19 08:29:58 +08:00
842ff2b1e2 [refactor] Refactor time LUT (#10982) 2022-07-19 08:23:29 +08:00
68b9a2936a [improvement](doe) Step1: Fe generates the DSL and is used to explain (#9895)
For the first step, I will only change FE and then change BE once I make sure the DSL is ok.
2022-07-18 23:20:58 +08:00
e769597fd2 [Improvement] (datetime) support microsecond for date literal (#10917)
* [Improvement] (datetime) support microsecond for date literal

* remove joda dependency
2022-07-18 21:39:39 +08:00
8a366c9ba2 [feature](multi-catalog) read parquet file by start/offset (#10843)
To avoid reading the repeat row group, we should align offsets
2022-07-18 20:51:08 +08:00
60dd322aba [feature-wip](multi-catalog) Optimize threads and thrift interface of FileScanNode (#10942)
FileScanNode in be will launch as many threads as the number of splits.
The thrift interface of FileScanNode is excessive redundant.
2022-07-18 20:50:34 +08:00
a849f5be71 [feature](Nereids): hashCode(), equals() and UT. (#10870)
Add hashCode(), equals() for operator.

Add basic UT for them(need more detail test).

**future ticket**: add hashCode(), equals() and UT for `Expression`
2022-07-18 20:33:10 +08:00
4c161b7e2c [regression-test] add tpch_sf1 test (#10846)
Co-authored-by: stephen <hello-stephen@qq.com>
2022-07-18 20:00:02 +08:00
b185545243 [refactor](Nereids)remove generic type from Rule and Job (#10897) 2022-07-18 19:35:16 +08:00
Pxl
afc1d0c05c [Chore][Compile] fix compile fail on clang (#10837)
fix compile fail on clang because of output int128
2022-07-18 19:21:01 +08:00
899acb6564 [improvement][agg]import sub hashmap (#10937) 2022-07-18 18:36:45 +08:00
b037aca4fd [improvement](dynamic-partition) add replication allocation check for dynamic partition when creating table(#10892) 2022-07-18 18:02:33 +08:00
a2ed4b5c78 [improvement] improvement for light weight schema change (#10860)
* improvement for dynamic schema
not use schema as lru cache key any more.
load segment just use the rowset's original schema not the current read schema.
generate column reader and column iterator using the original schema, using the read schema if it is a new column.
using column unique id as key instead of column ordinals.
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-07-18 17:53:31 +08:00
ba04c983ae [regression-test]Add order by for qt_select1 in test_aggregate_all_functions (#10951) 2022-07-18 17:44:23 +08:00
890fd70620 [improvement] dynamically calculate max rows to read in a batch to avoid oom (#10972) 2022-07-18 17:43:53 +08:00
6736e06679 [feature](udf) Vectorization support remote udaf #10683 (#10685) 2022-07-18 17:15:34 +08:00
9adbd8abbd [feature](resource-tag) support multi tag for a single Backend (#10901) 2022-07-18 16:50:45 +08:00
091e17ecab fix(fe): add , with json_root property in stmt show create routine load for xx_job (#10929)
Fix issue: https://github.com/apache/doris/issues/10928
2022-07-18 16:44:40 +08:00
216a55c12c [regression] Add ssb sf1 test under unique table with zstd (#10957)
* Add ssb sf1 test under unique table with zstd

Co-authored-by: smallhibiscus <844981280>
2022-07-18 16:35:14 +08:00
d9095922d9 [Enhancement] [Memory] add strict memory usage compile option STRICT_MEMORY_USE (#10936)
In the strict memory usage mode of STRICT_MEMORY_USE=ON, when the capacity of the vectorized Hash Table is greater than 2G, it starts to grow when 75% of the capacity is satisfied, the memory usage of the vectorized Join becomes 50% of the previous value.

STRICT_MEMORY_USE=ON` expects BE to use less memory, and gives priority to ensuring stability when the cluster memory is limited.
2022-07-18 16:16:43 +08:00
d199283df0 [Docs] add doc of tablet local debug (#10944)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-07-18 16:02:29 +08:00
006d7c9225 [fix]The spring boot startup banner is lost, and the maven package does not package the pictures in the resources directory (#10955) 2022-07-18 16:00:14 +08:00
234e822b36 [Regression](Array) add more array test (#10770) 2022-07-18 15:27:13 +08:00
16866e3c55 [doc] Add compression properties to the create table document (#10829)
Co-authored-by: smallhibiscus <844981280>
2022-07-18 15:25:53 +08:00
108e6207b1 [doc] fix sequence_column_manual.md (#10907)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-07-18 15:24:15 +08:00
cc7c31b080 [Bug](be) fix be coredump when receive singal(#10903). (#10953) 2022-07-18 15:23:51 +08:00
dc01ea7ad9 [fix](nereids) Fix the substring compilation error caused by merge (#10965)
Compilation error after merging due to Literal refactoring.
Compilation failure:
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/Substring.java:[40,38] org.apache.doris.nereids.trees.expressions.Literal is abstract; cannot be instantiated
2022-07-18 15:20:25 +08:00
238395e282 [Bug] fix decimal arithmetic calculations (#10963) 2022-07-18 14:35:07 +08:00
8c544b6e13 fix show storage policy null pointer and redundant log (#10906)
* fix show storage policy null pointer and redundant log
2022-07-18 14:08:54 +08:00
77ef19dbcd [BugFix](Array)Fix using Array aggregate function caused be coredump (#10649) 2022-07-18 13:47:17 +08:00
0b177669d9 [feature](nereids) support substring (#10847)
support substring, for example:
select substr(a, 2), substring(b ,3 ,4) from test1;
2022-07-18 12:38:56 +08:00
bf95440c13 [Refactor](nereids)Refactor Literal to an inheritance hierarchy (#10771)
Use inheritance hierarchy instead of combination to make the framework more clear
2022-07-18 12:01:30 +08:00
d0dc93654e [doc] update proxy_connect_timeout and proxy_timeout from 30s to 300s (#10753) 2022-07-18 10:55:41 +08:00
2d5aca18fb [feature-wip](array) add the array_sort function (#10598)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-18 10:52:42 +08:00
e3c19ded44 [enhancement](thirdparty) Support building thirdparty on macOS (#10677) 2022-07-18 10:50:30 +08:00
5c88a74792 [Enhancement] generate runtime filter only for tuples with conjunct (#8745)
Remove useless runtime filter in some primary-foreign key join scenario in TPCH case.
2022-07-18 09:37:45 +08:00
2b6cdcf599 [improvement] add an option to let regression stop when a failure happen (#10939)
For community pipeline, it is a waste of resource to run tests with errors.
2022-07-18 08:53:17 +08:00
ec5996f1f8 [improvement]do not acquire mutex in metric hook (#10941) 2022-07-18 08:52:24 +08:00
2e94674cb5 [fix](alter) fix bug that fe crash because npe on rollupBatchTask (#10943)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-07-18 08:47:25 +08:00
523d395527 [refactor] Remove alpha rowset meta (#10933)
* remove alpha_rowset_meta
* remove alpha rowset related codes in compaction
* remove alpha rowset related codes in RowsetMeta
* fix be ut because some ut use alpha rowsetmeta
2022-07-18 08:45:46 +08:00