doris

Author	SHA1	Message	Date
AlexYue	a607c30ad4	[docs] Fe build idea doc (#10996 ) * [doc](fe): enhance the fe-idea-dev * [doc](fe)add solution for m1 mac compile error Co-authored-by: jackwener <jakevingoo@gmail.com>	2022-07-20 19:03:29 +08:00
Hong Liu	b62e3e7aa0	[regression test]Add ssb sf1 test under unique table with zstd (#11004 ) Co-authored-by: smallhibiscus <844981280>	2022-07-20 18:59:46 +08:00
camby	0a8ae6aeec	Refractor COLLECT_LIST and COLLECT_SET register logic (#10956 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-07-20 18:02:39 +08:00
Dongyang Li	1ca00e0107	[tools] add clickbench tools (#11009 ) * [tools] add clickbench tools Co-authored-by: stephen <hello-stephen@qq.com>	2022-07-20 17:59:04 +08:00
Adonis Ling	e5663f9872	[Bug](array-type) Fix the core dump caused by unaligned __int128 (#11020 ) Fix the core dump caused by unaligned __int128 and change DEFAULT_ALIGNMENT	2022-07-20 16:37:27 +08:00
Lightman	a71822a74d	[refactor]remove col_unique_id (#11025 )	2022-07-20 16:35:14 +08:00
Mingyu Chen	7bdce8f572	[refactor](policy) refactor some policy create and check logic (#11007 ) * [refactor](policy) refactor some policy create and check logic	2022-07-20 16:20:59 +08:00
morrySnow	658a9f7531	[fix](planner)unnecessary cast will be added on children in InPredicate (#11033 )	2022-07-20 16:00:26 +08:00
minghong	6233b5200e	[refactor] (Nereids) rename GroupExpression.getParent() to getOwnerGroup() (#11027 ) GroupExpression.getParent() returns the group which contains this expr. This name is missleading especially in tree structures. So we change the name to getOwnerGroup.	2022-07-20 15:57:59 +08:00
zhannngchen	a1c1cfce47	Add some comments for the feature mow (#11028 )	2022-07-20 15:35:41 +08:00
zhannngchen	ec5471f048	[feature-wip](unique-key-merge-on-write) Implement tablet lookup interface, using rowset-tree, DSIP-018[3/5] (#10938 )	2022-07-20 14:52:14 +08:00
Shuo Wang	9b91f86c38	[Feature](Nereids) Reorder join to eliminate cross join. (#10890 ) Try to eliminate cross join via finding join conditions in filters and changing the join orders. For example: -- input: SELECT * FROM t1, t2, t3 WHERE t1.id=t3.id AND t2.id=t3.id -- output: SELECT * FROM t1 JOIN t3 ON t1.id=t3.id JOIN t2 ON t2.id=t3.id This feature is controlled by session variable enable_nereids_reorder_to_eliminate_cross_join with true by default. Simplify usage of Memo and rewrite rule application. Before this PR, if we want to apply a rewrite rule to a plan, the code is like the below: Memo memo = new Memo(); memo.initialize(root); PlannerContext plannerContext = new PlannerContext(memo, new ConnectContext()); JobContext jobContext = new JobContext(plannerContext, new PhysicalProperties(), 0); RewriteTopDownJob rewriteTopDownJob = new RewriteTopDownJob(memo.getRoot(), ImmutableList.of(new AggregateDisassemble().build()), jobContext); plannerContext.pushJob(rewriteTopDownJob); plannerContext.getJobScheduler().executeJobPool(plannerContext); Plan after = memo.copyOut(); After this PR, we could use chain style calling: new Memo(plan) .newPlannerContext(connectContext) .setDefaultJobContext() .topDownRewrite(new AggregateDisassemble()) .getMemo() .copyOut(); Rename the session variable enable_nereids to enable_nereids_planner to make it more meaningful.	2022-07-20 13:53:54 +08:00
Mingyu Chen	56e036e68b	[feature-wip](multi-catalog) Support runtime filter for file scan node (#11000 ) * [feature-wip](multi-catalog) Support runtime filter for file scan node Co-authored-by: morningman <morningman@apache.org>	2022-07-20 12:36:57 +08:00
Kikyou1997	a5a50726bf	[Ehancement](planner) Rewrite implicit cast to the predicates (#10920 ) During the analysis of BinaryPredicate, it will generate a CastExpr if the slot implicitly in the below case: SELECT * FROM t1 WHERE t1.col1 = '1'; col1 is integer column. This will prevent the binary predicate from pushing down to OlapScan which would impact the performance.	2022-07-20 12:28:29 +08:00
yixiutt	dc2b709f6f	[Bug](compaction) fix uniq key compaction bug that does not count merged rows right(#10971 ) When a rowset includes multiple segments, segments rows will be merged in generic_iterator but merged_rows is not maintained. Compaction will failed in check_correctness. Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-07-20 12:07:45 +08:00
plat1ko	989e6d1cf9	[chore]fix clang compile error (#11021 )	2022-07-20 08:28:47 +08:00
Mingyu Chen	ba9c7e50aa	[doc] missing sidebar for cloudcanal (#10998 )	2022-07-19 23:51:12 +08:00
Jerry Hu	fd2c374426	[fix]Empty string key in aggregation was output as NULL (#11011 )	2022-07-19 23:25:28 +08:00
Pxl	95366de7f6	cast array element to same type (#10980 ) Fix problem when there are element of different types in an array.	2022-07-19 21:47:10 +08:00
Xin Liao	371c7be235	[feature-wip](unique-key-merge-on-write) add segment lookup interface implementation, DSIP-018 (#10922 )	2022-07-19 21:14:32 +08:00
Dongyang Li	d7770db5e2	Revert "[regressiontest] add tpcds_sf1 test (#10852 )" (#11008 ) This reverts commit d2bee602514e8238dd8ef3d3b9b34fb6171bd26f.	2022-07-19 18:41:53 +08:00
ElvinWei	2d90f4b87c	[feature-wip](statistics) step4: collect statistics by implementing statistics tasks (#8861 ) This pull request includes some implementations of the statistics(https://github.com/apache/incubator-doris/issues/6370), it will not affect any existing code and users will not be able to create statistics job. Now only MetaStatisticsTask that directly collects statistics by reading FE meta is implemented. SQLStatisticsTask is still being implemented, it needs to query BE through FE. The following is the function implemented by this pr: 1. Support statistics collection for partitioned and non-partitioned tables. For partitioned tables, the collection of statistics for the specified partition is implemented. 2. When the task is divided, it is divided according to the partition table and the non-partition table. The most fine-grained is to the tablet level. A matetask collects as many statistics as possible. 3. Add partition statistics (Table -> Partition -> Column). For example, the size of the table, the number of rows, the size of the partition, the number of rows, the maximum and minimum values of the columns, etc. 4. Display and modify partition-level statistics. …	2022-07-19 16:22:25 +08:00
Hong Liu	ac4ce4d874	Revert "[regression] Add ssb sf1 test under unique table with zstd (#10957 )" (#10992 ) This reverts commit 216a55c12c0be5c4090523195b2aff9d96c64f65.	2022-07-19 15:44:32 +08:00
Xinyi Zou	d5fa66d9a3	[Enhancement] [Memory] Limit memory usage use process actual physical memory (#10924 )	2022-07-19 11:08:39 +08:00
yuanyuan8983	b70274e2af	[docs] Changing the symbol of dataX doriswriter table creation statement (#10632 ) * Update datax.md	2022-07-19 10:15:27 +08:00
Jet He	f6cb7a838b	[Optimize] Improve performance like/not like filter through pushdown function to storage engine (#10355 ) * support like/not like conjuncts push down to storage engine * vectorized engine support like/not like conjuncts push down to storage engine * support both evaluate and evaluate_vec method in like predicate * reuse remove_pushed_conjuncts and prevent logic error during move function conjuncts * change #ifndef to pragma once as per comments * change enable_function_pushdown default to false Co-authored-by: heguangnan <heguangnan@bytedance.com>	2022-07-19 08:33:04 +08:00
Dongyang Li	d2bee60251	[regressiontest] add tpcds_sf1 test (#10852 ) Co-authored-by: smallhibiscus <844981280> Co-authored-by: stephen <hello-stephen@qq.com>	2022-07-19 08:30:53 +08:00
Yongqiang YANG	2acd5efcd8	[improvement-log]print a log when got a lower image version (#10910 )	2022-07-19 08:29:58 +08:00
Gabriel	842ff2b1e2	[refactor] Refactor time LUT (#10982 )	2022-07-19 08:23:29 +08:00
Stalary	68b9a2936a	[improvement](doe) Step1: Fe generates the DSL and is used to explain (#9895 ) For the first step, I will only change FE and then change BE once I make sure the DSL is ok.	2022-07-18 23:20:58 +08:00
Gabriel	e769597fd2	[Improvement] (datetime) support microsecond for date literal (#10917 ) * [Improvement] (datetime) support microsecond for date literal * remove joda dependency	2022-07-18 21:39:39 +08:00
slothever	8a366c9ba2	[feature](multi-catalog) read parquet file by start/offset (#10843 ) To avoid reading the repeat row group, we should align offsets	2022-07-18 20:51:08 +08:00
Ashin Gau	60dd322aba	[feature-wip](multi-catalog) Optimize threads and thrift interface of FileScanNode (#10942 ) FileScanNode in be will launch as many threads as the number of splits. The thrift interface of FileScanNode is excessive redundant.	2022-07-18 20:50:34 +08:00
jakevin	a849f5be71	[feature](Nereids): hashCode(), equals() and UT. (#10870 ) Add hashCode(), equals() for operator. Add basic UT for them(need more detail test). future ticket: add hashCode(), equals() and UT for `Expression`	2022-07-18 20:33:10 +08:00
Dongyang Li	4c161b7e2c	[regression-test] add tpch_sf1 test (#10846 ) Co-authored-by: stephen <hello-stephen@qq.com>	2022-07-18 20:00:02 +08:00
morrySnow	b185545243	[refactor](Nereids)remove generic type from Rule and Job (#10897 )	2022-07-18 19:35:16 +08:00
Pxl	afc1d0c05c	[Chore][Compile] fix compile fail on clang (#10837 ) fix compile fail on clang because of output int128	2022-07-18 19:21:01 +08:00
Jerry Hu	899acb6564	[improvement][agg]import sub hashmap (#10937 )	2022-07-18 18:36:45 +08:00
Zhengguo Yang	b037aca4fd	[improvement](dynamic-partition) add replication allocation check for dynamic partition when creating table(#10892 )	2022-07-18 18:02:33 +08:00
yiguolei	a2ed4b5c78	[improvement] improvement for light weight schema change (#10860 ) * improvement for dynamic schema not use schema as lru cache key any more. load segment just use the rowset's original schema not the current read schema. generate column reader and column iterator using the original schema, using the read schema if it is a new column. using column unique id as key instead of column ordinals. Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-07-18 17:53:31 +08:00
Jerry Hu	ba04c983ae	[regression-test]Add order by for qt_select1 in test_aggregate_all_functions (#10951 )	2022-07-18 17:44:23 +08:00
TengJianPing	890fd70620	[improvement] dynamically calculate max rows to read in a batch to avoid oom (#10972 )	2022-07-18 17:43:53 +08:00
chenlinzhong	6736e06679	[feature](udf) Vectorization support remote udaf #10683 (#10685 )	2022-07-18 17:15:34 +08:00
Mingyu Chen	9adbd8abbd	[feature](resource-tag) support multi tag for a single Backend (#10901 )	2022-07-18 16:50:45 +08:00
吕飞	091e17ecab	fix(fe): add `,` with json_root property in stmt `show create routine load for xx_job` (#10929 ) Fix issue: https://github.com/apache/doris/issues/10928	2022-07-18 16:44:40 +08:00
Hong Liu	216a55c12c	[regression] Add ssb sf1 test under unique table with zstd (#10957 ) * Add ssb sf1 test under unique table with zstd Co-authored-by: smallhibiscus <844981280>	2022-07-18 16:35:14 +08:00
Xinyi Zou	d9095922d9	[Enhancement] [Memory] add strict memory usage compile option STRICT_MEMORY_USE (#10936 ) In the strict memory usage mode of STRICT_MEMORY_USE=ON, when the capacity of the vectorized Hash Table is greater than 2G, it starts to grow when 75% of the capacity is satisfied, the memory usage of the vectorized Join becomes 50% of the previous value. STRICT_MEMORY_USE=ON` expects BE to use less memory, and gives priority to ensuring stability when the cluster memory is limited.	2022-07-18 16:16:43 +08:00
HappenLee	d199283df0	[Docs] add doc of tablet local debug (#10944 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-07-18 16:02:29 +08:00
jiafeng.zhang	006d7c9225	[fix]The spring boot startup banner is lost, and the maven package does not package the pictures in the resources directory (#10955 )	2022-07-18 16:00:14 +08:00
lihangyu	234e822b36	[Regression](Array) add more array test (#10770 )	2022-07-18 15:27:13 +08:00

... 57 58 59 60 61 ...

8276 Commits