doris

Author	SHA1	Message	Date
yixiutt	2a64571bef	[enhancement](generic_iterator) fix num check and add some notes (#12434 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-08 12:09:02 +08:00
morrySnow	a6880ca573	[fix](Nereids) throw IndexOutOfBoundsException in DistributionSpecHash#equalsSatisfy (#12446 ) In earlier PR #11976 , we changed DistributionSpecHash#equalsSatisfy, and forgot to check whether the length of both side are same. When required's shuffle slot size longer than current one, exception will be thrown.	2022-09-08 11:41:48 +08:00
Ashin Gau	dd2f834c79	[feature-wip](parquet-reader) bug fix, create compress codec before parsing dictionary (#12422 ) ## Fix five bugs: 1. Parquet dictionary data may be compressed, but `ColumnChunkReader` try to parse dictionary data before creating compression codec, causing unexpected data errors. 2. `FE` doesn't resolve array type 3. `ParquetFileHdfsScanner` doesn't fill partition values when the table is partitioned 4. `ParquetFileHdfsScanner` set `_scanner_eof = true` when a scan range is empty, causing the end of the scanner, and resulting in data loss 5. typographical error in `PageReader`	2022-09-08 09:54:25 +08:00
Luwei	d40a9d0555	[fix](memtracker) Fix memtracker did not subtract the memory released by load channel cancel (#12405 ) When the load channel is canceled, the memtracker does not subtract the memory released by the load channel. This will cause the memory usage counted by the memtracker of the load channel mgr to be larger than the actual memory usage.	2022-09-08 09:22:11 +08:00
Gabriel	41bc6b857d	[refactor](shuffle) remove unused code (#12442 )	2022-09-08 09:15:25 +08:00
yixiutt	018b4b7e1e	[bugfix](report) fix continuous version miss check (#12415 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-08 08:39:22 +08:00
yixiutt	e7aa131506	[enhancement](tcmalloc) add aggressive_memory_decommit conf and make it disable (#12436 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-08 08:37:16 +08:00
Gabriel	a536030979	[FOLLOWUP](load) fix nullable and add regression (#12375 ) * [FOLLOWUP](load) fix nullable and add regression	2022-09-08 00:05:04 +08:00
Gabriel	86e347f3bb	[Bug](doe) fix closing scanner twice (#12408 )	2022-09-07 22:45:30 +08:00
zhengyu	569ab30556	[bug](NodeChannel) fix OOM caused by pending queue in sink send (#12359 ) (#12362 ) Each NodeChannel has its own queue, with size up to 1/20 exec_mem_limit. User will crash into OOM if set exec_mem_limit high. This commit uses fixed number to control the total max memory used by NodeChannels. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2022-09-07 20:49:08 +08:00
Kikyou1997	bdbce77227	[fix](nereids) cast left child of TimestampArithmetic to wrong type in BindFunction (#12423 )	2022-09-07 20:32:47 +08:00
wudi	c2808de867	[Doc](balance)add replica balance speed param (#12406 ) * update balance param	2022-09-07 19:41:45 +08:00
camby	184be8d13c	[fix](array-type) ARRAY is not supported in bloomfilter index (#12353 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-09-07 18:00:01 +08:00
chenlinzhong	941bda5a20	[enhancement](spark-load)support dynamic set env (#12276 ) * [enhancement](spark-load)support dynamic set env and display spark appid * [enhancement](spark-load)support dynamic set env	2022-09-07 16:24:29 +08:00
jakevin	40f481049a	[fix](Nereids)lowest cost plan map do not be merged when do group merge (#12396 ) * [fix](Nereids)lowest cost plan map do not be merged when do group merge	2022-09-07 16:13:11 +08:00
yongjinhou	09b45f2b71	[Function](ELT)Add elt function (#12321 )	2022-09-07 15:21:08 +08:00
Shuo Wang	f2923f9180	[Refactor](Nereids) Simplify get input and output slots for plan/expression. (#12356 ) Simplify the code of getting input/output slots from `Expression` or `Plan`. new interfaces add `Expression`: `getInputSlots`: Get all the input slots of the expression. `Plan`: - `getOutputSet`: Get the output slot set of the plan. - `getInputSlots`: Get the input slot set of the plan. changed interface `TreeNode`: - `collect`: return `set` as result instead of `list`.	2022-09-07 14:05:37 +08:00
Kikyou1997	0bb06a1fa7	[feature](Nereids) let nullable of Year, WeekOfYear and Divide be the same as implementation in BE (#12374 ) These function/expression should always be nullable, so just return true in the overwrite method. - Year - WeekOfYear - Divide	2022-09-07 13:09:08 +08:00
morrySnow	46776af2a3	[fix](Nereids)plan translator lost other conjuncts on hash join node (#12391 ) In the earlier PR #11812 , we split join condition into two parts: hash join conjuncts and other condition. But we forgot to translate other condition into other conjuncts in HashJoinNode of legacy planner. So we get wrong result if query has other condition on join node. Such as: SELECT * FROM lineorder INNER JOIN part ON lo_partkey = p_partkey WHERE lo_orderkey > p_size;	2022-09-07 11:32:05 +08:00
Gabriel	449d0c219f	[Improvement](sort) Accumulate blocks to do partial sort (#12336 )	2022-09-07 10:34:28 +08:00
zhangstar333	42bdde8750	[Feature](Vectorized) support jdbc scan node (#12010 )	2022-09-07 10:29:41 +08:00
HappenLee	54d1630c42	[Opt](vectorized) speed up hash function compute in hash partition (#12334 ) After do the opt of hash function, the compute of siphash in HASH_PARTITION in vdata_stream_sender Before: 1s800ms After: 800ms	2022-09-07 10:11:40 +08:00
zxealous	e4b894a318	[Bug](remote) Fix BE crash because of call the future's get method twice (#12357 ) call the future's get method once and save it.	2022-09-07 10:11:27 +08:00
zhengyu	445f0882d1	[Enhancement](log) improve error msg for delta writer fail (#12121 ) (#12360 ) Signed-off-by: freemandealer <freeman.zhang1992@gmail.com> Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2022-09-07 10:10:51 +08:00
Yongqiang YANG	de9b9b3e8e	[chore](ut) enable asan core dump when running be ut (#12371 )	2022-09-07 10:09:18 +08:00
Jerry Hu	3485dfa927	[chore](profile) add some counters in aggregatation & sender (#12385 )	2022-09-07 10:09:05 +08:00
Henry2SS	d410797200	[fix](regression p0) fix regression p0 test qt_window_hang2 always failed because of timeout #12388 Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-07 10:08:12 +08:00
spaces-x	232d17efea	[Enhancement](sparkload) cast the src slot types of bitmap columns to `bitmap` when FE push tasks in spark load (#12394 ) In the current spark load implementation, the types of source data, that BE reads from the Broker, are all set to varchar. However, the two types of varchar and bitmap are not compatible anymore after version 1.1.0, which will cause spark load failure. An example of spark load error message: detailMessage = type not match, originType=VARCHAR(*), targeType=BITMAP Describe your changes. Set the src type of the bitmap columns from varchar to bitmapwhen fe pushtasks.	2022-09-07 10:07:38 +08:00
Tiewei Fang	9ccc39c164	[Enhancement](regression) add regression tests for executeSQL http rest api #12265	2022-09-07 10:02:37 +08:00
Adonis Ling	a465549f5e	[feature](Nereids)support parse and analyze having clause (#12129 ) Implement the having clause for Nereids Planner. NOTE: This PR aims at making Nereids Planner generate the correct logical plan and physical plan only. The runtime correctness is not the goal in this PR due to GROUP BY is not ready in Nereids Planner.	2022-09-07 09:47:03 +08:00
Gabriel	922b04fdc1	[Improvement](vectorized) change `static_cast` to `assert_cast` for reference (#12379 ) * [Improvement](vectorized) change `static_cast` to `assert_cast` for reference	2022-09-07 09:27:13 +08:00
Yongqiang YANG	772e5907f2	[enhancement](test) add some p0 cases (#12240 )	2022-09-07 09:10:42 +08:00
Stalary	5f255af065	[Enhancement](docker): Add elasticsearch docker file (#12377 )	2022-09-07 08:47:10 +08:00
Mingyu Chen	893567628e	[fix](exec-node) fix nullptr of runtime state (#12395 ) Remove default nullptr runtime state, which is very error-prone	2022-09-07 08:46:42 +08:00
morrySnow	55fb90d6ae	[feature](Nereids)add colocate, shuffle and bucket shuffle join algorithm to Nereids (#11976 ) This PR 1. add support below join algorithm already supported by legacy to Nereids - colocate join - bucket shuffle join - shuffle join - broadcast join 2. update all cost enforce derive utils - ChildOutputPropertyDeriver - EnforceMissingPropertiesHelper - RequestPropertyDeriver 3. add a local quick sort plan used in enforce 4. set PhysicalProperties to PhysicalPlan when choose best plan from memo 5. rename Job#pushTask to Job#pushJob	2022-09-07 00:31:21 +08:00
minghong	4c36e3dfa6	[fix](Nereids)LogicalAggregate's equals and hashCode missing two attributes (#12393 ) After applying NormalizeAggregate rule, owner groups of all aggregate children are removed. The root cause is the new aggregate node is regarded as the old aggregate node, because LogicalAggregate.equals() does not take some attributes ("normalized", "disassembled") into account.	2022-09-07 00:07:26 +08:00
morrySnow	3a0aae1b82	[enhancement](explain)add projections and output id in explain string (#12358 ) In earlier PR #11842, we add the ability of projection on each ExecNode. But, we cannot get the projection expr list in explain. This is inconvenience to debug. This PR add them into explain string if they exist.	2022-09-06 21:03:02 +08:00
camby	b8cc576cba	[fix](array-type) add data valid check for ARRAY type while insert or load (#12283 ) Add data valid check for ARRAY type while insert or load Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-09-06 20:48:58 +08:00
jiafeng.zhang	c975d71fd4	[typo](docs)Sql blacklist documentation fix (#12376 ) Sql blacklist documentationfix	2022-09-06 19:34:05 +08:00
Gabriel	b398fd60fc	[DOCS](function) Add docs for new time functions (#12382 ) Add docs for new time functions	2022-09-06 19:33:41 +08:00
chenlinzhong	7c2da89518	[docs](spark-load) set hadoop env (#12342 ) (spark-load) set hadoop env	2022-09-06 16:41:38 +08:00
zhengshiJ	4e95b3afaf	[test](nereids) add subquery regression Testing (#12372 ) Added regression test of sub-queries. Currently only associated sub-queries are added. Non-associated sub-queries will be added after project revision.	2022-09-06 16:37:17 +08:00
morrySnow	f1507f93ee	[enhancement](chore)add single empty line rule to fe check style for Nereids (#12365 )	2022-09-06 14:19:59 +08:00
slothever	4a55b504c0	[feature-wip](parquet-reader) bug fix, get the correct group reader (#12294 ) Fix the problem that cannot read the lineitem table of TPCH , and the error of allocate memory Co-authored-by: jinzhe <jinzhe@selectdb.com>	2022-09-06 13:59:35 +08:00
zhengshiJ	d7dedfadad	[fix](nereids) fix dead loop in unnesting subquery rule (#12345 ) [fix](nereids) fix dead loop in unnesting subquery rule	2022-09-06 11:50:30 +08:00
camby	cf5d194fe1	[enhancement](array-type) Split Array Offsets and String Offsets (#12341 ) In old Doris version string offsets are 32bit, but it is not enough for Array type. If we change string offsets from 32bit to 64bit, there will be problem if we upgrade BE one by one. Because at the same time 32bit Offsets and 64 bit Offsets String will exist at the same time. As a result, we separate the Codes for Array Offsets. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-09-06 11:18:27 +08:00
xueweizhang	53b79d5a8c	[Enhancement](restore) new add the property of reserve_replica to restore statement (#11942 ) Add a new property called 'reserve_replica', which means you can get a table with same partitions with the same replication num as before the backup. Co-authored-by: Stalary <stalary@163.com> Co-authored-by: camby <104178625@qq.com>	2022-09-06 10:32:21 +08:00
Dongyang Li	2019cf9406	[regression](test) add tpcds sf1 unique test (#12268 )	2022-09-06 10:12:00 +08:00
starocean999	86fa0e38e2	[fix](join) hash join should use children's output tuple ids not output tableref ids (#12261 )	2022-09-06 09:53:45 +08:00
Stalary	f2aa87d797	Add ctas support config key type ut and doc. (#12327 )	2022-09-06 09:16:02 +08:00

1 2 3 4 5 ...

6198 Commits