doris

Author	SHA1	Message	Date
Xinyi Zou	a73b28789d	Fix memory leak by calling in mem hook (#12708 ) After the consume mem tracker exceeds the mem limit in the mem hook, the boost stacktrace will be printed. A query/load will only be printed once, and the process tracker will only be printed once per second. After the process memory reaches the upper limit, the boost stacktrace will be printed every second. The observed phenomena are as follows: After query/load is canceled, the memory increases instantly; tcmalloc profile total physical memory is less than perf process memory; The process mem tracker is smaller than the perf process memory;	2022-09-18 10:04:15 +08:00
morrySnow	2e41976b07	update tpch regression test (#12687 ) turn on all TPC-H sf1 test cases except Q2. Q2 caused dead loop in Join reorder. Will turn on Q2 after fix it.	2022-09-17 17:06:39 +08:00
Xin Liao	bac58a4774	[feature-wip](unique-key-merge-on-write) fix calculate delete bitmap when flush memtable (#12668 )	2022-09-17 17:04:03 +08:00
HappenLee	35b97a5af0	[Opt](hash) Speed up insert from dict data map and not datetime (#12670 ) Speed up dict data read and not datetime. same target #12636	2022-09-17 17:02:43 +08:00
luozenglin	3030a3606a	[fix](load) fix stream load fail when setting strict mode (#12684 )	2022-09-17 17:02:11 +08:00
Xinyi Zou	3bb042e45c	[fix](memtracker) Process physical mem check does not include tc/jemalloc allocator cache (#12688 ) tcmalloc/jemalloc allocator cache does not participate in the mem check as part of the process physical memory. because new/malloc will trigger mem hook when using tcmalloc/jemalloc allocator cache, but it may not actually alloc physical memory, which is not expected in mem hook fail. in addition: The value of tcmalloc/jemalloc allocator cache is used as a mem tracker, the parent is the process mem tracker, which is updated every 1s. Modify the process default mem_limit to 90%. expect mem tracker to effectively limit the memory usage of the process.	2022-09-17 11:31:01 +08:00
Lightman	e01986b8b9	[feature](light-schema-change) fix light-schema-change and add more cases (#12160 ) Fix _delete_sign_idx and _seq_col_idx when append_column or build_schema when load. Tablet schema cache support recycle when schema sptr use count equals 1. Add a http interface for flink-connector to sync ddl. Improve tablet->tablet_schema() by max_version_schema.	2022-09-17 11:29:36 +08:00
Xinyi Zou	942b31038f	[fix](memory) Fix BE OOM when load -238 fail (#12666 ) When the flush is triggered when the load channel exceeds the mem limit, if the flush fails, an error message is returned and the load is terminated. Usually flush failure is -238 error code. Because the memtable is frequently flushed after the load channel exceeds the mem limit, the number of segments exceeds the max value.	2022-09-17 00:17:53 +08:00
Xinyi Zou	42b6532131	remove gc and fix print (#12682 )	2022-09-17 00:16:15 +08:00
924060929	0a95ebf602	[feature](Nereids) Add scalar function code generator and some function trait (#12671 ) This pr did these things: 1. Change the nullable mode of 'from_unixtime' and 'parse_url' from DEPEND_ON_ARGUMENT to ALWAYS_NULLABLE, which nullable configuration was missing previously. 2. Add some new interfaces for origin NullableMode. This change inspired by the grammar of scala's mix-in trait, It help us to quickly understand the traits of function without read the lengthy procedural code and save the work to write some template code, like `class Substring extends ScalarFunction implements ImplicitCastInputTypes, PropagateNullable`. These are the interfaces: - PropagateNullable: equals to NullableMode.DEPEND_ON_ARGUMENT - AlwaysNullable: equals to NullableMode.ALWAYS_NULLABLE - AlwaysNotNullable: equals to NullableMode.ALWAYS_NOT_NULLABLE - others ComputeNullable: equals to NullableMode.CUSTOM 3. Add `GenerateScalarFunction` to generate nereids-style function code from legacy functions, but not actual generate any new function class yet, because the function's trait is not ready for use. I need add some traits for the legacy function's CompareMode and NonDeterministic, this thought is the same as ComputeNullable.	2022-09-16 21:27:30 +08:00
Zhengguo Yang	b733a23cf7	[Bugfix](stack_over_flow) fix be may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large (#12658 )	2022-09-16 20:57:22 +08:00
wudi	a3fee5afbb	[doc](variables) fix forward_to_master doc bug #12659 Co-authored-by: wudi <>	2022-09-16 20:56:55 +08:00
yongjinhou	6fc74def02	[fix](Broker load): fix bug for broker label has already been used (#12630 )	2022-09-16 20:46:01 +08:00
morrySnow	378acfa28f	[enhancement](Nereids) eliminate all unessential cross join in TPC-H benchmark (#12651 ) For eliminate all unessential cross join in TPC-H benchmark, this PR: 1. push all predicates that can be push down through join before do ReorderJoin rule. Then we could eliminate all cross join that can be eliminated in ReorderJoin rule since this rule need matching a LogicalFilter as a root pattern. (Q2, Q15, Q16, Q17, Q18) 2. enable expression optimization rule - extract common expression. (Q19) 3. fix cast translate failed. (Q19)	2022-09-16 19:09:58 +08:00
Yongqiang YANG	a4a5dae7dc	[enhancement](test) add tpcds_sf100 to p2 cases (#12296 )	2022-09-16 17:38:23 +08:00
minghong	21319e6db4	[fix](nereids) generate invalid slot when translate predicates in filter on hash join (#12475 ) test sql: TPC-H q21 ``` select count(*) from lineitem l3 right anti join lineitem l1 on l3.l_orderkey = l1.l_orderkey and l3.l_suppkey <> l1.l_suppkey; ``` if we have other join conjuncts, we have to put all slots from left and right into `slotReferenceMap` instead of `hashjoin.getOutput()` After splitting intermediate tuple and output tuple, we meet several issues in regression test. And hence, we make following changes: 1. since translating project will replace underlying hash-join node's output tuple, we add PhysicalHashJoin.shouldTranslateOutput 2. because PhysicalPlanTranslator will merge filter and hashJoin, we add PhysicalHashJoin.filterConjuncts and translate filter conjuncts in physicalHashJoin 3. In this pr, we set HashJoinNode.hashOutputSlotIds properly when using nereids planner. 4. in order to be compatible with BE, in substring function, nullable() returns true	2022-09-16 16:51:04 +08:00
HappenLee	9d6c199553	[Bug](vec) Fix avg overflow in clickbench (#12621 )	2022-09-16 14:43:40 +08:00
Shuo Wang	131f2a42d2	[Improvement](Nereids) Restrict the condition to apply MergeConsecutiveLimits rule (#12624 ) This PR added a condition check for MergeConsecutiveLimits rule: the input upper limit should not have valid offset info.	2022-09-16 13:05:39 +08:00
jakevin	0f6dbb5769	[fix](Nereids): split INNER and OUTER into different rules. (#12646 )	2022-09-16 10:34:42 +08:00
TengJianPing	8364165e30	[regression_test](testcase) add regression test case from session variable skip_storage_engine_merge, skip_delete_predicate and show_hidden_columns (#12617 ) also add this function to new olap scan node.	2022-09-16 10:33:12 +08:00
BenjaminWenqiYu	97ff14482f	[enhancement](doc) When we use flink doris connector with bounded source, we should using the BATCH mode. (#12576 )	2022-09-16 10:31:17 +08:00
Henry2SS	d4f8e0c754	[Bug](spark load) fix spark load clearSparkLauncherLog NPE #12619 Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-16 10:30:57 +08:00
wxy	20de8ac29d	[fix](auditloader plugin): fix bug for AuditLoaderPlugin that stmt appears truncated when stmt contains '\n'. (#12627 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2022-09-16 10:28:10 +08:00
lsy3993	380e3695f8	[test](window-function) add cte test in regression of window function #12635	2022-09-16 10:27:50 +08:00
BenjaminWenqiYu	f1811e41bc	[fix](config)Update user_define_tables.sh #12542	2022-09-16 10:27:28 +08:00
Pxl	d44ec74988	[Enhancement](column) optimize for ColumnString::insert_many_dict_data (#12636 ) optimize for ColumnString::insert_many_dict_data	2022-09-16 10:23:04 +08:00
Gabriel	c05d736331	[Improvement](sort) fallback to partial sort small block if topN is small (#12604 ) * [Improvement](sort) fallback to partial sort small block if topN is small	2022-09-16 10:20:17 +08:00
yinzhijian	2a063355ad	[fix](vstream load) Fix the default value insertion problem when importing json (#12601 ) * [fix](vstream load) Fix the default value insertion problem when importing json * update	2022-09-16 09:54:45 +08:00
yinzhijian	a97f63141e	[fix](cast) Add validity check for date conversion for non-vectorization (#12608 ) actual result select cast("0.0000031417" as date); +------------------------------+ \| CAST('0.0000031417' AS DATE) \| +------------------------------+ \| 2000-00-00 \| +------------------------------+ expect result select cast("0.0000031417" as date); +------------------------------+ \| CAST('0.0000031417' AS DATE) \| +------------------------------+ \| NULL \| +------------------------------+	2022-09-16 09:08:53 +08:00
yixiutt	d906e97f1b	[bugfix](compression) fix lock bug in concurrent acquire context (#12638 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-16 09:05:29 +08:00
minghong	98dad6158b	[fix](Nereids) type coercion on case-when is not correct (#12650 ) When we do type coercion on CaseWhen expression, such as sql like this: ``` CASE WHEN n_nationkey > 1 THEN n_regionkey ELSE 0 END ``` The ELSE part 0 need do type coercion as CAST (0 AS INT). But we miss it in PR #11802	2022-09-16 02:26:11 +08:00
mch_ucchi	a63cdc8a7c	[feature](Nereids) support basic runtime filter (#12182 ) This PR add runtime filter to Nereids planner. Now only support push through join node and scan node. TODO: 1. current support inner join, cross join, right outer join, and will support other join type in future. 2. translate left outer join to inner join if there are inner join ancestors. 3. some complex situation cannot be handled now, see more details in test case: testPushDownThroughJoin. 4. support src key is aggregate group key.	2022-09-16 02:21:01 +08:00
Adonis Ling	0daa25d9a9	[fix](nereids) UT failed when test cases in package (#12622 ) NamedExpressionUtil::clear should reset the nextId rather than create a new IdGenerator<ExprId> because the old one may be referenced by other objects and it may cause some cases start in a dirty environment when we run test cases in package.	2022-09-15 22:25:40 +08:00
yixiutt	3072e17b39	[Bugfix](primary-key) fix calc delete bitmap bug in concurrent memtable flush (#12605 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-15 21:50:24 +08:00
jakevin	db8bc80c36	[feature](Nereids): semi join transpose (#12590 ) * [feature](Nereids): semi join transpose and enable ZIG_ZAG join reorder.	2022-09-15 21:32:50 +08:00
Zhengguo Yang	c6c84a2784	[chore](build) add build param to version string (#12591 )	2022-09-15 17:09:22 +08:00
morrySnow	858e8234d7	[feature](Nereids) add predicates push down on all join type (#12571 ) * [feature](Nereids) add predicates push down on all join type	2022-09-15 15:18:42 +08:00
yinzhijian	5b6d48ed5b	[feature](nereids) support distinct count (#12159 ) support distinct count with group by clause. for example: SELECT count(distinct c_custkey + 1) FROM customer group by c_nation; TODO: support distinct count without group by clause.	2022-09-15 13:01:47 +08:00
Shuo Wang	b11791b9a8	[Feature](Nereids) Limit pushdown. (#12518 ) This PR adds rewrite rules to push the limit down. Following two cases would be handled: ``` limit -> join limit -> project -> join ```	2022-09-15 12:12:10 +08:00
Shuo Wang	d2d5c19d51	[Improvement](Nereids) Avoid unsafe cast. (#12603 ) This PR changed some interfaces to avoid unsafe cast. - Modify `Plan.getExpressions()`'s return type from `List<Expression>` to `List<? extends Expression>` Return projects (type is a list of named expression) in `getExpressions` can avoid unsafe cast. See `LogicalProject.getExpression()` as an example. - Modify `EmptyRelation.getProjects()`'s return type from `List<NamedExpression>` to `List<? extends NamedExpression>` Creating empty relation with a list of slots can avoid unsafe cast. See the `EliminateLimit` rule for example.	2022-09-15 12:02:35 +08:00
mch_ucchi	5e0dc11f87	[feature](Nereids)add RelationId as a unique identifier of relations (#12461 ) In Nereids, we could not distinguish two relation from same table in one PlanTree. This lead to some trick code to process them when do plan. Such as a separate branch to do equals in GroupExpression. This PR add RelationId to LogicalRelation and PhysicalRelation. Then all relations equals function will compare RelationId to help us distinguish two relation from same table. TODO: add relation id to UnboundRelation, UnboundOneRowRelation, LogicalOneRowRelation, PhysicalOneRowRelation.	2022-09-15 11:56:56 +08:00
Gabriel	fc4298e85e	[feature](outfile) support parquet writer (#12492 )	2022-09-15 11:09:12 +08:00
zhangstar333	22a8d35999	[Feature](vectorized) support jdbc sink for insert into data to table (#12534 )	2022-09-15 11:08:41 +08:00
carlvinhust2012	33f5a86e69	[fix](array-type) forbid to create materialized view for array column (#12543 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-09-15 11:08:23 +08:00
HappenLee	e413a2b8e9	[Opt](vectorized) Use new way to do hash shffle to speed up query (#12586 )	2022-09-15 11:08:04 +08:00
Mingyu Chen	353bb6fdfb	[doc] update docs (#12615 )	2022-09-15 11:07:34 +08:00
zhannngchen	1080095f46	[typo](doc) fix some typos (#12611 )	2022-09-15 11:07:19 +08:00
starocean999	8e4374b7ec	[enhancement](agg)remove unnessasery mem alloc and dealloc in agg node (#12535 )	2022-09-15 11:07:06 +08:00
Henry2SS	2ac790bf31	[enhancement](statistic) the calculation of routine load statistics are not accurate (#12594 ) Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-15 11:00:57 +08:00
yixiutt	b136d80e1a	[enhancement](compress) reuse compression ctx and buffer (#12573 ) Reuse compression ctx and buffer. Use a global instance for every compression algorithm, and use a thread saft buffer pool to reuse compression buffer, pool size is equal to max parallel thread num in compression, and this will not be too large. Test shows this feature increase 5% of data import and compaction. Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-15 10:59:46 +08:00

1 2 3 4 5 ...

6341 Commits