doris

Author	SHA1	Message	Date
zbtzbtzbt	9d48154cdc	[minor](non-vec) delete unused interface in RowBatch (#15186 )	2022-12-20 13:06:34 +08:00
yiguolei	a2d56af7d9	[profile](datasender) add more detail profile in data stream sender (#15176 ) * [profile](datasender) add more detail profile in data stream sender Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-12-20 12:07:34 +08:00
Lijia Liu	938f4f33d6	[Pipeline] Add MLFQ when schedule (#15124 )	2022-12-20 11:49:15 +08:00
Gabriel	5c5ccd8d45	[regression](pipeline) add TPCH cases (#15177 )	2022-12-20 11:41:10 +08:00
luozenglin	0c2911efb1	[enhancement](gc) sub_file_cache checks the directory files when gc (#15114 ) * [enhancement](gc) sub_file_cache checks the directory files when gc * update	2022-12-20 10:50:11 +08:00
Zhengguo Yang	98cdeed6e0	[chore](routine load) remove deprecated property of librdkafka reconnect.backoff.jitter.ms #15172	2022-12-20 10:13:56 +08:00
HappenLee	40141a9c9c	[opt](vectorized) opt the null map _has_null logic (#15181 ) opt the null map _has_null logic	2022-12-20 10:01:54 +08:00
starocean999	4979ad09c8	[fix](join)the policy to choose colocate join is not correct (#15140 ) * [hotfix](dev-1.0.1) fix colocate join bug in vec engine after introducing output tuple (#10651) to support vectorized outer join, we introduced a out tuple for hash join node, but it breaks the checking for colocate join. To solve this problem, we need map the output slot id to the children's slot id of hash join node, and the colocate join can be checked correctly. * fix colocate join bug * fix non vec colocate join issue Co-authored-by: lichi <lichi@rateup.com.cn> * add test cases Co-authored-by: lichi <lichi@rateup.com.cn>	2022-12-20 09:44:47 +08:00
minghong	320b264c9d	[feature](planner) compact multi-euqals to in-predicate #14876	2022-12-20 09:43:34 +08:00
minghong	81c06e8edc	[feature](nereids) add scalar function is_null_pred and is_not_null_pred (#15163 )	2022-12-20 00:54:40 +08:00
mch_ucchi	918698151a	[Fix](Nereids)fix be core when select constant expression (#15157 ) fix be core when select !2	2022-12-20 00:48:11 +08:00
minghong	a84a590b4f	[fix](nereids) estimate TimeStampArithmetic (#15061 ) `select * from lineitem where l_shipdate < date('1994-01-01') + interval '1' YEAR limit 1;` cause stack overflow	2022-12-20 00:44:42 +08:00
minghong	4dece99c97	[fix](nereids)add estimation for full outer join (#14902 )	2022-12-20 00:42:11 +08:00
minghong	a086f67255	[fix](nereids) stats calculator lost column statistics on limit node (#14759 ) `select avg(id) from (select id from t1 limit 1);` above sql encounters NPE, because stats for limit node lost column statistics	2022-12-20 00:39:57 +08:00
zhangstar333	494eb895d3	[vectorized](pipeline) support union node operator (#15031 )	2022-12-19 22:01:56 +08:00
HappenLee	7c67fa8651	[Bug](pipeline) fix bug of right anti join error result in pipeline (#15165 )	2022-12-19 19:28:44 +08:00
Mingyu Chen	21523f4db1	[fix](auth) fix bug that user info may lost when upgrading to 1.2.0 (#15144 ) * [fix](auth) fix bug that user info may lost when upgrading to 1.2.0 * fix	2022-12-19 16:01:18 +08:00
Mingyu Chen	f5823a90ff	[fix](broker-load) fix broker load with hdfs failed to get right file type (#15138 )	2022-12-19 16:00:58 +08:00
Gabriel	0732f31e5d	[Bug](pipeline) Fix bugs for scan node and join node (#15164 ) * [Bug](pipeline) Fix bugs for scan node and join node * update	2022-12-19 15:59:29 +08:00
TengJianPing	445ec9d02c	[fix](counter) fix coredump caused by updating destroyed counter (#15160 )	2022-12-19 14:35:03 +08:00
Jibing-Li	6be5670ce9	[Feature](multi catalog)Remove enable_multi_catalog config item, open this function to public. (#15130 ) The multi-catalog feature is ready to use, remove enable_multi_catalog switch in FE config, open it to public.	2022-12-19 14:29:13 +08:00
xueweizhang	1597afcd67	[fix](mutil-catalog) fix get many same name db/table when show where (#15076 ) when show databases/tables/table status where xxx, it will change a selectStmt to select result from information_schema, it need catalog info to scan schema table, otherwise may get many database or table info from multi catalog. for example mysql> show databases where schema_name='test'; +----------+ \| Database \| +----------+ \| test \| \| test \| +----------+ MySQL [internal.test]> show tables from test where table_name='test_dc'; +----------------+ \| Tables_in_test \| +----------------+ \| test_dc \| \| test_dc \| +----------------+	2022-12-19 14:27:48 +08:00
xueweizhang	000972ae17	[fix](executor) fix some npe about getting catalog and add some error info (#15155 )	2022-12-19 14:25:52 +08:00
Ashin Gau	7730a88d11	[fix](multi-catalog) add support for orc binary type (#15141 ) Fix three bugs: 1. DataTypeFactory::create_data_type is missing the conversion of binary type, and OrcReader will failed 2. ScalarType#createType is missing the conversion of binary type, and ExternalFileTableValuedFunction will failed 3. fmt::format can't generate right format string, and will be failed	2022-12-19 14:24:12 +08:00
Xin Liao	03ea2866b7	[fix](load) add to error tablets when delta writer failed to close (#15118 ) The result of load should be failed when all tablets delta writer failed to close on single node. But the result returned to client is success. The reason is that the committed tablets and error tablets are both empty, so publish will be success. We should add it to error tablets when delta writer failed to close, then the transaction will be failed.	2022-12-19 14:22:25 +08:00
jiafeng.zhang	e8bac706d3	[deps](FE)Upgrade the velocity version that hive-exec depends on to 2.3 (#15067 )	2022-12-19 14:20:11 +08:00
AlexYue	b62a94ab46	[enhancement](metric)add one metric for the publish num per db (#14942 ) Add one metric to detect the publish txn num per db. User can get the relative speed of the txns processing per db using this metric and doris_fe_txn_num.	2022-12-19 14:18:11 +08:00
zhengyu	a7180c5ad8	[fix](segcompaction) fix segcompaction failed for newly created segment (#15022 ) (#15023 ) Currently, newly created segment could be chosen to be compaction candidate, which is prone to bugs and segment file open failures. We should skip last (maybe active) segment while doing segcompaction.	2022-12-19 14:17:58 +08:00
Jerry Hu	f1321c2fed	[fix](pipeline) remove the redundant override of the close function in set operators (#15161 )	2022-12-19 14:09:36 +08:00
luozenglin	07f5d9562c	[fix](brokerload) fix broker load failed aused by the error path (#15057 )	2022-12-19 10:51:48 +08:00
Gabriel	0b6054a4ce	[Bug](decimalv3) Fix wrong argument for min_by/max_by (#15153 )	2022-12-19 10:15:28 +08:00
xueweizhang	1ed5ad3a16	[fix](merge-on-write) delete all rows with same key in all pre segments (#14995 )	2022-12-19 10:08:38 +08:00
zhangstar333	17e14e9a63	[bug](udaf) fix java udaf incorrect get null value with row (#15151 )	2022-12-19 10:07:12 +08:00
Tiewei Fang	8a08085356	[enhancement](signal) output query_id when 'be' core dumped #15080	2022-12-19 09:47:38 +08:00
yuxuan-luo	a75c302bdb	[fix](schema) Fix create table error if Colocate tables not equal to bucket num (#15071 ) Co-authored-by: hugoluo <hugoluo@tencent.com>	2022-12-19 09:24:14 +08:00
Jibing-Li	3506b568ff	[Regression](multi catalog)P2 regression case for external hms catalog on emr. #15156	2022-12-19 09:21:48 +08:00
924060929	af4d9b636a	[refactor](Nerieds) Refactor aggregate function/plan/rules and support related cbo rules (#14827 ) # Proposed changes ## refactor - add AggregateExpression to shield the difference of AggregateFunction before disassemble and after - request `GATHER` physicalProperties for query, because query always collect result to the coordinator, use `GATHER` maybe select a better plan - refactor `NormalizeAggregate` - remove some physical fields for the `LogicalAggregate`, like `AggPhase`, `isDisassemble` - remove `AggregateDisassemble` and `DistinctAggregateDisassemble`, and use `AggregateStrategies` to generate various of PhysicalHashAggregate, like `two phases aggregate`, `three phases aggregate`, and cascades can auto select the lowest cost alternative. - move `PushAggregateToOlapScan` to `AggregateStrategies` - separate the traverse and visit method in FoldConstantRuleOnFE - if some expression not implement the visit method, the traverse method can handle and rewrite the children by default - if some expression implement the visit, the user defined traverse(invoke accept/visit method) will quickly return because the default visit method will not forward to the children, and the pre-process in traverse method will not be skipped. ## new feature - support `disable_nereids_rules` to skip some rules. example: 1. create 1 bucket table `n` ```sql CREATE TABLE `n` ( `id` bigint(20) NOT NULL ) ENGINE=OLAP DUPLICATE KEY(`id`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`id`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); ``` 2. insert some rows into `n` ```sql insert into n select * from numbers('number'='20000000') ``` 3. query table `n` ```sql SET enable_nereids_planner=true; SET enable_vectorized_engine=true; SET enable_fallback_to_original_planner=false; explain plan select id from n group by id; ``` the result show that we use the one stage aggregate ``` \| PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional.empty, requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) \| \| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) ) \| \| +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) ) \| ``` 4. disable one stage aggregate ```sql explain plan select /+SET_VAR(disable_nereids_rules=DISASSEMBLE_ONE_PHASE_AGGREGATE_WITHOUT_DISTINCT)/ id from n group by id ``` the result is two stage aggregate ``` \| PhysicalHashAggregate ( aggPhase=GLOBAL, aggMode=BUFFER_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) \| \| +--PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_BUFFER, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[ANY], stats=(rows=1, width=1, penalty=2.0E7) ) \| \| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) ) \| \| +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) ) \| ```	2022-12-18 21:49:29 +08:00
Gabriel	13bc8c2ef8	[Pipeline](runtime filter) Support runtime filters on pipeline engine (#15040 )	2022-12-18 21:48:00 +08:00
Gabriel	7241c156ed	[doc](decimalv3) add label for decimalv3 (#15148 )	2022-12-17 21:35:23 +08:00
xueweizhang	6aba948df0	[fix](multi-catalog) hidden password for show create jdbc catalog (#15145 ) when show create catalog of jdbc, it will show 'jdbc.password' plain text. fix it like other code that hidden password.	2022-12-17 17:20:17 +08:00
Adonis Ling	56cd1faeaf	[chore](github) BE UT workflows support branch checks (#15146 ) Apache Doris made the branch branch-1.2-lts protected. As a result, all pull requests for this branch should be checked before merging it. However, the BE UT workflows doesn't support branch checks and they fail to check the pull requests for the branch branch-1.2-lts. The reason is that they download the wrong pre-built third-party libraries when they check the pull requests for branch-1.2-lts.	2022-12-17 16:54:58 +08:00
Pxl	1b07e3e18b	[Chore](refactor) some modify for pass c++20 standard (#15042 ) some modify for pass c++20 standard	2022-12-17 14:41:07 +08:00
zhannngchen	0cd791ec57	[fix](load) delta writer init failed might cause data inconsistency between multiple replicas (#15058 ) In the following case, data inconsistency would happen between multiple replicas current delta writer only writes a few lines of data (which meas the write() method only called once) writer failed when init()(which is called at the fist time we call write()), and current tablet is recorded in _broken_tablets delta writer closed, and in the close() method, delta writer found it's not inited, treat such case as an empty load, it will try to init again, which would create an empty rowset. tablet sink received the error report in rpc response, marked the replica as failed, but since the quorum replicas are succeed, so the following load commit operation will succeed. FE send publish version task to each be, the one with empty rowset will publish version successfully. We got 2 replica with data and 1 empty replica.	2022-12-16 22:07:00 +08:00
starocean999	6d5251af78	[fix](subquery)fix bug of using constexpr as subquery's output (#15119 )	2022-12-16 21:58:58 +08:00
Hong Liu	33abe11dea	[regression-test](query) Add regression case of error could not be changed to nullabl when exe… (#15123 ) * Add regression case of error could not be changed to nullabl when exeing sql * add out file Co-authored-by: smallhibiscus <844981280>	2022-12-16 21:57:36 +08:00
Mingyu Chen	4530b531e7	[fix](type) forbid time type when creating table (#15093 )	2022-12-16 21:54:35 +08:00
Gabriel	67b9d469c1	[Bug](datev2) Fix compatible problems caused by datev2 (#15131 ) This bug is introduced by #15094	2022-12-16 21:52:39 +08:00
Liqf	be2f1df3f1	[typo](doc) fix doc (#15132 )	2022-12-16 21:50:21 +08:00
catpineapple	63d2e85372	multi-catalog_doc (#15139 )	2022-12-16 21:49:50 +08:00
Kang	66422fc351	change datatypes order in document sidebar (#15117 )	2022-12-16 21:28:37 +08:00

1 2 3 4 5 ...

7720 Commits