doris

Author	SHA1	Message	Date
airborne12	9d2f879bd2	[Enhancement](inverted index) make InvertedIndexReader shared_from_this (#21381 ) This PR proposes several changes to improve code safety and readability by replacing raw pointers with smart pointers in several places. use enable_factory_creator in InvertedIndexIterator and InvertedIndexReader, remove explicit new constructor. make InvertedIndexReader shared_from_this, it may desctruct when InvertedIndexIterator use it.	2023-07-06 11:52:59 +08:00
Yongqiang YANG	fb14950887	[refactor](load) split flush_segment_writer into two parts (#21372 )	2023-07-06 11:13:34 +08:00
AlexYue	80be2bb220	[bugfix](RowsetIterator) use valid stats when creating segment iterator (#21512 )	2023-07-06 10:35:16 +08:00
Siyang Tang	b1be59c799	[enhancement](query) enable strong consistency by syncing max journal id from master (#21205 ) Add a session var & config enable_strong_consistency_read to solve the problem that loading result may be shortly invisible to follwers, to meet users requirements in strong consistency read scenario. Will sync max journal id from master and wait for replaying.	2023-07-06 10:25:38 +08:00
HHoflittlefish777	6a0a21d8b0	[regression-test](load) add streamload default value test (#21536 )	2023-07-06 10:14:13 +08:00
Kaijie Chen	688a1bc059	[refactor](load) expand OlapTableValidator to VOlapTableBlockConvertor (#21476 )	2023-07-06 10:11:53 +08:00
YueW	a2e679f767	[fix](status) Return the correct error code when clucene error occured (#21511 )	2023-07-06 09:08:11 +08:00
Mingyu Chen	c1e82ce817	[fix](backup) fix show snapshot cauing mysql connection lost (#21520 ) If this is no `info file` in repository, the mysql connection may lost when user executing `show snapshot on repo`, ``` 2023-07-05 09:22:48,689 WARN (mysql-nio-pool-0\|199) [ReadListener.lambda$handleEvent$0():60] Exception happened in one session(org.apache.doris.qe.ConnectContext@730797c1). java.io.IOException: Error happened when receiving packet. at org.apache.doris.qe.ConnectProcessor.processOnce(ConnectProcessor.java:691) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[doris-fe.jar:1.2-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_322] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_322] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_322] ``` This is because there are some field missing in returned result set.	2023-07-05 22:44:57 +08:00
Xiangyu Wang	b6a5afa87d	[Feature](multi-catalog) support query hive-view for nereids planner. (#21419 ) Relevant pr #18815, support query hive views for nereids planner.	2023-07-05 21:58:03 +08:00
jakevin	b3db904847	[fix](Nereids): when child is Aggregate, don't infer Distinct for it (#21519 )	2023-07-05 19:39:41 +08:00
airborne12	5d2739b5c5	[Fix](submodule) revert clucene version wrong rollback (#21523 )	2023-07-05 19:10:15 +08:00
Xiangyu Wang	f868aa9d4a	[Enhancement](multi-catalog) Add some checks for ShowPartitionsStmt. (#21446 ) 1. Add some validations for ShowPartitionsStmt with hive tables 2. Make the behavior consistently with hive	2023-07-05 16:28:05 +08:00
Xiangyu Wang	0da1bc7acd	[Fix](multi-catalog) Fallback to refresh catalog when hms events are missing (#21333 ) Fix #20227, the implementation has some problems and can not catch event-missing-exception.	2023-07-05 16:27:01 +08:00
Mingyu Chen	242a35fa80	[fix](s3) fix s3 fs benchmark tool (#21401 ) 1. fix concurrency bug of s3 fs benchmark tool, to avoid crash on multi thread. 2. Add `prefetch_read` operation to test prefetch reader. 3. add `AWS_EC2_METADATA_DISABLED` env in `start_be.sh` to avoid call ec2 metadata when creating s3 client. 4. add `AWS_MAX_ATTEMPTS` env in `start_be.sh` to avoid warning log of s3 sdk.	2023-07-05 16:20:58 +08:00
HappenLee	39590f95b0	[pipeline](load) return error status in pipeline load (#21303 )	2023-07-05 16:13:32 +08:00
Jibing-Li	37a52789bd	[improvement](statistics, multi catalog)Estimate hive table row count based on file size. (#21207 ) Support estimate table row count based on file size. With sample size=3000 (total partition number is 87491), load cache time is 45s. With sample size=100000 (more than total partition number 87505), load cache time is 388s.	2023-07-05 16:07:12 +08:00
jakevin	1121e7d0c3	[feature](Nereids): pushdown distinct through join. (#21437 )	2023-07-05 15:55:21 +08:00
morrySnow	4d414c649a	[fix](Nereids) set operation physical properties derive is wrong (#21496 )	2023-07-05 15:44:40 +08:00
abmdocrt	d8a549fe61	[Fix](Comment) Comment should be in English (#20964 )	2023-07-05 15:41:34 +08:00
abmdocrt	48bfb8e9cf	[Enhancement](regression-test)Add regression test for MoW backup and restore (#21223 )	2023-07-05 15:16:04 +08:00
Xinyi Zou	38c8657e5e	[improve](memory) more grace logging for memory exceed limit (#21311 ) more grace logging for Allocator and MemTracker when memory exceed limit fix bthread grace exit.	2023-07-05 14:59:06 +08:00
xzj7019	f9bc433917	[fix](nereids) fix runtime filter expr order (#21480 ) Current runtime filter pushing down to cte internal, we construct the runtime filter expr_order with incremental number, which is not correct. For cte internal rf pushing down, the join node will be always different, the expr_order should be fixed as 0 without incrementation, otherwise, it will lead the checking for expr_order and probe_expr_size illegal or wrong query result. This pr will revert 2827bc1 temporarily, it will break the cte rf pushing down plan pattern.	2023-07-05 14:27:35 +08:00
Pxl	f02bec8ad1	[Chore](runtime filter) change runtime filter dcheck to error status or exception (#21475 ) change runtime filter dcheck to error status or exception	2023-07-05 14:03:55 +08:00
catpineapple	d3eeb233c8	[fix](dbt) dbt getconfig array or string (#21345 ) {{ config(unique_key='id') }} {{ config(unique_key=['id','name']) }} Follow the dbt habit, use string for a single column name, and use array for multiple columns	2023-07-05 11:42:38 +08:00
catpineapple	e510e6b0a6	[fix](dbt) dbt-doris match dbt-core==1.5 (#21392 ) dbt-doris==0.2 match dbt-core==1.3 or older version dbt-doris Subsequent version match dbt-core==1.4，1.5	2023-07-05 11:42:19 +08:00
catpineapple	c9c183e498	[fix](dbt) dbt seed config read (#21492 )	2023-07-05 11:41:59 +08:00
Ashin Gau	0084b9fd9a	[fix](hudi) scala can't call Properties.putAll in jdk11 (#21494 )	2023-07-05 10:53:09 +08:00
starocean999	de5cfe34bf	[fix](feut)should not create a DeriveStatsJob in fe ut (#21498 )	2023-07-05 10:38:09 +08:00
DeadlineFen	15ec191a77	[Fix](CCR) Use tableId as the credential for CCR syncer instead of tableName (#21466 )	2023-07-05 10:16:09 +08:00
DeadlineFen	93795442a4	[Fix](CCR) Binlog config is missed when create replica task (#21397 )	2023-07-05 10:15:13 +08:00
DeadlineFen	0469c02202	[Test](regression) Temporarily disable quickTest for SHOW CREATE TABLE to adapt to enable_feature_binlog=true (#21247 )	2023-07-05 10:12:02 +08:00
zhangstar333	122f5f6c2d	[enchanment](udf) add more info when download jar package failed (#21440 ) when download jar package, some times show the checksum is not equal, but the root reason is unknown, now add some error msg if failed.	2023-07-04 20:35:35 +08:00
Xinyi Zou	3b73604f74	[fix](memory) fix jemalloc purge arena dirty pages core dump (#21486 ) Issue Number: close #xxx jemalloc/jemalloc#2470 Occasional core dump during stress test.	2023-07-04 20:35:13 +08:00
Mryange	81ee4d7402	[performance](group_concat) avoid extra copy in group_concat (#21432 ) avoid extra copy in group_concat	2023-07-04 20:21:44 +08:00
Luzhijing	8c2963961f	[docs](releasenote) 2.0 beta release note (#21457 )	2023-07-04 19:02:18 +08:00
zy-kkk	f498beed07	[improvement](jdbc)Support for automatically obtaining the precision of the trino/presto timestamp type (#21386 )	2023-07-04 18:59:42 +08:00
zy-kkk	aec5bac498	[improvement](jdbc)Support for automatically obtaining the precision of the hana timestamp type (#21380 )	2023-07-04 18:59:21 +08:00
zy-kkk	b27fa70558	[fix](jdbc) fix presto jdbc catalog pushDown and nameFormat (#21447 )	2023-07-04 18:58:33 +08:00
zy-kkk	be406a1696	[typo](docs) fix presto jdbc catalog docs (#21445 )	2023-07-04 18:24:58 +08:00
YueW	899f7fbfeb	[fix](regression case) fix variable scope bug in some inverted index regression cases (#21194 ) fix variable scope bug in some inverted index regression cases	2023-07-04 18:05:46 +08:00
AKIRA	9d997b9349	[revert](nereids) Revert data size agg (#21216 ) To make stats derivation more precise	2023-07-04 18:02:15 +08:00
jakevin	1b86e658fd	[fix](Nereids): decrease the memo GroupExpression of limits (#21354 )	2023-07-04 17:15:41 +08:00
Mingyu Chen	13fb69550a	[improvement](kerberos) disable hdfs fs handle cache to renew kerberos ticket at fix interval (#21265 ) Add a new BE config `kerberos_ticket_lifetime_seconds`, default is 86400. Better set it same as the value of `ticket_lifetime` in `krb5.conf` If a HDFS fs handle in cache is live longer than HALF of this time, it will be set as invalid and recreated. And the kerberos ticket will be renewed.	2023-07-04 17:13:34 +08:00
Mingyu Chen	c2b483529c	[fix](heartbeat) need to set backend status base on edit log (#21410 ) For non-master FE, must set Backend's status based on the content of edit log. There is a bug that if we set fe config: `max_backend_heartbeat_failure_tolerance_count` larger that one, the non-master FE will not set Backend as dead until it receive enough number of heartbeat edit log, which is wrong. This will causing the Backend is dead on Master FE, but is alive on non-master FE	2023-07-04 17:12:53 +08:00
Ashin Gau	9adbca685a	[opt](hudi) use spark bundle to read hudi data (#21260 ) Use spark-bundle to read hudi data instead of using hive-bundle to read hudi data. Advantage for using spark-bundle to read hudi data: 1. The performance of spark-bundle is more than twice that of hive-bundle 2. spark-bundle using `UnsafeRow` can reduce data copying and GC time of the jvm 3. spark-bundle support `Time Travel`, `Incremental Read`, and `Schema Change`, these functions can be quickly ported to Doris Disadvantage for using spark-bundle to read hudi data: 1. More dependencies make hudi-dependency.jar very cumbersome(from 138M -> 300M) 2. spark-bundle only provides `RDD` interface and cannot be used directly	2023-07-04 17:04:49 +08:00
morrySnow	90dd8716ed	[refactor](multicast) change the way multicast do filter, project and shuffle (#21412 ) Co-authored-by: Jerry Hu <mrhhsg@gmail.com> 1. Filtering is done at the sending end rather than the receiving end 2. Projection is done at the sending end rather than the receiving end 3. Each sender can use different shuffle policies to send data	2023-07-04 16:51:07 +08:00
hqx871	09f414e0f4	fix lru cache handle field order (#21435 ) For LRUHandle, all fields should be put ahead of key_data. The LRUHandle is allocated using malloc and starting from field key_data is for key data.	2023-07-04 16:10:05 +08:00
jakevin	9e8501f191	[Performance](Nereids): speedup analyze by removing sort()/addAll() in OptimizeGroupExpressionJob to (#21452 ) sort() and allAll() all rules will cost much time and it's useless action, remove them to speed up. explain tpcds q72: 1.72s -> 1.46s	2023-07-04 16:01:54 +08:00
Huang Haijun	890e55b604	[typo](docs)Delete unsupported sql statements in GROUP_CONCAT() (#21455 ) Delete unsupported sql statements in GROUP_CONCAT()	2023-07-04 14:46:49 +08:00
Pxl	65cb91e60e	[Chore](agg-state) add sessionvariable enable_agg_state (#21373 ) add sessionvariable enable_agg_state	2023-07-04 14:25:21 +08:00

1 2 3 4 5 ...

11687 Commits