doris

Author	SHA1	Message	Date
Gabriel	a038fdaec6	[Bug](pipeline) Fix bug in non-local exchange on pipeline engine (#16463 ) Currently, for broadcast shuffle, we serialize a block once and then send it by RPC through multiple channel. After this, we will serialize next block in the same memory for consideration of memory reuse. However, since the RPC is asynchronized, maybe the next block serialization will happen before sending the previous block. So, in this PR, I use a ref count to identify if the serialized block can be reuse in broadcast shuffle.	2023-02-09 19:22:40 +08:00
Ashin Gau	539fd684e9	[improvement](filecache) use dynamic segment size to cache remote file block (#16485 ) `CachedRemoteFileReader` has used fixed segment size(file_cache_max_file_segment_size=4M) to cache remote file blocks. However, the column size in a rowgroup/strip maybe smaller than 10K if a parquet/orc file has many columns, resulting in particularly serious read amplification. For example: Q1 in clickbench: select count() from hits ``` - FileCache: 0ns - IOHitCacheNum: 552 - IOTotalNum: 835 - ReadFromFileCacheBytes: 19.98 MB - ReadFromWriteCacheBytes: 0.00 - ReadTotalBytes: 29.52 MB - SkipCacheBytes: 0.00 - WriteInFileCacheBytes: 915.77 MB - WriteInFileCacheNum: 283 ``` Only 30MB of data is needed, but 900MB+ of data is read from hdfs. The query time of Q1(single scan thread) increased from 5.17s* to 24.45s when enable file cache. Therefore, this PR introduce dynamic segment size which is based on the `read_size` of the data. In order to prevent too small or too large IO, the segment size is limited in [4096, file_cache_max_file_segment_size]. Q1 in clickbench is 5.66s when enable file cache. The performance is almost the same as if the cache is disabled, and the data size read from hdfs is reduced to 45MB. ``` - FileCache: 0ns - IOHitCacheNum: 297 - IOTotalNum: 835 - ReadFromFileCacheBytes: 8.73 MB - ReadFromWriteCacheBytes: 0.00 - ReadTotalBytes: 29.52 MB - SkipCacheBytes: 0.00 - WriteInFileCacheBytes: 45.66 MB - WriteInFileCacheNum: 544 ``` ## Remaining Problems Small queries may result in a large number of small files(4KB at least), and the `BE` saves too much meta information of cached segments. ## Fix bug `FileCachePolicy` in `FileReaderOptions` is a constant reference, but the parameter passed in `FileFactory::create_file_reader` is a temporary variable, resulting in segmentation fault.	2023-02-09 16:39:10 +08:00
Xinyi Zou	9090c5e4e5	[fix](docs) Fix memory & rowset count metrics (#16550 )	2023-02-09 15:55:35 +08:00
chunping	851a3575ae	[fix](regression case) exclude test_broker_load suite, reopen after bug fix (#16554 ) There is something wrong with the `test_broker_load` suite(s3 auth problem). So I ignore this case temporarily. cc @wsjz , please help to solve it and add it back	2023-02-09 15:51:32 +08:00
slothever	ab4c718478	[fix](iceberg) remove s3 default temporary credentials #16543 remove TemporaryAWSCredentialsProvider in global s3 source Co-authored-by: jinzhe <jinzhe@selectdb.com>	2023-02-09 15:36:35 +08:00
plat1ko	ba4b6aa0c0	[hot-fix](cooldown) Fix unknown module cooldownJob when load fe image #16545	2023-02-09 15:36:01 +08:00
Gabriel	e48a033338	[Bug](pipeline) Support projection in UnionSourceOperator (#16525 )	2023-02-09 14:43:44 +08:00
lihangyu	4b093d1ef6	[Bug](point query) when prepared statement used lazyEvaluateRangeLocations should clear bucketSeq2locations to avoid memleak (#16531 ) When JDBC client enable server side prepared statement, it will cache OlapScanNode and reuse it for performance, but each time call `addScanRangeLocations` will add new item to `bucketSeq2locations`, so the `bucketSeq2locations` lead to a memleak if OlapScanNode cached in memory	2023-02-09 14:41:07 +08:00
HappenLee	7d035486ad	[Opt](vec) opt the fast execute logic to remove useless function call (#16532 )	2023-02-09 14:12:40 +08:00
yiguolei	646ba2cc88	[bugfix](scannode) 1. make rows_read correct 2. use single scanner if has limit clause (#16473 ) make rows_read correct so that the scheduler could using this correctly. use single scanner if has limit clause. Move it from fragment context to scannode. --------- Co-authored-by: yiguolei <yiguolei@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-02-09 14:12:18 +08:00
superche	21cdbec982	[fix](docs) fix some errors in docs (#16546 ) Co-authored-by: hechao <hechao@selectdb.com>	2023-02-09 13:50:42 +08:00
Yun Tang	338277b748	[doc](flink-connector) Update the flink connector docs to the latest (#14856 )	2023-02-09 12:48:59 +08:00
Jenson97	d52fab6316	[typo](docs)modified some text errors (#16544 ) Co-authored-by: wangtao <wangtao01@tianyancha.com>	2023-02-09 11:59:49 +08:00
Xiaocc	0142ef8b95	[improvement](scanner) Supports bthread scanner (#16031 )	2023-02-09 10:24:56 +08:00
Drogon	531616b8ee	[Fix](bucket)fix partition with no history data && AutoBucketUtilsTest (#16516 ) fix partition with no history data && AutoBucketUtilsTest (#16515)	2023-02-09 10:17:25 +08:00
yixiutt	9f8753ffd2	[bugfix](vertical_compaction) fix base_compaction delete_sign handler (#16469 ) In vertical base compaction, same rows will be filtered in vertical_merge_iterator, we should skip these filtered rows when set agg flag of delete sign. For example, schema is a,b,delete_sign, and data is 1,1,1 1,1,0 1,1,0 2,2,1 2,2 and Block we get in VerticalBlockReader is 1,1,1 2,2,1 and we should set agg flag idex 0,4 to true when handle delete sign, so we add a function continuous_agg_count to skip same rows filtered in VerticalMergeIterator.	2023-02-09 10:13:41 +08:00
plat1ko	e1f1386395	[fix](cooldown) Rewrite update cooldown conf (#16488 ) Remove error-prone CooldownJob, and use CooldownConfHandler to update Tablet's cooldown conf. Some bug fix about cooldown.	2023-02-09 09:12:55 +08:00
zhengyu	e6b0d94459	[enhancement][docs] add docs for newly added two compaction method (#16529 ) (#16530 ) Co-authored-by: yixiutt <102007456+yixiutt@users.noreply.github.com> Co-authored-by: zhengyu <freeman.zhang1992@gmail.com> Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-02-09 09:07:33 +08:00
Hong Liu	2d7a9c9c11	add the batch interval time of sink in spark connector doc (#16501 )	2023-02-09 08:39:30 +08:00
Gabriel	d1c6b81140	[Bug](log) add some log to find out bug (#16518 )	2023-02-08 21:23:02 +08:00
starocean999	f0b0eedbc5	[fix](planner)group_concat lost order by info in second phase merge agg (#16479 )	2023-02-08 20:48:52 +08:00
morrySnow	a512469537	[fix](planner) cannot process more than one subquery in disjunct (#16506 ) before this PR, Doris cannot process sql like that ```sql CREATE TABLE `test_sq_dj1` ( `c1` int(11) NULL, `c2` int(11) NULL, `c3` int(11) NULL ) ENGINE=OLAP DUPLICATE KEY(`c1`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`c1`) BUCKETS 3 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); CREATE TABLE `test_sq_dj2` ( `c1` int(11) NULL, `c2` int(11) NULL, `c3` int(11) NULL ) ENGINE=OLAP DUPLICATE KEY(`c1`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`c1`) BUCKETS 3 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); insert into test_sq_dj1 values(1, 2, 3), (10, 20, 30), (100, 200, 300); insert into test_sq_dj2 values(10, 20, 30); -- core SELECT * FROM test_sq_dj1 WHERE c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 < 10; -- invalid slot SELECT * FROM test_sq_dj1 WHERE c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 IN (SELECT c2 FROM test_sq_dj2) OR c1 < 10; ``` there are two problems: 1. we should remove redundant sub-query in one conjuncts to avoid generate useless join node 2. when we have more than one sub-query in one disjunct. we should put the conjunct contains the disjunct at the top node of the set of mark join nodes. And pop up the mark slot to the top node.	2023-02-08 18:46:06 +08:00
Henry2SS	bb334de00f	[enhancement](load) Change transaction limit from global level to db level (#15830 ) Add transaction size quota for database Co-authored-by: wuhangze <wuhangze@jd.com>	2023-02-08 18:04:26 +08:00
HappenLee	f71fc3291f	[Bug](fix) right anti join error result when batch size is low (#16510 )	2023-02-08 17:26:19 +08:00
Jibing-Li	666f7096f2	[Fix](multi catalog)(planner) Fix external table statistic collection bug (#16486 ) Add index id to column statistic id. Refresh statistic cache after analyze.	2023-02-08 16:51:30 +08:00
Mingyu Chen	b06e6b25c9	[improvement](fuzzy) print fuzzy session variable in FE audit log (#16493 ) * [improvement](fuzzy) print fuzzy session variable in FE audit log	2023-02-08 16:38:04 +08:00
lihangyu	d956cb13af	[Bug](point query) Reusable in PointQueryExecutor should call init before add to LookupCache (#16489 ) Otherwise in high concurrent query, _block_pool maybe used before Reusable::init done in other threads	2023-02-08 16:05:59 +08:00
minghong	e11437d1fe	[fix](planner) npe in RewriteBinaryPredicatesRule (#16401 ) RewriteBinaryPredicatesRule rewrite expression like `cast(A decimal) > decimal` to `A > some_other_bigint` in order to： 1. push down the rewrite predicate 2. avoid convert column A to decimal We get the datatype of `A` by `expr0.getSrcSlotRef().getColumn().getType()`. However, when A is result of a function from sub-query, this rule is not applicable. For example: ``` select * from ( select TIMESTAMPDIFF(MINUTE,startTime,endTime) AS timediff from CNC_SliceSate) T where timediff > 5.0; ``` we cannot push predicate down to OlapScan(CNC_SliceSate) to save effort.	2023-02-08 15:57:35 +08:00
TengJianPing	f6a20f844b	[fix](hashjoin) join produce blocks with rows larger than batch size: handle join with other conjuncts (#16402 )	2023-02-08 14:26:35 +08:00
slothever	2883f67042	[fix](iceberg) update iceberg docs and add credential properties (#16429 ) Update iceberg docs Add new s3 credential and properties	2023-02-08 13:53:01 +08:00
abmdocrt	41947c73eb	[Feature](array-function) Support array functions for nested type datev2 and datetimev2 (#16382 )	2023-02-08 12:51:07 +08:00
jakevin	98c741d664	[fix](Nereids): `FilterOrSelf` shouldn't `And` all predicates.. (#16491 )	2023-02-08 12:42:22 +08:00
UnicornLee	2fd7833a12	[fix](doc): fix typo of tpch.md (#16229 )	2023-02-08 12:01:21 +08:00
Gabriel	583001bd92	[Bug](share hash table) Support shared hash table on Nereids (#16474 )	2023-02-08 11:51:27 +08:00
yagagagaga	713c11b42b	[typo](docs) Fix some errors in the description (#16452 )	2023-02-08 11:47:39 +08:00
caoliang-web	2350ef1a64	Modify thrift_rpc_timeout_ms default value documentation (#16464 )	2023-02-08 11:35:38 +08:00
Tiewei Fang	afdaf2d70e	[Doc](Jdbc Catlog) JDBC Catalog support `Insert` operation (#16454 )	2023-02-08 10:59:20 +08:00
minghong	254790c564	[fix](nereids) FE nereids use DateV2Literal instead of 'cast datev2' (#16386 ) BE already support DateV2Literal, and hence, remove code in FE which convert DateV2Literal to Cast datev2	2023-02-08 10:51:35 +08:00
morrySnow	81dbed70c2	[fix](Nereids) back off on tpch p1 (#16478 ) adjust nullable on empty set should apply after unnested sub-query some function should propagate nullable when args are datev2 or datetimev2 add back tpch sf0.1 nereids regression test	2023-02-08 10:43:13 +08:00
luozenglin	289a4b2ea4	[fix](func) fix truncate float type result error (#16468 ) When the argument of truncate function is float type, it can match both truncate(DECIMALV3) and truncate(DOUBLE), if the match is truncate(DECIMALV3), the precision is lost when converting float to DECIMALV3(38, 0). Here I modify it to match truncate(DOUBLE) for now, maybe we still need to solve the problem of losing precision when converting float to DECIMALV3.	2023-02-08 08:57:43 +08:00
Kang	cf18de14b5	[fix](writer) add _is_closed state to DeltaWriter and avoid write/close core after close (#16453 )	2023-02-07 22:40:26 +08:00
Jerry Hu	91325e5ca3	[fix](pipeline) incorrect result when disabling sharing hash table (#16476 )	2023-02-07 21:25:32 +08:00
mch_ucchi	a4c28e6efa	[Fix](Nereids) runtime filter cannot generate when expression is cast. (#16120 )	2023-02-07 20:28:07 +08:00
yixiutt	f90d844a53	[improvement](compaction) enable compaction in TABLET_NOTREADY (#16470 ) If alter task in queue, compaction is not enabled and may cause too much version. Keep last 10 version in new tablet so that base tablet's max version will not be merged and than we can copy data from base tablet to new tablet.	2023-02-07 19:58:23 +08:00
lihangyu	1d0fdff98a	[Bug](sort) disable 2phase read for sort by expressions exclude slotref (#16460 ) ``` create table tbl1 (k1 varchar(100), k2 string) distributed by hash(k1) buckets 1 properties("replication_num" = "1"); insert into tbl1 values(1, "alice"); select cast(k1 as INT) as id from tbl1 order by id limit 2; ``` The above query could pass `checkEnableTwoPhaseRead` since the order by element is SlotRef but actually it's an function call expr	2023-02-07 19:42:54 +08:00
HappenLee	9114896178	[DecimalV3](opt) opt the function of decimalv3 to_string logic (#16427 )	2023-02-07 13:28:07 +08:00
minghong	796d51ae2e	[enhance](fuzzy)set rewriteOrToInPredicateThreshold=2/10000 in fuzzy mode (#16456 ) * set rewriteOrToInPredicateThreshold=2/10000 in fuzzy mod * fmt	2023-02-07 12:45:27 +08:00
yiguolei	d390e63a03	[enhancement](stream receiver) make stream receiver exception safe (#16412 ) make stream receiver exception safe change get_block(block*) to get_block(block , bool* eos) unify stream semantic	2023-02-07 12:44:20 +08:00
yiguolei	6fdd35a6f2	[enhancement](mpp process) remove unused method and make report process more clear (#16441 ) both update status and open_vectorized_internal will call send_report and stop report thread. move update_status code to open method and remove unnecessary send_report and stop_report_thread. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-02-07 12:28:55 +08:00
Shuo Wang	bed1ab7c19	[Feature](Nereids) Add hint to enable pre-aggregation when scan OLAP table. (#15614 ) This pr added support for the pre-aggregation hint. Users could use /+PREAGGOPEN/ to enable pre-preaggregation for OLAP table. For example: Let's say we have an aggregate-keys table t (k1 int, k2 int, v1 int sum, v2 int sum). Pre-aggregation could be enabled by query with a hint: select k1, v1 from t /+PREAGGOPEN/.	2023-02-07 11:59:10 +08:00

1 2 3 4 5 ...

8603 Commits