Commit Graph

18263 Commits

Author SHA1 Message Date
Pxl
266bb971a6 [Enchancement](function) display elements number on check_chars_length #16570 2023-02-10 08:52:41 +08:00
c1a1275870 [fix](memory) Fix parquet load stack overflow (#16537) 2023-02-10 08:48:12 +08:00
48780dcea0 [BugFix](cooldown) push correct cooldownttl to be (#16553)
There were cooldownttl and cooldownttlms in StoragePolicy, it's so error-prone because they served nearly the same.
For example, the init function would only assign the ttl timestamp to cooldownttl, which would end up pushing cooldownttl 0 to be.
2023-02-10 08:45:04 +08:00
4fcd6cd236 [refactor](remove unused code) remove load stream mgr (#16580)
remove old stream load pipe
remove old stream load manager

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-02-10 07:46:18 +08:00
438daaaf1c [enchancement](mv) forbidden craete useless mv in fe (#16286)
forbidden create useless mv in fe
2023-02-09 23:00:09 +08:00
ab34f418c3 [bugfix](information schema) sometimes fe throw thrift_rpc_error (#16555)
mysql> SELECT TABLE_NAME, CHECK_OPTION, IS_UPDATABLE, SECURITY_TYPE, DEFINER FROM INFORMATION_SCHEMA.VIEWS WHERE TABLE_SCHEMA = 'test' ORDER BY TABLE_NAME ASC;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 0
Current database: *** NONE ***

ERROR 1105 (HY000): RpcException, msg: org.apache.doris.rpc.RpcException: failed to call frontend service/n @ 0x563a2b11b6ea doris::Status::ConstructErrorStatus()
@ 0x563a2bcd638f doris::ThriftRpcHelper::rpc<>()
@ 0x563a2b78b777 doris::SchemaHelper::list_table_status()
@ 0x563a2b7a0972 doris::SchemaViewsScanner::get_new_table()
@ 0x563a2b7a0b00 doris::SchemaViewsScanner::get_next_row()
@ 0x563a2ccd0c93 doris::vectorized::VSchemaScanNode::get_next()
@ 0x563a2b7450d6
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-02-09 21:49:33 +08:00
05ed1f751b [fix](planner)(Nereids) add date and datev2 signature to greatest and least function (#16565) 2023-02-09 21:36:53 +08:00
9f0fab8823 [typo](docs)Change the lowercase letters of the disk type example to uppercase (#16557) 2023-02-09 20:33:20 +08:00
77b7b84c34 fix (#16322)
Co-authored-by: wudi <>
2023-02-09 19:55:12 +08:00
130b3599bc [Improvement](writer) make DeltaWriter close idempotent to be more robust (#16558)
return `Status::OK()` instead of `Status::Error<ALREADY_CLOSED>()` for close() in `DeltaWriter` if it's already closed.
2023-02-09 19:48:23 +08:00
a038fdaec6 [Bug](pipeline) Fix bug in non-local exchange on pipeline engine (#16463)
Currently, for broadcast shuffle, we serialize a block once and then send it by RPC through multiple channel. After this, we will serialize next block in the same memory for consideration of memory reuse. However, since the RPC is asynchronized, maybe the next block serialization will happen before sending the previous block.

So, in this PR, I use a ref count to identify if the serialized block can be reuse in broadcast shuffle.
2023-02-09 19:22:40 +08:00
539fd684e9 [improvement](filecache) use dynamic segment size to cache remote file block (#16485)
`CachedRemoteFileReader` has used fixed segment size(file_cache_max_file_segment_size=4M) to cache remote file blocks. However, the column size in a rowgroup/strip maybe smaller than 10K if a parquet/orc file has many columns, resulting in particularly serious read amplification. For example:
Q1 in clickbench: select count(*) from hits
```
-  FileCache:  0ns
  -  IOHitCacheNum:  552
  -  IOTotalNum:  835
  -  ReadFromFileCacheBytes:  19.98  MB
  -  ReadFromWriteCacheBytes:  0.00  
  -  ReadTotalBytes:  29.52  MB
  -  SkipCacheBytes:  0.00  
  -  WriteInFileCacheBytes:  915.77  MB
  -  WriteInFileCacheNum:  283 
```
Only 30MB of data is needed, but 900MB+ of data is read from hdfs. The query time of Q1(single scan thread) increased from **5.17s** to **24.45s** when enable file cache.

Therefore, this PR introduce dynamic segment size which is based on the `read_size` of the data. In order to prevent too small or too large IO, the segment size is limited in [4096, file_cache_max_file_segment_size].

Q1 in clickbench is **5.66s** when enable file cache. The performance is almost the same as if the cache is disabled, and the data size read from hdfs is reduced to 45MB.
```
-  FileCache:  0ns
    -  IOHitCacheNum:  297
    -  IOTotalNum:  835
    -  ReadFromFileCacheBytes:  8.73  MB
    -  ReadFromWriteCacheBytes:  0.00  
    -  ReadTotalBytes:  29.52  MB
    -  SkipCacheBytes:  0.00  
    -  WriteInFileCacheBytes:  45.66  MB
    -  WriteInFileCacheNum:  544
```
## Remaining Problems
Small queries may result in a large number of small files(4KB at least), and the `BE` saves too much meta information of cached segments.

## Fix bug
`FileCachePolicy` in `FileReaderOptions` is a constant reference, but the parameter passed in `FileFactory::create_file_reader` is a temporary variable, resulting in segmentation fault.
2023-02-09 16:39:10 +08:00
9090c5e4e5 [fix](docs) Fix memory & rowset count metrics (#16550) 2023-02-09 15:55:35 +08:00
851a3575ae [fix](regression case) exclude test_broker_load suite, reopen after bug fix (#16554)
There is something wrong with the `test_broker_load` suite(s3 auth problem).
So I ignore this case temporarily.
cc @wsjz , please help to solve it and add it back
2023-02-09 15:51:32 +08:00
ab4c718478 [fix](iceberg) remove s3 default temporary credentials #16543
remove TemporaryAWSCredentialsProvider in global s3 source

Co-authored-by: jinzhe <jinzhe@selectdb.com>
2023-02-09 15:36:35 +08:00
ba4b6aa0c0 [hot-fix](cooldown) Fix unknown module cooldownJob when load fe image #16545 2023-02-09 15:36:01 +08:00
e48a033338 [Bug](pipeline) Support projection in UnionSourceOperator (#16525) 2023-02-09 14:43:44 +08:00
4b093d1ef6 [Bug](point query) when prepared statement used lazyEvaluateRangeLocations should clear bucketSeq2locations to avoid memleak (#16531)
When JDBC client enable server side prepared statement, it will cache OlapScanNode and reuse it for performance, but each
time call `addScanRangeLocations` will add new item to `bucketSeq2locations`, so the `bucketSeq2locations` lead to a memleak if OlapScanNode
cached in memory
2023-02-09 14:41:07 +08:00
7d035486ad [Opt](vec) opt the fast execute logic to remove useless function call (#16532) 2023-02-09 14:12:40 +08:00
646ba2cc88 [bugfix](scannode) 1. make rows_read correct 2. use single scanner if has limit clause (#16473)
make rows_read correct so that the scheduler could using this correctly.
use single scanner if has limit clause. Move it from fragment context to scannode.
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-02-09 14:12:18 +08:00
21cdbec982 [fix](docs) fix some errors in docs (#16546)
Co-authored-by: hechao <hechao@selectdb.com>
2023-02-09 13:50:42 +08:00
338277b748 [doc](flink-connector) Update the flink connector docs to the latest (#14856) 2023-02-09 12:48:59 +08:00
d52fab6316 [typo](docs)modified some text errors (#16544)
Co-authored-by: wangtao <wangtao01@tianyancha.com>
2023-02-09 11:59:49 +08:00
0142ef8b95 [improvement](scanner) Supports bthread scanner (#16031) 2023-02-09 10:24:56 +08:00
531616b8ee [Fix](bucket)fix partition with no history data && AutoBucketUtilsTest (#16516)
fix partition with no history data && AutoBucketUtilsTest (#16515)
2023-02-09 10:17:25 +08:00
9f8753ffd2 [bugfix](vertical_compaction) fix base_compaction delete_sign handler (#16469)
In vertical base compaction, same rows will be filtered in vertical_merge_iterator,
we should skip these filtered rows when set agg flag of delete sign.
For example, schema is a,b,delete_sign, and data is
1,1,1
1,1,0
1,1,0
2,2,1
2,2
and Block we get in VerticalBlockReader is
1,1,1
2,2,1
and we should set agg flag idex 0,4 to true when handle delete sign, so
we add a function continuous_agg_count to skip same rows filtered in
VerticalMergeIterator.
2023-02-09 10:13:41 +08:00
e1f1386395 [fix](cooldown) Rewrite update cooldown conf (#16488)
Remove error-prone CooldownJob, and use CooldownConfHandler to update Tablet's cooldown conf.
Some bug fix about cooldown.
2023-02-09 09:12:55 +08:00
e6b0d94459 [enhancement][docs] add docs for newly added two compaction method (#16529) (#16530)
Co-authored-by: yixiutt <102007456+yixiutt@users.noreply.github.com>
Co-authored-by: zhengyu <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-02-09 09:07:33 +08:00
2d7a9c9c11 add the batch interval time of sink in spark connector doc (#16501) 2023-02-09 08:39:30 +08:00
d1c6b81140 [Bug](log) add some log to find out bug (#16518) 2023-02-08 21:23:02 +08:00
f0b0eedbc5 [fix](planner)group_concat lost order by info in second phase merge agg (#16479) 2023-02-08 20:48:52 +08:00
a512469537 [fix](planner) cannot process more than one subquery in disjunct (#16506)
before this PR, Doris cannot process sql like that
```sql
CREATE TABLE `test_sq_dj1` (
    `c1` int(11) NULL,
    `c2` int(11) NULL,
    `c3` int(11) NULL
) ENGINE=OLAP
DUPLICATE KEY(`c1`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`c1`) BUCKETS 3
PROPERTIES (
    "replication_allocation" = "tag.location.default: 1",
    "in_memory" = "false",
    "storage_format" = "V2",
    "disable_auto_compaction" = "false"
);

CREATE TABLE `test_sq_dj2` (
    `c1` int(11) NULL,
    `c2` int(11) NULL,
    `c3` int(11) NULL
) ENGINE=OLAP
DUPLICATE KEY(`c1`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`c1`) BUCKETS 3
PROPERTIES (
    "replication_allocation" = "tag.location.default: 1",
    "in_memory" = "false",
    "storage_format" = "V2",
    "disable_auto_compaction" = "false"
);

insert into test_sq_dj1 values(1, 2, 3), (10, 20, 30), (100, 200, 300);
insert into test_sq_dj2 values(10, 20, 30);

-- core
SELECT * FROM test_sq_dj1 WHERE c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 < 10;

-- invalid slot
SELECT * FROM test_sq_dj1 WHERE c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 IN (SELECT c2 FROM test_sq_dj2) OR c1 < 10;
```

there are two problems:
1. we should remove redundant sub-query in one conjuncts to avoid generate useless join node
2. when we have more than one sub-query in one disjunct. we should put the conjunct contains the disjunct at the top node of the set of mark join nodes. And pop up the mark slot to the top node.
2023-02-08 18:46:06 +08:00
bb334de00f [enhancement](load) Change transaction limit from global level to db level (#15830)
Add transaction size quota for database

Co-authored-by: wuhangze <wuhangze@jd.com>
2023-02-08 18:04:26 +08:00
f71fc3291f [Bug](fix) right anti join error result when batch size is low (#16510) 2023-02-08 17:26:19 +08:00
666f7096f2 [Fix](multi catalog)(planner) Fix external table statistic collection bug (#16486)
Add index id to column statistic id. Refresh statistic cache after analyze.
2023-02-08 16:51:30 +08:00
b06e6b25c9 [improvement](fuzzy) print fuzzy session variable in FE audit log (#16493)
* [improvement](fuzzy) print fuzzy session variable in FE audit log
2023-02-08 16:38:04 +08:00
d956cb13af [Bug](point query) Reusable in PointQueryExecutor should call init before add to LookupCache (#16489)
Otherwise in high concurrent query, _block_pool maybe used before Reusable::init done in other threads
2023-02-08 16:05:59 +08:00
e11437d1fe [fix](planner) npe in RewriteBinaryPredicatesRule (#16401)
RewriteBinaryPredicatesRule rewrite expression like 
`cast(A decimal) > decimal` to `A > some_other_bigint`
in order to:
1. push down the rewrite predicate 
2. avoid convert column A to decimal

We get the datatype of `A` by `expr0.getSrcSlotRef().getColumn().getType()`.
However, when A is result of a function from sub-query, this rule is not applicable.
For example:
```
select * 
from (
       select TIMESTAMPDIFF(MINUTE,startTime,endTime) AS timediff 
        from CNC_SliceSate) T  
where timediff > 5.0;
```
we cannot push predicate down to OlapScan(CNC_SliceSate) to save effort.
2023-02-08 15:57:35 +08:00
f6a20f844b [fix](hashjoin) join produce blocks with rows larger than batch size: handle join with other conjuncts (#16402) 2023-02-08 14:26:35 +08:00
2883f67042 [fix](iceberg) update iceberg docs and add credential properties (#16429)
Update iceberg docs
Add new s3 credential and properties
2023-02-08 13:53:01 +08:00
41947c73eb [Feature](array-function) Support array functions for nested type datev2 and datetimev2 (#16382) 2023-02-08 12:51:07 +08:00
98c741d664 [fix](Nereids): FilterOrSelf shouldn't And all predicates.. (#16491) 2023-02-08 12:42:22 +08:00
2fd7833a12 [fix](doc): fix typo of tpch.md (#16229) 2023-02-08 12:01:21 +08:00
583001bd92 [Bug](share hash table) Support shared hash table on Nereids (#16474) 2023-02-08 11:51:27 +08:00
713c11b42b [typo](docs) Fix some errors in the description (#16452) 2023-02-08 11:47:39 +08:00
2350ef1a64 Modify thrift_rpc_timeout_ms default value documentation (#16464) 2023-02-08 11:35:38 +08:00
afdaf2d70e [Doc](Jdbc Catlog) JDBC Catalog support Insert operation (#16454) 2023-02-08 10:59:20 +08:00
254790c564 [fix](nereids) FE nereids use DateV2Literal instead of 'cast datev2' (#16386)
BE already support DateV2Literal, and hence, remove code in FE which convert DateV2Literal to Cast datev2
2023-02-08 10:51:35 +08:00
81dbed70c2 [fix](Nereids) back off on tpch p1 (#16478)
adjust nullable on empty set should apply after unnested sub-query
some function should propagate nullable when args are datev2 or datetimev2
add back tpch sf0.1 nereids regression test
2023-02-08 10:43:13 +08:00
289a4b2ea4 [fix](func) fix truncate float type result error (#16468)
When the argument of truncate function is float type, it can match both truncate(DECIMALV3) and truncate(DOUBLE), if the match is truncate(DECIMALV3), the precision is lost when converting float to DECIMALV3(38, 0).

Here I modify it to match truncate(DOUBLE) for now, maybe we still need to solve the problem of losing precision when converting float to DECIMALV3.
2023-02-08 08:57:43 +08:00