Commit Graph

11394 Commits

Author SHA1 Message Date
4d84cd8ca1 Revert "Revert "[Test](regression) CCR syncer thrift interface regression test (#20935)" (#20990)" (#21022)
This reverts commit 2a294801f1324a999570158eea3224239eefbb29.
2023-06-21 15:20:21 +08:00
b65b821813 [enhancement](pk) add bvar stating cached io (#20977) 2023-06-21 15:02:10 +08:00
c5560b8f93 [fix](load) segcompaction does not signal waiters when an error hanppens (#21043)
This leads to a deadlock.
2023-06-21 14:56:34 +08:00
bad22dd4e2 [Fix](orc-reader) Fix orc dict filter null value issue in _convert_dict_cols_to_string_cols which caused incorrect result. (#21047)
Query results should not have empty values.
```
use regresssion.multi_catalog;
select commit_id from github_events_orc WHERE (event_type = 'CommitCommentEvent') AND commit_id != "" limit 10;
```
```
+------------------------------------------+
| commit_id                                |
+------------------------------------------+
| 685c1fd8dbbdc10c042932f9a9f88be00ff96c75 |
| 685c1fd8dbbdc10c042932f9a9f88be00ff96c75 |
| 4e3ab2ff2d2474f5d51334b9b0fdf17e9845a166 |
|                                          |
|                                          |
|                                          |
|                                          |
|                                          |
|                                          |
| 7191c20cb49da07a7fc16aa32dc0de4faff528b2 |
+------------------------------------------+
10 rows in set (0.54 sec) 
```
2023-06-21 14:54:01 +08:00
564b3533cf [enhancement](merge-on-write) update publish/streamload/compaction co… (#21040) 2023-06-21 14:49:51 +08:00
62fb0e642e [chore](dynamic schema) deprecated create dynamic schema table (#21058) 2023-06-21 14:44:57 +08:00
6f20cac1da [bugfix](cooldown) Fix potential deadlock while calling handleCooldownConf (#20975) 2023-06-21 14:44:01 +08:00
81abdeffbc [Improvement](pipeline) Improve shared scan performance (#20785) 2023-06-21 14:36:05 +08:00
Pxl
5f0bb49d46 [Feature](materialized-view) support create mv contain aggstate column (#20812)
support create mv contain aggstate column
2023-06-21 13:06:52 +08:00
fcd778fb4f [Fix](mysql proto) avoid send duplicated OK packet (#21032)
1. The Mysql Go driver has a logic that terminates when it reads an EOF (end-of-file) and expects no data in the buffer. However, the front-end (FE) mistakenly returns an additional OK packet, which causes an exception to be thrown when reading the buffer.

2. Refactor some logic to support full prepared not just in where clause, like 
```
select ?, ? from tbl
```
2023-06-21 12:00:22 +08:00
18beb822a3 [FIX](array-type) fix array string output with fe const expr (#21042)
fe foldconstRule make array() function expr with const literal , and would not pass this array literal to be . but we should make fe array string output format is same with be array string output
2023-06-21 11:52:02 +08:00
5f760a8939 [fix](runtime_filter) remove incorrect DCHECK (#21050) 2023-06-21 11:27:53 +08:00
ef17289925 [feature](jni) add jni metrics and attach to BE profile automatically (#21004)
Add JNI metrics, for example:
```
-  HudiJniScanner:  0ns
  -  FillBlockTime:  31.29ms
  -  GetRecordReaderTime:  1m5s
  -  JavaScanTime:  35s991ms
  -  OpenScannerTime:  1m6s
```
Add three common performance metrics for JNI scanner:
1. `OpenScannerTime`: Time to init and open JNI scanner
2. `JavaScanTime`: Time to scan data and insert into vector table in java side
3. `FillBlockTime`: Time to convert java vector table to c++ block

And support user defined metrics in java side, for example: `OpenScannerTime` is a long time for the open process, we want to determine which sub-process takes too much time, so we add `GetRecordReaderTime` in java side.
The user defined metrics in java side can be attached to BE profile automatically.
2023-06-21 11:19:02 +08:00
Pxl
b4773e1195 [Chore](materialized-view) enable nereids planner on regression test mv_p0 (#21023)
enable nereids planner on regression test mv_p0
2023-06-21 10:01:27 +08:00
0cf9de8cef [fix](decimalv3) fix result error when cast a round decimalv3 to double (#20678) 2023-06-21 00:02:48 +08:00
ca6f51fcd5 [Performance] disable mmap alloc for doris performance (#21034)
disable mmap alloc for some benchmark
2023-06-20 23:27:49 +08:00
6d579d924d [fix](profile) delete useless profile add_child #20989 2023-06-20 23:21:52 +08:00
d7cc05502a [typo](doc) To access a Kafka cluster with PLAIN authentication enabled (#21019) 2023-06-20 23:19:59 +08:00
b70a14d9c9 [fix](merge-on-write) fix that delete bitmap is not calculated correctly when has sequence column (#20955) 2023-06-20 21:36:47 +08:00
2c11ce0a02 [bugfix](topn) fix key topn merge block conflict with index predicate result columns (#20820) 2023-06-20 21:23:00 +08:00
7a58a69aa9 [Fix](inverted index) skip index compaction when src rs did not have inverted index (#21010) 2023-06-20 21:22:25 +08:00
ce1b39e79d [fix](profile) avoid unnecessary refresh profile of TabletsChannel
Before, refresh the TabletsChannel profile in the LoadChannelMgr refresh memory statistics thread

This means that enable_profile=false will refresh and have performance loss in stress test
2023-06-20 21:09:43 +08:00
622ef63c69 [fix](memory) fix bthread_setspecific error in rpc done.run() (#20999) 2023-06-20 21:00:45 +08:00
f10258577b [Fix](Planner) Fix group concat with multi distinct and segs (#20912)
Problem:
when use select group_concat(distinct a, 'seg1'), group_concat(distinct b, 'seg2') ... Error would rised
Reason:
Group_concat function regard 'seg' as arguments also, so multi distinct column error would rised
Solved:
let Multi Distinct group_concat function only get first argument as real argument
2023-06-20 21:00:18 +08:00
55a6649da9 [fix](testcase) fix test case failure of insert null value into not null column (#20963) 2023-06-20 20:46:07 +08:00
190debaac9 [Improvement](load) single partition load optimize (#20876)
1. When creating a single partition,partition and tablet are not looked up for each row of data
2. Only DISTRIBUTED BY random
2023-06-20 20:29:39 +08:00
493f9f563c [chore](third-party) temporary rollback brpc to 1.4 (#21011) 2023-06-20 20:16:51 +08:00
9eade148dd [enhancement](merge-on-write) add primary key data page size config (#20961) 2023-06-20 19:51:02 +08:00
19dd35f908 [doc](fix) cold hot separation cache doc (#20994) 2023-06-20 18:18:22 +08:00
ca8f51602b [Improvement](multi catalog, statistics)Support two level external statistics cache loader (#20906)
The current column statistic cache loader is to load data from column_statistics olap table.
This pr is to change the cache loader logic to First load from column_statistics olap table, if no data was loaded, then load from table metadata. This is mainly to support fetch statistics data for external catalog using HMS or Iceberg api.
This is the first PR, next pr will implement the fetch logic for different external catalogs.
2023-06-20 16:43:18 +08:00
cb89af49e7 [improvement](replica) donot care last failed version in publish (#21001)
We just care 2 things:
1. If the replica acks right
2. If the replica catches up
2023-06-20 15:57:54 +08:00
0b1bbe4045 [Bugfix](CCR) BinlogTombstone tableId is null when db disable binlog (#20995) 2023-06-20 15:48:47 +08:00
0d80456869 [enhancement](backup) teach fe to acquire a consistent backup between be and fe (#21014) 2023-06-20 15:37:41 +08:00
ccba11d7ea [Fix](inverted index) remove IndexReader::indexExists, use fs interface (#20970) 2023-06-20 15:22:25 +08:00
f4d3f4ae19 [Fix](Nereids) failed to fold date_format() to constant (#20976) 2023-06-20 15:11:25 +08:00
ec34f72204 [enhancement](nereids) log for exception stack of sync analyze (#21013) 2023-06-20 15:11:03 +08:00
6b4a9edbbd [fix](nereids) Fix explain graph with CTE #20997
Add support of MultiCastDataSink
2023-06-20 14:55:21 +08:00
7da3fde89c [Fix](Nereids)cast to datev2 default for Nereids if enable_date_conversion (#20973) 2023-06-20 14:53:20 +08:00
012813b3f7 [fix](load) add missing flush context for BetaRowsetWriter::_add_block() (#20884) 2023-06-20 14:27:39 +08:00
53b2fe5db6 [improvement](jdbc) Set the JDBC connection timeout to be conf (#21000) 2023-06-20 14:23:48 +08:00
c85271d2ae [Fix](orc-reader) Fix filter size mismatch in orc reader. (#20998)
Fix filter size mismatch in orc reader introduced by #20806
2023-06-20 12:27:16 +08:00
d05614ef51 [Fix](invert index)all directories use NoLock (#20962) 2023-06-20 12:12:16 +08:00
74a09fc6e5 [Dependency](fe)Use the release version of hive-catalog (#20921)
Used hive-catalog-shade 1.0.1
2023-06-20 11:53:59 +08:00
1eb4e5bd06 [Fix](Routineload)routine load does not support lowercase data source names (#21005) 2023-06-20 11:44:02 +08:00
923f7edad0 [opt](hudi) using native reader to read the base file with no log file (#20988)
Two optimizations:
1. Insert string bytes directly to remove decoding&encoding process.
2. Use native reader to read the hudi base file if it has no log file. Use `explain` to show how many splits are read natively.
2023-06-20 11:20:21 +08:00
7e01f074e2 [improvement](jdbc mysql) support auto calculate the precision of timestamp/datetime (#20788) 2023-06-20 10:39:34 +08:00
87258a13c4 [enhancement](nereids) Remove useless config option #20905
1. Remove useless config option
2. Fix timeout cancel, before this PR an OlapAnalysisTask would continue running even if it's already timeout.
2023-06-20 10:37:46 +08:00
824bc02603 [Function] Support date function: microsecond() (#20044) 2023-06-20 10:32:54 +08:00
0287cc15f2 [fix](meta) 'clean label from db' does not work (#20625)
When we use a label to load data, this label can not be used twice. But when we execute a sql 'CLEAN LABEL [label] FROM db;', we hope that the same label can be used again.
However, the sql above does not work. This PR is fixing this problem.
2023-06-20 10:25:31 +08:00
d02ecef406 [fix](Nereids): revert push down alias into union (#20991)
revert #20543 to tmp avoid problem
2023-06-20 09:32:26 +08:00