Commit Graph

9222 Commits

Author SHA1 Message Date
48a2fe68ad [typo](docs) Fix some display errors (#17663)
* [fix](docs) fix some errors in docs
2023-03-11 09:10:48 +08:00
3745e6c18a [fix](Nereids): order of project's logical properties is different with that of project expression (#17648) 2023-03-11 00:26:54 +08:00
051ab7a9c6 [refactor](Nereids): refactor Join-Dependent Predicate Duplication. (#17653) 2023-03-10 22:19:45 +08:00
566d133610 [enhancement](Nereids) Refactor EliminateLimitTest and EliminateFilterTest by match-pattern (#17631) 2023-03-10 21:24:36 +08:00
6dcd791b74 [feature](struct-type) support CAST AS Struct type (#17553)
1. add support `CAST AS Struct` from Struct type;
2. fix crash while `CAST('{}' AS Struct)`;
3. `CAST('' AS complext_type)` should return NULL instead of empty object;

---------

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2023-03-10 21:21:16 +08:00
2739a44eaf [fix](segcompaction) heap overflow when doing segcompaction for cancelling load(#17529)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-03-10 20:52:05 +08:00
47e9217c1e [improvement](pipeline)Avoid duplicate trigger teamcity build (#17647)
* add clickbench and arm pipeline trigger;test merge check


* set compile required; add clickbench and arm pipeline trigger to buildall;

* avoid duplicate builds

* simplify auto trigger code and avoid repeated triggers 0310
2023-03-10 19:56:14 +08:00
948654ad38 [fix](ui)format the text file of profile #17645 2023-03-10 19:54:28 +08:00
9cfa61b402 [Enhancement](HttpServer) Provide authentication interface for BE (#17073)
Add an authentication interface in FE for BE
2023-03-10 16:34:47 +08:00
9ae5ec4dc5 [fix](nereids) PushdownExpressionsInHashCondition contains duplicate column and WindowExpression miss column stats (#17624)
tpcds: q47 and q57
1. PushdownExpressionsInHashCondition:project contains duplicate column
2. WindowExpression stats caclucate: miss column stats
2023-03-10 16:08:43 +08:00
365c8eed7e [fix](function) width_bucket should get min and max from each tuple (#17466) 2023-03-10 13:14:12 +08:00
739e043c8d [fix](publish) add retry publish when succeed replica num less than quorum and transaction not VISIBLE (#17453)
for some reasons, transaction pushlish succeed replica num less than quorum,
this transaction's status can not to be VISIBLE, and this publish task of this
replica of this tablet on this backend need retry publish success to
make transaction VISIBLE when last publish failed.
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-03-10 12:02:15 +08:00
a79b8ede88 [Bug](ColumnArray) Fix array column replicate replicate_offsets not matched (#17616)
the input replicate_offsets should be the same size as ColumnArray's offset.
```
IColumn::Offsets replicate_offsets(get_offsets().size(), 0);
// |---------------------|-------------------------|-------------------------|
// [0, begin)             [begin, begin + count_sz)  [begin + count_sz, size())
//  do not need to copy    copy counts[n] times       do not need to copy
```

we should
2023-03-10 11:52:22 +08:00
Pxl
1a549edac2 [Chore](third-party) upgrade thrift from 0.13 to 0.16 (#17202)
upgrade thrift from 0.13 to 0.16
There is thrift's release notes https://github.com/apache/thrift/blob/master/CHANGES.md
2023-03-10 11:33:16 +08:00
fcd25b53bf [Optimize](Random distribution) Improve the performance of tablet sin… (#17389)
The current distribution model for Doris is as follows:

OlapTableSink seperate the original Block into serveral subblocks of each node(BE) by tablets distribution and distributes subblocks to storage engine of backends, then the storage engine will seperate the subblock into multiple tablets channel and each delta writer will handle partial of the block.

This model causes blocks to be split according to tablets, and the splitting process can be a relatively heavy operation. After splitting, the blocks are distributed to different DeltaWriters (Memtables) through RPCs to TabletChannels. The distribution operation on TabletChannels is also a relatively heavy operation. If the distribution property of the table is RANDOM distribution, then we have the opportunity to distribute the blocks according to the complete block during distribution. The advantage of doing so is to reduce memory copying and improve write locality, similar to appending the entire block to the memtable.

This optimze could save 10% ~ 20% CPU cost of RANDOM distribution table load when enable load_to_single_tablet
2023-03-10 10:52:40 +08:00
f84b8b7c8b [fix](priv) fix extract real user name when do privilege check (#17488)
fix extract real user name of root/admin
2023-03-10 10:22:13 +08:00
fe6361f4b5 [regression-test](p0) fix some unstable p0 cases (#17518)
drop database before create
remove some large, unused debug log
2023-03-10 10:21:39 +08:00
e1bf9411de [feature](array function) add support for array_enumerate_uniq (#17541)
add support for array_enumerate_uniq()
2023-03-10 10:20:49 +08:00
c7aa3f9717 [fix](backup) backup throw NPE when no partition in table (#17546)
If table has no partition, backup will report error:

2023-03-06 17:35:32,971 ERROR (backupHandler|24) [Daemon.run():118] daemon thread got exception. name: backupHandler
java.util.NoSuchElementException: No value present
        at java.util.Optional.get(Optional.java:135) ~[?:1.8.0_152]
        at org.apache.doris.catalog.OlapTable.selectiveCopy(OlapTable.java:1259) ~[doris-fe.jar:1.0-SNAPSHOT]
        at org.apache.doris.backup.BackupJob.prepareBackupMeta(BackupJob.java:505) ~[doris-fe.jar:1.0-SNAPSHOT]
        at org.apache.doris.backup.BackupJob.prepareAndSendSnapshotTask(BackupJob.java:398) ~[doris-fe.jar:1.0-SNAPSHOT]
        at org.apache.doris.backup.BackupJob.run(BackupJob.java:301) ~[doris-fe.jar:1.0-SNAPSHOT]
        at org.apache.doris.backup.BackupHandler.runAfterCatalogReady(BackupHandler.java:188) ~[doris-fe.jar:1.0-SNAPSHOT]
        at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) ~[doris-fe.jar:1.0-SNAPSHOT]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.0-SNAPSHOT]
2023-03-10 10:19:37 +08:00
4ba93efc98 [Enhance](DOE)Support parse default es iso datetime string (#17412)
* support parse default es iso datetime string
2023-03-10 09:59:20 +08:00
006f7a91ac [fix](planner) should not turn on push agg op when olapscan has conjuncts on it (#17598)
we should not set PushAggOp to any type, if olap scan already has conjunct on it.
2023-03-10 09:33:08 +08:00
a745ab1703 [fix](schema scanner) fix query some schema table report invalid parameter (#17626)
Example:

SELECT ROUTINE_SCHEMA AS PROCEDURE_CAT, NULL AS PROCEDURE_SCHEM,ROUTINE_NAME AS PROCEDURE_NAME,NULL AS NUM_INPUT_PARAMS,NULL AS NUM_OUTPUT_PARAMS,NULL AS NUM_RESULT_SETS,ROUTINE_COMMENT AS REMARKS,IF(ROUTINE_TYPE = 'FUNCTION', 2,IF(ROUTINE_TYPE= 'PROCEDURE', 1, 0)) AS PROCEDURE_TYPE FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_SCHEMA = DATABASE();
ERROR 1105 (HY000): errCode = 2, detailMessage = invalid parameter

This wrong and some BI tools could not work correctly.
2023-03-10 08:52:09 +08:00
08f0170895 [fix](olap) The 'scan key' generated by the 'is null' expression causes incorrect query results (#17569) 2023-03-10 08:51:06 +08:00
c3c7bc4340 [fix](profile) fix profile sort child list exception (#17613) 2023-03-10 08:44:32 +08:00
f9baf9c556 [improvement](scan) Support pushdown execute expr ctx (#15917)
In the past, only simple predicates (slot=const), and, like, or (only bitmap index) could be pushed down to the storage layer. scan process:

Read part of the column first, and calculate the row ids with a simple push-down predicate.
Use row ids to read the remaining columns and pass them to the scanner, and the scanner filters the remaining predicates.
This pr will also push-down the remaining predicates (functions, nested predicates...) in the scanner to the storage layer for filtering. scan process:

Read part of the column first, and use the push-down simple predicate to calculate the row ids, (same as above)
Use row ids to read the columns needed for the remaining predicates, and use the pushed-down remaining predicates to reduce the number of row ids again.
Use row ids to read the remaining columns and pass them to the scanner.
2023-03-10 08:35:32 +08:00
0334cde2b1 [fix](merge-on-write) when if publish and be down, need recalc delete bitmap for MoW (#17617)
* (merge-on-write) when if publish and be down, need recalc delete bitmap for MoW

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

* fix code

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

---------

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-03-10 07:55:00 +08:00
e80ae0367a [improvement](be) add a name for be jvm (#17595) 2023-03-09 23:27:15 +08:00
4ddd303cfc [Feature-wip](MySQL Load)Support cancel query for mysql load (#17233)
Notice some changes:
1. Support cancel query for mysql load 
2. Change the thread pool for mysql load manager.
3. Fix sucret path check logic
4. Fix some doc error
2023-03-09 22:08:26 +08:00
0c48bb4d66 [typo](docs) Fix some misspelled words (#17605) 2023-03-09 21:55:58 +08:00
849b5b7b8f [fix](sequence) fix that the result is wrong when load multiple duplicate keys (#17575) 2023-03-09 20:59:23 +08:00
53bf1271ec [doc](multi-catalog) column type mapping for map&struct types (#17591) 2023-03-09 19:47:11 +08:00
0432ba8b33 [refactor](status) refactor status judgement (#17592) 2023-03-09 17:40:25 +08:00
5d26a12312 [fix](inverted index) fix missing several numeric types for inverted index query (#17359) 2023-03-09 16:34:06 +08:00
4a0361914b [fix](alter inverted index) add or drop inverted index also need change table state to SCHEMA_CHANGE (#17471)
before this pr, add or drop inverted index not change table state, maybe multiple alter jobs executed at the same time, that may lead to some unexpected problems.
2023-03-09 16:33:46 +08:00
310bdb60f4 [chore](maven) Prefer protoc in thirdparty to the one in maven artifacts (#17596)
The prebuilt protoc-gen-grpc-java binary uses glibc on Linux and the version of glibc which Centos 6 uses is too old.
2023-03-09 16:21:38 +08:00
49c54e59db [typo](docs) Fix some misspelled words (#17593) 2023-03-09 15:24:41 +08:00
62a03ec24c [feature](regression) add http test action (#17567) 2023-03-09 15:13:04 +08:00
e182e2426f [fix](regression) close p0 fe regression pipline config for avoiding flink load fail (get tableList write lock timeout) (#17573)
This pull request for bellow problem :
when fe config set sys_log_verbos_modules = org.apache.doris, which will make fe get writeLock longer. In this config, make a stream load, that stream load will failed with this message ([ANALYSIS_ERROR]errCode = 2, detailMessage = get tableList write lock timeout, tableList=(Table [id=86135, name=flink_connector, type=OLAP]))
2023-03-09 14:18:38 +08:00
6c894be007 [enhancement](Nereids) support decimalv3 and precision derive (#17393) 2023-03-09 14:12:10 +08:00
aaedcf34cf [enhancement](Nereids) refactor costModel framework (#17339)
refactor cost-model frameWork:
1. Use Cost class to encapsulate double cost
2. Use the `addChildCost` function to calculate the cost with children rather than add directly

Note we use the `Cost` class because we hope to customize the operator of adding a child host. Therefore, only when the cost would add the child Cost or be added by the parent  we use `Cost`. Otherwise, we use double such as `upperbound`
2023-03-09 13:58:44 +08:00
e1ea2e1f2c [fix](Nereids) store offset of Limit in exchangeNode (#17548)
When the limit has offset, we should add an exchangeNode and store the offset in it
2023-03-09 13:43:12 +08:00
2d027282f3 [fix](profile) modify load profile some bugs and docs (#17533)
1. 'insert into' profile has 'insert' type, can not query by 'load' type
2. 'insert into' profile does not have job_id, can not query by job_id. so put all profiles key with query_id
3. 'broker load' profile does not have some infos, npe
2023-03-09 11:58:40 +08:00
4ef46159ae [vectorized](udaf) support array type for java-udaf (#17351) 2023-03-09 11:30:07 +08:00
06dee69174 [Refactor](map) remove using column array in map to reduce offset column (#17330)
1. remove column array in map 
2. add offsets column in map 
Aim to reduce duplicate offset  from key-array and value-array in disk
2023-03-09 11:22:26 +08:00
368e6a4f9c [Bug](array filter) Fix bug due to ColumnArray::filter_generic invalid inplace size_at after set_end_ptr (#17554)
We should make a new PodArray to add items instead of do it inplace
2023-03-09 10:59:29 +08:00
00727e8c11 [fix](in-bitmap) fix result may be wrong if the left side of the in bitmap predicate is a constant (#17570) 2023-03-09 10:59:05 +08:00
Pxl
65b8dfc7ff [Enchancement](function) Inline some aggregate function && remove nullable combinator (#17328)
1. Inline some aggregate function
2. remove nullable combinator
2023-03-09 10:39:04 +08:00
6923bf8d7b [fix](file cache)fix block file cache can't be configured (#17511) 2023-03-09 10:12:08 +08:00
397cc011c4 [fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420)
ECB algorithm, block_encryption_mode does not take effect, it only takes effect when init vector is provided.
Solved: 192/256 supports calculation without init vector

For other algorithms, an error should be reported when there is no init vector

Initialization Vector. The default value for the block_encryption_mode system variable is aes-128-ecb, or ECB mode, which does not require an initialization vector. The alternative permitted block encryption modes CBC, CFB1, CFB8, CFB128, and OFB all require an initialization vector.

Reference: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_aes-decrypt

Note: This fix does not support smooth upgrades. during upgrade process, query may report error: funciton not found
2023-03-09 09:51:41 +08:00
8a6a4b82aa [typo](docs) Add a hyperlink to facilitate user redirect. (#17563) 2023-03-09 09:47:10 +08:00