Commit Graph

12441 Commits

Author SHA1 Message Date
7890e464ee [fix](case) 1. disable unstable case window_function 2. add sync after stream load (#22677)
* Update test_dup_tab_auto_inc_with_null.groovy,add sync after streamload

* Update test_unique_table_auto_inc.groovy, add sync after streamload* [fix](case) disable unstable case window_function
2023-08-09 11:03:31 +08:00
4608dcb2d9 [fix](agg) fix coredump caused by push down count aggregation (#22699)
fix coredump caused by push down count aggregation
2023-08-09 10:21:20 +08:00
a778027569 [typo](docs)Modified description of JSON /String size (#21694) 2023-08-09 10:00:25 +08:00
508cbe030b [Chore](binlog) Refactor TABLET_MISSING in ingest_binlog && set_tstatus (#22727)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-08-09 09:58:54 +08:00
d3baac2952 [improvement](resource-tag) Add Backend tag location check (#22670)
Add Backend tag location check.
Avoid user set a bad backend tag, cause create table and dynamic partitions failed.
For example, the default value for all backends tag is default, When setting the replication_allocation of a table, user use the following command: ALTER TABLE example_db.mysql_table SET ("replication_allocation" = "tag.location.tag1: 1");, it can set success, but tag1 is not exist, cause dynamic partition can't create.
2023-08-09 00:08:34 +08:00
4359089b9c [fix](delete-pred) fix special char in delete sub condition #22667
For some users, their delete condition may contain special chars like '$', which will cause failure in parsing delete condition.
2023-08-09 00:04:26 +08:00
7bfcee6e71 [improvement](variable) add annotations for variables (#22292) 2023-08-08 22:16:42 +08:00
124c1b16cf [performance](load) remove unnecessary lock in TabletsChannel::add_batch (#22703)
This lock was introduced by lazy open in #18874.
It's unnecessary and costly to hold a lock while writing data to DeltaWriter in the first place.

However, since lazy open is reverted in #21821, we can completely omit this lock.
_tablet_writers is not supposed to be changed once we've reached TabletsChannel::add_batch.
2023-08-08 22:08:21 +08:00
9581d2b4eb [refactor](load) split memtable writer out of delta writer (#21892) 2023-08-08 22:02:42 +08:00
19f8264076 Refactor be CMakeLists BUILD_INDEX_TOOL && BUILD_META_TOOL, MAKE_TEST (#22730)
option

Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-08-08 20:51:10 +08:00
30ceb7aea7 [fix](chore] need to remove reference in assert_cast (#22706) 2023-08-08 20:36:05 +08:00
897db3cca5 [Chore](inverted index) refine log in DorisCompoundDirectory::FSIndexOutput (#22716) 2023-08-08 20:32:31 +08:00
b5d7e6e7d8 [improvement](stats) Add lifecycle hooks to AnalysisTask to make codes more clear (#22658) 2023-08-08 19:06:47 +08:00
a04e30d087 [Fix](Job)Fix Job schedule calculation start time (#22707)
Since we use division calculation, when the start time is not specified,
it may have a wrong deviation from our expected time.

For example, if it is the 7th minute now, the cycle is executed every two minutes.
Then it is calculated that the first execution is 8 minutes Because 7/2=3
3+1=4
But ideally we think it should be executed at the 9th minute
2023-08-08 18:30:38 +08:00
50dd318183 [style](jdbc catalog) Tidy the jdbc catalog java file directory (#22691) 2023-08-08 18:21:21 +08:00
f2dca848db [chore](Nereids): optimize to handle enforcer in MergeGroup() (#22709) 2023-08-08 16:56:34 +08:00
edd36fe86b [Chore](tablet) Remove unused BaseTablet::is_memory (#22688)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-08-08 16:42:59 +08:00
f2731185c9 [fix](memory) fix cache clean thread (#22472)
fix page cache update last visit time.
fix cache clean thread
2023-08-08 15:38:29 +08:00
0f15d86c43 [fix](nereids) decimalv2 and float like type's common type should be consistant with old planner in arithmetic expr (#22654)
when both decimalv2 and float like type in the arithmetic expr, the common type is depend on roundPreciseDecimalV2Value session variable. If it's true, the common type is DecimalV2Type.SYSTEM_DEFAULT, otherwise its double type.
2023-08-08 15:22:04 +08:00
8ef38637ae [docs](docs) Rename Title and URL of Date Functions (#22686) 2023-08-08 14:44:05 +08:00
0c972288ef [docs](docs)Rename Title and URL of Array Functions for SEO (#22669) 2023-08-08 14:32:05 +08:00
1d2046de64 [docs](docs)Rename Title of zh-CN Docs (#22662) 2023-08-08 14:31:28 +08:00
66784cef71 [Enhancement](Load) Stream Load using SQL (#22509)
This PR was originally #16940 , but it has not been updated for a long time due to the original author @Cai-Yao . At present, we will merge some of the code into the master first.

thanks @Cai-Yao @yiguolei
2023-08-08 13:49:04 +08:00
c4def9db5c [feature](Nereids): add enforcers in Group (#22660) 2023-08-08 13:39:55 +08:00
1617368ee1 [fix](planner) fix bug of push constant conjuncts through set operation node (#22695)
when pushing down constant conjunct into set operation node, we should assign the conjunct to agg node if there is one. This is consistant with pushing constant conjunct into inlineview.
2023-08-08 12:25:42 +08:00
d77b77a33f [feature](Nereids) eliminate sort that is not directly below result sink (#22550)
eliminate sort that is not directly below result sink.
TODO:
handle select c1 + c2 from (select c1, c2 from t order by c1) v;
2023-08-08 11:19:10 +08:00
91b15183e7 [enhance][external]enhance and fix external cases 0807 (#22689)
enhance and fix external cases 0807
2023-08-08 10:53:08 +08:00
e578e1e6a2 [opt](Nereids) turnoff pipeline when dml temporary (#22693)
pipeline could not work well for dml
2023-08-08 10:26:40 +08:00
22cbf43b14 [Improvement](binlog) Add full/incr engine clone with binlog (#22678)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-08-08 10:03:11 +08:00
36cea89c22 [Fix](planner)support delete conditions contain non-key columns and add check in analyze phase for delete. (#22673) 2023-08-07 21:49:53 +08:00
c9dc715c5d [fix](broker-load) fix error when using multi data description for same table in load stmt (#22666)
For load request, there are 2 tuples on scan node, input tuple and output tuple.
The input tuple is for reading file, and it will be converted to output tuple based on user specified column mappings.

And the broker load support different column mapping in different data description to same table(or partition).
So for each scanner, the output tuples are same but the input tuple can be different.

The previous implements save the input tuple in scan node level, causing different scanner using same input tuple,
which is incorrect.
This PR remove the input tuple from scan node and save them in each scanners.
2023-08-07 20:03:03 +08:00
f074909d3c [opt](Nereids) disable strict consistency dml by default temporary (#22672)
TODO:
1. optimize exchange performance
2. let table sink do merge on one replica
2023-08-07 19:38:35 +08:00
d1a2473944 [Feature](broker)Support GCS (#20904) 2023-08-07 19:37:18 +08:00
4b20f62f79 [community](github) remove Latest-Master-Code-Check (#22668) 2023-08-07 19:25:20 +08:00
77e772e103 [enhancement](config) add some pre-process and pre-check for BE storage config attentions in docs (#22486) 2023-08-07 18:16:57 +08:00
bc697ca9d6 [fix](time) fix error in time_to_sec 2023-08-07 17:33:24 +08:00
d9c93aaa1c [fix](regression) fix failed test delete_p0 in branch-2.0 #22652 2023-08-07 16:42:19 +08:00
f036cdfde6 [feature](compaction) support delete in cumulative compaction (#19609) 2023-08-07 15:22:21 +08:00
c82b9bd76b [test](pipline) exclude case test_doris_jdbc_catalog (#22664) 2023-08-07 15:13:34 +08:00
Pxl
591aee528d [Bug](exchange) change BlockSerializer from unique_ptr to object (#22653)
change BlockSerializer from unique_ptr to object
2023-08-07 14:47:21 +08:00
9c91e80b0c [feature](Nereids): pushdown COUNT(*) through join (#22545) 2023-08-07 12:53:27 +08:00
97adbaadb9 fix full auto analyze (#22650) 2023-08-07 11:41:38 +08:00
c31226b144 [refractor](regression-test) sort out test cases of external tables (#22640)
sort out the test cases of external table.
After modify, there are 2 directories:

1. `external_table_p0`: all p0 cases of external tables: hive, es, jdbc and tvf
2. `external_table_p2`: all p2 cases of external tables: hive, es, mysql, pg, iceberg and tvf

So that we can run it with one line command like:

```
sh run-regression-test.sh --run -d external_table_p0,external_table_p2
```
2023-08-07 11:12:30 +08:00
0ca0c162b1 [fix][load] fix memtable reset cause nullptr (#22577) 2023-08-07 10:45:09 +08:00
023815a4b4 [fix](planner)runtime filter shouldn't be pushed through window function node (#22501) 2023-08-07 09:57:12 +08:00
af8774c2e6 [Test](function) not unpack when else column is const null in IF function (#22419) 2023-08-07 09:34:48 +08:00
1847e440b2 [fix](memory) enable Jemalloc arena dirty pages (#22639)
If there is a core dump here, it may cover up the real stack, if stack trace indicates heap corruption
(which led to invalid jemalloc metadata), like double free or use-after-free in the application.
Try sanitizers such as ASAN, or build jemalloc with --enable-debug to investigate further.
2023-08-06 19:18:44 +08:00
1a8a1e5b16 [Feature](count_by_enum) support count_by_enum function (#22071)
count_by_enum(expr1, expr2, ... , exprN);

Treats the data in a column as an enumeration and counts the number of values in each enumeration. Returns the number of enumerated values for each column, and the number of non-null values versus the number of null values.
2023-08-06 16:05:14 +08:00
c2c01825c1 [opt](stacktrace) Optimize stacktrace output #22467 2023-08-06 15:53:53 +08:00
d628baba0a [improvement](hdfs) support hedged read (#22634)
In some cases, the high load of HDFS may lead to a long time to read the data on HDFS,
thereby slowing down the overall query efficiency. HDFS Client provides Hedged Read.
This function can start another read thread to read the same data when a read request
exceeds a certain threshold and is not returned, and whichever is returned first will use the result.

eg:

create catalog regression properties (
    'type'='hms',
    'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
    'dfs.client.hedged.read.threadpool.size' = '128',
    'dfs.client.hedged.read.threshold.millis' = "500"
);
2023-08-06 14:51:48 +08:00