Commit Graph

10704 Commits

Author SHA1 Message Date
4aad88abc4 [test](Nereids) fix tpcds shape out file #20002 2023-05-24 17:40:13 +08:00
2b3db8f2a8 [Bug](functions) Fix functions for array type with nested decimalv3 (#19993) 2023-05-24 16:51:34 +08:00
ff54b45775 [fix](partial-update) should hold tablet meta lock before calling lookup_row_key() (#19964) 2023-05-24 16:37:27 +08:00
Pxl
3ba7c2336b [Chore](build) change CMAKE_CXX_STANDARD from 17 to 20 #19987 2023-05-24 16:16:42 +08:00
e5eed53b89 [improvement](bitmap) Use shared_ptr in BitmapValue to avoid deep copying (#19101)
Currently bitmapvalue type is copied between columns, it cost a lot of memory. Use a shared ptr in bitmap value to avoid copy data.
2023-05-24 16:13:01 +08:00
c730033595 [improvement](exchange) data stream sender stop sending data to receiver if it returns eos early (#19847)
For broadcast join, only one build fragment instance will build hash table, other fragment instances just receive and throw away build side data, this is waste of memory and cpu.

This PR improve this condition, data stream receiver tells sender that it does not need data from sender, and sender stops sending anydata to it.
2023-05-24 15:11:32 +08:00
d0a3cdfe1a [enhancement](error message) print query id when query timeout (#19972)
In regression test, there are many query timeout, but we do not know the query id, and it is too hard to use the sql text to find the query id in audit log. So that I add query id during query timeout.
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-05-24 14:40:33 +08:00
14b4c7abf9 [fix](hashtable) Check query cancel status during build hash table #19970
should cancel query during hash table build stage if the query is cancelled.
2023-05-24 14:24:03 +08:00
4603a60650 [opt](Nereids) give the easy understand error message when the window func misses the parameters (#19957)
For the new optimizer, if the window func misses the parameter. It will not give an understandable error message. So add the error message.
2023-05-24 14:18:22 +08:00
c84fd79051 [regression](nereids) fix tpcds plan shape #19985
skip tpcds 88/16/28/61/85/17/9/50/25/39/29/13/48/64
2023-05-24 14:04:28 +08:00
70f2e8ff80 [fix](nereids)enable decimalv3 by default for nereids (#19906) 2023-05-24 13:36:24 +08:00
f14e6189a9 [feature](load-refactor) Unfied mysql load use InsertStmt (#19571) 2023-05-24 12:09:16 +08:00
b4669eaeba [Improve](complex-type)add switch for array/struct/map nesting complex type (#19928)
Now we not support array/map/struct nesting each other for many action in be , If we do prohibit it in fe, we will meet many undefined action in be , so I just add switch to prohibit nesting complex type . When we fully support , can make it able.
Issue Number: close #xxx
2023-05-24 11:39:53 +08:00
cf7a74f6ec [fix](memory) query check cancel while waiting for memory in Allocator, and optimize log (#19967)
After the query check process memory exceed limit in Allocator, it will wait up to 5s.
Before, Allocator will not check whether the query is canceled while waiting for memory, this causes the query to not end quickly.
2023-05-24 11:08:48 +08:00
08ec5e2eb5 [fix](function) fix result column is nullable type when fast execute (#19889) 2023-05-24 10:27:50 +08:00
384a0c7aa7 [fix](testcases) Fix some unstable testcases. (#19956)
case of test_string_concat_extremely_long_string will exceed our test limit. Move it to p2 so that it will be tested only in SelectDB test environment.
Because we need to keep consistent with MySQL & avoid overflow. the q67 must keep its behavior like now. When we fully apply nereids & decimalV3 then, it will be fixed automatically.
In the parallel test, although all query stats were cleaned, the cases run in parallel will affect this. So we need to use a unique table for query_stats_test
test_query_sys_tables didn't deal with some unstable situations. fixed it.
temporarily disable unstable case analyze_test case for p0.
2023-05-24 09:52:02 +08:00
20f63a363d [docs](resource-group) add relevant documents to the resource group (#19941) 2023-05-23 23:36:46 +08:00
2b47282be1 [doc](merge-on-write) add some notes for using MoW (#19968) 2023-05-23 23:33:31 +08:00
a2d660a86c [typo](doc) fix some typo (#19414) 2023-05-23 23:32:39 +08:00
5247e2866f [fix](status) fix function code_as_string to private (#19960)
* rebase

* update format

---------

Co-authored-by: ZI-MA <chime316@qq.com>
2023-05-23 23:27:55 +08:00
e3929820d9 [performance](load) use vector instead of skiplist when insert agg keys (#19099) 2023-05-23 20:11:50 +08:00
a6674bb7b1 [regression](nereids) tpcds sf100 plan shape regression cases (#19913) 2023-05-23 18:48:00 +08:00
35f8fc22f2 [testcase](test) Fix query stats test may failed (#19958) 2023-05-23 18:33:07 +08:00
a434a49f71 [Bug](decimal) fix mod function (#19925)
Bug:
select id, kdcml * ktint, kdcml / ktint, kdcml % ktint from expr_test order by id;
+------+-------------------+-------------------+-----------------------+
| id | kdcml * ktint | kdcml / ktint | kdcml % ktint |
+------+-------------------+-------------------+-----------------------+
| NULL | NULL | NULL | NULL |
| 1 | 24.395 | 24.395 | -4702111234474983.74 |
| 2 | 68.968 | 17.242 | -4702111234474983.74 |
| 3 | 146.268 | 16.252 | -4702111234474983.74 |
| 4 | 275.772 | 17.235 | -4702111234474983.74 |
| 5 | 487.470 | 19.498 | -4702111234474983.74 |
| 6 | 827.244 | 22.979 | -4702111234474983.74 |
| 7 | 1364.860 | 27.854 | -4702111234474983.74 |
| 8 | 2205.928 | 34.467 | -4702111234474983.74 |
| 9 | 3509.595 | 43.328 | -4702111234474983.74 |
| 10 | 5514.790 | 55.147 | -4702111234474983.74 |
| 11 | 8578.988 | 70.900 | -4702111234474983.74 |
| 12 | 13235.484 | 91.913 | -4702111234474983.74 |
| 13 | 24.395 | 24.395 | -4702111234474983.74 |
| 14 | 68.968 | 17.242 | -4702111234474983.74 |
| 15 | 146.268 | 16.252 | -4702111234474983.74 |
| 16 | 275.772 | 17.235 | -4702111234474983.74 |
| 17 | 487.470 | 19.498 | -4702111234474983.74 |
| 18 | 827.244 | 22.979 | -4702111234474983.74 |
| 19 | 1364.860 | 27.854 | -4702111234474983.74 |
| 20 | 2205.928 | 34.467 | -4702111234474983.74 |
| 21 | 3509.595 | 43.328 | -4702111234474983.74 |
| 22 | 5514.790 | 55.147 | -4702111234474983.74 |
| 23 | 8578.988 | 70.900 | -4702111234474983.74 |
| 24 | 13235.484 | 91.913 | -4702111234474983.74 |
2023-05-23 18:24:31 +08:00
2596d68424 [fix](schema change) Change table state to NORMAL by SchemaChangeJob instead of SchemaChangeHandler (#19838)
fix problem:
If there is an unfinished schema change job (job-2), and before this time, another schema change job (job-1) of the same table has been finished.
Then restart fe, will replay edit log (pending log and waiting_txn log) for job-2, and the table's state is set to SHCEMA_CHANGE, but when loadAlterJob after replayJournal, will add job-1 to schema change handler, and then run the job-1 will set the table to NORMAL because of job-1 is done, but at this point, the job-2 is doing runWaitingTxnJob, in this function will check table's state, if not normal will throw exception, not change the job's state, and cannot cancel the job because the table is not under schema change.
2023-05-23 18:23:12 +08:00
c246c22b23 [doc](multi-catalog)Supplementary FAQ (#19911)
* catalog doc

* catalog doc
2023-05-23 18:22:10 +08:00
8b184cc5ef [bug](compile) fix fe compile error #19946
Fix fe maven package has a version conflict for package grpc-core.
2023-05-23 18:20:48 +08:00
bfafa5e19d [Chore](docs)Remove empty pages to avoid ambiguity (#19953)
* [Chore](docs)Remove empty pages to avoid ambiguity

* update slidebar
2023-05-23 18:20:19 +08:00
c88ba85e10 [Bug](schema-change) fix varchar can not change to datev2 #19952 2023-05-23 18:18:55 +08:00
6efe6ef6e8 [Enhancement](scanner) allocate blocks in scanner_context on demand and free them on close (#19389)
Firstly, to reduce memory usage, we do not pre-allocate blocks, instead we lazily allocate block when upper call get_free_block. And when upper call return_free_block to return free block, we add the block to a queue for memory reuse, and we will free the blocks in the queue when the scanner_context was closed instead of destructed.
Secondly, to limit the memory usage of the scanner, we introduce a variable _free_blocks_capacity to indicate the current number of free blocks available to the scanners. The number of scanners that can be scheduled will be calculated based on this value.

ssb flat test
previous
lineorder 1.2G:
load time: 3s, query time: 0.355s
lineorder 5.8G:
load time: 330s, query time: 0.970s
load time: 349s, query time: 0.949s
load time: 349s, query time: 0.955s
load time: 360s, query time: 0.889s (pipeline enabled)
after
lineorder 1.2G:
load time: 3s, query time: 0.349s
lineorder 5.8G:
load time: 342s, query time: 0.929s
load time: 337s, query time: 0.913s
load time: 345s, query time: 0.946s
load time: 346s, query time: 0.865s (pipeline enabled)
2023-05-23 18:17:21 +08:00
6f511ac859 [fix](s3)fix s3 resource check (#19933)
fix s3 resource check:

ERROR 1105 (HY000): Unexpected exception: org.apache.doris.common.DdlException: errCode = 2, detailMessage = Missing [AWS_ACCESS_KEY] in properties.

we should use new properties to check s3 available
2023-05-23 16:20:07 +08:00
7247ac9b75 [fix](Nereids) join reorder lead to circle in memo (#19935)
If we have join as the root node, then after some join reorder join, the root Group in Memo will have a GroupExpression including LogicalProject as its plan and the children is its ownerGroup.
This PR add a rewrite rule to ensure we have a Project on the top of the top Join of plan to avoid circle in Memo.
2023-05-23 15:22:32 +08:00
14de2a5c0e [test](Nereids) enable all query in tpcds empty table test (#19945) 2023-05-23 14:40:21 +08:00
ebe3f6ec42 [refactor](routineload)Refactored routineload to improve scalability (#19834)
- The data source parameters are sunk into the specific data source class
- Simplify some code logic to reduce code complexity
- Provide a data source factory class to extract public logic
- Code that removes tests from production code. We should not include code for testing purposes in any production code.
2023-05-23 14:05:47 +08:00
cae8161dd8 [test](lazy open) add lazy open test (#19896) 2023-05-23 14:03:15 +08:00
c0ad588801 [enhancement](page cache) use separate pk index cache (#19864) 2023-05-23 14:02:12 +08:00
da66a64e09 [fix](merge-on-write) return error st if check_pk_in_pre_segments failed (#19736) 2023-05-23 11:04:15 +08:00
8d5269542b [fix](testcase) fix wrong use of same table in different case 2023-05-23 10:24:33 +08:00
4398b91576 [Fix](multi catalog)Change all partition names to lower case (#19816)
Iceberg table partition name may contain upper case characters, for example: City=xxx, Nation=xxx.
But in Doris, all column names are in lower case. Here we transfer the partition name to lower case to keep consist with column name.
2023-05-23 09:31:31 +08:00
633989c78e [fix](Nereids): commute non-inner join for DPHyp (#19929) 2023-05-23 09:30:50 +08:00
d2e3fb097b [docs](struct-map-type) update user docs for struct and map type (#19939)
We have already supported decimalV3.
2023-05-23 09:30:26 +08:00
bd74890cf7 [fix](multi-catalog) JDBC Catalog Unknown UNSIGNED type of mysql, type: [DOUBLE] (#19912) 2023-05-23 09:29:57 +08:00
fe111207a9 [Fix](lazy_open) Fix lazy open null point (#19829) 2023-05-23 09:17:46 +08:00
30417e06d4 [enhance](fs) use bvar to monitor s3 file reader& writer (#19607)
remove useless prefix
2023-05-22 23:01:21 +08:00
3dcdadcea6 [Improvement](function) support decimalv3 for function least and greatest (#19931) 2023-05-22 22:48:44 +08:00
53ba46e404 [Fix][Refactor] Fix 'not member call on null pointer of type 'doris::TextConverter' error in ubsan env and refactor text converter. (#19849)
Fix 'not member call on null pointer of type doris::TextConverter' error in ubsan env and refactor text converter.
2023-05-22 21:00:19 +08:00
6762af3c9b [Improve](struct)improve struct support into outfile (#19894)
support select into outfile for struct type
2023-05-22 18:45:56 +08:00
Pxl
9945067e3c [Bug](function) make VcompoundPred optimization work well (#19870)
make VcompoundPred optimization work well
#19818 this pr try to enable VcompoundPred optimization but get wrong result on tpcds q28.
The reason is some nullable logic on mysql need special handling.

mysql [regression_test_tpcds_sf1_p1]>select null and false;
+----------------+
| NULL AND FALSE |
+----------------+
|              0 |
+----------------+
1 row in set (0.00 sec)

mysql [regression_test_tpcds_sf1_p1]>select null and true;
+---------------+
| NULL AND TRUE |
+---------------+
| NULL          |
+---------------+
1 row in set (0.00 sec)

mysql [regression_test_tpcds_sf1_p1]>select null or false;
+---------------+
| NULL OR FALSE |
+---------------+
| NULL          |
+---------------+
1 row in set (0.00 sec)

mysql [regression_test_tpcds_sf1_p1]>select null or true;
+--------------+
| NULL OR TRUE |
+--------------+
|            1 |
+--------------+
1 row in set (0.00 sec)
2023-05-22 18:32:17 +08:00
7ae3724acd [typo](doc)fix typo in readme #19895
miliseconds -> milliseconds
2023-05-22 18:31:19 +08:00
750a3ea1b4 [doc](fqdn)broker fqdn doc #19910 2023-05-22 18:30:57 +08:00