Commit Graph

10815 Commits

Author SHA1 Message Date
91dae8a5b6 [FIX](mysql_writer) fix mysql output binary object works (#20154)
* fix struct_export out data

* fix mysql writer output with binary true
2023-05-29 16:53:33 +08:00
cc47ee480c [feat](stats) delete data size stat and Made task timeout configurable (#20090)
1. Delete the stats for data size, since it would cost too much time but useless
2. Make task time out configurable since when it's common to analyze a quite huge table that the default 10 min is not suitable
2023-05-29 16:40:59 +08:00
55ccddb62c [Conf](decimalv3) enable decimalv3 by default 2023-05-29 15:38:31 +08:00
Bin
d7e0a52bde [typo](doc)correct the misspelled word and the improper word (#20149) 2023-05-29 15:07:30 +08:00
500995c442 [Fix](multi catalog)Fix Iceberg table missing column unique id bug (#20152)
This pr is to fix the bug introduced by PR #19909
The bug failed to set column unique id for iceberg table, which will cause the query result for iceberg table are all NULL.

```
mysql> select * from iceberg_partition_lower_case_parquet limit 1;
+------+------+------+---------+
| k1   | k2   | k3   | city    |
+------+------+------+---------+
| NULL | NULL | NULL | Beijing |
+------+------+------+---------+
1 row in set (0.60 sec)
```
After fix:
```
mysql> select * from iceberg_partition_lower_case_parquet limit 1;
+------+------+------+---------+
| k1   | k2   | k3   | city    |
+------+------+------+---------+
|    1 | k2_1 | k3_1 | Beijing |
+------+------+------+---------+
1 row in set (0.35 sec)
```
2023-05-29 15:04:12 +08:00
Pxl
8376e5eefb [Chore](build) add non-virtual-dtor, remove no-embedded-directive/no-zero-length-array (#20118)
add non-virtual-dtor, remove no-embedded-directive/no-zero-length-array
2023-05-29 14:42:47 +08:00
Pxl
bbb3af6ce6 [Feature](agg_state) support agg_state combinators (#19969)
support agg_state combinators state/merge/union
2023-05-29 13:07:29 +08:00
f217e052d3 [fix](dynamic_partition) fix dynamic partition not work when drop and recover olap table (#19031)
when olap table is dynamic partition enable, if drop and recover olap table, the table should be added to DynamicPartitionScheduler again

---------

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2023-05-29 13:02:10 +08:00
8378ab5e41 [Fix](inverted index) fix memeory leak when inverted index writer do not finish correctly (#20028)
* [Fix](inverted index) fix memeory leak when inverted index writer do not finish correctly

* [Update](inverted index) use smart pointer to avoid memeory leak

* [Chore](format) code format

---------

Co-authored-by: airborne12 <airborne12@gmail.com>
2023-05-29 12:18:14 +08:00
a86134cb39 [fix](executor) Fixed an error with cast as time. #20144
before

mysql [(none)]>select cast("10:10:10" as time);
+-------------------------------+
| CAST('10:10:10' AS TIMEV2(0)) |
+-------------------------------+
| 00:00:00                      |
+-------------------------------+
after

mysql [(none)]>select cast("10:10:10" as time);
+-------------------------------+
| CAST('10:10:10' AS TIMEV2(0)) |
+-------------------------------+
| 10:10:10                      |
+-------------------------------+
In the past, we supported this syntax.

mysql [(none)]>select cast("2023:05:01 13:14:15" as time);
+------------------------------------------+
| CAST('2023:05:01 13:14:15' AS TIMEV2(0)) |
+------------------------------------------+
| 13:14:15                                 |
+------------------------------------------+
However, "10:10:10" is also a valid datetime.

mysql [(none)]>select cast("10:10:10" as datetime);
+-----------------------------------+
| CAST('10:10:10' AS DATETIMEV2(0)) |
+-----------------------------------+
| 2010-10-10 00:00:00               |
+-----------------------------------+
So here, the order of parsing has been adjusted.
2023-05-29 12:17:21 +08:00
9f8de89659 [refactor](exec) replace the single pointer with an array of 'conjuncts' in ExecNode (#19758)
Refactoring the filtering conditions in the current ExecNode from an expression tree to an array can simplify the process of adding runtime filters. It eliminates the need for complex merge operations and removes the requirement for the frontend to combine expressions into a single entity.

By representing the filtering conditions as an array, each condition can be treated individually, making it easier to add runtime filters without the need for complex merging logic. The array can store the individual conditions, and the runtime filter logic can iterate through the array to apply the filters as needed.

This refactoring simplifies the codebase, improves readability, and reduces the complexity associated with handling filtering conditions and adding runtime filters. It separates the conditions into discrete entities, enabling more straightforward manipulation and management within the execution node.
2023-05-29 11:47:31 +08:00
970efdc1cb [Feature](Nereids) support advanced materialized view (#19650)
Increase the functionality of advanced materialized view

This feature already supported by legacy planner with PR #19650

This PR implement it in Nereids. This PR implement the features as below:
1. Support multiple columns in aggregate function.  eg: select sum(c1 + c2) from t1;
2. Supports complex expressions.  eg: select abs(c1), sum(abc(c1+1) + 1) from t1;

TODO:
1. Support adding where in materialized view
2023-05-29 10:37:44 +08:00
859b03dfdf [Improvement](topn) prevent memory usage of key topn increasing unlimited (#19978) 2023-05-29 10:16:15 +08:00
e0d9f7f955 [enhancement](load) add some profile items for load (#20141) 2023-05-29 09:54:03 +08:00
344ca112af [fix] (clone) fix drop biggest version replica during reblance step (#20107)
* add check for rebalancer choose deleted replica

* impr a compare
2023-05-29 09:00:51 +08:00
42239d635a [fix](tablet_manager_lock) fix create tablet timeout #20067 (#20069) 2023-05-28 23:05:13 +08:00
a5d73d47b6 [security] Don't print password in BaseController (#18862) 2023-05-28 22:49:18 +08:00
4573ee9a49 [enhance](PrefetchReader) abort load task when data size returned by S3 is smaller than requested (#19947)
We encountered one confusing situation where buffered reader were trapped in one endless loop when calling readat. Then we found out that it was all due to the return data size is less than requested.
As the following picture shows, the actual data size is about 2M, and when we called readat it only retrieved about 1MB.
2023-05-28 21:48:17 +08:00
5f9c6e076f [Fix](load)Make insert timeout accurate in show load statistics (#20068) 2023-05-28 21:19:06 +08:00
13c80bdb10 [chore](toolchain) change doris default toolchain to clang (#20146)
GCC is very slow during build and link. Change to clang as we discussed many times.


Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-05-28 21:05:23 +08:00
9d44918036 [Improve](data-type) Clean datatype uselesscode (#20145)
* fix struct_export out data

* delete useless code with data type
2023-05-28 20:48:29 +08:00
c45da40ed7 [refactor-WIP](TaskWorkerPool) add specific classes for ALTER_TABLE, CLONE, STORAGE_MEDIUM_MIGRATE task (#20140) 2023-05-28 19:27:08 +08:00
Bin
142f884753 [typo](docs)Best usage document correction. #20142 2023-05-28 18:56:17 +08:00
ae352997b4 [Enhancement](alter inverted index) Improve alter inverted index performance with light weight add or drop inverted index (#19063) 2023-05-28 11:23:07 +08:00
da17c45c0b [enhance](FileWriter)enhance s3 file writer bvar to avoid adding abort bytes (#20138)
* don't add each time upload or it would add aborted bytes

* alloca memory
2023-05-28 10:52:37 +08:00
f21bf11cf5 [fix](ldap) fix ldap related errors (#19959)
1. fix ldap user show grants return null pointer exception;
2. fix ldap user show databases return no authority db;
3. ldap authentication supports catalog level;
2023-05-27 23:51:32 +08:00
0434c6a738 [refactor-WIP](TaskWorkerPool) add specific classes for PUSH, PUBLIC_VERION, CLEAR_TRANSACTION tasks (#19822) 2023-05-27 22:47:45 +08:00
875e72b5ea [typo](doc)spark load add task timeout parameter #20115 2023-05-27 22:44:20 +08:00
8c00012e8f [improvement](community) simplify the pr template and modify pr labeler (#20127) 2023-05-27 22:43:51 +08:00
637e083343 [regression](test) fix test case failed in pipeline mode (#20139) 2023-05-27 22:42:25 +08:00
509689491f [improvement](exec) Refactor the partition sort node to send data in pipeline mode (#20128)
before: the node will wait to retrieve all data from child, then send data to parent.
now: for data from child that does not require sorting, it can be sent to parent immediately.
2023-05-27 22:42:10 +08:00
4cbb6ece10 [fix](fe)ordering exprs should be substituted in the same way as select part (#20091) 2023-05-27 21:00:57 +08:00
ac8599fedb [Fix](single replica load) fix indices_size key not found core (#20047) 2023-05-27 13:28:07 +08:00
f54a068d82 [feature](function) add json->operator convert to json_extract (#19899) 2023-05-27 12:45:45 +08:00
f3d8af330a [Bug](point query) check point query before check two phase read (#20055)
* [Bug](point query) checkAndSetPointQuery before checkEnableTwoPhaseRead

1. checkEnableTwoPhaseRead rely on thr short circuit flag
2. add more metric to display lookup profile

* fix rebase
2023-05-27 12:38:58 +08:00
Bin
b12250f9e8 [typo](docs)Data partition document correction. (#20103)
* correct the wrongly conveyed meaning.

* delete the item which should not be there anymore.
2023-05-27 12:37:50 +08:00
16c46974c5 [chore](build) Fix compilation errors reported by GCC-13 (#20114) 2023-05-27 08:25:52 +08:00
9539bbf8ae Revert "[test](executor)add crud regression test for resource group (#19659)" (#20121)
This reverts commit 8b9813663d87afa7b359b31782f3864dc54881df.
2023-05-27 08:25:00 +08:00
51ca645c3f [fix](mtmv)Fix tablet not found when restart fe (#20095)
The replayCreateTable restriction must be olapTable. If mv is used, nothing will be done, resulting in no call to invertedIndex.addReplica
2023-05-27 08:20:06 +08:00
23c95d15da [regression-test](sort) Fix unstable sorting (#20125) 2023-05-26 23:42:05 +08:00
93933308e6 [Feature-WIP](CCR): Add ccr doris interface (WIP) (#17881) 2023-05-26 23:40:49 +08:00
e5b0d7a5cd [CTE](eof) Support cte reuse reduce counter by eof status and pipeline task mem can release (#20056) 2023-05-26 22:03:29 +08:00
3c6227a900 [fix](filesystem) Fix core caused by using moved variable in batch_delete_impl #20033 2023-05-26 21:39:27 +08:00
860e28a3a3 [Fix](multi-catalog) Fix db name is not lower case when jdbc catalog configuration lower_case_table_names is true. (#20021)
Fix db name is not lower case when jdbc catalog configuration lower_case_table_names is true.
Fix regression-test test_oracle_jdbc_catalog.
2023-05-26 21:35:38 +08:00
23ad72e734 [Bug](runtime filter) Fix min/max filter for decimalv3 (#20005) 2023-05-26 21:35:21 +08:00
cb4a57f44f [Opt](orc-reader) Support merge small IO facility in orc reader. (#20092)
#18976 introduced merge small IO facility to optimize performance, and used by parquet reader. 
This PR support this facility in orc reader.  Current ORC reader implementation need to reposition parent present stream when reading lazy columns in lazy materialization facility. So let it works by removing `DCHECK_GE(offset, cached_data.end_offset)`.
2023-05-26 21:06:12 +08:00
346c51faa2 [fix](expr) Make VExprContext exit gracefully (#19984) 2023-05-26 20:21:53 +08:00
dcdc81844f [fix](nereids)use same decimalv3 type for params and return types (#20101) 2023-05-26 20:15:51 +08:00
9458a24cd7 [fix](multi-catalog) values in sqlserver should be enclosed by single quotes (#19971)
Fix errors when inserting string/date/datetime values into SQLServer:

ERROR 1105 (HY000): errCode = 2, detailMessage = (172.21.0.101)[INTERNAL_ERROR]UdfRuntimeException: JDBC executor sql has error:
CAUSED BY: SQLServerException: Invalid column name '2021-10-30'.
When using double quotes enclose string values, it will be parsed as column name, so we should enclose string values with single quotes.
2023-05-26 20:04:45 +08:00
ce45d6119d [FIX](regress-test) fix struct_export out data (#20111)
fix struct_export out data
2023-05-26 19:57:51 +08:00