Commit Graph

5755 Commits

Author SHA1 Message Date
509d865760 [feature](Nereids): convert CaseWhen to If (#23040)
Add a rule to optimize CASE WHEN expression.
Rewrite rule to convert CASE WHEN to IF.

For example:
CASE WHEN a > 1 THEN 1 ELSE 0 END -> IF(a > 1, 1, 0)
2023-08-30 15:47:29 +08:00
3a0a79b4a0 [Improvement][SparkLoad] Use system env configs when users don't set env configs. (#21837) 2023-08-30 15:14:40 +08:00
aef162ad4c [test](log) add some log in udf function when thrown exception (#23651)
[test](log) add some log in udf function when thrown exception (#23651)
2023-08-30 14:16:05 +08:00
4fec0826f8 [fix](Nereids): avoid Exception to cause analyze time too long (#23627)
AnyDataType will cause toCatalogDataType throw Exception, it will cost much time.

Avoid to throw Exception in Analyzer.
2023-08-30 12:25:31 +08:00
d326cb0c99 [fix](planner) array constructor do type coercion with decimal in wrong way (#23630)
array creator with decimal type and integer type parameters should return array<decimal>,
but the legacy planner return array<double>
2023-08-30 11:18:31 +08:00
f786689044 [refactor](TableRowCountAction) Fine-tune sql execution code (#23541) 2023-08-30 10:11:30 +08:00
ca55bd88ad [Fix](Job)Fix the window time is not updated when no job is registered (#23628)
Fix resume job grammar definition is inconsistent
Show Job task Add execution results
JOB allows to define update operations
2023-08-30 09:48:21 +08:00
e02747e976 [feature](Nereids) support struct type (#23597)
1. support struct data type
2. add array / map / struct literal syntax
3. fix array union / intersect / except type coercion
4. fix explict cast data type check for array
5. fix bound function type coercion
2023-08-29 20:41:24 +08:00
4f7e7040ad [bugfix] (dynamic partition) dynamic partition job is removed when tbl is sync (#23404) 2023-08-29 20:35:56 +08:00
1ac0ff0ea9 [feature](delete-predicate) support delete sub predicate v2 (#22442)
New structure for delete sub predicate.
Delete sub predicate uses a string type condition_str to stored temporarily now and fields will be extracted from it using std::regex, which may introduces stack overflow when matching a extremely large string(bug of libc).

Now we attempt to use a new PB structure to hold the delete sub predicate, to avoid that problem.

message DeleteSubPredicatePB {
    optional int32 column_unique_id = 1;
    optional string column_name = 2;
    optional string op = 3;
    optional string cond_value = 4;
}
Currently, 2 versions of sub predicate will both be filled. For query, we use the v2, and during compaction we still use v1. The old rowset meta with delete predicates which had sub predicate v1 will be attempted to convert to v2 when read from PB. Moreover, efforts will be made to rewrite these meta with the new delete sub predicate.

Make preparation to use column unique id to specify a column globally.
Using the column unique id rather than the column name to identify a column is vital for flexible schema change. The rewritten delete predicate will attach column unique id.
2023-08-29 19:37:23 +08:00
103fa4eb55 [feature](Export) support export with nereids (#23319) 2023-08-29 19:36:19 +08:00
cc1509ba11 [fix](view) The parameter positions of timestamp diff function to sql are reversed (#23601) 2023-08-29 18:30:16 +08:00
8932a6fae7 [feature](Nereids) support Literal collate syntax (#23600)
Support such sql grammar, Just for compatibility

```sql
select table_name
from information_schema.tables
where table_schema collate utf8_general_ci = 'information_schema'
  and table_name collate utf8_general_ci = 'parameters';
```
2023-08-29 17:01:13 +08:00
598dc6960a [fix](Nereids) make agg output unchanged after normalized (#23499)
The normalizedAgg rule can change the output of agg.

For example:
```
select c1 as c, c1 from t having c1 > 0
```
The normalizedAgg rule will make a plan with output c, which can cause the having filter error

Therefore, the output exprId should be unchanged after normalized
2023-08-29 15:01:26 +08:00
4c00b1760b [feature](partial update) Support partial update for broker load (#22970) 2023-08-29 14:41:01 +08:00
5b641ebd40 [feature-wip](catalog) support deltalake catalog step1-metadata (#22493) 2023-08-29 10:31:37 +08:00
d8f159728b [fix](planner) only forbid substitute literal expr in function call expr (#23532)
This is a follow up pr of #23438. It's not correct to forbid substitute all literal exprs, only need to prevent substitute literal expr in function's param list.
2023-08-29 10:22:39 +08:00
6f3e2a30e6 [Feat](Nereids) Add leading and ordered hint (#22057)
Add leading hint and ordered hint. Usage:
select /*+ ordered / * from a join b on xxx; which will limit join order to original order
select /+ leading ({b a}) */ from a join b on xxx; which will change join order to b join a.
2023-08-28 21:04:40 +08:00
0ff191cdf3 [feature](Nereids) add expr depth limit and expr children limit in Nereids (#23569)
#### `expr_depth_limit`

Default:3000

IsMutable:true

Limit on the depth of an expr tree.  Exceed this limit may cause long analysis time while holding db read lock.  Do not set this if you know what you are doing

#### `expr_children_limit`

Default:10000

IsMutable:true

Limit on the number of expr children of an expr tree.  Exceed this limit may cause long analysis time while holding database read lock.
2023-08-28 19:03:43 +08:00
a70ebe87c5 [optimize](Nereids): speedup analyze (#23549)
- avoid some `withRowCount`
- ArrayList with size
- checkPrimitiveInputDataTypesWithExpectType avoid to check AnyDataType
2023-08-28 18:17:55 +08:00
Pxl
8e4c0d1e81 [Bug](materialized-view) fix divide double can not match mv (#23504)
* fix divide double can not match mv

* fix

* fix
2023-08-28 18:01:08 +08:00
10792ca0f7 [fix](nereids) Mistaken stats when analyzing table incrementally and partition number less than 512 #23507
Fix bug that mistaken stats when analyzing table incrementally and partition number less than 512
Fix bug that cron expression lost during analyzing
Mark system job as running after registered to AnalysisManager to avoid submit same jobs if previous one take long time
2023-08-28 17:31:36 +08:00
1cc6474487 [fix](planner)fix bug of pushing conjunct through agg node (#23483) 2023-08-28 17:10:13 +08:00
2dff89a77a [Log](Alter) Print table's state when checkNormalStateForAlter() failed (#23358) 2023-08-28 17:04:01 +08:00
6ac694aede [Configuration](multi-catalog) Modify default external cache expire time to 10 mins. (#23490)
Configuration Modify default external cache expire time to 10 mins.
2023-08-28 16:16:43 +08:00
f70638e895 [Fix](autobucket) Fix autobucket partition size by using getAllDataSize including cooldown size (#23557) 2023-08-28 15:24:48 +08:00
Pxl
6e82178847 [Bug](materialized-view) fix loaddb analyze failed on MaterializedIndexMeta (#23442)
* fix loaddb analyze failed on MaterializedIndexMeta

* update

* update
2023-08-28 15:18:18 +08:00
4c8fc06e40 [Feature](fe) Add admin set partition version statement (#23086)
This commit add a statement to modify partition visible version.
2023-08-28 14:31:54 +08:00
f7d2c1faf6 [feature](Nereids) support select key encryptKey (#23257)
Add select key

```
- CREATE ENCRYPTKEY key_name AS "key_string"
- select key my_key
+-----------------------------+
| encryptKeyRef('', 'my_key') |
+-----------------------------+
| ABCD123456789               |
+-----------------------------+
```
2023-08-28 14:07:26 +08:00
ef2fc44e5c [Improve](Job)Allows modify the configuration of the Job queue and the number of consumer threads (#23547) 2023-08-28 12:01:49 +08:00
e84989fb6d [feature](Nereids) support map type (#23493) 2023-08-28 11:31:44 +08:00
b181a9f099 [feature](Nereids) support array type in fold constant framework (#23373)
1. use legacy planner way to process constant folding result from be
2. support signature with complex type for constant folding on fe
2023-08-28 10:47:43 +08:00
d19dcd6bc1 [improve](jdbc catalog) support sqlserver uniqueidentifier data type (#23297) 2023-08-28 10:30:10 +08:00
eadffedb33 [Feature](fe) Add admin set table status statement (#23139)
For some certain bugs, jobs is stuck in FE by the table state. For example, There is a bug which causes table remains ROLLUP state after adding rollup job, then other alter jobs later will not succeed because the table state is always ROLLUP but not NORMAL.

This commit adds a statement which is used to set the state of the specified table.
2023-08-28 10:22:09 +08:00
92bdf75836 [fix](Nereids): LogicalRepeat equals lack @Override (#23408) 2023-08-28 10:07:39 +08:00
153e8f0f72 [imporvement](table property) support for alter table property: skip wirte index , single compaction (#23475) 2023-08-26 23:52:09 +08:00
93918253ba [fix](metric) fix issue that the counter of rejected transactions does not cover some failed situations for load (#23363)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2023-08-26 20:06:42 +08:00
30658ebeda [Fix](planner) Fix query queue can not limit maxConcurrency #23418
2 Fix concurrent can not limit
2023-08-26 17:31:44 +08:00
40be6a0b05 [fix](hive) do not split compress data file and support lz4/snappy block codec (#23245)
1. do not split compress data file
Some data file in hive is compressed with gzip, deflate, etc.
These kinds of file can not be splitted.

2. Support lz4 block codec
for hive scan node, use lz4 block codec instead of lz4 frame codec

4. Support snappy block codec
For hadoop snappy

5. Optimize the `count(*)` query of csv file
For query like `select count(*) from tbl`, only need to split the line, no need to split the column.

Need to pick to branch-2.0 after this PR: #22304
2023-08-26 12:59:05 +08:00
36b7fcf055 [tmp](hive) support hive partition 00 (#23224)
in some case, a hive table with int partition column may has following partition value:
hour=00, hour=01
we need to support this.
2023-08-26 12:58:31 +08:00
db8d18eb40 [Enhance](auth)row policy support role (#23022)
```
CREATE ROW POLICY test_row_policy_1 ON test.table1 
AS {RESTRICTIVE|PERMISSIVE} [TO  user] [TO ROLE role] USING (id in (1, 2)); // add `to role`

DROP [ROW] POLICY [IF EXISTS] test_row_policy;//delete `for user` and `on table`

SHOW ROW POLICY [FOR user][FOR ROLE role] // add `for role`
```
2023-08-26 10:24:59 +08:00
f66f161017 [fix](multi-catalog)fix hive table with cosn location issue (#23409)
Sometimes, the partitions of a hive table may on different storage, eg, some is on HDFS, others on object storage(cos, etc).
This PR mainly changes:

1. Fix the bug of accessing files via cosn.
2. Add a new field `fs_name` in TFileRangeDesc
    This is because, when accessing a file, the BE will get a hdfs client from hdfs client cache, and different file in one query
request may have different fs name, eg, some of are `hdfs://`, some of are `cosn://`, so we need to specify fs name
for each file, otherwise, it may return error:

`reason: IllegalArgumentException: Wrong FS: cosn://doris-build-1308700295/xxxx, expected: hdfs://[172.xxxx:4007](http://172.xxxxx:4007/)`
2023-08-26 00:16:00 +08:00
2b6d876280 [feature](move-memtable)[6/7] add options to enable memtable on sink node (#23470)
Co-authored-by: Siyang Tang <82279870+TangSiyang2001@users.noreply.github.com>
2023-08-25 22:32:22 +08:00
da21b1cb24 [Feature](Job)Allow Job to perform all insert operations, and limit permissions to allow Admin operations (#23492) 2023-08-25 21:58:53 +08:00
006c88827f [fix](stats) Fix auto analyze (#20426)
We only reanalyze those partition that lastVisibleTime is later than job's updatetime, so we shouldn't set this field when creat e system jobs
2023-08-25 21:30:59 +08:00
e3db0fddc1 [fix](iceberg) fix iceberg count(*) short circuit read bug (#23402) 2023-08-25 21:30:30 +08:00
468dfc97db [fix](meta) set broadcast_right_table_scale_factor when upgrading from 1.2 to 2.x (#23423)
When upgrading from 1.2 to 2.x(future version higher than 2.0), the default value of parameter broadcast_right_table_scale_factor may not be upgraded from old default value 10.0 to new default 0.0, which will cause the broadcast join behavior unexpected and may have a big performance impact. This pr will force to reset the value to new default value 0.0, to make sure the behavior correct.
2023-08-25 21:26:19 +08:00
00826185c1 [fix](tvf view)Support Table valued function view for nereids (#23317)
Nereids doesn't support view based table value function, because tvf view doesn't contain the proper qualifier (catalog, db and table name). This pr is to support this function.

Also, fix nereids table value function explain output exprs incorrect bug.
2023-08-25 21:23:16 +08:00
8be0202b94 [improvement](old planner)Prune extra slots with old planner for sql like select count(1) from view (#23393)
The sql like
Select count(1) from view 
would contain all the columns in old planner's execution plan, which is slow, because BE need to read all the column in data files. This pr is to improve the plan to only contain one column.
2023-08-25 21:22:03 +08:00
1312c12236 Revert "[fix](testcase) fix test case failure of insert null value into not null column (#20963)" (#23462)
* Revert "[fix](testcase) fix test case failure of insert null value into not null column (#20963)"

This reverts commit 55a6649da962fb170ddb40fea8ef26bdc552a51a.

Mannual Revert "fix in strict mode, return error for insert if datatype convert fails (#20378)"

This mannual reverts commit 1b94b6368f5e871c9a0fe53dd7c64409079a4c9d

* fix case failure
2023-08-25 16:47:14 +08:00