Commit Graph

18429 Commits

Author SHA1 Message Date
af8832389f [feature](Nereids) add 4 array functions (#25488)
- array_concat
- array_pushback
- array_pushfront
- array_zip
2023-10-17 04:45:15 -05:00
06ff59bc03 [Performance](sink) SIMD the tablet sink valied data function (#25480) 2023-10-17 16:21:08 +08:00
f38f5f50eb [fix](ipv6)fix can not resolve host and port (#25254)
for ipv6,address should be [ip]:port instead of ip:port
2023-10-17 15:46:45 +08:00
31a5e072e7 [refactor](pipelineX) Simplify set operation (#25502) 2023-10-17 15:11:46 +08:00
652d6c57c0 [fix](jdbc catalog) fix handle oracle date format (#25487) 2023-10-17 02:10:28 -05:00
8c5af5a088 [fix](case) Fix test_analyze case (#25476)
It has following problems before this PR
use count(*) to check if all column analyzed
return directly when fe count > 1
Co-authored-by: AKIHA <cyborgz1999@example.com>
2023-10-17 15:06:01 +08:00
1514f78b87 [refactor](partial-update) Split partial update infos from tablet schema (#25147) 2023-10-17 14:21:40 +08:00
4d12d8885e [feature](Nereids): graphSimplifier should compare edge1BeforeEdge2 and edge2BeforeEdge1 (#25416) 2023-10-17 14:10:21 +08:00
c4cc6cefda [fix](regression-test) fix http stream 2pc case(#25507) 2023-10-17 13:21:03 +08:00
0ee06f30b0 [feature](nereids)Ignore some node in 'explain shape plan' command (#25485)
if set ignore_shape_nodes='PhysicalDistribute, PhysicalProject'
then
explain shape plan will not print project and distribute node
2023-10-17 11:57:36 +08:00
c2fe34dec7 [refine](pipelineX) refactor local state (#25448) 2023-10-17 11:23:29 +08:00
410441b516 [enhancement](Nereids): remove LAsscom in Bushy Tree RuleSet (#25465)
- Bushy Tree RuleSet don't need LAsscom
- fix bug: rule pattern shouldn't use same name
2023-10-17 11:22:52 +08:00
384fddb2ff [test](case)add some debug log in mv case (#25458)
* [test](case)change the insert stmt in mv case
2023-10-17 11:04:45 +08:00
5f844486e3 [enhancement](invert index) read columns by index reduce seek time (#24735) 2023-10-17 10:34:33 +08:00
1130317b91 [Improvement](statistics)Collect stats for hive partition column using metadata (#24853)
Hive partition columns' stats could be calculated from hive metastore data. Doesn't need to execute sql to get the stats.
This PR is using hive partition metadata to collect partition column stats.
2023-10-17 10:31:57 +08:00
a383a2bc83 [cases](regresstest)add json format regress test for nested types (#25397) 2023-10-17 10:16:52 +08:00
a364a24ac2 [Enhance](regression) add hive out file check (#25475)
add hive out file check
fix hive sql state with " ; "
2023-10-17 10:11:57 +08:00
ef7d8aa99a [fix](be)confix bug of converting outer join probe block to nullable (#25492)
_do_evaluate will add temp result column into original table block, so in order to only convert correct columns to be nullable, need call convert_block_to_null before _do_evaluate
2023-10-17 10:10:56 +08:00
85b8497624 [fix](Tvf) return empty set when tvf queries an empty file or an error uri (#25280)
### Before:
return errors when tvf queries an empty file or an error uri:
1. get parsed schema failed, empty csv file
2. Can not get first file, please check uri.

### Now:
we just return empty set when tvf queries an empty file or an error uri.
```sql
mysql> select * from s3( 
"uri" = "https://error_uri/exp_1.csv", 
"s3.access_key"= "xx", 
"s3.secret_key" = "yy", 
"format" = "csv") limit 10;

Empty set (1.29 sec)
```
2023-10-17 09:52:53 +08:00
cda8fb6b8b [fix](load) return Status when error in RowsetWriter::build (#25381) 2023-10-17 09:40:23 +08:00
a194a15442 [improvement](tablet schedule) colocate balance between all groups (#23543) 2023-10-17 09:33:52 +08:00
f1a5e393c7 [feature](insert) Support group commit insert use new syntax like insert into table_id(xxx) (#25484) 2023-10-17 09:23:09 +08:00
f75ee49cb4 [chore](fmt) Remove stringstream by fmt (#25474)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-10-16 21:31:54 +08:00
59ebbb351e [feature](merge-cloud) Enable write into cache when uploading file to s3 using s3 file writer (#24364) 2023-10-16 21:31:02 +08:00
fe1980d7f2 [docs](docs) Add release note 2.0.2 (#25375) 2023-10-16 20:38:45 +08:00
f9a80ecdab [improvement](sync version) fe sync version with be (#25236) 2023-10-16 20:34:25 +08:00
e9157a3dba [fix](path gc) fix data dir path gc (#25420) 2023-10-16 20:25:20 +08:00
eaf5febc97 [enhancement](cooldown) Improve cooldown logs (#25432) 2023-10-16 20:17:00 +08:00
bfc602f343 [compile](fix) fix ubsan compile error (#25473) 2023-10-16 20:11:17 +08:00
Pxl
72920fbd1d [Improvement](materialized-view) set job failed when toAgentTaskRequest meet error (#25358)
set job failed when toAgentTaskRequest meet error
2023-10-16 20:10:52 +08:00
f9df3bae61 [Enhancement](functions) change some nullable mode and clear some smooth upgrade (#25334) 2023-10-16 19:50:17 +08:00
7fd876f3a2 [fix](planner)should call SlotRef'smaterializeSrcExpr() method if the slotRef is materialized (#25467) 2023-10-16 19:42:12 +08:00
dd1c4f4218 [fix](regression) fix group commit stream load regression test (#25469) 2023-10-16 19:41:51 +08:00
e3d0e55794 [feature-wip] (Nereids) Support transforming trino dialect SQL to logical plan (#21855)
Support transforming trino dialect SQL to logical plan (#21854)

## Proposed changes

Issue Number: #21854 
Use io.trino.sql.tree.AstVisitor as vistor, visit coorresponding trino node and transform it to doris logical plan.

## Further comments

Here are some examples for function transforming as following:
**ascii('a')** function is in doris and **codepoint('a')** funtion in trino, they have the same feature and have the same method signature, so we can use [TrinoFnCallTransformer](3b37b76886/fe/fe-core/src/main/java/org/apache/doris/nereids/parser/trino/TrinoFnCallTransformer.java) to handle them.

another example for ComplexTransformer as following:
**date_diff('second', TIMESTAMP '2020-12-25 22:00:00', TIMESTAMP '2020-12-25 21:00:00')"** fuction in trino
and **seconds_diff(2020-12-25 22:00:00, 2020-12-25 21:00:00)")** fuction in doris. They have different method signature, we cant not handle it by TrinoFnCallTransformer simply and we should handle it by individual complex transformer [DateDiffFnCallTransformer](3b37b76886/fe/fe-core/src/main/java/org/apache/doris/nereids/parser/trino/DateDiffFnCallTransformer.java).
2023-10-16 05:10:55 -05:00
cf073ec8ce [runtimefilter](nerieds)support Non equal runtime filter for nested loop join #25193 2023-10-16 17:49:47 +08:00
Pxl
d41e839ea0 [Chore](sink) add index number check for table sink (#25461)
add index number check for table sink
2023-10-16 17:03:26 +08:00
9deda929b9 [refactor](stats) Use id instead name in analysis info (#25213) 2023-10-16 03:49:53 -05:00
4c42f3b783 [Improvement](hive-udf)(doc) minimize hive-udf and add some docs. (#24786) 2023-10-16 16:47:21 +08:00
5eff36417a [typo](docs) Fix some ambiguous descriptions (#23912) 2023-10-16 16:44:11 +08:00
b2e3ecb81d [opt](load)change load_to_single_tablet tablet search algorithm from random to round-robin (#25256)
At present, `load_to_singlt_tablet` import implementation refers to simple random number remainder, which cannot achieve true averaging. This will lead to uneven disk IO and uneven use of cluster resources. To solve this problem, we are preparing to implement round-robin for each partition tablet imported each time, in order to achieve average load to each tablet.

When generating the load query plan, the tablet index record currently imported is passed to BE.
Add a deamon task in FE to regularly clean up the `loadTabletRecordMap`. The map will get the bucket_number of the partition and update the `load_tablet_index` when `getCurrentLoadTabletIndex`.
2023-10-16 16:43:25 +08:00
b83e412623 [fix](hive-udf) delete Logger to avoid Kryo serialize problem. (#25312) 2023-10-16 16:10:06 +08:00
e8431e1a97 [fix](planner)should not add TupleIsNullPredicate for inlineview plan (#25338) 2023-10-16 15:24:13 +08:00
8e9e1b1bfd [fix](planner) Disable infer expr column name when query on old optimizer (#25317)
Disable infer expr column name when query on old optimizer.
This bug is be brought in #24990

if your query SQL is
select id, name, sum(target) FROM db_test.table_test2 group by id, name;
the result column name when query is as following:
|id|name |sum(cast(target as DOUBLE))|

when you create view as following:
CREATE VIEW v1 as select id, name, sum(target) FROM db_test.table_test2 group by id, name;
then query the view v1, the result is as following:
|id|name |__sum_2|
2023-10-16 02:08:52 -05:00
1a27ac8d56 [opt] use correct column label when execute query in FE (#25372)
SET @a = '4';
SELECT @a;

previous:
+-----+
| '4' |
+-----+
| 4   |
+-----+

current:
+----+
| @a |
+----+
| 4  |
+----+
2023-10-16 02:03:33 -05:00
Pxl
292ccaeda8 insert default when json array parse failed (#25447)
insert default when json array parse failed
2023-10-16 14:51:26 +08:00
04e5fb3809 [fix](regression test) fix mysql tuple convert test result not ordered #25455 2023-10-16 14:32:51 +08:00
0585beee02 [typo](docs) Modify parameter description (#23782) 2023-10-16 01:29:00 -05:00
0aa50fb256 [fix](nereids)fix regression case: eliminate_outer_join (#25208) 2023-10-16 14:08:36 +08:00
f698f205d5 [Fix](merge-on-write) throw exception when the user don't specify the insert columns in insert statement for partial update (#25437) 2023-10-16 14:05:06 +08:00
e94fbe169e [Enhance](regression) add hms catalog broker scan case (#25453) 2023-10-16 12:35:46 +08:00