Commit Graph

1882 Commits

Author SHA1 Message Date
47689fd452 [refactor](jni) unified jni framework for java udf (#25302)
Use the unified jni framework to refactor java udf.
The unified jni framework takes VectorTable as the container to transform data between c++ and java, and hide the details of data format conversion.
In addition, the unified framework supports complex and nested types.
The performance of basic types remains consistent, with a 30% improvement in string types and an order of magnitude improvement in complex types.
2023-10-18 09:27:54 +08:00
18c2a13e09 [fix](multi-catalog)fix maxcompute partition filter and session creation (#24911)
add maxcompute partition support
fix maxcompute partition filter
modify maxcompute session create method
2023-10-17 22:36:10 +08:00
ce18f1148a [improvement](catalog)compatible with paimon 0.5 (#24985)
compatible with paimon 0.5
add p0 for paimon,need set enablePaimonTest=true
2023-10-17 22:07:13 +08:00
b74836050a [chore](config) turnoff fuzzy for enable_simdjson_reader (#25521) 2023-10-17 18:42:11 +08:00
9d6b2dceb2 [fix](Nereids) non-slot filter should not be push through aggregate (#25525) 2023-10-17 05:02:26 -05:00
af8832389f [feature](Nereids) add 4 array functions (#25488)
- array_concat
- array_pushback
- array_pushfront
- array_zip
2023-10-17 04:45:15 -05:00
652d6c57c0 [fix](jdbc catalog) fix handle oracle date format (#25487) 2023-10-17 02:10:28 -05:00
0ee06f30b0 [feature](nereids)Ignore some node in 'explain shape plan' command (#25485)
if set ignore_shape_nodes='PhysicalDistribute, PhysicalProject'
then
explain shape plan will not print project and distribute node
2023-10-17 11:57:36 +08:00
410441b516 [enhancement](Nereids): remove LAsscom in Bushy Tree RuleSet (#25465)
- Bushy Tree RuleSet don't need LAsscom
- fix bug: rule pattern shouldn't use same name
2023-10-17 11:22:52 +08:00
384fddb2ff [test](case)add some debug log in mv case (#25458)
* [test](case)change the insert stmt in mv case
2023-10-17 11:04:45 +08:00
a383a2bc83 [cases](regresstest)add json format regress test for nested types (#25397) 2023-10-17 10:16:52 +08:00
a364a24ac2 [Enhance](regression) add hive out file check (#25475)
add hive out file check
fix hive sql state with " ; "
2023-10-17 10:11:57 +08:00
ef7d8aa99a [fix](be)confix bug of converting outer join probe block to nullable (#25492)
_do_evaluate will add temp result column into original table block, so in order to only convert correct columns to be nullable, need call convert_block_to_null before _do_evaluate
2023-10-17 10:10:56 +08:00
85b8497624 [fix](Tvf) return empty set when tvf queries an empty file or an error uri (#25280)
### Before:
return errors when tvf queries an empty file or an error uri:
1. get parsed schema failed, empty csv file
2. Can not get first file, please check uri.

### Now:
we just return empty set when tvf queries an empty file or an error uri.
```sql
mysql> select * from s3( 
"uri" = "https://error_uri/exp_1.csv", 
"s3.access_key"= "xx", 
"s3.secret_key" = "yy", 
"format" = "csv") limit 10;

Empty set (1.29 sec)
```
2023-10-17 09:52:53 +08:00
Pxl
72920fbd1d [Improvement](materialized-view) set job failed when toAgentTaskRequest meet error (#25358)
set job failed when toAgentTaskRequest meet error
2023-10-16 20:10:52 +08:00
b2e3ecb81d [opt](load)change load_to_single_tablet tablet search algorithm from random to round-robin (#25256)
At present, `load_to_singlt_tablet` import implementation refers to simple random number remainder, which cannot achieve true averaging. This will lead to uneven disk IO and uneven use of cluster resources. To solve this problem, we are preparing to implement round-robin for each partition tablet imported each time, in order to achieve average load to each tablet.

When generating the load query plan, the tablet index record currently imported is passed to BE.
Add a deamon task in FE to regularly clean up the `loadTabletRecordMap`. The map will get the bucket_number of the partition and update the `load_tablet_index` when `getCurrentLoadTabletIndex`.
2023-10-16 16:43:25 +08:00
e8431e1a97 [fix](planner)should not add TupleIsNullPredicate for inlineview plan (#25338) 2023-10-16 15:24:13 +08:00
Pxl
292ccaeda8 insert default when json array parse failed (#25447)
insert default when json array parse failed
2023-10-16 14:51:26 +08:00
0aa50fb256 [fix](nereids)fix regression case: eliminate_outer_join (#25208) 2023-10-16 14:08:36 +08:00
e94fbe169e [Enhance](regression) add hms catalog broker scan case (#25453) 2023-10-16 12:35:46 +08:00
29d4e8ee90 [Fix](Nereids) fix test leading change disable join reorder parameter (#23657)
Problem:
when running pipeline, we get randomly failed of test_leading
Reason:
physical distribute was generated and choosed to be the best plan because we can not get any statistic information of empty table. So we would get some unexpect result because we can not expect the order in memo
Solved:
Add statistic of columns used in test_leading, try repeatly in pipeline
2023-10-15 22:59:45 -05:00
c482c22a74 [case](regresscases) add regress cases for nested type nested type with csv format (#25355)
this pr
1.  fix use podarray push_back() with back() will make heap_use_after_free when podarray is reach capacity which would may make heap free 
2. add cases for csv format for nested types. and csv file has two define which are without quote or just like json text
2023-10-16 11:13:44 +08:00
4c57c31c5c [fix](Nereids) count should not accept complex and json type (#25354) 2023-10-15 22:08:35 -05:00
dfc7d04626 [fix](functions) add quantile_state_empty function signature (#25306) 2023-10-16 11:05:48 +08:00
9649e09aaa [feature](function) support bitmap type in min/max_by agg function (#25430)
support bitmap type in min/max_by agg function
2023-10-16 11:05:32 +08:00
e5ef0aa6d4 [refactor](mysql result format) use new serde framework to tuple convert (#25006) 2023-10-14 19:46:42 +08:00
b946521a56 [enhancement](regression-test) add single stream multi table case (#25360) 2023-10-14 10:59:50 +08:00
283bd59eba [improvement](scanner) Remove the predicate that is always true for the segment (#25366)
By utilizing the zonemap index of the segment, we can ascertain if a predicate is always true. For example, if the segment’s maximum value is 100 and the predicate is col < 101, then this predicate is always true for this segment.
2023-10-13 15:25:38 +08:00
6f9a084d99 [Fix](Outfile) Use data_type_serde to export data to parquet file format (#24998) 2023-10-13 13:58:34 +08:00
509a79988e [FIX](regresstest) fix cases for test_nested_types_insert_into_with_s3 (#25228) 2023-10-13 11:39:29 +08:00
c6824ce1ae [test](fix) unstable case test_jdbc_query_mysql (#25279) 2023-10-12 03:56:38 -05:00
42f8b253aa [function](nereids) support array_apply/array_repeat/group_uniq_array/ipv4numtostring (#25249)
nereids support functions: array_apply/array_repeat/group_uniq_array/ipv4numtostring
2023-10-12 11:08:42 +08:00
Pxl
a0d2b1ec56 [Bug](materialized-view) fix not match mv when some alias on agg (#25321)
fix not match mv when some alias on agg
2023-10-12 11:02:55 +08:00
7ca63665b4 [fix](agg) garbled characters in result of map_agg (#25318) 2023-10-12 10:10:55 +08:00
73c3e3ab55 [Feature](x-load) support config min replica num for loading data (#21118) 2023-10-11 21:07:35 +08:00
ba87f7d3a3 [fix](pipelineX) add table sink and some fix in pipelineX (#25314) 2023-10-11 20:18:08 +08:00
f680a2141d [enhancement](regression-test) add routine load json case (#25253) 2023-10-11 19:43:08 +08:00
c6b1c903e4 [fix](Regression-test) fix that the String type in a nested type should contain double quotes and add regression-test (#25115) 2023-10-11 18:30:26 +08:00
e514d52232 [fix](point-query) Support mow table with sequence column (#25308) 2023-10-11 18:22:16 +08:00
2d19f2fbfe [fix](planner)need call materializeSrcExpr for materialized slots in join node (#25204) 2023-10-11 16:34:53 +08:00
e9554e36a8 [fix](nereids)disable parallel scan in some case (#25089) 2023-10-11 16:32:09 +08:00
6d999f5b95 [enhancement](nereids)add eliminate filter on one row relation rule (#24980)
1.simplify PushdownFilterThroughSetOperation rule
2.add eliminate filter on one row relation rule
2023-10-11 16:12:24 +08:00
47578c0fc9 [fix](Nereids) fix toSql of date literal (#25243)
toSql should return '2023-2-1 ' for DateLiteral 2023-2-1
2023-10-11 13:04:05 +08:00
b91bce8a62 [feature](Nereids) add array distance functions (#25196)
- l1_distance
- l2_distance
- cosine_distance
- inner_product
2023-10-10 21:35:06 -05:00
5be29f859a [enhancement](node) add filter in partition sort node in BE #25188
add filter in partition sort node in BE
2023-10-11 10:30:15 +08:00
1fa8720164 [regression-test](merge-on-write) Fix partial update concurrency conflict case (#25212) 2023-10-11 10:17:01 +08:00
b7ac95a970 [enhancement](regression-test) open routine load regression test by default and add data check (#25122) 2023-10-11 10:03:16 +08:00
5f95e97c56 [fix](function) array distance should return null when result is nan (#25214) 2023-10-10 04:41:51 -05:00
181c58c691 [fix](Nereids) count_by_enum signature is wrong (#25167) 2023-10-10 13:05:20 +08:00
59dee6b235 [fix](Nereids) support string cast to complex type (#25154) 2023-10-10 10:26:33 +08:00