Commit Graph

14149 Commits

Author SHA1 Message Date
d2400d1d7b [feature](profile) profilev2 distinguish Sink and Operator in pipelineX (#25491)
* update

* update
2023-10-18 13:12:29 +08:00
6cb947f72b [refactor](unused code) delete unused method from field.h (#25554)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-10-18 13:11:14 +08:00
64aeeb971b [Fix](partial-update) Correct the alignment process when the table has sequence column and add cases (#25346)
This PR fix the alignment process during publish phase when conflict occurs during concurrent partial updates: if we encounter a row with the same key and larger value in sequence column, it means that there exists another load which introduces a row with the same keys and larger sequence column value published successfully after the commit phase of the current load. We should act as follows:

- If the columns we update include sequence column, we should delete the current row becase the partial update on the current row has been overwritten by the previous one with larger sequence column value.
- Otherwise, we should combine the values of the missing columns in the previous row and the values of the including columns in the current row into a new row.
2023-10-18 11:32:51 +08:00
ef9cbc4c64 [enhancement](priv) Clarify ccr releated FrontendServiceImpl call privs (#25530)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-10-18 10:51:55 +08:00
6f6264693f [fix](Nereids) can't choosing best plan for join that could only broadcast (#25511)
we need ensure there is one request properties at least
2023-10-17 21:40:05 -05:00
b0e0a0569a [Fix](row store) Real default value should be used instead of default… (#25230)
Before this PR the default value is not correct, we should use default value in Frontend schema.
2023-10-18 10:13:44 +08:00
5503d04be2 [fix](test) create table should with distribution info (#25544)
create table should with distribution info
2023-10-18 10:03:35 +08:00
3225495233 [regression-test](export) Add some tests that use hive external table to read orc/parquet file exported by doris (#25431)
add some regression test:

1. Export Doris data to the orc/parquet file on HDFS with DORIS.
2. Create external table to read orc/parquet files on hive.
2023-10-18 09:59:15 +08:00
7cfb1d9b0e [Regression case](statistics) Add regression test case for fetching HMSExternalTable through hms. (#25548)
Regression case for fetching HMSExternalTable statistics through HMS when the table is not analyzed.
2023-10-18 09:57:58 +08:00
47689fd452 [refactor](jni) unified jni framework for java udf (#25302)
Use the unified jni framework to refactor java udf.
The unified jni framework takes VectorTable as the container to transform data between c++ and java, and hide the details of data format conversion.
In addition, the unified framework supports complex and nested types.
The performance of basic types remains consistent, with a 30% improvement in string types and an order of magnitude improvement in complex types.
2023-10-18 09:27:54 +08:00
26e332c608 [fix](multi-catalog)add exception for unsupported hive input format (#25490)
add exception for unsupported hive input format
2023-10-17 22:53:53 +08:00
b76e23fb34 [improvement](meta) allow to ignore unknown image module (#25450)
Add new FE config `ignore_unknown_metadata_module`. Default is false.
If set to true, when reading metadata image file, and there are unknown modules, these modules
will be ignored and skipped.
This is mainly used in downgrade operation, old version can be compatible with new version Image file.
2023-10-17 22:53:31 +08:00
18c2a13e09 [fix](multi-catalog)fix maxcompute partition filter and session creation (#24911)
add maxcompute partition support
fix maxcompute partition filter
modify maxcompute session create method
2023-10-17 22:36:10 +08:00
ce18f1148a [improvement](catalog)compatible with paimon 0.5 (#24985)
compatible with paimon 0.5
add p0 for paimon,need set enablePaimonTest=true
2023-10-17 22:07:13 +08:00
f6f1e3b646 [chore](build) Bump the version of hyperscan (#25464)
The latest version fixed the previous issue (https://github.com/intel/hyperscan/issues/292).
2023-10-17 08:45:25 -05:00
d287f53d77 [fix](nereids)in physical plan, print join class simple name not full name #25515 2023-10-17 20:25:14 +08:00
9b1cdd3230 [fix](planner) mark join slot should always be nullable (#25433) 2023-10-17 06:14:13 -05:00
b74836050a [chore](config) turnoff fuzzy for enable_simdjson_reader (#25521) 2023-10-17 18:42:11 +08:00
8eff1486bd [feature](nereids)print query id with memo and physical tree (#25501)
print query id with memo and physical tree when dump_nereids_memo switched on. This is used for regression test.
2023-10-17 05:06:11 -05:00
9d6b2dceb2 [fix](Nereids) non-slot filter should not be push through aggregate (#25525) 2023-10-17 05:02:26 -05:00
af8832389f [feature](Nereids) add 4 array functions (#25488)
- array_concat
- array_pushback
- array_pushfront
- array_zip
2023-10-17 04:45:15 -05:00
06ff59bc03 [Performance](sink) SIMD the tablet sink valied data function (#25480) 2023-10-17 16:21:08 +08:00
f38f5f50eb [fix](ipv6)fix can not resolve host and port (#25254)
for ipv6,address should be [ip]:port instead of ip:port
2023-10-17 15:46:45 +08:00
31a5e072e7 [refactor](pipelineX) Simplify set operation (#25502) 2023-10-17 15:11:46 +08:00
652d6c57c0 [fix](jdbc catalog) fix handle oracle date format (#25487) 2023-10-17 02:10:28 -05:00
8c5af5a088 [fix](case) Fix test_analyze case (#25476)
It has following problems before this PR
use count(*) to check if all column analyzed
return directly when fe count > 1
Co-authored-by: AKIHA <cyborgz1999@example.com>
2023-10-17 15:06:01 +08:00
1514f78b87 [refactor](partial-update) Split partial update infos from tablet schema (#25147) 2023-10-17 14:21:40 +08:00
4d12d8885e [feature](Nereids): graphSimplifier should compare edge1BeforeEdge2 and edge2BeforeEdge1 (#25416) 2023-10-17 14:10:21 +08:00
c4cc6cefda [fix](regression-test) fix http stream 2pc case(#25507) 2023-10-17 13:21:03 +08:00
0ee06f30b0 [feature](nereids)Ignore some node in 'explain shape plan' command (#25485)
if set ignore_shape_nodes='PhysicalDistribute, PhysicalProject'
then
explain shape plan will not print project and distribute node
2023-10-17 11:57:36 +08:00
c2fe34dec7 [refine](pipelineX) refactor local state (#25448) 2023-10-17 11:23:29 +08:00
410441b516 [enhancement](Nereids): remove LAsscom in Bushy Tree RuleSet (#25465)
- Bushy Tree RuleSet don't need LAsscom
- fix bug: rule pattern shouldn't use same name
2023-10-17 11:22:52 +08:00
384fddb2ff [test](case)add some debug log in mv case (#25458)
* [test](case)change the insert stmt in mv case
2023-10-17 11:04:45 +08:00
5f844486e3 [enhancement](invert index) read columns by index reduce seek time (#24735) 2023-10-17 10:34:33 +08:00
1130317b91 [Improvement](statistics)Collect stats for hive partition column using metadata (#24853)
Hive partition columns' stats could be calculated from hive metastore data. Doesn't need to execute sql to get the stats.
This PR is using hive partition metadata to collect partition column stats.
2023-10-17 10:31:57 +08:00
a383a2bc83 [cases](regresstest)add json format regress test for nested types (#25397) 2023-10-17 10:16:52 +08:00
a364a24ac2 [Enhance](regression) add hive out file check (#25475)
add hive out file check
fix hive sql state with " ; "
2023-10-17 10:11:57 +08:00
ef7d8aa99a [fix](be)confix bug of converting outer join probe block to nullable (#25492)
_do_evaluate will add temp result column into original table block, so in order to only convert correct columns to be nullable, need call convert_block_to_null before _do_evaluate
2023-10-17 10:10:56 +08:00
85b8497624 [fix](Tvf) return empty set when tvf queries an empty file or an error uri (#25280)
### Before:
return errors when tvf queries an empty file or an error uri:
1. get parsed schema failed, empty csv file
2. Can not get first file, please check uri.

### Now:
we just return empty set when tvf queries an empty file or an error uri.
```sql
mysql> select * from s3( 
"uri" = "https://error_uri/exp_1.csv", 
"s3.access_key"= "xx", 
"s3.secret_key" = "yy", 
"format" = "csv") limit 10;

Empty set (1.29 sec)
```
2023-10-17 09:52:53 +08:00
cda8fb6b8b [fix](load) return Status when error in RowsetWriter::build (#25381) 2023-10-17 09:40:23 +08:00
a194a15442 [improvement](tablet schedule) colocate balance between all groups (#23543) 2023-10-17 09:33:52 +08:00
f1a5e393c7 [feature](insert) Support group commit insert use new syntax like insert into table_id(xxx) (#25484) 2023-10-17 09:23:09 +08:00
f75ee49cb4 [chore](fmt) Remove stringstream by fmt (#25474)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-10-16 21:31:54 +08:00
59ebbb351e [feature](merge-cloud) Enable write into cache when uploading file to s3 using s3 file writer (#24364) 2023-10-16 21:31:02 +08:00
fe1980d7f2 [docs](docs) Add release note 2.0.2 (#25375) 2023-10-16 20:38:45 +08:00
f9a80ecdab [improvement](sync version) fe sync version with be (#25236) 2023-10-16 20:34:25 +08:00
e9157a3dba [fix](path gc) fix data dir path gc (#25420) 2023-10-16 20:25:20 +08:00
eaf5febc97 [enhancement](cooldown) Improve cooldown logs (#25432) 2023-10-16 20:17:00 +08:00
bfc602f343 [compile](fix) fix ubsan compile error (#25473) 2023-10-16 20:11:17 +08:00
Pxl
72920fbd1d [Improvement](materialized-view) set job failed when toAgentTaskRequest meet error (#25358)
set job failed when toAgentTaskRequest meet error
2023-10-16 20:10:52 +08:00