Commit Graph

3297 Commits

Author SHA1 Message Date
7f1b558011 [fix](stats) truncate min/max if too long (#27955)
For some string value the max/min might be a very long string
which might take too many memory of FE,
so we truncate to 1024 chars if it's too long
2023-12-05 20:40:38 +08:00
05adbfdb3d [feature](inverted index) match_phrase_prefix feature added (#27404)
select count() from test_index_match_phrase_prefix where request match_phrase_prefix 'xxx';
2023-12-05 20:15:13 +08:00
e79422addc [refactor](Nereids) compatible with all ability legacy planner (#27947)
refactor:
1. split InsertIntoTableCommand into three sub command
- InsertIntoTableCommand
- InsertOverwriteTableCommand
- BatchInsertIntoTableCommand

feature:
1. support DEFAULT keywords in values list
2. support empty values list
3. support temporary partition
4. support insert into values in txn model

fix:
1. should start transaction before release read lock on target table
2023-12-05 19:10:55 +08:00
59c0db4080 Fix workload group unstable (#28003) 2023-12-05 17:55:43 +08:00
8dc67d3be6 [case](regression-test) add backup recovery for inverted index and bl… (#27616) 2023-12-05 17:11:02 +08:00
6074cddcf8 [feature](mtmv)add Job and task tvf (#27967)
add:
select * from jobs("type"="mv");
select * from tasks("type"="mv");
select * from jobs("type"="insert");
select * from tasks("type"="insert");

add check priv for mv_infos("database"="xxx");

change JobType MTMV==>MV
2023-12-05 15:12:36 +08:00
02512cd0e2 [fix](stats)Drop stats or update updated rows after truncate table (#27931)
1. Also clear follower's stats cache when doing drop stats.
2. Drop stats when truncate a table.
2023-12-05 14:53:35 +08:00
Pxl
8a761dff84 [Bug](materialized-view) fix create mv failed on unique table (#27971)
fix create mv failed on unique table
2023-12-05 14:53:09 +08:00
c98b80ae6a [Feature](functions) support ignore and nullable functions (#27848)
support ignore and nullable functions
2023-12-05 14:09:32 +08:00
79f6f85cf1 [FIX](serde)fix datetimev2 serde parse from string with scale (#27965) 2023-12-05 13:58:32 +08:00
aa75f81af5 [fix](case) add "nonConcurrent" to prevent effect other case (#27941)
same as #27940
2023-12-05 13:04:25 +08:00
17016b9797 [improvement](decimal) use new way for decimal arithmetic precision promotion (#27787)
* [DNM](decimal) use new way for decimal arithmetic precision promotion

* [improvement](decimal) [DNM](decimal) use new way for decimal arithmetic precision promotion
1. [DNM](decimal) use new way for decimal arithmetic precision promotion
2. throw exception if it overflows for decimal arithmetics
3. throw exception if it overflows when casting among number types

* fix compile error of gcc

* improvement

---------

Co-authored-by: morrySnow <morrysnow@126.com>
2023-12-05 12:54:40 +08:00
ca6949ee3e [Bug](partition) fix auto list partition erros of incorrect partition name (#27974)
the partition name need limit it's length and can't have negative "-"
2023-12-05 12:54:06 +08:00
2f63999066 [fix](Nereids): Preserve "" in single quote strings and '' in double quote strings. (#27959) 2023-12-05 12:30:03 +08:00
358d73a0ae [FIX](complextype) fix empty quote with complex type (#27942) 2023-12-05 12:25:26 +08:00
3412a022f4 [fix](restore) fix Restore from __keep_on_local__ throws null pointer… (#26943)
Co-authored-by: walter <patricknicholas@foxmail.com>
Co-authored-by: hugoluo <hugoluo@tencent.com>
Co-authored-by: walter <patricknicholas@foxmail.com>
2023-12-05 10:55:28 +08:00
50ad40a7a8 [test](Nereids): add infer-predicates regression test (#27850) 2023-12-05 10:16:01 +08:00
4934f7ed8d [enhancement](Nereids) add test for some push down filter rule (#27757) 2023-12-04 20:57:57 +08:00
4c9bf98dcd [config](p0) Pipeline p0 open arrow_flight_sql_port #27945 2023-12-04 19:24:15 +08:00
b096062680 [feature-wip](arrow-flight)(step6) Support regression test (#27847)
Design Documentation Linked to #25514

Regression test add a new group: arrow_flight_sql,

./run-regression-test.sh -g arrow_flight_sql to run regression-test, can use jdbc:arrow-flight-sql to run all Suites whose group contains arrow_flight_sql.
./run-regression-test.sh -g p0,arrow_flight_sql to run regression-test, can use jdbc:arrow-flight-sql to run all Suites whose group contains arrow_flight_sql, and use jdbc:mysql to run other Suites whose group contains p0 but does not contain arrow_flight_sql.
Requires attention, the formats of jdbc:arrow-flight-sql and jdbc:mysql and mysql client query results are different, for example:

Datatime field type: jdbc:mysql returns 2010-01-02T05:09:06, mysql client returns 2010-01-02 05:09:06, jdbc:arrow-flight-sql also returns 2010-01-02 05:09 :06.
Array and Map field types: jdbc:mysql returns ["ab", "efg", null], {"f1": 1, "f2": "a"}, jdbc:arrow-flight-sql returns ["ab ","efg",null], {"f1":1,"f2":"a"}, which is missing spaces.
Float field type: jdbc:mysql and mysql client returns 6.333, jdbc:arrow-flight-sql returns 6.333000183105469, in query_p0/subquery/test_subquery.groovy.
If the query result is empty, jdbc:arrow-flight-sql returns empty and jdbc:mysql returns \N.
use database; and query should be divided into two SQL executions as much as possible. otherwise the results may not be as expected. For example: USE information_schema; select cast ("0.0101031417" as datetime) The result is 2000-01-01 03:14:1 (constant fold), select cast ("0.0101031417" as datetime) The result is null (no constant fold),
In addition, doris jdbc:arrow-flight-sql still has unfinished parts, such as:

Unsupported data type: Decimal256. INVALID_ARGUMENT: [INTERNAL_ERROR]Fail to convert block data to arrow data, error: [E3] write_column_to_arrow with type Decimal256
Unsupported null value of map key. INVALID_ARGUMENT: [INTERNAL_ERROR]Fail to convert block data to arrow data, error: [E33] Can not write null value of map key to arrow.
Unsupported data type: ARRAY<MAP<TEXT,TEXT>>
jdbc:arrow-flight-sql not support connecting to specify DB name, such asjdbc:arrow-flight-sql://127.0.0.1:9090/{db_name}", In order to be compatible with regression-test, use db_nameis added before all SQLs whenjdbc:arrow-flight-sql` runs regression test.
select timediff("2010-01-01 01:00:00", "2010-01-02 01:00:00");, error java.lang.NumberFormatException: For input string: "-24:00:00"
2023-12-04 19:23:56 +08:00
283e1ea0b7 test operate txn 2pc exception handling (#27924)
Co-authored-by: qinhao <qinhao@newland.com.cn>
2023-12-04 17:30:05 +08:00
a7d1e92fc2 [Fix](variant) handle StorageReadOptions to avoid crash in new_column_iterator_with_path (#27936)
In partial update, read variant without `opt` will lead to crash
2023-12-04 17:02:35 +08:00
c80807a5c7 [fix](test) disable join reorder in bucket shuffle join test (#27930) 2023-12-04 16:15:00 +08:00
2eb8e0d66a [minor](testcase) Add auto partition test cases (#27921)
Add auto partition test cases
2023-12-04 15:10:42 +08:00
48935c14e2 [Improvement](variant) limit the column size on tablet schema (#27399) (#27785)
1. limit the column count to default 2048
2. fix get_inverted_index return nullptr when variant's unique id is -1, using it's parent unique id instead
3. avoid add same path subcolumn duplicately in tablet schema
4. make extracted column unique id -1
2023-12-04 14:47:36 +08:00
Pxl
e196a4fd8f [Chore](case) log out show alter table info on no_await (#27926)
log out show alter table info on no_await
2023-12-04 14:30:16 +08:00
e19af1b2ed [regression](Nereids) add rule test for push down limit + sort test (#26642) 2023-12-04 14:18:55 +08:00
e62d19d90d [improve](partition) support auto list partition with more columns (#27817)
before the partition by column only have one column.
now remove those limit, could have more columns.
2023-12-04 11:33:18 +08:00
f2cfc87aca [fix](nereids) temporary partition is selected only if user manually specified (#27893)
q1: "select * from ut_p temporary partitions(tp1) where val > 0"
in q1, temporary partition tp1 is scaned

q2: "select * from ut_p where val > 0"
in q2, temporary partition tp1 is not scaned.
2023-12-04 09:44:27 +08:00
97d36b4f38 [fix](csv_reader) fix trim_double_quotes behavior change (#27882) 2023-12-03 22:57:55 +08:00
3ddc8211d1 [FIX](array )fix array<null> literal in fe (#27750) 2023-12-03 13:19:22 +08:00
43f2966889 [case](regression) using load_parallelism when load csv and json from s3 (#27525)
Co-authored-by: qinhao <qinhao@newland.com.cn>
2023-12-03 09:56:47 +08:00
80d2c7ab41 [feature](parquet)support read parquet lzo compress. (#27706) 2023-12-03 09:55:52 +08:00
66cfcc67cb [Fix](exectuor)Fix Follower Fe query queue may not work when exec alter #27831 2023-12-02 20:19:50 +08:00
2e1ce758f1 [feature](function) support ip function ipv6numtostring(alias inet6_ntoa) (#27342) 2023-12-02 11:48:19 +08:00
b74388c3b1 [case](regression) Add backup restore test with specified partition (#27694) 2023-12-01 22:31:59 +08:00
1706699e7e [fix](multi-catalog)support the max compute partition prune (#27154)
1. max compute partition prune,
we just support filter mc partitions by '=',it can filter just one partition
to support multiple partition filter and range operator('>','<', '>='..), the partition prune should be supported.

2. add max compute row count cache and partitionValues cache

3. add max compute regression case
2023-12-01 22:28:26 +08:00
f4afcae452 [case](regression) Stream load 2pc exceptions (#27804)
Co-authored-by: qinhao <qinhao@newland.com.cn>
2023-12-01 22:27:40 +08:00
8749e5208f [fix](jdbc catalog) fix insert into jdbc table column order (#27855) 2023-12-01 20:46:48 +08:00
3f20cf1456 [fix](nereids)set operation's result type is wrong if decimal overflows (#27870) 2023-12-01 18:40:06 +08:00
007506ce42 [fix](like_func) incorrect result of like with 'NO_BACKSLASH_ESCAPES' mode (#27842) 2023-12-01 17:32:46 +08:00
34c85c962f [opt](Nereids) improve semi/anti join estimation when column stats are unavailable #27793
this change improves performance of tpch q20. on sf500, improved from 6.3sec to 1.1 sec
this change has no impaction on tpcds

when column stats is unknown,
the basic algorithm to estimate left semi join output row count is its left child output row count.
q1: "A left semi join B on A.x=B.x"
the output row is estimated as A.rowCount.

But the basic algorithm is not good to following pattern:
q2: "A left semi join filter(B) on A.x=B.x"
Because there is a filter on B, usually this left semi join also reduce the row count of A, and we estimate
the output of q2 as A.rowCount * Filter.rowCount/B.rowCount
2023-12-01 15:48:33 +08:00
Pxl
64fad89eb1 [Chore](case) add case of join with big hashtable (#27825)
add case of join with big hashtable
2023-12-01 15:32:23 +08:00
776f0205f3 [Fix](test) Fix an auto partition conflict and add many testcases (#27730)
Fix an auto partition conflict and add many testcases
2023-12-01 09:58:44 +08:00
2afbece0b8 [Fix](type) fix wrong type transform for unix_timestamp (#27728)
fix wrong type transform for unix_timestamp
2023-12-01 09:58:20 +08:00
60bc3be8a2 [Opt](Compression) Opt zstd block decompression by ZSTD_decompressDCtx(). (#27534)
Opt zstd block decompression by `ZSTD_decompressDCtx()` to replace streaming decompression.
It will improve performance but consume more memory. 

Test result: 
- env: 1 node(16 cores, 64G).
- parquet column: 100 million rows of char(255) column.
- result: 5.2 -> 4.6.
2023-12-01 09:10:32 +08:00
6a614c3e7b [regression](nereids) add regression case for transposeSemiJoinAgg/transposeSemiJoinAggProject rules (#27664)
add case for transposeSemiJoinAgg/transposeSemiJoinAggProject rules
2023-12-01 08:19:16 +08:00
2b2c2dd772 [fix](sequence column) insert into should require sequence column in all scenario (#27780) 2023-11-30 23:27:58 +08:00
6c4ec3cb82 [FIX](complextype)fix array/map/struct impl hashcode and equals (#27717) 2023-11-30 22:08:15 +08:00
97105e9a16 [regression](compaction) Add case to test single replica compaction (#27199) 2023-11-30 21:27:13 +08:00