Commit Graph

1782 Commits

Author SHA1 Message Date
0dc6d3a568 [fix](nereids) avg size of column stats always be 0 (#20341)
it takes lot of effort to compute the avgSizeByte for col stats.
we use schema information to avoid compute actual average size
2023-06-05 13:01:58 +08:00
cd0379df4e [fix](nereids) select with specified partition name is not work as expected (#20269)
This PR is to fix the select specific partition issue, certain codes related to this feature were accidentally deleted.
2023-06-05 12:48:54 +08:00
d03bb4ba7b [Optimize](function) Optimize locate function by compare across strings (#20290)
Optimize locate function by compare across strings. about 90% speed up test by sum()
2023-06-05 12:43:14 +08:00
3c28a71378 [fix](dynamic partition) partition create failed after alter distributed column (#20239)
This pr fix following two problems:

Problem1: Alter column comment make add dynamic partition failed inside issue #10811

create table with dynamic partition policy;
restart FE;
alter distribution column comment;
alter dynamic_partition.end to trigger add new partition by dynamic partition scheduler;
Then we got the error log, and the new partition create failed.
dynamic add partition failed: errCode = 2, detailMessage =      Cannot assign hash distribution with different distribution cols. default is: [id int(11) NULL COMMENT 'new_comment_of_id'], db: default_cluster:example_db, table: test_2
Problem2: rename distributed column, make old partition insert failed. inside #20405

The key point of the reproduce steps is restart FE.

It seems all versions will be affected, include master and lts-1.1 and so on.
2023-06-05 12:20:50 +08:00
a6d8115cbc [Improvement](planner) expand sql-block-rule to make it can be used on all kinds of sql stmt (#19540)
Currently, sql-block-rule can only be used for query statements, while it's useful for other stmts like insert / delete / alter / drop etc. Now remove the limitation and expand its using scenario.
2023-06-05 11:01:43 +08:00
c6387847aa [fix](nereids) change defaultConcreteType function's return value for decimal (#20380)
1. add default decimalv2 and decimalv3 for NullType
2. change defaultConcreteType of decimalv3 to this
2023-06-05 10:50:07 +08:00
59a0f80233 [Improve](array-function)Improve array function intersect (#20085)
now we just support array function with 2 arrays , but intersect operator can support more than 2 arrays
2023-06-05 10:38:48 +08:00
653dc715f6 [fix](pipeline) Comment unstable p1 tests #20407
Some stats collection related tests is unstable, comment it temporarily
2023-06-05 09:53:34 +08:00
e90b78d783 [chore](regression) add case in test_delete (#20372)
Add some cases of deletion conditions with numeric values.
2023-06-05 09:38:29 +08:00
50ce237a24 [fix](regression) exclude test_analyze_stats_p1 suite (#20366)
test_analyze_stats_p1 is failing constantly in regression test, @morrySnow suggests ignoring it first.

http://43.132.222.7:8111/test/-5693062769677098407?currentProjectId=Doris_DorisRegression&expandTestHistoryChartSection=true&expandedTest=build%3A%28id%3A155592%29%2Cid%3A9944
2023-06-05 08:21:46 +08:00
34a1b7599f [Fix](lazy_open) fix lazy open commit info lose (#20404) 2023-06-04 19:08:36 +08:00
ffadaa4935 [improvement](inverted index) skip write index on load and generate index on compaction (#20325) 2023-06-03 16:03:21 +08:00
4e4972d311 [fix](regression) test_partial_update_with_row_column (#20279) 2023-06-02 21:51:33 +08:00
Pxl
c2e96c7fa6 [Bug](schema-change) make test_dup_mv_schema_change more stable #20379
make test_dup_mv_schema_change more stable
2023-06-02 21:25:27 +08:00
Pxl
90d710e83d [Enchancement](function) optimize for padding function && add string length check on string op (#20363) 2023-06-02 21:24:41 +08:00
b62c5a70c7 [fix](match query) fix array column match query failed without inverted index (#20344) 2023-06-02 21:10:12 +08:00
adc3acb283 [fix](match) fix match query with compound predicates return -6003 (#20361) 2023-06-02 18:25:37 +08:00
a20a6d2bea [refactor](jdbc catalog) Refactor the JdbcClient code (#20109)
This PR does the following:

1. This PR is a substantial refactor of the JDBC client architecture. The previous monolithic JDBC client has been refactored into an abstract base class `JdbcClient`, and a set of database-specific subclasses (e.g., `JdbcMySQLClient`, `JdbcOracleClient`, etc.), and the JdbcClient required config, abstract into an object. This allows for improved modularity, easier addition of support for new databases, and cleaner, more maintainable code. This change is backward-compatible and does not affect existing functionality.
2. As a result of client refactoring, OceanBaseClient can automatically recognize the mode of operation as MySQL or Oracle, so we cancel the oceanbase_mode property in the Jdbc Catalog, but due to the cancellation of the property, When creating a single OceanBase Jdbc Table, the table type needs to be filled in as oceanbase(mysql mode) or oceanbase_oracle(oracle_mode). The above work is a change in the usage behavior, please note.
3. For the PostgreSQL Jdbc Catalog, I did two things:

      1.   The adaptation to MATERIALIZED VIEW and FOREIGN TABLE is added
      2.   Fixed reading jsonb, which had been incorrectly changed to json in a previous PR

4. fix some jdbc catalog test case
5. modify oceanbase jdbc doc

And,Thanks @wolfboys for the guidance
2023-06-02 17:58:10 +08:00
422fcd6377 [fix](Nereids) forbid unexpected expression on filter and fix two more bugs (#20331)
fix below bugs:
1. not check filter's expression, aggregate function, grouping scalar function and window expression should not appear in filter
2. show not change nullable of aggregate function when it is window function in window expression
3. bitmap and other metric types should not appear in order by or partition by of window expression
2023-06-02 16:19:50 +08:00
098c735064 [pipeline](fix) rm github_token, no need for it (#20360) 2023-06-02 14:11:21 +08:00
d68f3f3b3d [Feature](array-functions)improve array functions for array_last_index (#20294)
Now we just support array_first_index for lambda input , but no array_last_index
2023-06-02 13:54:03 +08:00
8ff8705b3f [fix](olap) deletion statement with space conditions did not take effect (#20349)
Deletion statement like this:

delete from tb where k1 = '  ';
The rows whose k1's value is ' ' will not be deleted.
2023-06-02 13:52:57 +08:00
9d8043e4c1 [Fix](Nereids) should not gather data when sink (#20330) 2023-06-02 10:33:11 +08:00
5a3b97bbf2 [enhancement](struct-type)support comment for struct field (#20200)
support comment for struct field
2023-06-02 10:29:56 +08:00
a8a4da9b9e [fix](nereids)dphyper join reorder may cache wrong project list for project node (#20209)
* [fix](nereids)dphyper join reorder may cache wrong project list for project node
2023-06-02 09:35:28 +08:00
ecdc5124be [feature-wip](duplicate-no-keys) schame change support for duplicate no keys (#19326) 2023-06-02 09:22:41 +08:00
01770ba68a [fix](regression-test) variable's scope returned by curl (#20347) 2023-06-01 23:38:39 +08:00
9b936049b6 [feature-wip](duplicate_no_keys) Add some test cases of all the duplicate tables in test case tpcds_sf100_without_key_p2 and make them duplicate tables without keys (#20332) 2023-06-01 22:29:51 +08:00
05b7c65509 [fix](regression-test) fix multi-thread problem of regression-test #20322 2023-06-01 18:57:17 +08:00
608d2a3eca [Bug](exec) push down no group by agg min cause error result (#20289)
sql """
CREATE TABLE t1_int (
num int(11) NULL,
dgs_jkrq bigint(20) NULL
) ENGINE=OLAP
DUPLICATE KEY(num)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(num) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false",
"enable_single_replica_compaction" = "false"
);
"""
sql """insert into t1_int values(1,1),(1,2),(1,3),(1,4),(1,null);"""
qt_sql """
select min(dgs_jkrq) from t1_int;
"""

get the error result:4

after change we get the right result:1
2023-06-01 17:29:46 +08:00
24fcc2011f [Fix](Nereids) Fix function test case unstable by adding order by (#20295)
Nereids function case do not have a order by clause, so the result will be unstable, so order by is added to ensure stability.
2023-06-01 15:18:25 +08:00
a8b273ae31 [P2](test) Fix P2 output (#20311) 2023-06-01 15:11:12 +08:00
519f01133a [feature](decimal)support cast rounding half up and div precision increment in decimalv3. (#19811) 2023-06-01 13:09:58 +08:00
04644c6dfa [fix](regression) regression test test_bitmap_filter_nereids could not run (#20293) 2023-06-01 12:56:32 +08:00
1b968c4ade [fix](multi catalog)Fix nereids planner text format include extra column index bug (#20260)
Nereids planner include all columns index in TFileScanRangeParams, this may cause the column projection incorrect for
 text format table. Because csv reader use the column index position to split a line. Extra column index will cause get 
wrong split result. This PR is to reset the column index after Projection, remove the useless column index.
2023-06-01 12:17:47 +08:00
cc41cb0e7e [Fix](Nereids) fix some insert into select bugs (#20052)
fix 3 bugs:

1. failed to insert into a table with mv.
```sql
create table t (
    id int,
   c1 int,
   c2 int,
   c3 int
) duplicate key(id)
distributed by hash(id) buckets 4

create materialized view k12s3m as select id, sum(c1), max(c3) from t group by id;

insert into t select -4, -4, -4, 'd';
```
insert will rise exception because mv column is not handled. now we will add a target column and value as defineExpr.

2. failed to insert into a table with not all the columns.
```sql
insert into t(c1, c2) select c1, c2 from t
```
and t(id ukey, c1, c2, c3), will insert too many data, we fix it by change the output partitions.

3. failed to insert into a table with complex select.
the select statement has join or agg, fix the bug by the way similar to the one at 2nd bug.
2023-06-01 12:15:19 +08:00
68e593fbf1 [fix](nereids)(planner) case when should return NullLiteral when all case result is NullLiteral (#20280) 2023-06-01 11:11:41 +08:00
4a682a0a46 [fix][regression-test] set timeout of curl in regression test to avoid hanged when be crashed. (#20222)
Currently in regression-test, when a be crash, because curl does not set a timeout, suite-thread will get stuck.
To solve this, encapsulate the call to be into a function, set the timeout uniformly, and avoid getting stuck
2023-06-01 11:00:09 +08:00
492154ee55 [fix](regression-test) add jdbc timeout (#20228)
In some cases ( or bugs), doris may returned query to jdbc, but jdbc can not recognized what doris sent back,
so hanged. To fix this, add a timeout of 30 minutes to jdbc connection.
2023-06-01 10:50:17 +08:00
9e21318834 [refactor](dynamic table) Make segment_writer unaware of dynamic schema, and ensure parsing is exception-safe. (#19594)
1. make ColumnObject exception safe
2. introduce FlushContext and construct schema at memtable flush stage to make segment independent from dynamic schema
3. add more test cases
2023-06-01 10:25:04 +08:00
90cd791789 [fix](tvf) s3 tvf specify region and s3.region params failed (#19921) 2023-06-01 10:00:49 +08:00
65a75abecb [Fix](Nereids) bitmap type should not be used in comparison predicate (#19807)
When using nereids, if we use compare operator of bitmap type, an analyze exception need to be throwed.

like: 
select id from (select BITMAP_EMPTY() as c0 from expr_test) as ref0 where c0 = 1 order by id

Which c0 in subq0 is a bitmap type, this scenario is not supported right now.
2023-05-31 23:09:36 +08:00
6adb3fdf11 [fix](match_phrase) Fix the inconsistent query result for 'match_phrase' after creating index without support_phrase property (#20258)
if create inverted index without support_phrase property, remaining the match_phrase condition to filter by match function.
2023-05-31 18:09:50 +08:00
c03a19ea23 [improvement](bitmap) Using set to store a small number of elements to improve performance (#19973)
Test on SSB 100g:

select lo_suppkey, count(distinct lo_linenumber) from lineorder group by lo_suppkey;
exec time: 4.388s

create materialized view:

create materialized view customer_uv as select lo_suppkey, bitmap_union(to_bitmap(lo_linenumber)) from lineorder group by lo_suppkey;
select lo_suppkey, count(distinct lo_linenumber) from lineorder group by lo_suppkey;
exec time: 12.908s

test with the patch, exec time: 5.790s
2023-05-31 16:13:42 +08:00
b53c42636e [Fix](Nereids) fold constant result is wrong on functions relative to timezone (#19863) 2023-05-31 15:52:40 +08:00
d93ff5d1ab [fix](pipeline) Enable pipeline explicitly in the plan shape check cases. (#20221)
enable pipeline explicitly in tpcds plan shape check
2023-05-31 14:40:24 +08:00
1f22aa6961 [fix](nereids) like function's nullable property should be PropagateNullable (#20237) 2023-05-31 12:13:38 +08:00
3f91127854 [fix](regression)Update external Brown test case out file. #20232
Update external Brown test case out file to match the new precision.
2023-05-31 09:21:04 +08:00
ff05217a1e [regression](p0) fix test for array_enumerate_uniq (#20231) 2023-05-30 22:14:19 +08:00
b7a69fbf4b [test](regression) add regression test from materialized slot bug (#20207)
The test query includes the conversion of string types to other types, and the processing of materialized columns for nested subqueries, which is the regression test for bug fix(#18783)
2023-05-30 21:23:05 +08:00