347cceb530
[Feature](inverted index) push count on index down to scan node ( #22687 )
...
Co-authored-by: airborne12 <airborne12@gmail.com >
2023-09-02 22:24:43 +08:00
20ff53c7ee
[regression-test](fix) fix case bug when define local variable ( #23785 )
2023-09-02 21:59:23 +08:00
eedd24316d
[Feature](CCR) Support MoW for CCR ( #22798 )
2023-09-02 20:40:06 +08:00
d2417c168b
[test](ColdHotSeparation) refresh case ( #23741 )
2023-09-02 14:43:57 +08:00
a6dff2faf0
[Feature](config) allow update multiple be configs in one request ( #23702 )
2023-09-02 14:26:54 +08:00
f1c354e0cf
[improvement](colocate table) forbit change colocate table's replica allocation ( #23064 )
2023-09-02 13:54:25 +08:00
228f0ac5bb
[Feature](Multi-Catalog) support query doris bitmap column in external jdbc catalog ( #23021 )
2023-09-02 12:46:33 +08:00
68aa4867b0
[fix](map_agg) lost scale information for decimal type ( #23776 )
2023-09-02 08:03:33 +08:00
6630f92878
[Enhancement](Load) stream tvf support json ( #23752 )
...
stream tvf support json
[{"id":1, "name":"ftw", "age":18}]
[{"id":2, "name":"xxx", "age":17}]
[{"id":3, "name":"yyy", "age":19}]
example:
curl -v --location-trusted -u root: -H "sql: insert into test.t1(c1, c2) select id, name from http_stream(\"format\" = \"json\", \"strip_outer_array\" = \"true\", \"read_json_by_line\" = \"true\")" -T /root/json_file.json http://127.0.0.1:8030/api/_http_stream
2023-09-02 01:09:06 +08:00
75e2bc8a25
[function](bitmap) support bitmap_to_base64 and bitmap_from_base64 ( #23759 )
2023-09-02 00:58:48 +08:00
a8de805a7a
[fix](Nereids) fix stats inject in or_expansion.groovy ( #23748 )
...
make stats injection run first
2023-09-01 18:31:58 +08:00
e3bbba82cf
[Fix](planner) fix to_date failed in create table as select ( #23613 )
...
Problem:
when create table as select using to_date function, it would failed
Example:
create table test_to_date properties('replication_num' = '1') as select to_date('20230816') as datev2;
Reason:
after release version 2.0, datev1 is disabled, but to_date function signature does not upgrade, so it failed when checking return type of to_date
Solved:
when getfunction, forbidden to_date with return type date_v1, datetime v1 also changed to datetime v2 and decimal v2 changed to decimal v3
2023-09-01 17:28:40 +08:00
b5232ce0d7
[fix](nereids) NormalizeAggregate may push redundant expr to child project node ( #23700 )
...
NormalizeAggregate may push exprs to child project node. We need make sure there is no redundant expr in the pushed down expr list. This pr use 'Set' to make sure of that.
2023-09-01 17:16:10 +08:00
32853a529c
[Bug](cte) fix multi cast data stream source not open expr ( #23740 )
...
fix multi cast data stream source not open expr
2023-09-01 14:57:12 +08:00
eaf2a6a80e
[fix](date) return right date value even if out of the range of date dictionary( #23664 )
...
PR(https://github.com/apache/doris/pull/22360 ) and PR(https://github.com/apache/doris/pull/22384 ) optimized the performance of date type. However hive supports date out of 1970~2038, leading wrong date value in tpcds benchmark.
How to fix:
1. Increase dictionary range: 1900 ~ 2038
2. The date out of 1900 ~ 2038 is regenerated.
2023-09-01 14:40:20 +08:00
c31cb5fd11
[enhance] use correct default value for show config action ( #19284 )
2023-09-01 11:28:26 +08:00
d96bc2de1a
[enhance](policy) Support to change table's storage policy if the two policy has same resource ( #23665 )
2023-09-01 11:25:27 +08:00
d6450a3f1c
[Fix](statistics)Fix external table auto analyze bugs ( #23574 )
...
1. Fix auto analyze external table recursively load schema cache bug.
2. Move some function in StatisticsAutoAnalyzer class to TableIf. So that external table and internal table could implement the logic separately.
3. Disable external catalog auto analyze by default, could open it by adding catalog property "enable.auto.analyze"="true"
2023-09-01 10:58:14 +08:00
9a7e8b298a
[Improvement](statistics)Show column stats even when error occurred ( #23703 )
...
Before, show column stats will ignore column with error.
In this pr, when min or max value failed to deserialize, show column stats will use N/A as value of min or max, and still show the rest stats. (count, null_count, ndv and so on).
2023-09-01 10:57:37 +08:00
52e645abd2
[Feature](Nereids): support cte for update and delete statements of Nereids ( #23384 )
2023-08-31 23:36:27 +08:00
e680d42fe7
[feature](information_schema)add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql ( #22702 )
...
add information_schema.metadata_name_idsfor quickly get catlogs,db,table.
1. table struct :
```mysql
mysql> desc internal.information_schema.metadata_name_ids;
+---------------+--------------+------+-------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-------+---------+-------+
| CATALOG_ID | BIGINT | Yes | false | NULL | |
| CATALOG_NAME | VARCHAR(512) | Yes | false | NULL | |
| DATABASE_ID | BIGINT | Yes | false | NULL | |
| DATABASE_NAME | VARCHAR(64) | Yes | false | NULL | |
| TABLE_ID | BIGINT | Yes | false | NULL | |
| TABLE_NAME | VARCHAR(64) | Yes | false | NULL | |
+---------------+--------------+------+-------+---------+-------+
6 rows in set (0.00 sec)
mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G;
*************************** 1. row ***************************
CATALOG_ID: 113008
CATALOG_NAME: hive1
DATABASE_ID: 113042
DATABASE_NAME: ssb1_parquet
TABLE_ID: 114009
TABLE_NAME: dates
1 row in set (0.07 sec)
```
2. when you create / drop catalog , need not refresh catalog .
```mysql
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.34 sec)
mysql> drop catalog hive2;
Query OK, 0 rows affected (0.01 sec)
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)
mysql> create catalog hive3 ...
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.32 sec)
```
3. create / drop table , need not refresh catalog .
```mysql
mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ;
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10666
1 row in set (0.04 sec)
mysql> drop table demo.example_tbl;
Query OK, 0 rows affected (0.01 sec)
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)
```
4. you can set query time , prevent queries from taking too long .
```
fe.conf : query_metadata_name_ids_timeout
the time used to obtain all tables in one database
```
5. add information_schema.profiling in order to Compatible with mysql
```mysql
mysql> select * from information_schema.profiling;
Empty set (0.07 sec)
mysql> set profiling=1;
Query OK, 0 rows affected (0.01 sec)
```
2023-08-31 21:22:26 +08:00
3a34ec95af
[FE](fucntion) add date_floor/ceil in FE function ( #23539 )
2023-08-31 19:26:47 +08:00
e54cd6a35d
[fix](regression)fix case test_outfile_orc_max_file_size by replace table_export_name #23648
...
fix case test_outfile_orc_max_file_size by replace table_export_name
2023-08-31 18:51:13 +08:00
f214485733
[fix](regression) try fix regression test no_await ( #23661 )
2023-08-31 16:22:51 +08:00
7379cdc995
[feature](nereids) support subquery in select list ( #23271 )
...
1. add scalar subquery's output to LogicalApply's output
2. for in and exists subquery's, add mark join slot into LogicalApply's output
3. forbid push down alias through join if the project list have any mark join slots.
4. move normalize aggregate rule to analysis phase
2023-08-31 15:51:32 +08:00
41c5e00071
[fix](planner)fix bug of resolve column ( #23512 )
...
if resolve a inline view column failed, we try to resolve it again by removing the table name. But it's wrong if the table name(may be the inlineview's alias) is same as some table name inside inlineview. So this pr check the table name, and only remove it when there is no table inside the inlineview has the same name with the column's table name
2023-08-31 12:25:26 +08:00
897151fc2b
[fix](Nereids) set operation syntax is not compatible with legacy planner ( #23668 )
...
for example
```sql
WITH A AS (SELECT * FROM B)
SELECT * FROM C
UNION
SELECT * FROM D
```
the scope of CTE in Nereids is the first set oeprand.
the scope of CTE in legacy planner is the whole statement.
2023-08-31 11:55:35 +08:00
96c4471b4a
[feature](udf) udf array/map support decimal and update doc ( #23560 )
...
* update
* decimal
* update table name
* remove log
* add log
2023-08-31 07:44:18 +08:00
a1505d25ce
[fix](case) drop storage policy before drop resource ( #23669 )
...
Co-authored-by: stephen <hello-stephen@qq.com >
2023-08-30 21:00:28 +08:00
7f4f39551a
[Bug](materialized-view) fix change base schema when create mv ( #23607 )
...
* fix change base schema when create mv
* fix
* fix
2023-08-30 21:00:12 +08:00
b7404896fa
[improvement](catalog) avoid calling checksum when replaying creating jdbc catalog and fix ranger issue ( #22369 )
...
1. jdbc
Before, in the constructor of Jdbc catalog, we may call checksum action of the jdbc driver.
But the download link of the jdbc driver may not be available when replaying, causing replay error.
This PR change the logic to avoid calling checksum when replaying creating jdbc catalog.
2. ranger
After this PR, when creating catalog, it will try to init access controller to make sure the config is ok.
3. catalog priv check
When creating/dropping/altering/ catalog, doris will only use internal access controller to check catalog level priv.
2023-08-30 19:24:11 +08:00
05771e8a14
[Enhancement](Load) stream Load using SQL ( #23362 )
...
Using stream load in SQL mode
for example:
example.csv
10000,北京
10001,天津
curl -v --location-trusted -u root: -H "sql: insert into test.t1(c1, c2) select c1,c2 from stream(\"format\" = \"CSV\", \"column_separator\" = \",\")" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql
curl -v --location-trusted -u root: -H "sql: insert into test.t2(c1, c2, c3) select c1,c2, 'aaa' from stream(\"format\" = \"CSV\", \"column_separator\" = \",\")" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql
curl -v --location-trusted -u root: -H "sql: insert into test.t3(c1, c2) select c1, count(1) from stream(\"format\" = \"CSV\", \"column_separator\" = \",\") group by c1" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql
2023-08-30 19:02:48 +08:00
25b8831afd
[fix](Outfile) fix core dump when export data to orc file format using outfile ( #23586 )
...
* fix
* add test
2023-08-30 19:01:44 +08:00
e1743b70f2
[enhancement](nereids)remove useless cast for floatlike type ( #23621 )
...
convert cast(c1 AS double) > 2.0 to c1 >= 2 (c1 is integer like type)
2023-08-30 19:00:16 +08:00
ade598e043
[feature](Nereids): eliminate distinct for max/min/any_value ( #23428 )
...
eliminate distinct for max/min/any_value function
```
max(distinct value) = max(value)
```
2023-08-30 17:23:10 +08:00
a136836770
[feature](Nereids) add two functions: char and covert ( #23104 )
...
add [char](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/string-functions/char/?_highlight=char ) func
```
mysql> select char(68, 111, 114, 105, 115);
+--------------------------------------+
| char('utf8', 68, 111, 114, 105, 115) |
+--------------------------------------+
| Doris |
+--------------------------------------+
```
convert func
```
MySQL root@127.0.0.1:(none)> select convert(1 using gbk);
+-------------------+
| convert(1, 'gbk') |
+-------------------+
| 1 |
+-------------------+
```
2023-08-30 17:09:06 +08:00
d326cb0c99
[fix](planner) array constructor do type coercion with decimal in wrong way ( #23630 )
...
array creator with decimal type and integer type parameters should return array<decimal>,
but the legacy planner return array<double>
2023-08-30 11:18:31 +08:00
e02747e976
[feature](Nereids) support struct type ( #23597 )
...
1. support struct data type
2. add array / map / struct literal syntax
3. fix array union / intersect / except type coercion
4. fix explict cast data type check for array
5. fix bound function type coercion
2023-08-29 20:41:24 +08:00
103fa4eb55
[feature](Export) support export with nereids ( #23319 )
2023-08-29 19:36:19 +08:00
94a8fa6bc9
[bug](function) fix explode_number function return wrong rows ( #23603 )
...
before the explode_number function result is random with const value.
because the _cur_size is reset, so it's can't insert values to column.
2023-08-29 19:02:49 +08:00
f17241386e
[fix](regression) Fix test no_await #23599
2023-08-29 18:58:13 +08:00
cc1509ba11
[fix](view) The parameter positions of timestamp diff function to sql are reversed ( #23601 )
2023-08-29 18:30:16 +08:00
84006dd8c7
[Fix](Full compaction) Fix full compaction regressison test ( #23487 )
2023-08-29 18:27:19 +08:00
8932a6fae7
[feature](Nereids) support Literal collate syntax ( #23600 )
...
Support such sql grammar, Just for compatibility
```sql
select table_name
from information_schema.tables
where table_schema collate utf8_general_ci = 'information_schema'
and table_name collate utf8_general_ci = 'parameters';
```
2023-08-29 17:01:13 +08:00
f7a3d2778a
[FIX](array)update array olapconvertor and support array nested other complex type ( #23489 )
...
* update array olapconvertor and support array nested other complex type
* update for inverted index
2023-08-29 16:18:11 +08:00
5fedb3285f
[fix](regression) add sync for loading data to avoid case failure in multi-fe test env ( #23604 )
2023-08-29 15:29:51 +08:00
598dc6960a
[fix](Nereids) make agg output unchanged after normalized ( #23499 )
...
The normalizedAgg rule can change the output of agg.
For example:
```
select c1 as c, c1 from t having c1 > 0
```
The normalizedAgg rule will make a plan with output c, which can cause the having filter error
Therefore, the output exprId should be unchanged after normalized
2023-08-29 15:01:26 +08:00
4c00b1760b
[feature](partial update) Support partial update for broker load ( #22970 )
2023-08-29 14:41:01 +08:00
7dcde4d529
[bug](decimal) Use max value as result if overflow ( #23602 )
...
* [bug](decimal) Use max value as result if overflow
* update
2023-08-29 13:26:25 +08:00
93db9b455a
[test](fix case) fix sql user conflict in test case ( #23583 )
2023-08-29 11:33:49 +08:00