Commit Graph

1620 Commits

Author SHA1 Message Date
Pxl
8e4c0d1e81 [Bug](materialized-view) fix divide double can not match mv (#23504)
* fix divide double can not match mv

* fix

* fix
2023-08-28 18:01:08 +08:00
Pxl
3049533e63 [Bug](materialized-view) fix core dump on create materialized view when diffrent mv column have same reference base column (#23425)
* Remove redundant predicates on scan node

update

fix core dump on create materialized view when diffrent mv column have same reference base column

Revert "update"

This reverts commit d9ef8dca123b281dc8f1c936ae5130267dff2964.

Revert "Remove redundant predicates on scan node"

This reverts commit f24931758163f59bfc47ee10509634ca97358676.

* update

* fix

* update

* update
2023-08-28 14:40:51 +08:00
c05319b8eb [fix](agg) incorrect result of bitmap_agg and bitmap_union (#23558) 2023-08-28 14:22:19 +08:00
f7d2c1faf6 [feature](Nereids) support select key encryptKey (#23257)
Add select key

```
- CREATE ENCRYPTKEY key_name AS "key_string"
- select key my_key
+-----------------------------+
| encryptKeyRef('', 'my_key') |
+-----------------------------+
| ABCD123456789               |
+-----------------------------+
```
2023-08-28 14:07:26 +08:00
e84989fb6d [feature](Nereids) support map type (#23493) 2023-08-28 11:31:44 +08:00
d19dcd6bc1 [improve](jdbc catalog) support sqlserver uniqueidentifier data type (#23297) 2023-08-28 10:30:10 +08:00
a5761a25c5 [feature](move-memtable)[7/7] add regression tests (#23515)
Co-authored-by: laihui <1353307710@qq.com>
2023-08-26 17:52:10 +08:00
40be6a0b05 [fix](hive) do not split compress data file and support lz4/snappy block codec (#23245)
1. do not split compress data file
Some data file in hive is compressed with gzip, deflate, etc.
These kinds of file can not be splitted.

2. Support lz4 block codec
for hive scan node, use lz4 block codec instead of lz4 frame codec

4. Support snappy block codec
For hadoop snappy

5. Optimize the `count(*)` query of csv file
For query like `select count(*) from tbl`, only need to split the line, no need to split the column.

Need to pick to branch-2.0 after this PR: #22304
2023-08-26 12:59:05 +08:00
f32efe5758 [Fix](Outfile) Fix that it does not report error when export table to S3 with an incorrect ak/sk/bucket (#23441)
Problem:
It will return a result although we use wrong ak/sk/bucket name, such as:
```sql
mysql> select * from demo.student
    -> into outfile "s3://xxxx/exp_"
    -> format as csv
    -> properties(
    ->   "s3.endpoint" = "https://cos.ap-beijing.myqcloud.com",
    ->   "s3.region" = "ap-beijing",
    ->   "s3.access_key"= "xxx",
    ->   "s3.secret_key" = "yyyy"
    -> );
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
| FileNumber | TotalRows | FileSize | URL                                                                                                |
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
|          1 |         3 |       26 | s3://xxxx/exp_2ae166e2981d4c08-b577290f93aa82ba_ |
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
1 row in set (0.15 sec)
```

The reason for this is that we did not catch the error returned by `close()` phase.
2023-08-26 00:19:30 +08:00
8af1e7f27f [Fix](orc-reader) Fix incorrect result if null partition fields in orc file. (#23369)
Fix incorrect result if null partition fields in orc file. 

### Root Cause
Theoretically, the underlying file of the hive partition table should not contain partition fields. But we found that in some user scenarios, the partition field will exist in the underlying orc/parquet file and are null values. As a result, the  pushed down partition field which are null values. filter incorrectly.

### Solution
we handle this case by only reading non-partition fields. The parquet reader is already handled this way, this PR handles the orc reader.
2023-08-26 00:13:11 +08:00
00826185c1 [fix](tvf view)Support Table valued function view for nereids (#23317)
Nereids doesn't support view based table value function, because tvf view doesn't contain the proper qualifier (catalog, db and table name). This pr is to support this function.

Also, fix nereids table value function explain output exprs incorrect bug.
2023-08-25 21:23:16 +08:00
29273771f7 [Fix](multi-catalog) Fix hive incorrect result by disable string dict filter if exprs contain null expr. (#23361)
Issue Number: close #21960

Fix hive incorrect result by disable string dict filter if exprs contain null expr.
2023-08-25 21:16:43 +08:00
e1367d509f [Fix](Full compaction) Fix full compaction by table id regressison test #23496 2023-08-25 18:07:06 +08:00
1312c12236 Revert "[fix](testcase) fix test case failure of insert null value into not null column (#20963)" (#23462)
* Revert "[fix](testcase) fix test case failure of insert null value into not null column (#20963)"

This reverts commit 55a6649da962fb170ddb40fea8ef26bdc552a51a.

Mannual Revert "fix in strict mode, return error for insert if datatype convert fails (#20378)"

This mannual reverts commit 1b94b6368f5e871c9a0fe53dd7c64409079a4c9d

* fix case failure
2023-08-25 16:47:14 +08:00
6d4f06689f [fix](Nereids) avoid Stats NaN (#23445)
tpcds 61 plan changed:
improved from 1.75 sec to 1.67 sec
2023-08-25 16:27:34 +08:00
0ccb7262a7 [feature](Nereids) add password func (#23244)
add password function
```
select password("123");
+-------------------------------------------+
| password('123')                           |
+-------------------------------------------+
| *23AE809DDACAF96AF0FD78ED04B6A265E05AA257 |
+-------------------------------------------+
```
2023-08-25 14:04:49 +08:00
8ef6b4d996 [fix](json) fix json int128 overflow (#22917)
* support int128 in jsonb

* fix jsonb int128 write

* fix jsonb to json int128

* fix json functions for int128

* add nereids function jsonb_extract_largeint

* add testcase for json int128

* change docs for json int128

* add nereids function jsonb_extract_largeint

* clang format

* fix check style

* using int128_t = __int128_t for all int128

* use fmt::format_to instead of snprintf digit by digit for int128

* clang format

* delete useless check

* add warn log

* clang format
2023-08-25 11:40:30 +08:00
372f83df5c [opt](Nereids) remove between expression to simplify planner (#23421) 2023-08-25 11:28:12 +08:00
37b90021b7 [fix](planner)literal expr should do nothing in substituteImpl() method (#23438)
substitute a literal expr is pointless and wrong. This pr keep literal expr unchanged during substitute process
2023-08-25 11:21:35 +08:00
18094511e7 [fix](Outfile/Nereids) fix that csv_with_names and csv_with_names_and_types file format could not be exported on nereids (#23387)
This problem is casued by #21197

Fixed an issue that `csv_with_names` and `csv_with_names_and_types` file format could not be exported on nereids optimizer when using `select...into outfile`.
2023-08-25 11:12:04 +08:00
3786ffec51 [opt](Nereids) add some array functions (#23324)
1. rename TVFProperties to Properties
2. add generating function explode and explode_outer
3. fix concat_ws could not apply on array
4. check tokenize second argument format on FE
5. add test case for concat_ws, tokenize, explode, explode_outer and split_by_string
2023-08-25 11:01:50 +08:00
7cfb3cc0aa [fix](functions) fix function substitute for datetimeV1/V2 (#23344)
* fix

* function fe
2023-08-25 09:59:38 +08:00
bc3d397759 [fix](case) update .out file, relate to #23272 (#23455)
Co-authored-by: stephen <hello-stephen@qq.com>
2023-08-25 09:15:27 +08:00
ceb931c513 [regression-test](hdfs_tvf)append regression test that hdfs_tvf read compression file (#23454) 2023-08-25 09:00:21 +08:00
441a9fff6d [fix](planner) fix now function param type error (#23446) 2023-08-25 00:12:21 +08:00
6a4976921d [fix](auth)Disable column auth temporarily (#23295)
- add config `enable_col_auth` to temporarily disable column permissions(because old/new planner has bug when select from view)
- Restore the old optimizer to the previous authentication method
- Support for new optimizer authentication(Legacy issue: When querying the view, the permissions of the base table will be authenticated. The view's own permissions should be authenticated and processed after the new optimizer is improved)
- fix: show grants for non-existent users
- fix: role:`admin` can not grant/revoke to/from user
2023-08-24 23:37:06 +08:00
f6c5c8f7b5 [Fix](Nereids) fix that select...from tablets() are invalidated when there exists predicates (#23365)
Problem: `select...from tablets()` are invalidated when there exists predicates, such as:
```sql
// The all data is:
mysql> select * from student3;
+------+------+------+
| id   | name | age  |
+------+------+------+
|    1 | ftw  |   18 |
|    3 | yy   |   19 |
|    4 | xx   |   21 |
|    2 | cyx  |   20 |
+------+------+------+

// when we specified tablet to read:
mysql> select * from student3 tablet(131131);
+------+------+------+
| id   | name | age  |
+------+------+------+
|    1 | ftw  |   18 |
|    3 | yy   |   19 |
+------+------+------+

// Howerver, when there exists predicates, the `tablet(131131)` is invalidated
mysql> select * from student3 tablet(131131) where id > 1;
+------+------+------+
| id   | name | age  |
+------+------+------+
|    4 | xx   |   21 |
|    3 | yy   |   19 |
|    2 | cyx  |   20 |
+------+------+------+
```

After the fix, we get promising data
```sql
mysql> select * from student3 tablet(131131) where id > 1;
+------+------+------+
| id   | name | age  |
+------+------+------+
|    3 | yy   |   19 |
+------+------+------+
```
2023-08-24 23:29:59 +08:00
320eda78e6 [fix](nereids) remove useless cast in in-predicate (#23171)
consider sql "select * from test_simplify_in_predicate_t where a in ('1992-01-31', '1992-02-01', '1992-02-02', '1992-02-03', '1992-02-04');"
before:

```
|   0:VOlapScanNode                                                                                                                                                                                      |
|      TABLE: default_cluster:bugfix.test_simplify_in_predicate_t(test_simplify_in_predicate_t), PREAGGREGATION: OFF. Reason: No aggregate on scan.                                                      |
|      PREDICATES: CAST(a[#0] AS DATETIMEV2(0)) IN ('1992-01-31 00:00:00', '1992-02-01 00:00:00', '1992-02-02 00:00:00', '1992-02-03 00:00:00', '1992-02-04 00:00:00') AND __DORIS_DELETE_SIGN__[#1] = 0 |
|      partitions=0/1, tablets=0/0, tabletList=                                                                                                                                                          |
|      cardinality=1, avgRowSize=0.0, numNodes=1                                                                                                                                                         |
|      pushAggOp=NONE                                                                                                                                                                                    |
|      projections: a[#0]                                                                                                                                                                                |
|      project output tuple id: 1                                                                                                                                                                        |
|      tuple ids: 0  
```
after:

```
|   0:VOlapScanNode                                                                                                                                 |
|      TABLE: default_cluster:bugfix.test_simplify_in_predicate_t(test_simplify_in_predicate_t), PREAGGREGATION: OFF. Reason: No aggregate on scan. |
|      PREDICATES: a[#0] IN ('1992-01-31', '1992-02-01', '1992-02-02', '1992-02-03', '1992-02-04') AND __DORIS_DELETE_SIGN__[#1] = 0                |
|      partitions=0/1, tablets=0/0, tabletList=                                                                                                     |
|      cardinality=1, avgRowSize=0.0, numNodes=1                                                                                                    |
|      pushAggOp=NONE                                                                                                                               |
|      projections: a[#0]                                                                                                                           |
|      project output tuple id: 1                                                                                                                   |
|      tuple ids: 0  

```
2023-08-24 18:14:43 +08:00
6c5072ffc5 [FIX](array-func) fix array index func with decimal (#23399)
fix array index func with decimal
in old analyzer when sql with array_position or array_contains with decimal , may loss precision to which will make result wrong
2023-08-24 17:58:20 +08:00
687c676160 [FIX](map)fix column map for offset next_array_item_rowid order (#23250)
* fix column map for offset next_array_item_rowid order

* add regress test
2023-08-24 10:57:40 +08:00
448b7755c6 [feature](jdbc catalog) support doris jdbc catalog array type (#23056) 2023-08-23 21:17:16 +08:00
daa4db097e [fix](Nereids) array_difference and array_position get wrong result (#23331)
1. change array_difference signature to let it return same type as arg
2. do not change precision when signature not use wildcard type
2023-08-23 20:38:09 +08:00
1b7d692d72 [fix](planner & nereids) convert to double if div decimal overflow (#23272) 2023-08-23 20:10:53 +08:00
51ac92f65c Revert "[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)" (#23368)
This reverts commit 1c3cc77a54938ed948ad8186b8dea8385977d23c.
2023-08-23 18:27:35 +08:00
8140fc737e [Fix](inverted index) fix bug when match condition in hash join (#23105)
* [Fix](inverted index) fix bug when match condition in hash join
2023-08-23 17:48:31 +08:00
7f1857b4e7 [fix](regression-test) fix unstable case load_p0/insert/test_insert.groovy (#23326) 2023-08-23 16:22:49 +08:00
22e373a799 [feature](vector-search) add 4 distance functions to support vector search (#23129) 2023-08-23 15:51:15 +08:00
2dda44d7b5 [fix](csv-reader)fix bug of multi-char delimiter in csv reader
fix bug that csv_reader parse line in order to get column.
2023-08-23 15:19:13 +08:00
c7b9eb5f9c [enhancement](bitmap)support bitmap type for non-key column in unique table (#23228) 2023-08-23 14:21:22 +08:00
527293aa41 [refactor](dynamic table) remove dynamic table (#23298) 2023-08-23 14:15:14 +08:00
14296ee87f [fix](window_function) wrong order by range (#23346) 2023-08-23 11:23:00 +08:00
78c6b115c3 [fix](planner)avg function need support large int param (#23254)
* [fix](planner)avg function need support large int param
2023-08-23 10:05:08 +08:00
a7675243d9 [fix](jdbc catalog) fix adaptation to Oracle special character / table names (#23080)
The changes of this PR for JdbcOracleClient are as follows:

#### bug fixes:
  1. Fix the problem that if there is an approximate table name for Schema synchronization with a table name with `/` characters, the synchronization Column will be confused
  2. Fix the NPE problem of metadata synchronization after enabling lower_case_table_names configuration

#### improvement:
  1. Modify the method of synchronizing Oracle User to Doris Database mapping, use `metadata.getSchemas` instead of `SELECT DISTINCT OWNER FROM all_tables`
  2. When synchronizing metadata, change `null` at the catalog level to `conn.getcatalog`
2023-08-22 15:25:42 +08:00
da2eb69eba [test](Nereids) add array scalar function test cases (#23303) 2023-08-22 15:05:28 +08:00
9d2e23b1aa [fix](parquet) A row of complex type may be stored across more pages (#23277)
A row of complex type may be stored across two(or more) pages, and the parameter `align_rows` indicates that whether the reader should read the remaining value of the last row in previous page.
2023-08-22 14:47:10 +08:00
6c8af92175 [fix])(nereids)Support select catalog.db.table.column from xxx for nereids planner. #23221
Nereids doesn't support select table.* from table, this pr is to fix this bug.
Support three layer qualifier. (catalog.database.table)
2023-08-22 13:58:25 +08:00
5ff7b57fc1 [fix](parquet) parquet reader confuses logical/physical/slot id of columns (#23198)
`ParquetReader` confuses logical/physical/slot id of columns. If only reading the scalar types, there's nothing wrong, but when reading complex types, `RowGroup` and `PageIndex` will get wrong statistics. Therefore, if the query contains complex types and pushed-down predicates, the probability of the result set is incorrect.
2023-08-22 13:35:29 +08:00
51db11ed0b [improve](jdbc catalog) Add a variable to accommodate the final keyword in ClickHouse Jdbc Catalog queries (#23282) 2023-08-22 12:13:36 +08:00
b471862dba [Fix](regression-test) fix es regression test (#23160) 2023-08-22 11:52:37 +08:00
5d9678700c [feature](Nereids) support select tablets with nereids optimizer (#23164) 2023-08-22 10:14:27 +08:00