1. this pr is used to fix the be core dump when select the invalid array.
2. before the change, we run "select array_intersect([1, 2, 3, 1, 2, 3], '1[3, 2, 5]');" will cause be core dump.
MySQL [example_db]> select array_intersect([1, 2, 3, 1, 2, 3], '1[3, 2, 5]');
ERROR 1105 (HY000): RpcException, msg: io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason
3. after the change, we run "select array_intersect([1, 2, 3, 1, 2, 3], '1[3, 2, 5]');" will get error message.
MySQL [example_db]> select array_intersect([1, 2, 3, 1, 2, 3], '1[3, 2, 5]');
errCode = 2, detailMessage = No matching function with signature: array_intersect(array<tinyint(4)>, varchar(-1))"
Co-authored-by: hucheng01 <hucheng01@baidu.com>
Support delete from partitioned table without partition specified in [DELETE] stmt.
## Usage
If it is a partitioned table, you can specify a partition.
If not specified, Doris will infer partition from the given conditions.
In two cases, Doris cannot infer the partition from conditions:
1) the conditions do not contain partition columns;
2) The operator of the partition column is `not in`.
When a partition table does not specify the partition,
or the partition cannot be inferred from the conditions,
the session variable `delete_without_partition` needs to be `true`
to make delete statement be applied to all partitions.
## Test case
Test case is added in `regression-test/suites/delete_p0/test_delete_from_partition.groovy`,
user can delete from partitioned table without partition specified now.
There are some issues with the compile flag `-Wno-unused-but-set-variable` for clang.
1. `-Wno-unused-but-set-variable` should be set when building source by clang-15 on Linux. (#13000#13016)
2. On macOS Monterey, Apple Clang 13 may treat it as a unknown warning option and the compilation process may interrupt.
This PR introduces a better way to make this compile flag more portable.
1. Test whether the compiler recognizes this flag.
2. Add this flag if the compiler recognizes it.
When running regression test, we always found that the query return empty result after loading finished,
even if we call "sync" before the query.
This is because for `stream load`, the load task result will be returned immediately after the txn's status changed to VISIBLE,
but before writing the edit log.
So if we do the query right after we got the load task result, it is possible that we can not see the latest loaded data.
Same issue with `insert` operation
Issue Number: close#12574
This pr adds `NewJsonReader` which implements GenericReader interface to support read json format file.
TODO:
1. modify `_scann_eof` later.
2. Rename `NewJsonReader` to `JsonReader` when `JsonReader` is deleted.
The index for external table columns from path is incorrect in new scanner. This is a fix for it.
e.g. In the next query, nation and city columns are from path
```
mysql> select nation, city, count(*) from parquet_two_part group by nation, city;
+--------+------------+----------+
| nation | city | count(*) |
+--------+------------+----------+
| cn | beijing | 1199969 |
| cn | shanghai | 1199771 |
| jp | tokyo | 599715 |
| rus | moscow | 600659 |
| us | chicago | 1199805 |
| us | washington | 1201296 |
+--------+------------+----------+
6 rows in set (0.39 sec)
```
1. add "sync" to avoid some potential meta sync problem when running regression test on multi-node cluster
2. Use /tmp dir as dest dir of outfile test, to avoid "No such file or directory" error.
This PR unified the selection of rollup index and materialized view index into uniform logic, which is called selecting materialized index.
Main steps:
### Find candidate indexes
1. When aggregate is present, it's handled in `SelectMaterializedIndexWithAggregate`. The base index and indexes that could use pre-aggregation should be used. The pre-aggregation status is determined by aggregation function, grouping expression, and pushdown predicates.
2. When aggregate is not on top of scan node, it's handled in `SelectMaterializedIndexWithoutAggregate`. The base index and indexes that have all the key columns could be used.
### Filter and order the candidate indexes
1. filter indexes that contain all the required output scan columns.
2. filter indexes that could match prefix index most.
3. order the result index by row count, column count, and index id.