Support delete from partitioned table without partition specified in [DELETE] stmt.
## Usage
If it is a partitioned table, you can specify a partition.
If not specified, Doris will infer partition from the given conditions.
In two cases, Doris cannot infer the partition from conditions:
1) the conditions do not contain partition columns;
2) The operator of the partition column is `not in`.
When a partition table does not specify the partition,
or the partition cannot be inferred from the conditions,
the session variable `delete_without_partition` needs to be `true`
to make delete statement be applied to all partitions.
## Test case
Test case is added in `regression-test/suites/delete_p0/test_delete_from_partition.groovy`,
user can delete from partitioned table without partition specified now.
When running regression test, we always found that the query return empty result after loading finished,
even if we call "sync" before the query.
This is because for `stream load`, the load task result will be returned immediately after the txn's status changed to VISIBLE,
but before writing the edit log.
So if we do the query right after we got the load task result, it is possible that we can not see the latest loaded data.
Same issue with `insert` operation
Issue Number: close#12574
This pr adds `NewJsonReader` which implements GenericReader interface to support read json format file.
TODO:
1. modify `_scann_eof` later.
2. Rename `NewJsonReader` to `JsonReader` when `JsonReader` is deleted.
The index for external table columns from path is incorrect in new scanner. This is a fix for it.
e.g. In the next query, nation and city columns are from path
```
mysql> select nation, city, count(*) from parquet_two_part group by nation, city;
+--------+------------+----------+
| nation | city | count(*) |
+--------+------------+----------+
| cn | beijing | 1199969 |
| cn | shanghai | 1199771 |
| jp | tokyo | 599715 |
| rus | moscow | 600659 |
| us | chicago | 1199805 |
| us | washington | 1201296 |
+--------+------------+----------+
6 rows in set (0.39 sec)
```
This PR unified the selection of rollup index and materialized view index into uniform logic, which is called selecting materialized index.
Main steps:
### Find candidate indexes
1. When aggregate is present, it's handled in `SelectMaterializedIndexWithAggregate`. The base index and indexes that could use pre-aggregation should be used. The pre-aggregation status is determined by aggregation function, grouping expression, and pushdown predicates.
2. When aggregate is not on top of scan node, it's handled in `SelectMaterializedIndexWithoutAggregate`. The base index and indexes that have all the key columns could be used.
### Filter and order the candidate indexes
1. filter indexes that contain all the required output scan columns.
2. filter indexes that could match prefix index most.
3. order the result index by row count, column count, and index id.
From now, we don't support type derivation for array function's arguments.
So that the cases below will return wrong values or even cause be core.
mysql> select array_union([1],[10000000]);
+----------------------------------------+
| array_union(ARRAY(1), ARRAY(10000000)) |
+----------------------------------------+
| [1, -128] |
+----------------------------------------+
1 row in set (0.03 sec)
mysql> select array_union([NULL],[1]);
ERROR 1105 (HY000): RpcException, msg: io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason
mysql> select array_union([],[1]);
ERROR 1105 (HY000): RpcException, msg: io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason
This commit make a small fix to derivate the argument types of the array function
1、 For null type in arguments, cast the null type to boolean type, because null type should not be seen in be.
2、For different types in arguments, cast all arguments type to their compatible type.
Users do not need to set create_table_timeout, it is a ddl command and when encounter a timeout event
users will set a lager timeout and retry.
Stream load 2pc is used by default in flink connector, so we should not disable it by config, the config
item is useless.
How to reproduce:
1. create a table with replica allocation, eg:
```
"replication_allocation"="tag.location.group_01:1, tag.location.group_02:1, tag.location.group_03:1"
```
2. Backup this table
3. Restore this table with specific replication allocation, eg: `"replication_allocation" = "tag.location.default: 3"`
4. After restore, executing `show create table xxx`, you will be the `replication_allocation` is still:
```
"replication_allocation"="tag.location.group_01:1, tag.location.group_02:1, tag.location.group_03:1"
```
Not what we expected
5. But if you execute `show partitions from xxx`, the replication allocation of each partition is what we expected:
```
"replication_allocation" = "tag.location.default: 3"
```
This is because when doing restore job, we forget to set the "default" replica allocation property of the table.
And the result of `show create table` is got from "default" replica allocation property, not from the real replica property of each partition
There are some errors occur when building FE by JDK (arm64) on M1 because the dependencies protoc and grpc-java doesn't support M1.
#13563 modified the build.sh to fix this issues by adding -Dos.arch=x86_64 to build command.
However, if some one executes `mvn clean package -DskipTests=true` under the folder fe, the errors will occur again.
This PR introduces a better way to fix them.
1. Missing field and line delimiter
2. When query external table with text(csv) format, we should pass the column position map to BE,
otherwise the column order is wrong.
TODO:
1. For now, if we query csv file with non-exist column, it will return null.
But it should return null or default value of that column.
2. Add regression test after hive docker is ready.
The regression test may failed because of heartbeat failure occasionally.
So I add 2 new FE config to relax this limit
1. `disable_backend_black_list`
Set to true to not put Backend to black list even if we failed to send task to it. Default is false.
2. `max_backend_heartbeat_failure_tolerance_count`
Only if the failure time of heartbeat exceed this config, we can set Backend as dead. Default is 1.
1. Modify default behavior of `build.sh`
The `BUILD_JAVA_UDF` is default ON, so that jvm is needed for compilation and runtime.
2. Add docker-compose for MySQL 5.7, PostgreSQL 14 and Hive 2
See `docker/thirdparties/docker-compose`.
3. Add some regression test cases for jdbc query on MySQL, PG and Hive Catalog
The default is `false`, if set to true, you need first start docker for MySQL/PG/Hive.
4. Support `if not exists` and `if exists` for create/drop resource and create/drop encryptkey