Commit Graph

2951 Commits

Author SHA1 Message Date
45b31506c7 [improvement](delete) support delete from partitioned table without partition specified (#13533)
Support delete from partitioned table without partition specified in [DELETE] stmt.

## Usage
If it is a partitioned table, you can specify a partition.
If not specified, Doris will infer partition from the given conditions.
In two cases, Doris cannot infer the partition from conditions:
1) the conditions do not contain partition columns;
2) The operator of the partition column is `not in`.
When a partition table does not specify the partition,
or the partition cannot be inferred from the conditions,
the session variable `delete_without_partition` needs to be `true`
to make delete statement be applied to all partitions.

## Test case
Test case is added in `regression-test/suites/delete_p0/test_delete_from_partition.groovy`,
user can delete from partitioned table without partition specified now.
2022-10-27 21:32:45 +08:00
ec86e9c9b2 [feature-wip][MTMV] The schedule framework for the MTMV (#13147)
Design document: https://github.com/apache/doris/issues/13146
2022-10-27 11:37:24 +08:00
0e70d681d9 [feature](Nereids): Construct join graph (#13679)
* feat: add hypergraph and its api

* feat: add visulization api

Signed-off-by: xiejiann <jianxie0@gmail.com>

* remove unused code

Signed-off-by: xiejiann <jianxie0@gmail.com>

* fix format

Signed-off-by: xiejiann <jianxie0@gmail.com>

* remove unused test

Signed-off-by: xiejiann <jianxie0@gmail.com>

* remove unused tests

Signed-off-by: xiejiann <jianxie0@gmail.com>

* format

Signed-off-by: xiejiann <jianxie0@gmail.com>

Signed-off-by: xiejiann <jianxie0@gmail.com>
2022-10-27 11:32:31 +08:00
2697f72d77 [Improvement][SET-PROPERTY] Support for set query_timeout property (#13444) 2022-10-27 10:03:39 +08:00
7557980d64 [improvement](regression-test) avoid query empty result after loading finished (#13682)
When running regression test, we always found that the query return empty result after loading finished,
even if we call "sync" before the query.
This is because for `stream load`, the load task result will be returned immediately after the txn's status changed to VISIBLE,
but before writing the edit log.
So if we do the query right after we got the load task result, it is possible that we can not see the latest loaded data.

Same issue with `insert` operation
2022-10-27 09:47:18 +08:00
5bd66243ee [minor](log) remove some unused logs (#13689)
1. When running regression test with specific suites or group, do not print other suite name or file name
2. Remove unused alter table job log.
2022-10-27 09:37:32 +08:00
3c95106d45 [Bug](jdbc) Fix memory leak for JDBC datasource (#13657) 2022-10-27 00:02:25 +08:00
ddb27b9c3f nereids use decimal(27,9) (#13678) 2022-10-26 21:37:24 +08:00
f4c8d4ce85 [feature](nereids) estimate plan cost by column ndv and table row count (#13375)
In this version, we use column ndv information to estimate plan cost.

This is the first version, covers TPCH queries.
2022-10-26 20:35:10 +08:00
bed759b3f5 [Fix](array-type) support CTAS for ARRAY column from collect_list and collect_set (#13627)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-10-26 19:42:15 +08:00
0841c5bf28 [Bugfix](manager) fix query profile key incompatible with old versions (#13596) 2022-10-26 14:27:58 +08:00
3548d0b824 [fix](statistics) fix cross join statistics exception (#13645) 2022-10-26 14:10:57 +08:00
c418bbd2d1 [feature-wip](new-scan) support Json reader (#13546)
Issue Number: close #12574
This pr adds `NewJsonReader` which implements GenericReader interface to support read json format file.

TODO:
1. modify `_scann_eof` later.
2. Rename `NewJsonReader` to `JsonReader` when `JsonReader` is deleted.
2022-10-26 12:52:21 +08:00
44c9163b3c [Fix](multi-catalog)Fix partition external table query bug. (#13535)
The index for external table columns from path is incorrect in new scanner. This is a fix for it.
e.g. In the next query, nation and city columns are from path
```
mysql> select nation, city, count(*) from parquet_two_part group by nation, city;
+--------+------------+----------+
| nation | city       | count(*) |
+--------+------------+----------+
| cn     | beijing    |  1199969 |
| cn     | shanghai   |  1199771 |
| jp     | tokyo      |   599715 |
| rus    | moscow     |   600659 |
| us     | chicago    |  1199805 |
| us     | washington |  1201296 |
+--------+------------+----------+
6 rows in set (0.39 sec)
```
2022-10-26 12:47:37 +08:00
15130c469f [fix](planner) cannot recogonize column's table when analyze rewrite expr (#13597)
We save mv column with alias as table name, and search it with original table name.
2022-10-26 11:15:48 +08:00
e5b33abd3c [fix](planner) inlineView alias error (#13600) 2022-10-26 10:14:04 +08:00
e385cb063c [improvement](config) allow to modify the master-only configuration of non-master nodes (#13558)
Non-master nodes may switch to master, so we should allow the master-only configuration of all fe nodes to be modified.
2022-10-26 10:00:12 +08:00
c709998faa [improvement][refactor](mysql) remove old mysql server and add keep alive option (#13663)
* [improvement][refactor](mysql) remove old mysql server and add keep alive option
2022-10-26 09:38:33 +08:00
a02a56eb38 [fix](postgresql) fix postgresql cann't find table (#13550) 2022-10-26 09:30:28 +08:00
e00734348b [Chore](regression) Fix wrong result for decimal (#13644) 2022-10-26 09:24:46 +08:00
c486d9746d [fix](broker) fix bug when broker load with s3a (#13650)
Signed-off-by: nextdreamblue <zxw520blue1@163.com>

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2022-10-26 09:22:38 +08:00
9691db7918 [Enhancement](metrics) add more metrics (#11693)
* Add `AutoMappedMetric` to measure dynamic object.
* Add query instance and rpc metrics
* Add thrift rpc metrics
* Add txn metrics
* Reorganize metrics init routine.

Co-authored-by: 迟成 <chicheng@meituan.com>
2022-10-26 08:31:03 +08:00
17ba40f947 [feature-wip](CN Node)Support compute node (#13231)
Introduce the node role to doris, and the table creation and tablet scheduler will control the storage only assign to the BE nodes.
2022-10-25 21:44:33 +08:00
cb39671a73 [fix](policy) Add readlock for show policy (#13497)
Add readlock for show policy resolve ConcurrentModificationException
2022-10-25 21:42:40 +08:00
d6c3470c8d [feature](Nereids) support materialized index selection (#13416)
This PR unified the selection of rollup index and materialized view index into uniform logic, which is called selecting materialized index. 

Main steps:

### Find candidate indexes
1. When aggregate is present, it's handled in `SelectMaterializedIndexWithAggregate`.  The base index and indexes that could use pre-aggregation should be used. The pre-aggregation status is determined by aggregation function, grouping expression, and pushdown predicates.
2. When aggregate is not on top of scan node, it's handled in `SelectMaterializedIndexWithoutAggregate`. The base index and indexes that have all the key columns could be used.

### Filter and order the candidate indexes
1. filter indexes that contain all the required output scan columns.
2. filter indexes that could match prefix index most.
3. order the result index by row count, column count, and index id.
2022-10-25 19:25:58 +08:00
1741a20689 [opt](planer) remove unless cast of avg function (#13593) 2022-10-25 17:33:02 +08:00
f209b7ab6e [fix](Nereids) add exchange node check between local and global agg in plan translator (#12913)
### table schema
CREATE TABLE `t1` (
  `k1` int(11) NULL,
  `v1` int(11) NULL
) ENGINE=OLAP
DUPLICATE KEY(`k1`, `v1`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`k1`)
BUCKETS 3
PROPERTIES('replication_num'='1')

### query
select k1,count(distinct v1+1) from t1 group by k1;

### error
java.lang.ClassCastException: org.apache.doris.planner.OlapScanNode cannot be cast to org.apache.doris.planner.AggregationNode
2022-10-25 16:55:29 +08:00
e103531e69 [fix](sort)order by constant expr bug (#13613)
Issue Number: close (#13350)
2022-10-25 16:43:18 +08:00
57e248e09b [feature-wip](unique-key-merge-on-write) check whether the partition column is a key column when create table for MOW table (#13490) 2022-10-24 21:16:38 +08:00
409bd76999 [improve](Nereids): ReorderJoin eliminate this recursion (#13505) 2022-10-24 17:11:43 +08:00
7faad9f004 [FIX](agg)fix group by constant child expr bug (#13485) 2022-10-24 16:32:36 +08:00
177e82bdab [Enhancement](array-type) Add type derivation for array functions (#13534)
From now, we don't support type derivation for array function's arguments.
So that the cases below will return wrong values or even cause be core.

mysql> select array_union([1],[10000000]);
+----------------------------------------+
| array_union(ARRAY(1), ARRAY(10000000)) |
+----------------------------------------+
| [1, -128]                              |
+----------------------------------------+
1 row in set (0.03 sec)

mysql> select array_union([NULL],[1]);
ERROR 1105 (HY000): RpcException, msg: io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason

mysql> select array_union([],[1]);
ERROR 1105 (HY000): RpcException, msg: io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason
This commit make a small fix to derivate the argument types of the array function
1、 For null type in arguments, cast the null type to boolean type, because null type should not be seen in be.
2、For different types in arguments, cast all arguments type to their compatible type.
2022-10-24 11:51:47 +08:00
54545c6446 [improvement](config) enlarge default value of create_table_timeout and remove disable_stream_load_2pc (#13520)
Users do not need to set create_table_timeout, it is a ddl command and when encounter a timeout event
users will set a lager timeout and retry.

Stream load 2pc is used by default in flink connector, so we should not disable it by config, the config
item is useless.
2022-10-24 11:51:18 +08:00
e53baeea17 [Improvement](Audit): Remove default_cluster for audit log db #13499 2022-10-24 11:46:06 +08:00
9a47e2dab4 [improvement](profile) Change some profile warning log to debug log (#13539) 2022-10-24 10:34:15 +08:00
4754ccf16b [refactor](Nereids) remove useless explaration job in cascades framework (#13545) 2022-10-24 10:33:24 +08:00
d266b8fb50 [fix](restore) fix wrong replica allcation after restore (#13575)
How to reproduce:
1. create a table with replica allocation, eg:
    ```
    "replication_allocation"="tag.location.group_01:1, tag.location.group_02:1, tag.location.group_03:1"
    ```
2. Backup this table
3. Restore this table with specific replication allocation, eg: `"replication_allocation" = "tag.location.default: 3"`
4. After restore, executing `show create table xxx`, you will be the `replication_allocation` is still:
    ```
    "replication_allocation"="tag.location.group_01:1, tag.location.group_02:1, tag.location.group_03:1"
    ```
    Not what we expected
5. But if you execute `show partitions from xxx`, the replication allocation of each partition is what we expected:
    ```
    "replication_allocation" = "tag.location.default: 3"
    ```

This is because when doing restore job, we forget to set the "default" replica allocation property of the table.
And the result of `show create table` is got from "default" replica allocation property, not from the real replica property of each partition
2022-10-24 09:42:28 +08:00
477b28efac [deps](fe)upgrade commons-text to 1.10.0 (#13562) 2022-10-23 23:30:02 +08:00
b042ef9765 [chore](macOS) Fix the issues with protoc and protoc-gen-grpc-java on M1 (#13571)
There are some errors occur when building FE by JDK (arm64) on M1 because the dependencies protoc and grpc-java doesn't support M1. 
#13563 modified the build.sh to fix this issues by adding -Dos.arch=x86_64 to build command.
However, if some one executes `mvn clean package -DskipTests=true` under the folder fe, the errors will occur again.

This PR introduces a better way to fix them.
2022-10-23 14:10:46 +08:00
3a3def447d [fix](csv-reader) fix bug that csv reader can not read text format hms table (#13515)
1. Missing field and line delimiter
2. When query external table with text(csv) format, we should pass the column position map to BE,
    otherwise the column order is wrong.

TODO:
1. For now, if we query csv file with non-exist column, it will return null.
    But it should return null or default value of that column.
2. Add regression test after hive docker is ready.
2022-10-22 22:40:03 +08:00
413d2332ce [improvement](heartbeat) Add some relaxation strategies to reduce the failure probability of regression testing (#13568)
The regression test may failed because of heartbeat failure occasionally.
So I add 2 new FE config to relax this limit

1. `disable_backend_black_list`
    Set to true to not put Backend to black list even if we failed to send task to it. Default is false.
2. `max_backend_heartbeat_failure_tolerance_count`
   Only if the failure time of heartbeat exceed this config, we can set Backend as dead. Default is 1.
2022-10-22 17:53:07 +08:00
wxy
60e9fe2b3a [fix](plugin) bugfix for dirty uninstallation of dynamic plugin (#13540) (#13543)
Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>
2022-10-21 23:28:06 +08:00
122f36e5be [RuntimeFilter] (vec) support runtime filter in indeed slot id (#13556) 2022-10-21 21:41:23 +08:00
02598931e6 [fix](ddl) check view name by table name regex when create (#13240) 2022-10-21 17:41:23 +08:00
ddc08ee690 [enhancement](Nereids) turn on stream pre aggregate for nereids (#13538)
Exactly the same behavior as the legacy optimizer.
2022-10-21 17:23:53 +08:00
847b80ebfa [test](jdbc) add jdbc and hive regression test (#13143)
1. Modify default behavior of `build.sh`
    The `BUILD_JAVA_UDF` is default ON, so that jvm is needed for compilation and runtime.

2. Add docker-compose for MySQL 5.7, PostgreSQL 14 and Hive 2
   See `docker/thirdparties/docker-compose`.

3. Add some regression test cases for jdbc query on MySQL, PG and Hive Catalog
   The default is `false`, if set to true, you need first start docker for MySQL/PG/Hive.

4. Support `if not exists` and `if exists` for create/drop resource and create/drop encryptkey
2022-10-21 15:29:27 +08:00
Pxl
88ceace855 [Bug](predicate) fix core dump on bool type runtime filter (#13417)
fix core dump on bool type runtime filter
2022-10-21 13:15:22 +08:00
b861b66bef [improve](Nereids): verify the join reorder search space; (#13498)
* [improve](Nereids): verify the join reorder search space;
2022-10-21 11:48:04 +08:00
9a3c1f0867 [Improvement](decimal) print decimal according to the real precision and scale (#13437) 2022-10-21 10:00:01 +08:00
27d84eafc5 [feature](alter) support rename column for table with unique column id (#13410) 2022-10-21 08:45:34 +08:00