Commit Graph

5755 Commits

Author SHA1 Message Date
221e860cb7 [Feature](Routine Load)Support Partial Update (#22785) 2023-08-10 17:41:53 +08:00
df26fb2de4 [fix][alter table property] fix alter table property failed (#22791) 2023-08-10 17:12:42 +08:00
fd0c161081 [enhance](ColdHeatSeparation) forbid change storage policy to another one with different storage resource (#22519) 2023-08-10 16:32:09 +08:00
50fbe31f93 [fix](tablet report) fix not add replicas when a backend re join the cluster after changing its ip or port (#22700) 2023-08-10 15:29:28 +08:00
ec0cedab51 [opt](stats) Use single connect context for each olap analyze task
1. add some comment 
2. Fix potential NPE caused by deleting a running analyze job
3. Use single connect  context for each olap analyze task
2023-08-10 15:04:28 +08:00
f7d00d467a [fix](multicatlog) fix read hive/iceberg catalog on cosn & fix read data via broker (#22087)
* [fix](multicatlog) fix read hive/iceberg catalog on cosn & fix read data via broker

* Update FileSystemFactory.java
2023-08-10 14:44:53 +08:00
f2658dc7bd [Feature](multi-catalog) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema. (#22318)
Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema by session var `truncate_char_or_varchar_columns`.
2023-08-10 14:37:20 +08:00
f1db6bd8c1 [feature](hive)append support for struct and map column type on textfile format of hive table (#22347)
1. append support for struct and map column type on textfile format  of hive table.
2. optimizer code that array column type.

```mysql
+------+------------------------------------+
| id   | perf                               |
+------+------------------------------------+
| 1    | {"key1":"value1", "key2":"value2"} |
| 1    | {"key1":"value1", "key2":"value2"} |
| 2    | {"name":"John", "age":"30"}        |
+------+------------------------------------+
```

```mysql
+---------+------------------+
| column1 | column2          |
+---------+------------------+
|       1 | {10, "data1", 1} |
|       2 | {20, "data2", 0} |
|       3 | {30, "data3", 1} |
+---------+------------------+
```
Summarizes support for complex types(support assign delimiter) :

1. array< primitive_type > and array< array< ... > >
2. map< primitive_type , primitive_type >
3. Struct< primitive_type , primitive_type ... >
2023-08-10 13:47:58 +08:00
57fb9799b5 [feature](agg) add aggregation function 'bitmap_agg' (#22768)
This function can be used to replace bitmap_union(to_bitmap(expr)), because bitmap_union(to_bitmap(expr)) need create many many small bitmaps firstly and then merge them into a single bitmap.
bitmap_agg will convert the column value into a bitmap directly. Its performance is better than bitmap_union(to_bitmap(expr)) . In our test , there is about 30% improvement.
2023-08-10 12:18:25 +08:00
35dd787ed7 [improvement](transaction) abort txn when be lost heartbeat over 1 min (#22781) 2023-08-10 12:04:42 +08:00
432c8f1d6a [opt](stats) No more sync unknown stats since cannot serialize (#22775)
Gson can't serialize INFINITY under current configuration
2023-08-10 11:46:56 +08:00
f001b9d5c8 [enhance](multi-catalog) support multi name service when config hive catalog #21825
when create catalog with multi-servicename like below:
REATE CATALOG hive_prod_t1 PROPERTIES (
'type'='hms',
'hive.metastore.uris' = 'thrift://10.198.xxx:9011,thrift://11.11.xxx:9001,thrift://10.198.xxx:9011',
'hadoop.username' = 'user',
'dfs.nameservices'='ns1007,ns1017',
'dfs.ha.namenodes.ns1007'='nn1,nn2',
'dfs.namenode.rpc-address.ns1007.nn1'='10.198.xxxx:8120',
'dfs.namenode.rpc-address.ns1007.nn2'='10.198.xxx:8120',
'dfs.client.failover.proxy.provider.ns1007'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider',
'dfs.ha.namenodes.ns1017'='nn1,nn2',
'dfs.namenode.rpc-address.ns1017.nn1'='10.198.xxxx:8120',
'dfs.namenode.rpc-address.ns1017.nn2'='10.198.xxxx:8120',
'dfs.client.failover.proxy.provider.ns1017'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider'
);

the result will be: ERROR 1105 (HY000): errCode = 2, detailMessage = Missing dfs.ha.namenodes.ns1007,ns1017 property
2023-08-10 10:48:08 +08:00
eafdab0cfd [Enhancement](tvf) Add frontends_disks table-valued-function (#22568)
---------

Co-authored-by: yuxianbing <yuxianbing@yy.com>
Co-authored-by: yuxianbing <iloveqaz123>
2023-08-10 10:40:24 +08:00
b90a7748a6 [Feature](Job Schedule)implement Transient Task Register (#22665)
Implement the TransientTaskRegister to support submitting transient tasks which do not require a timer trigger.
rename some class:
TimerTaskDisruptor -> TaskDisruptor
TimerTaskEvent -> TaskEvent
TimerTaskExpirationHandler -> TaskHandler
AsyncJobManager -> TimerJobManager
MemoryTask -> TransientTask
2023-08-10 10:34:13 +08:00
8591257d74 [fix](nereids) parallel instance number is set to 1 incorrectly (#22748)
make PlanNode.getNumInstance() abstract to force every PlanNode specify how to define its numInstance.
By default, PlanNode.numInstance is 1. PlanNode except exchangeNode should not use this default value directly. PlanNode.numInstance is used for PlanNode which will change numInstance like exchange node.
2023-08-10 10:17:37 +08:00
8a5021c235 [Fix](Sql)NPE when the Delete statement does not specify a where condition (#22766)
Execute Sql

delete from test_table.
2023-08-09 11:51:46,586 WARN (mysql-nio-pool-7|540) [StmtExecutor.analyze():987] Analyze failed. stmt[25, 519f916eeb94a8b-afe8e1094fb39fc1]
java.lang.NullPointerException: null
        at org.apache.doris.rewrite.ExprRewriter.applyRuleBottomUp(ExprRewriter.java:236) ~[classes/:?]
        at org.apache.doris.rewrite.ExprRewriter.applyRule(ExprRewriter.java:226) ~[classes/:?]
        at org.apache.doris.rewrite.ExprRewriter.applyRuleRepeatedly(ExprRewriter.java:216) ~[classes/:?]
        at org.apache.doris.rewrite.ExprRewriter.rewrite(ExprRewriter.java:166) ~[classes/:?]
        at org.apache.doris.rewrite.ExprRewriter.rewrite(ExprRewriter.java:151) ~[classes/:?]
        at org.apache.doris.analysis.DeleteStmt.analyze(DeleteStmt.java:127) ~[classes/:?]
        at org.apache.doris.qe.StmtExecutor.analyze(StmtExecutor.java:983) ~[classes/:?]
        at org.apache.doris.qe.StmtExecutor.executeByLegacy(StmtExecutor.java:660) ~[classes/:?]
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:448) ~[classes/:?]
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:419) ~[classes/:?]
        at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:441) ~[classes/:?]
        at org.apache.doris.qe.ConnectProcessor.dispatch(ConnectProcessor.java:589) ~[classes/:?]
        at org.apache.doris.qe.ConnectProcessor.processOnce(ConnectProcessor.java:826) ~[classes/:?]
        at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[classes/:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:829) ~[?:?]
Fix Result

[HY000][1105] errCode = 2, detailMessage = Where clause is not set
Affected version

2.0-Alpha +
2023-08-10 10:15:49 +08:00
b25d52b736 [feature](cast) remove some unused in functioncast and support some function in nereids (#22729)
1 ConvertImplGenericFromString do not need a template StringColumnType
2 remove timev1 in function cast
3 support time_to_sec , sec_to_time in nereids
2023-08-10 10:10:32 +08:00
919bfd73f1 [improvement](multi-catalog)add scanner isolation class loader (#22247)
Add scanner isolation class loader to make each plugin non-conflicting.
The BE will get scanner classes by JNI call and use JniClassLoader load them.
In the last version,we always get canner classes from the system class path by default,
so it cannot isolate the classes for each scanner
2023-08-10 10:02:46 +08:00
df1f67d835 [improve](insert) Support server side prepare insert stmt (#22353) 2023-08-10 09:59:17 +08:00
768088c95e [refactor](udaf) refactor call udaf function and support map type in return (#22508) 2023-08-09 22:44:07 +08:00
3b7a0a4713 [fix](cache) Fix enable sql cache lead to FE Full GC or OOM #22769 2023-08-09 19:24:04 +08:00
Pxl
89dc1f73b2 [Bug](materialized-view) make mv matched when preagg have value column predicate contained in mv'where clause (#22779)
1. make mv matched when preagg have value column predicate contained in mv
'where clause
2. fix `org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = BITMAP_UNION need input a bitmap column, but input INVALID_TYPE`
3. make the error message more detailed when create mv stmt parse failed
2023-08-09 19:17:55 +08:00
HB
5147c096ef [Enhancement] Add an API to query session information for all FEs (#20134)
Currently, Doris only has one interface for querying specific FE session information, and many times we need to know how many session information there are in the current cluster, so I added this API.

`
GET /rest/v1/session/all

{
"msg": "success",
"code": 0,
"data": {
"column_names": ["FE", "Id", "User", "Host", "Cluster", "Db", "Command", "Time", "State", "Info"],
"rows": [{
"FE": "10.14.170.23",
"User": "root",
"Command": "Sleep",
"State": "",
"Cluster": "default_cluster",
"Host": "10.81.85.89:31465",
"Time": "230",
"Id": "0",
"Info": "",
"Db": "db1"
},
{
"FE": "10.14.170.24",
"User": "root",
"Command": "Sleep",
"State": "",
"Cluster": "default_cluster",
"Host": "10.81.85.88:61465",
"Time": "460",
"Id": "1",
"Info": "",
"Db": "db1"
}]
},
"count": 2
}
`
2023-08-09 19:02:45 +08:00
e6a860fc9e [memo](nereids) record the chosen group expression in Group (#22661)
1. remember the chosen plan in group
2. set groupId after RecomputeLogicalPropertiesProcessor
2023-08-09 18:44:46 +08:00
2a13d15d20 [feature](Nereids): disable join order when the join number > 63 (#22708) 2023-08-09 17:09:38 +08:00
77d3d4e324 [fix](cache) add sql cache conf cache_result_max_data_size (#22645)
Only the maximum number of rows in sql cache cache_result_max_row_count is not enough. If a row of data is too large, FE may OOM.
2023-08-09 14:46:23 +08:00
690a519742 [fix](Nereids) disable or expansion when pipeline engine is disable (#22719) 2023-08-09 14:33:50 +08:00
9533918d4f [fix](delete) Fix parsing error of delete where date statement (#22690) 2023-08-09 12:33:03 +08:00
a8d690272f [refactor](Nereids) let topn runtime filter as PhysicalTopN's attr (#22745)
The original implement use MutableMap on PhysicalTopN.
It is easy to lose if we rewrite the plan after this processor.
The new implement use attr to indict whether use topn runtime filter
2023-08-09 12:13:21 +08:00
4608dcb2d9 [fix](agg) fix coredump caused by push down count aggregation (#22699)
fix coredump caused by push down count aggregation
2023-08-09 10:21:20 +08:00
d3baac2952 [improvement](resource-tag) Add Backend tag location check (#22670)
Add Backend tag location check.
Avoid user set a bad backend tag, cause create table and dynamic partitions failed.
For example, the default value for all backends tag is default, When setting the replication_allocation of a table, user use the following command: ALTER TABLE example_db.mysql_table SET ("replication_allocation" = "tag.location.tag1: 1");, it can set success, but tag1 is not exist, cause dynamic partition can't create.
2023-08-09 00:08:34 +08:00
7bfcee6e71 [improvement](variable) add annotations for variables (#22292) 2023-08-08 22:16:42 +08:00
b5d7e6e7d8 [improvement](stats) Add lifecycle hooks to AnalysisTask to make codes more clear (#22658) 2023-08-08 19:06:47 +08:00
a04e30d087 [Fix](Job)Fix Job schedule calculation start time (#22707)
Since we use division calculation, when the start time is not specified,
it may have a wrong deviation from our expected time.

For example, if it is the 7th minute now, the cycle is executed every two minutes.
Then it is calculated that the first execution is 8 minutes Because 7/2=3
3+1=4
But ideally we think it should be executed at the 9th minute
2023-08-08 18:30:38 +08:00
50dd318183 [style](jdbc catalog) Tidy the jdbc catalog java file directory (#22691) 2023-08-08 18:21:21 +08:00
f2dca848db [chore](Nereids): optimize to handle enforcer in MergeGroup() (#22709) 2023-08-08 16:56:34 +08:00
0f15d86c43 [fix](nereids) decimalv2 and float like type's common type should be consistant with old planner in arithmetic expr (#22654)
when both decimalv2 and float like type in the arithmetic expr, the common type is depend on roundPreciseDecimalV2Value session variable. If it's true, the common type is DecimalV2Type.SYSTEM_DEFAULT, otherwise its double type.
2023-08-08 15:22:04 +08:00
66784cef71 [Enhancement](Load) Stream Load using SQL (#22509)
This PR was originally #16940 , but it has not been updated for a long time due to the original author @Cai-Yao . At present, we will merge some of the code into the master first.

thanks @Cai-Yao @yiguolei
2023-08-08 13:49:04 +08:00
c4def9db5c [feature](Nereids): add enforcers in Group (#22660) 2023-08-08 13:39:55 +08:00
1617368ee1 [fix](planner) fix bug of push constant conjuncts through set operation node (#22695)
when pushing down constant conjunct into set operation node, we should assign the conjunct to agg node if there is one. This is consistant with pushing constant conjunct into inlineview.
2023-08-08 12:25:42 +08:00
d77b77a33f [feature](Nereids) eliminate sort that is not directly below result sink (#22550)
eliminate sort that is not directly below result sink.
TODO:
handle select c1 + c2 from (select c1, c2 from t order by c1) v;
2023-08-08 11:19:10 +08:00
e578e1e6a2 [opt](Nereids) turnoff pipeline when dml temporary (#22693)
pipeline could not work well for dml
2023-08-08 10:26:40 +08:00
36cea89c22 [Fix](planner)support delete conditions contain non-key columns and add check in analyze phase for delete. (#22673) 2023-08-07 21:49:53 +08:00
f074909d3c [opt](Nereids) disable strict consistency dml by default temporary (#22672)
TODO:
1. optimize exchange performance
2. let table sink do merge on one replica
2023-08-07 19:38:35 +08:00
d1a2473944 [Feature](broker)Support GCS (#20904) 2023-08-07 19:37:18 +08:00
9c91e80b0c [feature](Nereids): pushdown COUNT(*) through join (#22545) 2023-08-07 12:53:27 +08:00
97adbaadb9 fix full auto analyze (#22650) 2023-08-07 11:41:38 +08:00
023815a4b4 [fix](planner)runtime filter shouldn't be pushed through window function node (#22501) 2023-08-07 09:57:12 +08:00
1a8a1e5b16 [Feature](count_by_enum) support count_by_enum function (#22071)
count_by_enum(expr1, expr2, ... , exprN);

Treats the data in a column as an enumeration and counts the number of values in each enumeration. Returns the number of enumerated values for each column, and the number of non-null values versus the number of null values.
2023-08-06 16:05:14 +08:00
fce78fff92 [fix](rest)check response code when get image (#22272) 2023-08-06 10:34:10 +08:00