Commit Graph

13288 Commits

Author SHA1 Message Date
d2cd0c30c7 [improvement](jdbc catalog) optimize the JDBC Catalog connection error message (#23868) 2023-09-11 10:26:54 +08:00
480fcef0a1 [typo](errmsg) Improve partition error message (#23968) 2023-09-11 10:25:06 +08:00
cd13f9e8c6 [BUG](view) fix can't create view with lambda function (#23942)
before the lambda function Expr not implement toSqlImpl() function.
so it's call parent function, which is not suit for lambda function.
and will be have error when create view.
2023-09-11 10:04:00 +08:00
0896aefce3 [fix](local exchange) fix bug of accesssing released counter of local data stream receiver (#24148) 2023-09-11 09:52:31 +08:00
a0fcc30764 [Fix](Status) Handle status code correctly and add a new error code ENTRY_NOT_FOUND (#24139) 2023-09-11 09:32:11 +08:00
dcde83d6e6 [Improve](regresstests)add boundary regress tests for map & array #24133 2023-09-11 08:28:11 +08:00
31bffdb5fc [enhancement](stats) audit for stats collection #24074
log stas collection sqls in audit log
2023-09-11 08:26:12 +08:00
71db844c64 [feature](invert index) add tokenizer CharFilter preprocessing (#24102) 2023-09-10 23:08:28 +08:00
ebac816e85 Revert "[improvement](bitshuffle)Enable avx512 support in bitshuffle for performance boost (#15972)" (#24146)
This reverts commit 28fcc093a8958a6870fec9802b23db07a42bbd7b.
2023-09-10 23:06:21 +08:00
586492c124 [Feature](multi-catalog) Support sql cache for hms catalog (#23391)
**Support sql cache for hms catalog. Legacy planner and Nereids planner are all supported. 
Not support partition cache now, not support federated query now.**
2023-09-10 21:56:35 +08:00
9b3be0ba7a [Fix](multi-catalog) Do not throw exceptions when file not exists for external hive tables. (#23799)
A similar bug compares to #22140 .

When executing a query with hms catalog, the query maybe failed because some hdfs files are not existed. We should just distinguish this kind of errors and skip it.

```
errCode = 2, detailMessage = (xxx.xxx.xxx.xxx)[CANCELLED][INTERNAL_ERROR]failed to init reader for file hdfs://xxx/dwd_tmp.db/check_dam_table_relation_record_day_data/part-00000-c4ee3118-ae94-4bf7-8c40-1f12da07a292-c000.snappy.orc, err: [INTERNAL_ERROR]Init OrcReader failed. reason = Failed to read hdfs://xxx/dwd_tmp.db/check_dam_table_relation_record_day_data/part-00000-c4ee3118-ae94-4bf7-8c40-1f12da07a292-c000.snappy.orc: [INTERNAL_ERROR]Read hdfs file failed. (BE: xxx.xxx.xxx.xxx) namenode:hdfs://xxx/dwd_tmp.db/check_dam_table_relation_record_day_data/part-00000-c4ee3118-ae94-4bf7-8c40-1f12da07a292-c000.snappy.orc, err: (2), No such file or directory), reason: RemoteException: File does not exist: /xxx/dwd_tmp.db/check_dam_table_relation_record_day_data/part-00000-c4ee3118-ae94-4bf7-8c40-1f12da07a292-c000.snappy.orc at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86) 
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76) 
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:158) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1927) 
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:738) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:426) 
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) 
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) 
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
```
2023-09-10 21:55:09 +08:00
f85da7d942 [improvement](jdbc) add profile for jdbc read and convert phase (#23962)
Add 2 metrics in jdbc scan node profile:
- `CallJniNextTime`: call get next from jdbc result set
- `ConvertBatchTime`: call convert jobject to columm block

Also fix a potential concurrency issue when init jdbc connection cache pool
2023-09-10 21:42:06 +08:00
ff92b5bc29 [Bug](pipelineX) Fix runtime filter on pipelineX (#24141) 2023-09-10 20:53:54 +08:00
a05003fbe1 [fix](pipeline) fix remove pipeline_x_context from fragment manager (#24062) 2023-09-10 20:53:26 +08:00
1df2e4454f [improvememt](file-cache) increase virtual node number to make file cache more even (#24143)
The origin virtual number is Math.max(Math.min(512 / backends.size(), 32), 2);, which is too small,
causing uneven cache distribution when enabling file cache.
2023-09-10 19:56:53 +08:00
102abff071 [Fix](spark-load) ignore column name case in spark load (#23947)
Doris is not case sensitive to field names, so when doing spark load, we can convert all fields to lowercase for matching and loading.
2023-09-10 19:45:01 +08:00
8e171f5cbf [Enhancement](multi-catalog) merge hms partition events. (#22869)
This pr mainly has two changes:

1. add some merge processes about partition events
2. add a ut for `MetastoreEventFactory`. First add some mock classes (`MockCatalog`/`MockDatabase` ...) to simulate the real hms catalog/databases/tables/partitions,  then create a event producer which can produce every kinds of `MetastoreEvent`s randomly. Use two catalogs for test, one is named `testCatalog` and the other is the `validateCatalog`, use event producer to produce many events and let `validateCatalog` to handle all of the events, but `testCatalog` just handles the events  which have been merged by `MetastoreEventFactory`, check if the `validateCatalog` is equals to `testCatalog`.
2023-09-10 18:29:54 +08:00
32a7eef96a [schedule](pipeline) Remove wait schedule time in pipeline query engine (#23994)
Co-authored-by: yiguolei <676222867@qq.com>
2023-09-10 17:06:51 +08:00
648bf77c72 [Fix](MemtableMemoryLimiter) fix memtable memory limiter tigger flush log (#24137) 2023-09-10 16:33:35 +08:00
14f8f0cae0 [Improvement](errorcode) use error code when disk exceed capacity limit (#24136) 2023-09-10 16:32:17 +08:00
cae5a9d3cd [Fix](auth) fix revoke role operation cause fe down (#23852)
If there 3 above fe nodes,
the following opeartions will cause all FE nodes down.

DROP USER revoke_test_user
DROP ROLE revoke_test_role
DROP DATABASE IF EXISTS revoke_test_db
CREATE DATABASE revoke_test_db
CREATE ROLE revoke_test_role
CREATE USER revoke_test_user IDENTIFIED BY 'revoke_test_pwd'
GRANT SELECT_PRIV ON revoke_test_db.* TO ROLE 'revoke_test_role'
GRANT 'revoke_test_role' TO revoke_test_user
SHOW GRANTS FOR revoke_test_user
REVOKE 'revoke_test_role' from revoke_test_user
SHOW GRANTS FOR revoke_test_user
DROP USER revoke_test_user
DROP ROLE revoke_test_role
DROP DATABASE revoke_test_db
2023-09-10 16:16:07 +08:00
71645a391c [debug](FileCache) fail over to remote file reader if local cache failed (#24097)
Fail over to remote file reader even if local file cache failed. This operation can increase the robustness of file cache.
2023-09-10 12:26:17 +08:00
69f599bb53 [regression-test](fix)add test_ifnull. (#23956) 2023-09-10 12:11:43 +08:00
262c669918 [fix](jdbc catalog) fix jdbc catalog creating json columns when reading json data (#24122) 2023-09-10 12:00:53 +08:00
953958c486 [fix](create tablet) fix backend create tablet timeout (#23879) 2023-09-10 11:41:00 +08:00
93c1151f1a [fix](join) incorrect result of mark join (#24112) 2023-09-10 11:30:45 +08:00
232b58a27d [fix](broker-load) make sequence column name case insensitive (#24071) 2023-09-10 10:51:07 +08:00
a1090f20c7 [supplement](regression-test) pass ccr case when fe enable_feature_binlog=false (#24077) 2023-09-10 10:49:42 +08:00
5f2ca8c84c [log](load) print more message about load job on tablet error (#24096) 2023-09-10 10:30:43 +08:00
650af8f4df [fix](test) fix broker load with default value test case (#24123) 2023-09-10 10:28:22 +08:00
f9a75b5c4f [feature](csv_serde)1.append csv serde for serialize to csv and deserialize from csv. 2.let csvReader use csv serde not text_converter. (#23352)
1. append csv serde for serialize to csv and deserialize from csv.
2. let csvReader use csv serde not text_converter.
2023-09-10 00:16:21 +08:00
5eb9e10b51 [pipelineX](pick) pick 2 PRs to fix bugs (#24117) 2023-09-09 23:10:46 +08:00
7c7e44fcc8 [refactor](nereids) make forbid_unknown_col_stats check more accurate (#24061)
ignore unknown col stats check if:
colunm not used in query
column is Array/Json/Map/Struct type
2023-09-09 22:42:17 +08:00
e09e030652 [fix](Nereids) mv in select materialized_view should disable show table (#24104)
mv in select materialized_view should disable show table,
because Nereids planner can output the string such as
slot#[0] in toSql() of SlotRef. Note this is only a
temporary solution, will use an expression translator later
2023-09-09 21:57:52 +08:00
21e30d4374 [fix](planner)ctas's query part is not analyzed correctly (#24111)
* [fix](planner)ctas's query part is not analyzed correctly
2023-09-09 20:55:09 +08:00
8c2a721873 [opt](nereids)push down filter through window #23935
select rank() over (partition by A, B) as r, sum(x) over(A, C) as s from T;
A is a common partition key for all windowExpressions, that is A is intersection of {A,B} and {A, C}
we could push filter A=1 through this window, since A is a common Partition key:
select * from (select a, row_number() over (partition by a) from win) T where a=1;
origin plan:

----filter((T.a = 1))
----------PhysicalWindow
------------PhysicalQuickSort
--------------PhysicalProject
------------------PhysicalOlapScan[win]
transformed to

----PhysicalWindow
------PhysicalQuickSort
--------PhysicalProject
----------filter((T.a = 1))
------------PhysicalOlapScan[win]
But C=1 can not be pushed through window.
2023-09-09 20:53:31 +08:00
7b62013d21 [refactor](nereids) print "ifnull" instead of "nvl" in explain #23979
'ifnull' is used more general.
2023-09-09 20:33:23 +08:00
a8ed1d87d7 [enhancement](config): Change root log level to info in broker log (#24023) 2023-09-09 17:56:50 +08:00
6b9698a248 [bugfix](insert into) should not send profile during report process (#24127)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-09-09 17:12:35 +08:00
c3f3195721 [Fix](clucene) fix clucene build error in arm (#24130) 2023-09-09 15:31:40 +08:00
03757d0672 [bug](explode) fix table node not implement alloc_resource function (#24031)
fix table node not implement alloc_resource function
2023-09-09 08:25:28 +08:00
698fe55662 remove unused configs in be and broker (#24021) 2023-09-09 08:24:50 +08:00
153c7982f3 [Optimize](invert index) Optimize multiple terms conjunction query (#23871) 2023-09-09 01:52:58 +08:00
0f408d1192 [improvement](executor)Add name for task scheduler #23983 2023-09-09 00:56:39 +08:00
b5e1e36750 [fix](pipeline)add logs for unstable cases #24073
Issue Number: close #xxx

ShowTableStmtTest.testNoDb and DropDbStmtTest.testNoPriv are unstable cases,error msg is:

java.lang.Exception: Unexpected exception, expected<org.apache.doris.common.AnalysisException> but was<mockit.internal.expectations.invocation.MissingInvocation>
we can not know what is missing ,and this issue cannot be reproduced locally,so add some log
2023-09-09 00:49:40 +08:00
7abd23cad1 [fix](tablet clone) fix be load rebalancer choose candidate tablets #23915
When be load reblancer choose candidate tablets, it will try moving tablets from high load backends to low backend backends. If the higher HIGH BE has no available slot num, it should try next HIGH BE.
2023-09-09 00:48:27 +08:00
2fb4c818da [fix](tablet clone) delete tablet check other catchup #24038
Sometimes FE replica's version is unreliable. FE's replica may bigger than BE's real version. Need check if BE missing version (last failed version > 0).
2023-09-09 00:42:32 +08:00
aad3eb257f update gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b to 3.0.0 (#24056)
There are 1 security vulnerabilities found in gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b

CVE-2022-28948
What did I do?
Upgrade gopkg.in/yaml.v3 from v3.0.0-20210107192922-496545a6307b to 3.0.0 for vulnerability fix

What did you expect to happen?
Ideally, no insecure libs should be used.

How can we automate the detection of these types of issues?
By using the GitHub Actions configurations provided by murphysec, we can conduct automatic code security checks in our CI pipeline.

The specification of the pull request
PR Specification from OSCS
2023-09-09 00:37:39 +08:00
3e7f531d2b [fix](sec)upgrade org.yaml:snakeyaml to 2.0 #24057 2023-09-09 00:37:07 +08:00
0f0ffa3482 [Fix](Parquet Reader) fix parquet read issue (#24092) 2023-09-09 00:35:18 +08:00