Commit Graph

17908 Commits

Author SHA1 Message Date
80cdc74908 [fix](arrow-flight) Fix reach limit of connections error (#32911)
Fix Reach limit of connections error
in fe.conf , arrow_flight_token_cache_size is mandatory less than qe_max_connection/2. arrow flight sql is a stateless protocol, connection is usually not actively disconnected, bearer token is evict from the cache will unregister ConnectContext.

Fix ConnectContext.command not be reset to COM_SLEEP in time, this will result in frequent kill connection after query timeout.

Fix bearer token evict log and exception.

TODO: use arrow flight session: https://mail.google.com/mail/u/0/#inbox/FMfcgzGxRdxBLQLTcvvtRpqsvmhrHpdH
2024-04-10 11:34:29 +08:00
d959291c98 [improvement](decommission be) decommission check replica num (#32748) 2024-04-10 11:34:28 +08:00
06e5c6c966 [fix](grace-exit) Stop incorrectly of reportwork cause heap use after free #32929 2024-04-10 11:34:28 +08:00
f23a72b937 [chore](log) print query id before logging profile in be.INFO (#32922) 2024-04-10 11:34:28 +08:00
87f99271e1 [fix](spill) Avoid releasing resources while spill tasks are executing (#32783) 2024-04-10 11:34:28 +08:00
f5340039fc [fix](multicatalog) fix no data error when read hive table on cosn (#32815)
Currently, when reading a hive on cosn table, doris return empty result, but the table has data.
iceberg on cosn is ok.
The reason is misuse of cosn's file sytem. according to cosn's doc, its fs.cosn.impl should be org.apache.hadoop.fs.CosFileSystem
2024-04-10 11:34:28 +08:00
66536c2976 [fix](Nereids) NPE when create table with implicit index type (#32893) 2024-04-10 11:34:28 +08:00
dcfdbf0629 [chore](show) support statement to show views from table (#32358)
MySQL [test]> show views;
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
| t2_view        |
+----------------+
2 rows in set (0.00 sec)

MySQL [test]> show views like '%t1%';
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
+----------------+
1 row in set (0.01 sec)

MySQL [test]> show views where create_time > '2024-03-18';
+----------------+
| Tables_in_test |
+----------------+
| t2_view        |
+----------------+
1 row in set (0.02 sec)
2024-04-10 11:34:28 +08:00
96b995504c [enhancement](statistics) excluded delta rows num for rollup&mv tablets (#32568)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
Co-authored-by: tsy <tangsiyang2001@foxmail.com>
2024-04-10 11:34:28 +08:00
e8d67e79fd [fix](test) fix duplicated catalog name in regression cases (#32442)
Each suite should use different catalog name,
other it will effect each other when running cases concurrently.
2024-04-10 11:34:28 +08:00
217514e5dd [minor](test) Add Iceberg hadoop catalog FE unit test (#32449)
For easy testing the behavior of Iceberg's HadoopCatalog.listNamespaces()
2024-04-10 11:34:28 +08:00
b130df2488 2.1.2-rc04 2024-04-09 16:28:27 +08:00
c35b2becdd [fix][docker] fix kafka test scritps (#33417)
Co-authored-by: 胥剑旭 <xujianxu@xujianxudeMacBook-Pro.local>
2024-04-09 16:11:09 +08:00
005f7af21f [bugfix](deadlock) should not use query cancelled in fragment mgr 2024-04-09 16:09:01 +08:00
e574b35833 [Enhancement](partition) Refine some auto partition behaviours (#32737) (#33412)
fix legacy planner grammer
fix nereids planner parsing
fix cases
forbid auto range partition with null column
fix CreateTableStmt with auto partition and some partition items.
1 and 2 are about #31585
doc pr: apache/doris-website#488
2024-04-09 15:51:02 +08:00
97850cf2bb [fix](cooldown) Fix hdfs path (#33315) 2024-04-09 12:55:53 +08:00
a1f80eaa7a 2.1.2-rc03 2024-04-09 12:49:05 +08:00
2a0644f442 [Fix](function) Fix unix_timestamp core for string input (#32871) 2024-04-09 12:48:35 +08:00
b5b0181a79 2.1.2-rc02 2024-04-09 12:37:31 +08:00
3c4ccb3981 Revert "[opt](scan) read scan ranges in the order of partitions (#31630)"
This reverts commit 5d99dffe6f1a3fcb107ce56181aeff96ef222def.
2024-04-09 12:37:31 +08:00
bfc9260507 [bugfix](deadlock) avoid deadlock in memtracker cancel query (#33400)
get_query_ctx(hold query ctx map lock) ---> QueryCtx ---> runtime statistics mgr --->

runtime statistics mgr ---> allocate block memory ---> cancel query

memtracker will try to cancel query when memory is not available during allocator.
BUT the allocator is a foundermental API, if it call the upper API it may deadlock.
Should not call any API during allocator.
2024-04-09 12:20:54 +08:00
0c8d3d007d [fix](jni) don't delete global ref if scanner is not openned (#33398) 2024-04-09 09:06:16 +08:00
0e1a15960c 2.1.2-rc01 2024-04-08 23:17:15 +08:00
7892e7300f [fix](external catalog) Reset external table creation status on log replay (#33393) 2024-04-08 23:17:15 +08:00
4d98fe23a2 [enhancement](rpc) should print fe address in error msg during thrift rpc call (#33381)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-04-08 23:10:17 +08:00
dbf2326f62 [regression-test](case) fix unstable test case in multi fe env (#33385)
* [regression-test](case) fix unstable test case in multi fe env
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-04-08 23:10:17 +08:00
0234976ab7 [refactor](meta scan) Remove RPC from execute threads (#33378) 2024-04-08 20:28:02 +08:00
5e5fffe4e3 Set enable_unique_key_partial_update to false in statistics session variable. (#33220) 2024-04-08 16:49:58 +08:00
a8232c67f9 [pipelineX](runtime filter) Fix task timeout caused by runtime filter (#33332) (#33369) 2024-04-08 16:30:32 +08:00
d60d804d9c [fix](memory) Fix task repeat attach task DCHECK failed #32784 (#33343)
[branch-2.1](memory) Fix CCR task repeat attach task DCHECK failed3 #33366
2024-04-08 16:15:04 +08:00
1f3ab4fd24 [fix](jdbc catalog) fix db2 test connection sql (#33335) 2024-04-08 09:05:44 +08:00
c318c48a38 [fix](compile) fix implicit float-to-int conversion in mem_info calculation (#33311) 2024-04-08 07:34:22 +08:00
ebbfb06162 [Bug](array) fix array column core dump in get_shrinked_column as not check type (#33295)
* [Bug](array) fix array column core dump in get_shrinked_column as not check type

* add function could_shrinked_column
2024-04-08 07:27:40 +08:00
1b3e4322e8 [improvement](serde) Handle NaN values in number for MySQL result write (#33227) 2024-04-07 23:24:23 +08:00
fae55e0e46 [Feature](information_schema) add processlist table for information_schema db (#32511) 2024-04-07 23:24:22 +08:00
29556f758e [fix](parquet) fix time zone error in parquet reader (#33217)
`isAdjustedToUTC` is exactly the opposite in parquet reader(https://github.com/apache/parquet-format/blob/master/LogicalTypes.md), resulting the time with `isAdjustedToUTC=true` has increased by eight hours(UTC8).

The parquet with `isAdjustedToUTC=true` can be produced by spark-sql with the following configuration:
```
--conf spark.sql.session.timeZone=UTC
--conf spark.sql.parquet.outputTimestampType=TIMESTAMP_MICROS
```

However, using the following configuration, there's no logical and convert type in parquet meta data, so the time read by doris will also increase by eight hours(UTC8). Users need to set their own UTC time zone in doris(https://doris.apache.org/docs/dev/advanced/time-zone/)
```
--conf spark.sql.session.timeZone=UTC
--conf spark.sql.parquet.outputTimestampType=INT96
```
2024-04-07 23:24:22 +08:00
b882704eaf [fix](Export) Set the default value of the data_consistence property of export to partition (#32830) 2024-04-07 23:24:22 +08:00
69bf3b9da4 [fix](hdfs-writer) Catch error information after hdfsCloseFile() (#33195) 2024-04-07 23:24:17 +08:00
586df24b9d [fix](tvf) Support fs.defaultFS with postfix '/' (#33202)
For HDFS tvf like:
```
select count(*) from hdfs(
"uri" = "hdfs://HDFS8000871/path/to/1.parquet",
"fs.defaultFS" = "hdfs://HDFS8000871/",
"format" = "parquet"
);
```

Before, if the `fs.defaultFS` is end with `/`, the query will fail with error like:
```
reason: RemoteException: File does not exist: /user/doris/path/to/1.parquet
```
You can see that is a wrong path with wrong prefix `/user/doris`
User need to set `fs.defaultFS` to `hdfs://HDFS8000871` to avoid this error.

This PR fix this issue
2024-04-07 22:21:14 +08:00
466972926e [fix](dns-cache) do not detach the refresh thread (#33182) 2024-04-07 22:18:56 +08:00
feb2f4fae8 [feature](local-tvf) support local tvf on shared storage (#33050)
Previously, local tvf can only query data on one BE node.
But if the storage is shared(eg, NAS), it can be executed on multi nodes.

This PR mainly changes:
1. Add a new property `"shared_stoage" = "false/true"`

    Default is false, if set to true, "backend_id" is optional. If "backend_id" is set,
    it still be executed on that BE, if not set, "shared_stoage" must be "true"
    and it will be executed on multi nodes.

Doc: https://github.com/apache/doris-website/pull/494
2024-04-07 22:17:28 +08:00
95da52b9d8 [fix](avro) avoid BE crash if avro scanner's dependency jars is mssing (#33031)
1. Check the return value of avro reader's init_fetch_table_schema_reader()
2. Also fix a bug but the parse exception of Nereids may suppress the real exception from old planner
    It will result unable to see the real error msg.
2024-04-07 22:17:16 +08:00
ed93d6132f [fix](jni) avoid coredump if failed to get jni env (#32950)
This PR #32217 find a problem that may failed to get jni env.
And it did a work around to avoid BE crash.

This PR followup this issue, to avoid BE crash when doing `close()` of JniConnector
if failed to get jni env.

The `close()` method will return error when:
1. Failed to get jni env
2. Failed to release jni resource.

This PR will ignore the first error, and still log fatal for second error
2024-04-07 22:16:53 +08:00
c758a25dd8 [opt](fqdn) Add DNS Cache for FE and BE (#32869)
In previously, when enabling FQDN, Doris will call dns resolver to get IP from hostname
each time when 1) FE gets BE's grpc client. 2) BE gets other BE's brpc client.
So when in high concurrency case, the dns resolver be overloaded and failed to resolve hostname.

This PR mainly changes:

1. Add DNSCache for both FE and BE.
    The DNSCache will run on every FE and BE node. It has a cache, key is hostname and value is IP.
    Caller can get IP by hostname from this cache, and if hostname does not exist, it will try to resolve it
    and update the cache.
    In addition, DNSCache has a daemon thread to refresh the cache every 1 min, in case that the IP may
    be changed at anytime.

There are other implements of this dns cache:

1.  36fed13997
    This is for BE side, but it does not handle the IP change case.

3. https://github.com/apache/doris/pull/28479
    This is for FE side, but it can only work with Master FE. Other FE node will not be aware of the IP change.
    And there are a bunch of BackendServiceProxy, this PR only handle cache in one of them.
2024-04-07 22:16:04 +08:00
8bb2ef1668 [opt](iceberg) no need to check the name format of iceberg's database (#32977)
No need to check the name format of iceberg's database.
We should accept all databases.
2024-04-07 22:14:51 +08:00
e9b67bc82d [bugfix](paimon)merge meta-inf/services for paimon FileIOLoader (#33166)
We introduced paimon's oss and s3 packages, but did not register them in meta-info/service. As a result, when be used the s3  or oss interface, an error was reported and the class could not be found(`Could not find a file io implementation for scheme 's3' in the classpath.`).

FYI:
https://stackoverflow.com/questions/47310215/merging-meta-inf-services-files-with-maven-assembly-plugin
https://stackoverflow.com/questions/1607220/how-can-i-merge-resource-files-in-a-maven-assembly
2024-04-07 22:13:00 +08:00
d9d950d98e [fix](iceberg) fix iceberg predicate conversion bug (#33283)
Followup #32923

Some cases are not covered in #32923
2024-04-07 22:12:38 +08:00
190763e301 [bugfix](iceberg)Convert the datetime type in the predicate according to the target column (#32923)
Convert the datetime type in the predicate according to the target column.
And add a testcase for #32194
related #30478 #30162
2024-04-07 22:12:33 +08:00
ecb4372479 [Fix](pipelinex) Fix MaxScannerThreadNum calculation error in file scan operator when turn on pipelinex. (#33037)
MaxScannerThreadNum in file scan operator when turn on pipelinex is incorrect, it will cost many memory and causing performance degradation. This PR fix it.
2024-04-07 22:11:27 +08:00
32d6a4fdd5 [opt](rowcount) refresh external table's rowcount async (#32997)
In previous implementation, the row count cache will be expired after 10min(by default),
and after expiration, the next row count request will miss the cache, causing unstable query plan.

In this PR, the cache will be refreshed after Config.external_cache_expire_time_minutes_after_access,
so that the cache entry will remain fresh.
2024-04-07 22:11:14 +08:00