Commit Graph

14887 Commits

Author SHA1 Message Date
ee12297cd9 [fix](test) Fix fe ut BDBJEJournalTest not stable (#27192) 2023-11-17 19:24:12 +08:00
a8720e645f [fix](fe ut) Fix borrow oject throw npe (#27072)
occasional failure of fe ut, borrowObject throw npe
```
get agent task request. type: CREATE, signature: 10008, fe addr: null
java.lang.NullPointerException
	at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.register(GenericKeyedObjectPool.java:1079)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:347)
get agent task request. type: CREATE, signature: 10012, fe addr: TNetworkAddress(hostname:127.0.0.1, port:56072)
	at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:277)
	at org.apache.doris.common.GenericPool.borrowObject(GenericPool.java:99)
	at org.apache.doris.utframe.MockedBackendFactory$DefaultBeThriftServiceImpl$1.run(MockedBackendFactory.java:219)
	at java.lang.Thread.run(Thread.java:750)
```
2023-11-17 19:16:29 +08:00
52995c528e [fix](iceberg) iceberg use customer method to encode special characters of field name (#27108)
Fix two bugs:
1. Missing column is case sensitive, change the column name to lower case in FE for hive/iceberg/hudi
2. Iceberg use custom method to encode special characters in column name. Decode the column name to match the right column in parquet reader.
2023-11-17 18:38:55 +08:00
f8b61d3d8e [Enhance](fe) select BE local broker to scan Hive table when 'broker.name' in hms catalog is specified (#27122)
Since #24830 introduce `broker.name` in hms catalog, data scan will run on specified brokers.
And [doris operator](https://github.com/selectdb/doris-operator) support BE and broker deployed in same pod, BE access local broker is the fastest approach to access data.
In previous logic, every inputSplit will select one BE to execute,  then randomly select one broker for actual data access, BE and related broker are always located on  separate K8S pod.
This pr optimizes the broker select strategy to prioritize BE-local broker when `broker.name` is specified in hms catalog.
2023-11-17 18:29:55 +08:00
0ece18d6cd [FIX](regresstest) fix test_map_nested_array csv file for id(#27105) 2023-11-17 04:20:02 -06:00
fa7e1b7fc7 [fix](Nereids) result type of add precision is 1 more than expected (#27136) 2023-11-17 04:13:09 -06:00
xy
fdec286e82 [optimize](cooldown)Shorten the _meta_lock lock interval (#27118)
Change the two passes of _rs_version_map to one, reducing cpu overhead and shortening the lock interval of _meta_lock

Co-authored-by: xingying01@corp.netease.com <xingying01@corp.netease.com>
2023-11-17 16:59:36 +08:00
xy
ab322eaa2b [improvement](detailMessage) add AvailCapacity prompt in detailMessage (#26328)
Co-authored-by: xingying01 <xingying01@corp.netease.com>
2023-11-17 16:54:31 +08:00
593e3662b0 [Fix](match) fix match null for no index (#26983)
This pull request addresses an issue observed with inverted index tables or tables without indices when querying null values using the MATCH function. 
Previously, executing a query like `SELECT * FROM table WHERE column MATCH null;` would yield incorrect results. 

The update introduces enhanced handling of nullable columns within the MATCH function, ensuring accurate query results when null values are involved.
2023-11-17 15:57:50 +08:00
9b040b3fbd [fix](nereids) partition prune fails in case of NOT expression (#27047)
* handle not and add regression test
2023-11-17 15:50:09 +08:00
ec92ba4af1 [fix](statistics)Fix alter column stats bug (#27093)
Encode the min and max value with base64 encoder while inject the column stats.
2023-11-17 15:40:47 +08:00
4d2fb1fffb [fix](load) add lock in active_memtable_mem_consumption (#27101) 2023-11-17 15:03:15 +08:00
e1b180d53d [improve](streamload) Explicitly judge the return value of close #27134 2023-11-17 14:17:09 +08:00
285c617a5f [minor](stats) Add start/end time for analyze job, precise to seconds of TableStats update time #27123 2023-11-17 13:59:53 +08:00
b359fff097 [regression test](http_stream) Case for Invalid file format (#27133) 2023-11-17 13:46:17 +08:00
a0661ed9d2 [Fix](multi-catalog) Fix complex type crash when using dict filter facility in the parquet-reader. (#27151)
- Fix complex type crash when using the dict filter facility in the parquet-reader by turning off the dict filter facility in this case.
- Add orc complex types regression test.
2023-11-17 13:43:58 +08:00
c7d961cb11 [regression test](stream load) add case for strict_mode=true and max_filter_ratio=0.5 (#27125) 2023-11-17 13:39:01 +08:00
ee08958526 [regression test](http_stream) case for timezone (#27149)
It does not work now, anyway we need a case.
2023-11-17 13:36:41 +08:00
4fff9a5937 [Improvement](inverted index) delay inverted index col read to reduce IO (#26080) (#26337) 2023-11-17 13:12:12 +08:00
06f0c10c8b [fix](nereids) count in correlated subquery shoud not output null value (#27064)
consider sql: 

SELECT * FROM t1 WHERE t1.a <= (SELECT COUNT(t2.a) FROM t2 WHERE (t1.b = t2.b));

when unnest correlated subquery, we create a left join node.
Assume outer query is left table and subquery is right one.
If there is no match, the row from right table is filled with nulls.
But COUNT function is always not nullable. 
So wrap COUNT with Nvl to ensure it's result is 0 instead of null to get the correct result
2023-11-16 22:31:42 -06:00
4ac460af28 [decimal](tests) add test case for least/greatest for decimalv3 type (#26930) 2023-11-17 12:09:59 +08:00
91af86bc78 [fix](function) fix error when use negative number in explode_numbers #27020 2023-11-17 12:02:14 +08:00
Pxl
1188d88a10 [Chore](status) catch some error status on storage (#27132)
catch some error status on storage
2023-11-17 12:00:39 +08:00
43ffcc5012 [fix](fe) Fix enable_nereids_planner forward not take effect (#26782)
* The java reflection method `getFields()` only return public fields,
  but enable_nereids_planner is private
2023-11-17 11:13:07 +08:00
334260dff7 [feature](function) support ip function ipv4stringtonum(ordefault, ornull), inet_aton (#25510) 2023-11-17 10:27:07 +08:00
a4d78682ff [Optimize](point query) clear names to reduce mem consumption and cpu cost related to block column name (#26931) 2023-11-17 10:18:21 +08:00
0c264c8a14 [fix](pipelineX) fix scheduling bug in union operator (#27131) 2023-11-17 10:02:54 +08:00
a510b5be81 [regression](delete) add regression test for every type delete (#26954) 2023-11-16 08:03:31 -06:00
Pxl
fd6a2cba5e [Chore](clang-tidy)enable readability-function-size.LineThreshold and readability-functi…
set readability-function-size.LineThreshold to 80 and enable readability-function-cognitive-complexity
2023-11-16 20:37:12 +08:00
492a22dced select coordinator node from user's tag when exec streaming load (#27106) 2023-11-16 19:55:50 +08:00
0ac3984d4b [doc](fix) en docs for k8s operator (#27049) 2023-11-16 18:40:56 +08:00
afffcfd14c [fix](load) skip cancel already cancelled channels (#27111) 2023-11-16 18:38:40 +08:00
e29d8cb110 [feature](move-memtable) support pipelineX in sink v2 (#27067) 2023-11-16 15:00:55 +08:00
54989175fb [case] Load json data with enable_simdjson_reader=false (#26601) 2023-11-16 14:40:59 +08:00
f10ab4e113 [enhancement](JNI) Provide default environment variables if it is unset (#27037) 2023-11-16 14:37:11 +08:00
754ca1fa46 [fix](Nereids) nested type coercion should not process struct (#27068) 2023-11-16 00:08:38 -06:00
2b401785ce [fix](Nereids) build array and map literal expression failed (#27060)
1. empty array and map literal
2. multi-layer nested array and map literal
2023-11-16 00:08:24 -06:00
d20441f002 [doc](Nereids) add doc for NO_BACKSLASH_ESCAPES (#26995) 2023-11-16 13:51:43 +08:00
343d58123d [Fix](nereids)Fix nereids fail to parse tablesample rows bug (#26981) 2023-11-16 12:23:37 +08:00
bf6a9383bc [fix](stats) table not exists error msg not print objects name (#27074) 2023-11-15 22:10:50 -06:00
6be74d22ea [fix](nereids)fix bug that query infomation_schema.rowsets fe send fragment to one of muilti be. (#27025)
Fixed the bug of incomplete query results when querying information_schema.rowsets in the case of multiple BEs.

The reason is that the schema scanner sends the scan fragment to one of multiple bes, and be queries the information of fe through rpc. Since the rowsets information requires information about all BEs, the scan fragment needs to be sent to all BEs.
2023-11-16 12:08:22 +08:00
7e82e7651a [Improve](txn) Add some fuzzy test stub in txn (#26712) 2023-11-16 11:50:06 +08:00
624d372dcd [FIX](map)fix map element_at with decimal value (#27030) 2023-11-16 11:49:51 +08:00
230d8af777 [regression test](temporary_partitions) add case for temporary_partitions #27063 2023-11-16 11:49:37 +08:00
7fbc6d26a7 [debug](log) add some log to debug issue about insert (#27045) 2023-11-16 11:46:47 +08:00
f3ee6dd55a [feature](Nereids): eliminate sort under subquery (#26993) 2023-11-16 10:30:28 +08:00
a54cfb7558 [fix](backup) return if status not ok and reduce summit job (#26940)
when backup is prepareAndSendSnapshotTask(), if some table has error, return status not ok, but not return, and other tables continue put snapshot job into batchTask and summit jobs to be while these jobs need cancel. so when status is not ok, return and do not summit jobs
2023-11-16 10:16:56 +08:00
612d9dd7c6 [fix](errmsg) fix multiple FE processes start err msg (#27009) 2023-11-16 10:16:35 +08:00
042f6e8458 [cleanup](move-memtable) cleanup unused fields in rowset writer v2 (#27073) 2023-11-16 10:13:00 +08:00
xy
b8b86a7262 [enhance](cooldown) Reduce the locking interval for cooldown task (#26984)
Co-authored-by: xingying01 <xingying01@corp.netease.com>
2023-11-16 10:02:32 +08:00