Commit Graph

5130 Commits

Author SHA1 Message Date
43d783ae21 [fix](vertical compaction) compaction block reader should return error when reading next block failed (#22431) 2023-08-01 14:09:18 +08:00
f842067354 [fix](merge-on-write) fix duplicate keys occur when be restart (#22437)
For mow table, delete bitmap of stale rowsets has not been persisted. When be restart, duplicate keys will occur if read stale rowsets.
Therefore, for the mow table, we do not allow reading the stale rowsets. Although this may result in VERSION_ALREADY_MERGED error when query after be restart, its probability of occurrence is relatively low.
2023-08-01 14:07:04 +08:00
3a11de889f [Opt](exec) opt the performance of date parquet convert by date dict (#22384)
before:

mysql> select count(l_commitdate) from lineitem;
+---------------------+
| count(l_commitdate) |
+---------------------+
| 600037902 |
+---------------------+
1 row in set (0.86 sec)
after:

mysql> select count(l_commitdate) from lineitem;
+---------------------+
| count(l_commitdate) |
+---------------------+
| 600037902 |
+---------------------+
1 row in set (0.36 sec)
2023-08-01 12:24:00 +08:00
a371e1d4c5 [fix](window_funnel_function) fix upgrade compatibility due to the added field in WindowFunnelState (#22416) 2023-08-01 12:08:55 +08:00
d585a8acc1 [Improvement](shuffle) Accumulate rows in a batch for shuffling (#22218) 2023-08-01 09:55:06 +08:00
5f25b924b3 [opt](conf) Modify brpc eovercrowded conf (#22407)
brpc ignore eovercrowded of data stream sender and exchange sink buffer
Modify the default value of brpc_socket_max_unwritten_bytes
2023-08-01 08:47:55 +08:00
66e540bebe [Fix](executor)Fix incorrect mem_limit return value type (#22415) 2023-07-31 22:28:41 +08:00
c1f36639fd [fix](sort) VSortedRunMerger does not return any rows with a large offset value (#22191) 2023-07-31 22:28:13 +08:00
89433f6a13 [fix](complex_type) throw error when reading complex types in broker/stream load (#22331)
Check whether there are complex types in parquet/orc reader in broker/stream load. Broker/stream load will cast any type as string type, and complex types will be casted wrong. This is a temporary method, and will be replaced by tvf.
2023-07-31 22:23:08 +08:00
c25b9071ad [opt](conf) Modify brpc work pool conf default value #22406
Default, if less than or equal 32 core, the following are 128, 128, 10240, 10240 in turn.
if greater than 32 core, the following are core num * 4, core num * 4, core num * 320, core num * 320 in turn

brpc_heavy_work_pool_threads
brpc_light_work_pool_threads
brpc_heavy_work_pool_max_queue_size
brpc_light_work_pool_max_queue_size
2023-07-31 20:38:34 +08:00
3b1be39033 [fix](load) load core dump print load id (#22388)
save the load id to the thread context,
expect all task ids to be saved in thread context, compaction/schema change/etc.
2023-07-31 18:29:38 +08:00
7261845b3d [FIX](complex-type)fix complex type nested col_const (#22375)
for array/map/struct in mysql_writer unpack_if_const only unpack self column not nested , so col_const should not used in nested column.
2023-07-31 14:53:18 +08:00
147a148364 [refactor](segcompaction) simplify submit_seg_compaction_task interface (#22387) 2023-07-31 13:53:38 +08:00
f2919567df [feature](datetime) Support timezone when insert datetime value (#21898) 2023-07-31 13:08:28 +08:00
b64f62647b [runtime filter](profile) add merge time on non-pipeline engine (#22363) 2023-07-31 12:52:42 +08:00
ee754307bb [refactor](load) refactor memtable flush actively (#21634) 2023-07-30 21:31:54 +08:00
79289e32dc [fix](cast) fix wrong result of casting empty string to array date (#22281) 2023-07-30 21:15:03 +08:00
63a9a886f5 [enhance](S3) add s3 bvar metrics for all s3 operation (#22105) 2023-07-30 21:09:17 +08:00
06e4061b94 [enhance](ColdHeatSeparation) carry use path style info along with cold heat separation to support using minio (#22249) 2023-07-30 21:03:33 +08:00
4077338284 [Opt](parquet) opt the performance of date convertion (#22360)
before:
```
mysql>  select count(l_commitdate) from lineitem;
+---------------------+
| count(l_commitdate) |
+---------------------+
|           600037902 |
+---------------------+
1 row in set (1.61 sec)
```

after:
```
mysql>  select count(l_commitdate) from lineitem;
+---------------------+
| count(l_commitdate) |
+---------------------+
|           600037902 |
+---------------------+
1 row in set (0.86 sec)
```
2023-07-30 15:54:13 +08:00
e47d1fccf5 [bugfix](be core) fragment executor's destruct method should be called before query context (#22362)
fragment executor's destruct method will call close, it depends on query context's object pool, because many object is put in query context's object pool such as runtime filter.
It should be deleted before query context. Or there will be heap use after free error.
It is fixed in #17675, but Do not know why not in master. So 1.2-lts does not have this problem.
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-07-29 22:41:46 +08:00
765f1b6efe [Refactor](load) Extract load public code (#22304) 2023-07-29 12:56:31 +08:00
47c2cc5c74 [vectorized](udf) java udf support with return map type (#22300) 2023-07-29 12:52:27 +08:00
Pxl
210f6661b4 [Bug](profile) add lock on add_filter_info #22355
multiple scanner may update profile at same time
2023-07-29 12:45:50 +08:00
bc88d34b16 [bug](distinct-agg) fix distinct-agg outblock columns size not equal key size (#22357)
* [imporve](flex) support scientific notation(aEb) parser

* update

* [bug](distinct-agg) fix distinct-agg outblock columns size not equal key size
2023-07-29 12:44:44 +08:00
302de27985 [Refactor] Refactor some code with three-way comparison (#22170)
Refactor some code with three-way comparison
2023-07-29 11:30:15 +08:00
ae8a26335c [opt](hive)opt select count(*) stmt push down agg on parquet in hive . (#22115)
Optimization "select count(*) from table" stmtement , push down "count" type to BE.
support file type : parquet ,orc in hive .

1. 4kfiles , 60kwline num 
    before:  1 min 37.70 sec 
    after:   50.18 sec

2. 50files , 60kwline num
    before: 1.12 sec
    after: 0.82 sec
2023-07-29 00:31:01 +08:00
53d255f482 [fix](partial update) remove CHECK on illegal number of partial columns (#22319) 2023-07-28 23:11:58 +08:00
5b14d9fcdc [fix](compaction) fix time series compaction policy corner case (#22238) 2023-07-28 23:07:36 +08:00
0cc3232d6f [Improve](topn opt) modify fetch rpc timeout from 20s to 30s, since fetch is quite heavy sometimes (#22163) 2023-07-28 17:56:18 +08:00
Pxl
f7e0479605 [Chore](refactor) remove some unused code (#22152)
remove some unused code
2023-07-28 17:30:46 +08:00
ec1a4d172b (vertical compaction) fix vertical compaction core (#22275)
* (vertical compaction) fix vertical compaction core
co-author:@zhannngchen
2023-07-28 16:41:00 +08:00
0c734a861e [Enhancement](delete) eliminate reading the old values of non-key columns for delete stmt (#22270) 2023-07-28 14:37:33 +08:00
c2155678ca [fix](functions) fix now(null) crash (#22321)
before: BE crash
now:

mysql [test]>select now(null);
+-----------+
| now(NULL) |
+-----------+
| NULL      |
+-----------+
1 row in set (0.06 sec)
2023-07-28 14:07:56 +08:00
1c6246f7ee [improve](agg) support distinct agg node (#22169)
select c_name from customer union select c_name from customer
this sql used agg node to get distinct row of c_name,
so it's no need to wait for inserted all data to hash map,
could output the data which it's inserted into hash map successed.
2023-07-28 13:54:10 +08:00
ad080c691f [chore](log)Move non-user-friendly error message to be.WARNING (#22315)
Move non-user-friendly error message to be.WARNING
2023-07-28 13:15:25 +08:00
7be349a10b [opt](inverted index) add session variable enable_inverted_index_query to control whether query with inverted index (#22255) 2023-07-28 12:43:26 +08:00
0d7d9b92db [fix](multi-catalog) complex types parsing failed, with unexpected nulls and rows (#22228)
Fix tow bugs:
1. Unexpected null values in array column. If 65535 consecutive values are not null in nullable array column, this error will be triggered. The reason is that the array parser did not handle boundary conditions.
2. The number of rows of key filed, and that of value field in map column are not equal. Similarly, the number of rows among fields in struct column are not the same. This would be triggered when the number of rows are not equal among parquet pages of different columns in a row group.
2023-07-28 10:03:08 +08:00
7d5d416b25 [Fix](EsCatalog) fix be core when query the table of Es catalog with null fields (#22279) 2023-07-28 09:53:55 +08:00
8caa5a9ba4 [Fix](mutli-catalog) Fix null partitions error in iceberg tables. (#22185)
### Issue
when partition has null partitions, it throws error
`Failed to fill partition column: t_int=null`

### Resolution
- Fix the following null partitions error in iceberg tables by replacing null partition to '\N'.
- Add regression test for hive null partition.
2023-07-27 23:57:35 +08:00
00863f25e9 [improvement](profile) add table name for file scan node (#22299)
```
VFILE_SCAN_NODE(region)  (id=0):(Active:  3.537us,  %  non-child:  0.00%)
                                -  RuntimeFilters:  :  
                              -  UseSpecificThreadToken:  False
                              -  AcquireRuntimeFilterTime:  501ns
                              -  AllocateResourceTime:  105.598us
```
2023-07-27 23:54:31 +08:00
b5fa29e138 [fix](bitmap) incorrect result of function 'bitmap_from_array' (#22305) 2023-07-27 22:44:06 +08:00
5584d7a5ba [Improve](point query) Improve lookup connection cache from DoubleBuffer to LRU cache for better item pruning (#22041) 2023-07-27 22:22:50 +08:00
8371171e44 [Feature](inverted index) add inverted index tool (#22207) 2023-07-27 21:28:34 +08:00
687d97e648 [improvement][default_config] enlarge default value compaction related (#22286)
configs

1. Because vertical compaction is enabled by default, it consumes less
memory, we can enlarge default value of compaction related configs.
2. Enlarge default value of shard size related to lock.
2023-07-27 20:17:43 +08:00
d0c369d61b [fix](vec) Arena was not initialized in PartitionMethodSerialized (#22295) 2023-07-27 18:55:57 +08:00
6f1c03c766 [fix](jdbc_catalog) fix int and bigint in mysql view when use doris catalog (#22251) 2023-07-27 16:50:42 +08:00
aa75f79fad [fix](executor)cancel exchange buffer rpc when query is cancelled (#22226)
when brpc client make a request to a server, if the server doesn't response and may not response forever(such as BE restart), the query can be cancelled at once, but the ExchangeSinkBuffer can not be cancelled until rpc timeout.
So we hope when the query is cancelled, the ExchangeSinkBuffer can be closed at once.
2023-07-27 14:38:25 +08:00
9a95d664b9 [chore](third-party) Fix the build order for libunwind (#22244)
1. libunwind depends on lzma
2. Fix the missing headers issues reported by GCC-13
2023-07-27 14:07:08 +08:00
Pxl
05be45bd35 [Improvement](brpc) adjust brpc_light_work_pool_threads/brpc_heavy_work_pool_threads (#22241)
adjust brpc_light_work_pool_threads/brpc_heavy_work_pool_threads
2023-07-27 14:03:46 +08:00