Commit Graph

324 Commits

Author SHA1 Message Date
c6b1c903e4 [fix](Regression-test) fix that the String type in a nested type should contain double quotes and add regression-test (#25115) 2023-10-11 18:30:26 +08:00
9e31cb26bb [fix](parse_url) fix parse_url is not working in some case to extract the HOST (#25040)
Issue Number: close #24452
2023-10-09 00:14:58 +08:00
b91335dbb8 [refactor](columndecimal) is_decimal_v2 member is useless because column decimal could detect by itself (#25110)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-10-08 18:09:19 +08:00
0df32c8e3e [Fix](Outfile) Use data_type_serde to export data to csv file format (#24721)
Modify the outfile logic, use the data type serde framework.
2023-10-07 22:50:44 +08:00
642e5cdb69 [Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly (#23395) 2023-09-29 22:38:52 +08:00
68087f6c82 [fix](json function) Fix the slow performance of get_json_path when processing JSONB (#24631)
When processing JSONB, automatically convert to jsonb_extract_string
2023-09-27 21:17:39 +08:00
5d138b6928 [remove](function) make execute_impl const and remove running_difference function (#24935) 2023-09-27 18:17:28 +08:00
ded8ba108f [test](be-ut) skip some be ut case (#24917)
skip be ut "VTimestampFunctionsTest.convert_tz_test"
2023-09-26 19:51:15 +08:00
082bcd820b [feature](insert) Support wal for group commit insert (#23053) 2023-09-26 14:46:24 +08:00
8191cd1dad [Bug](ScanNode) Fix potential incorrect query result caused by concurrent NewOlapScanNode initialization and Compaction (#24638)
* Optimize fetch delete predicates

* Fix incorrect query result when compaction eliminate delete predicates between `NewOlapScanNode::_init_scanners` and `NewOlapScanner::init`

* Fix be ut
2023-09-25 22:24:35 +08:00
22616d125d [function](bitmap) add function alias bitmap_andnot and bitmap_andnot_count (#24771) 2023-09-22 12:18:31 +08:00
85fb46bb71 [refactor](cache) Refactor preloaded timezone global cache (#24694)
Refactor preloaded timezone global cache
2023-09-21 17:26:41 +08:00
dc9fa1a4f1 [Refactor](Sink) convert to tablet sink to tablet writer (#24474) 2023-09-20 14:47:18 +08:00
8aea31e383 [fix](timezone) fix timezone parse when there is no tzfile (#24578) 2023-09-20 14:28:12 +08:00
4b5cea1ef8 [enhancement](fix)change ordinary type null value is \N,complex type null value is null (#24207) 2023-09-16 21:46:42 +08:00
268c867679 [Improve](serde)replace function_cast from_string to serde (#24087)
Now we can not support streamload with column which is map/array nested map/array
serde can do this now , so we can replace it
Notice. if item data in complex type data is empty we just return error, instead of makeup default value , because now we can not define right default for complex type
2023-09-14 13:53:16 +08:00
563c3f75ff [feature](move-memtable) share delta writer v2 among sinks (#24066) 2023-09-13 14:39:29 +08:00
c7ae2a7d22 [Refactor & Bugfix](static variables) move some static vairables to exec_env (#24029) 2023-09-13 09:27:03 +08:00
4bb9a12038 [function](bitmap) support bitmap_remove (#24190) 2023-09-12 14:52:04 +08:00
f9a75b5c4f [feature](csv_serde)1.append csv serde for serialize to csv and deserialize from csv. 2.let csvReader use csv serde not text_converter. (#23352)
1. append csv serde for serialize to csv and deserialize from csv.
2. let csvReader use csv serde not text_converter.
2023-09-10 00:16:21 +08:00
2f8b075b71 [improvement](bitmap) support version for ser/deser of bitmap (#23959) 2023-09-07 09:55:29 +08:00
Pxl
a96adc01aa [Chore](function) refactor of quantile_state (#23862)
refactor of quantile_state
2023-09-06 15:39:19 +08:00
75e2bc8a25 [function](bitmap) support bitmap_to_base64 and bitmap_from_base64 (#23759) 2023-09-02 00:58:48 +08:00
eaf2a6a80e [fix](date) return right date value even if out of the range of date dictionary(#23664)
PR(https://github.com/apache/doris/pull/22360) and PR(https://github.com/apache/doris/pull/22384) optimized the performance of date type. However hive supports date out of 1970~2038, leading wrong date value in tpcds benchmark.
How to fix:
1. Increase dictionary range: 1900 ~ 2038
2. The date out of 1900 ~ 2038 is regenerated.
2023-09-01 14:40:20 +08:00
62c075bf7e [improvement](Block) Replace Block(const PBlock&) with deserialize because it has heavy operations in ctor (#23672) 2023-08-31 14:44:17 +08:00
94a8fa6bc9 [bug](function) fix explode_number function return wrong rows (#23603)
before the explode_number function result is random with const value.
because the _cur_size is reset, so it's can't insert values to column.
2023-08-29 19:02:49 +08:00
5be8d57f52 [fix](be-ut) fix ColumnFixedLenghtObjectTest on 32 bits system (#23519) 2023-08-28 14:02:05 +08:00
f80b067990 [fix](column) add unimplemented function of ColumnFixedLengthObject (#23468) 2023-08-25 17:38:01 +08:00
9cacf9535a [Opt](functions) Use preloaded cache to accelerate timezone parsing (#22694)
* opt

* bugfix

* fix ut

* fix stylecheck
2023-08-25 10:00:48 +08:00
51ac92f65c Revert "[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)" (#23368)
This reverts commit 1c3cc77a54938ed948ad8186b8dea8385977d23c.
2023-08-23 18:27:35 +08:00
Pxl
8ed4045df9 [Chore](primitive-type) remove VecPrimitiveTypeTraits (#22842) 2023-08-23 08:37:40 +08:00
5ff7b57fc1 [fix](parquet) parquet reader confuses logical/physical/slot id of columns (#23198)
`ParquetReader` confuses logical/physical/slot id of columns. If only reading the scalar types, there's nothing wrong, but when reading complex types, `RowGroup` and `PageIndex` will get wrong statistics. Therefore, if the query contains complex types and pushed-down predicates, the probability of the result set is incorrect.
2023-08-22 13:35:29 +08:00
12075f9853 [pipelineX](projection) Support projection and blocking agg (#23256) 2023-08-21 22:23:02 +08:00
33dfa0c454 [Improve](serde) support text serde for nested type-array/map (#22738)
Now we can not support nested type array/map 
so this pr aim to:
1. add format option for string convert defined datatype to keep with origin from_string
2. support array map can nested array and map
2023-08-21 10:32:28 +08:00
1c3cc77a54 [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)
* [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty

* add ut

* fix nereids

* fix regression-test
2023-08-18 14:37:49 +08:00
4e880288c6 [refactor]use clear concept to replace std::enable_if_t (#22801)
---------

Signed-off-by: flynn <fenglv15@mails.ucas.ac.cn>
2023-08-12 15:10:30 +08:00
1a8a1e5b16 [Feature](count_by_enum) support count_by_enum function (#22071)
count_by_enum(expr1, expr2, ... , exprN);

Treats the data in a column as an enumeration and counts the number of values in each enumeration. Returns the number of enumerated values for each column, and the number of non-null values versus the number of null values.
2023-08-06 16:05:14 +08:00
b122f9b80c [fix](concat) ColumnString::chars is resized with wrong size (#22610)
FunctionStringConcat::execute_impl resized with size that include string null terminator, which causes ColumnString::chars.size() does not match with ColumnString::offsets.back, this will cause problems for some string functions, e.g. like and regexp.
2023-08-04 19:13:35 +08:00
86e6f5d039 [FIX](decimal)fix decimal precision (#22364)
Now we make wrong for decimal parse from string
if given string precision is bigger than defined decimal precision, we will return a overflow error, but only digit part is bigger than typed digit length , we should return overflow error when we traverse given string to decimal value
2023-08-03 21:13:58 +08:00
f16a39aea1 [feature](time) using timev2 type to replace the old time type. (#22269) 2023-08-01 15:59:07 +08:00
3a11de889f [Opt](exec) opt the performance of date parquet convert by date dict (#22384)
before:

mysql> select count(l_commitdate) from lineitem;
+---------------------+
| count(l_commitdate) |
+---------------------+
| 600037902 |
+---------------------+
1 row in set (0.86 sec)
after:

mysql> select count(l_commitdate) from lineitem;
+---------------------+
| count(l_commitdate) |
+---------------------+
| 600037902 |
+---------------------+
1 row in set (0.36 sec)
2023-08-01 12:24:00 +08:00
d585a8acc1 [Improvement](shuffle) Accumulate rows in a batch for shuffling (#22218) 2023-08-01 09:55:06 +08:00
ec1a4d172b (vertical compaction) fix vertical compaction core (#22275)
* (vertical compaction) fix vertical compaction core
co-author:@zhannngchen
2023-07-28 16:41:00 +08:00
d4a4c172ea [Improve](serde)update serialize and deserialize text for data type (#21109) 2023-07-26 10:06:16 +08:00
Pxl
19ba6bec38 [Improvement](pipeline) support send eos on local exchange and remove some unused code (#22086)
support send eos on local exchange and remove some unused code
2023-07-24 09:25:32 +08:00
ce397a8d32 [FIX](map)fix arrow serde with map null key #21955 2023-07-19 12:09:34 +08:00
b35cfc5d5e [opt](join) Opt the performance of join probe (#21845) 2023-07-19 01:21:22 +08:00
c6063ed92f [Revert](lazy open) revert lazy open and add case (#21821) 2023-07-18 19:41:33 +08:00
cbddff0694 [FIX](map) fix map key-column nullable for arrow serde #21762
arrow is not support key column has null element , but doris default map key column is nullable , so need to deal with if doris map row if key column has null element , we put null to arrow
2023-07-14 00:30:07 +08:00
3163841a3a [FIX](serde)Fix decimal for arrow serde (#21716) 2023-07-12 19:15:48 +08:00