doris

Author	SHA1	Message	Date
TengJianPing	22616d125d	[function](bitmap) add function alias bitmap_andnot and bitmap_andnot_count (#24771 )	2023-09-22 12:18:31 +08:00
zclllyybb	85fb46bb71	[refactor](cache) Refactor preloaded timezone global cache (#24694 ) Refactor preloaded timezone global cache	2023-09-21 17:26:41 +08:00
HappenLee	dc9fa1a4f1	[Refactor](Sink) convert to tablet sink to tablet writer (#24474 )	2023-09-20 14:47:18 +08:00
zclllyybb	8aea31e383	[fix](timezone) fix timezone parse when there is no tzfile (#24578 )	2023-09-20 14:28:12 +08:00
daidai	4b5cea1ef8	[enhancement](fix)change ordinary type null value is \N,complex type null value is null (#24207 )	2023-09-16 21:46:42 +08:00
amory	268c867679	[Improve](serde)replace function_cast from_string to serde (#24087 ) Now we can not support streamload with column which is map/array nested map/array serde can do this now , so we can replace it Notice. if item data in complex type data is empty we just return error, instead of makeup default value , because now we can not define right default for complex type	2023-09-14 13:53:16 +08:00
Kaijie Chen	563c3f75ff	[feature](move-memtable) share delta writer v2 among sinks (#24066 )	2023-09-13 14:39:29 +08:00
zhiqqqq	c7ae2a7d22	[Refactor & Bugfix](static variables) move some static vairables to exec_env (#24029 )	2023-09-13 09:27:03 +08:00
TengJianPing	4bb9a12038	[function](bitmap) support bitmap_remove (#24190 )	2023-09-12 14:52:04 +08:00
daidai	f9a75b5c4f	[feature](csv_serde)1.append csv serde for serialize to csv and deserialize from csv. 2.let csvReader use csv serde not text_converter. (#23352 ) 1. append csv serde for serialize to csv and deserialize from csv. 2. let csvReader use csv serde not text_converter.	2023-09-10 00:16:21 +08:00
TengJianPing	2f8b075b71	[improvement](bitmap) support version for ser/deser of bitmap (#23959 )	2023-09-07 09:55:29 +08:00
Pxl	a96adc01aa	[Chore](function) refactor of quantile_state (#23862 ) refactor of quantile_state	2023-09-06 15:39:19 +08:00
TengJianPing	75e2bc8a25	[function](bitmap) support bitmap_to_base64 and bitmap_from_base64 (#23759 )	2023-09-02 00:58:48 +08:00
Ashin Gau	eaf2a6a80e	[fix](date) return right date value even if out of the range of date dictionary(#23664 ) PR(https://github.com/apache/doris/pull/22360) and PR(https://github.com/apache/doris/pull/22384) optimized the performance of date type. However hive supports date out of 1970~2038, leading wrong date value in tpcds benchmark. How to fix: 1. Increase dictionary range: 1900 ~ 2038 2. The date out of 1900 ~ 2038 is regenerated.	2023-09-01 14:40:20 +08:00
TengJianPing	62c075bf7e	[improvement](Block) Replace Block(const PBlock&) with deserialize because it has heavy operations in ctor (#23672 )	2023-08-31 14:44:17 +08:00
zhangstar333	94a8fa6bc9	[bug](function) fix explode_number function return wrong rows (#23603 ) before the explode_number function result is random with const value. because the _cur_size is reset, so it's can't insert values to column.	2023-08-29 19:02:49 +08:00
Jerry Hu	5be8d57f52	[fix](be-ut) fix ColumnFixedLenghtObjectTest on 32 bits system (#23519 )	2023-08-28 14:02:05 +08:00
Jerry Hu	f80b067990	[fix](column) add unimplemented function of ColumnFixedLengthObject (#23468 )	2023-08-25 17:38:01 +08:00
zclllyybb	9cacf9535a	[Opt](functions) Use preloaded cache to accelerate timezone parsing (#22694 ) * opt * bugfix * fix ut * fix stylecheck	2023-08-25 10:00:48 +08:00
zclllyybb	51ac92f65c	Revert "[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236 )" (#23368 ) This reverts commit 1c3cc77a54938ed948ad8186b8dea8385977d23c.	2023-08-23 18:27:35 +08:00
Pxl	8ed4045df9	[Chore](primitive-type) remove VecPrimitiveTypeTraits (#22842 )	2023-08-23 08:37:40 +08:00
Ashin Gau	5ff7b57fc1	[fix](parquet) parquet reader confuses logical/physical/slot id of columns (#23198 ) `ParquetReader` confuses logical/physical/slot id of columns. If only reading the scalar types, there's nothing wrong, but when reading complex types, `RowGroup` and `PageIndex` will get wrong statistics. Therefore, if the query contains complex types and pushed-down predicates, the probability of the result set is incorrect.	2023-08-22 13:35:29 +08:00
Gabriel	12075f9853	[pipelineX](projection) Support projection and blocking agg (#23256 )	2023-08-21 22:23:02 +08:00
amory	33dfa0c454	[Improve](serde) support text serde for nested type-array/map (#22738 ) Now we can not support nested type array/map so this pr aim to: 1. add format option for string convert defined datatype to keep with origin from_string 2. support array map can nested array and map	2023-08-21 10:32:28 +08:00
ZenoYang	1c3cc77a54	[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236 ) * [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty * add ut * fix nereids * fix regression-test	2023-08-18 14:37:49 +08:00
flynn	4e880288c6	[refactor]use clear concept to replace std::enable_if_t (#22801 ) --------- Signed-off-by: flynn <fenglv15@mails.ucas.ac.cn>	2023-08-12 15:10:30 +08:00
czzmmc	1a8a1e5b16	[Feature](count_by_enum) support count_by_enum function (#22071 ) count_by_enum(expr1, expr2, ... , exprN); Treats the data in a column as an enumeration and counts the number of values in each enumeration. Returns the number of enumerated values for each column, and the number of non-null values versus the number of null values.	2023-08-06 16:05:14 +08:00
TengJianPing	b122f9b80c	[fix](concat) ColumnString::chars is resized with wrong size (#22610 ) FunctionStringConcat::execute_impl resized with size that include string null terminator, which causes ColumnString::chars.size() does not match with ColumnString::offsets.back, this will cause problems for some string functions, e.g. like and regexp.	2023-08-04 19:13:35 +08:00
amory	86e6f5d039	[FIX](decimal)fix decimal precision (#22364 ) Now we make wrong for decimal parse from string if given string precision is bigger than defined decimal precision, we will return a overflow error, but only digit part is bigger than typed digit length , we should return overflow error when we traverse given string to decimal value	2023-08-03 21:13:58 +08:00
Mryange	f16a39aea1	[feature](time) using timev2 type to replace the old time type. (#22269 )	2023-08-01 15:59:07 +08:00
HappenLee	3a11de889f	[Opt](exec) opt the performance of date parquet convert by date dict (#22384 ) before： mysql> select count(l_commitdate) from lineitem; +---------------------+ \| count(l_commitdate) \| +---------------------+ \| 600037902 \| +---------------------+ 1 row in set (0.86 sec) after: mysql> select count(l_commitdate) from lineitem; +---------------------+ \| count(l_commitdate) \| +---------------------+ \| 600037902 \| +---------------------+ 1 row in set (0.36 sec)	2023-08-01 12:24:00 +08:00
Gabriel	d585a8acc1	[Improvement](shuffle) Accumulate rows in a batch for shuffling (#22218 )	2023-08-01 09:55:06 +08:00
huanghaibin	ec1a4d172b	(vertical compaction) fix vertical compaction core (#22275 ) * (vertical compaction) fix vertical compaction core co-author:@zhannngchen	2023-07-28 16:41:00 +08:00
amory	d4a4c172ea	[Improve](serde)update serialize and deserialize text for data type (#21109 )	2023-07-26 10:06:16 +08:00
Pxl	19ba6bec38	[Improvement](pipeline) support send eos on local exchange and remove some unused code (#22086 ) support send eos on local exchange and remove some unused code	2023-07-24 09:25:32 +08:00
amory	ce397a8d32	[FIX](map)fix arrow serde with map null key #21955	2023-07-19 12:09:34 +08:00
HappenLee	b35cfc5d5e	[opt](join) Opt the performance of join probe (#21845 )	2023-07-19 01:21:22 +08:00
HHoflittlefish777	c6063ed92f	[Revert](lazy open) revert lazy open and add case (#21821 )	2023-07-18 19:41:33 +08:00
amory	cbddff0694	[FIX](map) fix map key-column nullable for arrow serde #21762 arrow is not support key column has null element , but doris default map key column is nullable , so need to deal with if doris map row if key column has null element , we put null to arrow	2023-07-14 00:30:07 +08:00
amory	3163841a3a	[FIX](serde)Fix decimal for arrow serde (#21716 )	2023-07-12 19:15:48 +08:00
amory	d0eb4d7da3	[Improve](hash-fun)improve nested hash with range #21699 Issue Number: close #xxx when cal array hash, elem size is not need to seed hash hash = HashUtil::zlib_crc_hash(reinterpret_cast<const char*>(&elem_size), sizeof(elem_size), hash); but we need to be care [[], [1]] vs [[1], []], when array nested array , and nested array is empty, we should make hash seed to make difference 2. use range for one hash value to avoid virtual function call in loop. which double the performance. I make it in ut column: array[int64] 50 rows , and single array has 10w elements	2023-07-11 14:40:40 +08:00
Pxl	ca71048f7f	[Chore](status) avoid empty error msg on status (#21454 ) avoid empty error msg on status	2023-07-11 13:48:16 +08:00
TengJianPing	736d6f3b4c	[improvement](timezone) support mixed uppper-lower case of timezone names (#21572 )	2023-07-11 09:37:14 +08:00
Mryange	8973610543	[feature](datetime) "timediff" supports calculating microseconds (#21371 )	2023-07-10 19:21:32 +08:00
amory	7caab87bbe	[FIX](serde) fix map/struct/array support arrow #21628 support map/struct support arrow format fix string arrow format fix largeInt 128 for arrow builder	2023-07-08 15:51:14 +08:00
amory	b7d6a70868	[FIX](datatype) Implement hash func with array/map/struct type (#21334 ) we do not Implement any hash functions in array/map/struct column , so we use sql like this will make be core select * from ( select bdp.nc_num, collect_list(distinct(bd.catalog_name)) as catalog_name, material_qty from dataease.bu_delivery_product bdp left join dataease.bu_trans_transfer btt on bdp.delivery_product_id = btt.delivery_product_id left join dataease.bu_delivery bd on bdp.delivery_id = bd.delivery_id where bd.val_status in ('10', '20', '30', '90') and bd.delivery_type in (0, 1, 2) group by nc_num, material_qty union ALL select bdp.nc_num, collect_list(distinct(bd.catalog_name)) as catalog_name, material_qty from dataease.bu_trans_transfer btt left join dataease.bu_delivery_product bdp on bdp.delivery_product_id = btt.delivery_product_id left join dataease.bu_delivery bd on bdp.delivery_id = bd.delivery_id where bd.val_status in ('10', '20', '30', '90') and bd.delivery_type in (0, 1, 2) group by nc_num, material_qty ) aa; core :	2023-06-30 17:11:35 +08:00
Liqf	d76fa427a3	[improve](jsonb)Invalid json path prompts an error instead of null (#19646 ) 1. Invalid json path prompts an error instead of null： before： ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]'); +-------------------------------------------------------------+ \| jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]') \| +-------------------------------------------------------------+ \| NULL \| +-------------------------------------------------------------+ 1 row in set (0.01 sec) ``` now ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]'); ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Json path error: Invalid Json Path for value: $[a] ``` 2. fix some problem: https://github.com/apache/doris/pull/19185 a. support negative numbers ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[-2]'); +--------------------------------------------------------------+ \| jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[-2]') \| +--------------------------------------------------------------+ \| "a" \| +--------------------------------------------------------------+ 1 row in set (0.02 sec) ``` b. Avoid using unnecessary memory 3. Supplementary regression test	2023-06-30 14:29:21 +08:00
Xinyi Zou	0396f78590	[fix](memory) Remove ChunkAllocator & fix Allocator no use mmap (#21259 )	2023-06-28 16:10:24 +08:00
Pxl	b6835840f7	[Bug](table-function) return InvalidArgument when explode_split meet empty delimiter (#20795 ) return InvalidArgument when explode_split meet empty delimiter	2023-06-15 15:17:22 +08:00
yiguolei	31a4f96f01	[refactor](exprcontext) move close to expr context's dector method (#20747 ) The close method does nothing. But I am not sure we could remove it. So that I add it to dector method and remove many many calls.	2023-06-14 18:01:07 +08:00

1 2 3 4 5 ...

314 Commits