doris

Author	SHA1	Message	Date
HappenLee	a6bc9cbe53	[Function] Refactor the function code of log (#8199 ) 1. Support return null when input is invalid 2. Del the unless code in vec function Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-24 11:06:58 +08:00
Pxl	90a8ca808a	[Bug][Vectorized] fix bitmap_min(empty) not return null (#8190 )	2022-02-24 11:06:27 +08:00
Mingyu Chen	9a7931cfed	[fix](mem-pool) fix bug that mem pool failed to allocate in ASAN mode (#8216 ) Also fix BE ut: 1. fix scheme_change_test memory leak 2. fix mem_pool_test Do not using DEFAULT_PADDING_SIZE = 0x10 in mem_pool when running ut. 3. remove plugin_test	2022-02-24 10:52:58 +08:00
Adonis Ling	0726a43a2a	[fix](be-ut) Fix unused-but-set-variable errors. (#8211 )	2022-02-23 21:43:15 +08:00
wangbo	83543c67fe	[improvement](storage)Using Be config to switch storage layer vectorization #8166 Using Be config to switch storage layer vectorization #8166	2022-02-23 20:11:28 +08:00
HappenLee	01fb25a498	[UT] Fix the UT of column_nullable_test (#8180 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-23 15:37:40 +08:00
wangbo	e3f1efcbbf	[Vec][Storage] Support delete condition;ut (#8091 ) Co-authored-by: Wang Bo <wangbo36@meituan.com>	2022-02-23 12:48:18 +08:00
wangbo	d17ed5e27a	[vectorization](storage)support seq column in storage layer (#8186 ) [vectorization](storage)support seq column in storage layer (#8186)	2022-02-23 12:23:31 +08:00
zhangstar333	31ab569c1d	[Vectorized][Feature] support some bitmap functions (#8138 )	2022-02-23 11:42:16 +08:00
awakeljw	b1e7343532	[Vectorized] [HashJoin] Opt HashJoin Performance (#8119 ) Co-authored-by: lihaopeng <happenlee@hotmail.com>	2022-02-23 10:28:16 +08:00
zuochunwei	802fcbbb05	(#8162 )refactor binary dict Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-02-22 11:23:54 +08:00
Pxl	87e555c27d	[Feature][Vectorized] support function json_array/json_object/json_quote (#8158 )	2022-02-22 09:29:56 +08:00
dataroaring	d6aebc0c2c	[improvement] make asan work as much as possible (#8148 ) * make ASAN poisoning work as much as possible Before this patch a use after poison is reported like below ==19305==ERROR: AddressSanitizer: unknown-crash on address 0x625000137013 at pc 0x561c44bcf6b8 bp 0x7ffb75a00910 sp 0x7ffb75a000b8 After this patch the use after poison is reported like below ==17782==ERROR: AddressSanitizer: use-after-poison on address 0x625000137033 at pc 0x55633c8f56b8 bp 0x7ff3dc437930 sp 0x7ff3dc43 Before this patch, a false memory usage is reported like below ==33080==AddressSanitizer CHECK failed: ../../../../src/libsanitizer/ asan/asan_allocator.cpp:189 "((old)) == ((kAllocBegMagic))"	2022-02-22 09:29:22 +08:00
Mingyu Chen	6e8d52f3fc	[fix](stream-load) fix bug that stream load may be blocked with unqualified data (#8176 ) Co-authored-by: morningman <chenmingyu@baidu.com>	2022-02-22 09:26:23 +08:00
zuochunwei	47067e40a6	[refactor](common) optimize Status implemention: no dynamic new (#8117 )	2022-02-22 09:23:29 +08:00
jacktengg	f13fd13e1b	[fix] (schema change) Fix BE crash after schema change int column to varchar column(#8073 ) (#8142 ) Co-authored-by: jianping.teng <tengjp@outlook.com>	2022-02-22 09:22:00 +08:00
zuochunwei	d0ee101c2f	[refactor] (runtime)tidy up the plan_fragment_executor codes (#8110 ) Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-02-22 09:20:27 +08:00
Zhengguo Yang	c47368f80c	[fix] (udf) fix check_fn and fn_call function name not same (#8132 )	2022-02-22 09:18:07 +08:00
Mingyu Chen	16020cbdf9	[fix](lateral-view) Fix bug that explode_json_array_string return unstable result (#8152 ) Co-authored-by: morningman <chenmingyu@baidu.com>	2022-02-21 09:38:36 +08:00
Zhengguo Yang	409aefdfbf	[refactor] add some log when close parquet file (#8144 )	2022-02-21 09:36:53 +08:00
zhannngchen	826738d97f	[docs]Some doc improvements and typo fix (#8153 )	2022-02-21 09:36:01 +08:00
HappenLee	56adc7f56b	[Bug][vec] Fix bug of nullable const value convert to argument cause coredump (#8139 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-20 20:05:23 +08:00
HZYWPT	4926c0bee7	[typo] translate the comments of byte_buffer.h (#8127 ) translate the comments of byte_buffer.h	2022-02-19 12:06:35 +08:00
zuochunwei	5f50d9ae3b	predicate test bugfix (#8134 ) Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-02-19 12:05:26 +08:00
Mingyu Chen	0f7a25367d	[fix](rowset-meta) Fix bug that rowset meta is not deleted (#8118 ) As described in #8120, a large number of rowset meta remain in rocksdb, which may be generated by: 1. drop tablet The drop tablet task itself just sets the state of the tablet meta to `SHUTDOWN` and moves the tablet to `_shutdown_tablets` vector then the background thread will periodically clean up the tablet in `_shutdown_tablets` (that's why even if we execute the `drop table xx force`, the tablet may be delayed by 10min to 1 hour before it goes into the trash directory). The regular cleanup thread in the background saves the complete tablet meta as a `.hdr` file when deleting the tablet, and then moves it to the trash directory along with the data files. But this process does not process the rowset meta (before doing the checkpoint of the tablet meta, the rowset meta is stored independently in rocksdb as a key-value). So this results in a residual rowset meta. 2. clone task The clone task may migrate back and forth between BEs, which may result in a situation where the tablet id is the same on the BE, but the tablet uuid is different. This leads to some rowset meta can not find the corresponding tablet, but there is no thread to process these rowsets, and eventually lead to residual. This is PR, I handled it in the regular cleanup thread with method `_clean_unused_rowset_metas()`. I did not delete rowset meta along with "drop tablet" task, because "drop tablet" itself is not a synchronous operation. It also relies on a background thread to clean up the tablet periodically. So I put this operation in the background cleanup thread.	2022-02-19 12:00:48 +08:00
zuochunwei	9cb9781d86	[chore](storage) add STORAGE_LAYER_VECTORIZED_SWITCH (#8005 ) if you want test storage layer vectorized, you need modify some codes to let vectorized storage layer working, it's boring work. now, you can just change one code (redefine the macro STORAGE_LAYER_VECTORIZED_SWITCH as 1 or 0), this gets more convenient.	2022-02-19 11:47:36 +08:00
Zhengguo Yang	50864aca7d	[refactor] fix warings when compile with clang (#8069 )	2022-02-19 11:29:02 +08:00
zhangstar333	8892780091	[Vectorized][Feature] support agg function percentile&&percentile_approx (#8066 )	2022-02-18 13:42:24 +08:00
HappenLee	bcde1f265a	[Function][Vectorized] Support least/greast function (#8107 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-18 11:57:07 +08:00
HappenLee	68b24d608f	[fix] (vectorization)Fix nullable column compute the hash value error (#8105 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-18 11:20:47 +08:00
yiguolei	d383821fd5	[refactor] Remove unused code in data dir (#8092 )	2022-02-18 11:14:02 +08:00
HappenLee	31399d5876	[Bug][Vec] Fix the bug of coredump when vec exec engine with delete condition (#8109 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-18 11:09:05 +08:00
wangbo	b9f0b5565c	[refactor](storage) refactor some interfaces of storage layer column (#8064 ) 1 format binary plain 2 remove batch_set_null_bitmap 3 fix segiter return value 4 set insert_many_binary_data args	2022-02-18 10:54:51 +08:00
yinzhijian	936da4f10a	[feature](thread-pool) Support thread pool per disk for scanners (#7994 ) Support thread pool per disk for scanners to prevent pool performance from some high ioutil disks happening key point: 1. each disk has a thread pool for scanners 2. whenever a thread pool of one disk runs out of local work, tasks can be retrieved from other threads(disks). This is done round-robin. performance testing: vec version: 25% faster than single thread pool in a high io util disk test case normal version: 8% faster than single thread pool in a high io util disk test case	2022-02-18 09:40:58 +08:00
zuochunwei	a162f56284	(test) resolve unit test failed problem for VGenericIteratorsTest Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-02-17 20:03:07 +08:00
awakeljw	bdd78f20c8	[Vectorized][HashJoin] Eliminate hashjoin branch prediction (#8051 ) Co-authored-by: jewisliu <jewisliu@tencent.com>	2022-02-17 19:00:26 +08:00
Pxl	e0dbf48682	[Vectorized] [AggFunction] Support group_concat (#8086 )	2022-02-17 14:19:07 +08:00
HappenLee	f6e2a4fe16	[Vectorized][Function] Support year/month/week/hour/mintue/day/second floor/ceil function (#8068 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-17 14:18:02 +08:00
zhangstar333	f8411f3c6a	[refactor](mysql_table_writer)split into two parts of vectorized and row mode (#8081 )	2022-02-17 11:29:25 +08:00
Mingyu Chen	26289c28b0	[fix](load)(compaction) Fix NodeChannel coredump bug and modify some compaction logic (#8072 ) 1. Fix the problem of BE crash caused by destruct sequence. (close #8058) 2. Add a new BE config `compaction_task_num_per_fast_disk` This config specify the max concurrent compaction task num on fast disk(typically .SSD). So that for high speed disk, we can execute more compaction task at same time, to compact the data as soon as possible 3. Avoid frequent selection of unqualified tablet to perform compaction. 4. Modify some log level to reduce the log size of BE. 5. Modify some clone logic to handle error correctly.	2022-02-17 10:52:08 +08:00
Pxl	f06c13a828	[feature](vec)(function) support function `convert_tz()` (#8060 )	2022-02-17 10:51:32 +08:00
HappenLee	bef1b55c1f	[feature][fix](vec)(function) Fix multi args function call the DATETIME type not effective in DATE type and add the alias function (#8050 ) 1. Support some function alias of mod/fmod, adddate/add_data 2. Support some function of multi args: week, yearweek 3. Fix bug of multi args function call the DATETIME type not effective in DATE type	2022-02-17 10:49:25 +08:00
spaces-x	53f22bbc14	[fix] fix incorrect serialized_size of TDigest object (#8046 )	2022-02-17 10:47:22 +08:00
yiguolei	d1cb2913c1	[improvement] check simd instructions before start (#8042 ) Sometimes BE is build on a machine with SIMD instruction such as AVX2. But the BE binary will be copied to a machine without AVX2. It will crashed without any error message. This PR will check the required SIMD instructions and print error messages during startup.	2022-02-17 10:46:03 +08:00
zhangstar333	0003822da7	[feature](vec) add ColumnHLL to support hll type (#7828 )	2022-02-17 10:44:42 +08:00
Pxl	143c4085ee	[Feature][Vectorized] support aggregate function ndv()/approx_count_distinct() (#8044 )	2022-02-16 14:30:13 +08:00
weizuo93	a6bf8c13eb	[Feature](Transaction) Support two phase commit (2PC) for stream load (#7473 ) The two phase batch commit means： During Stream load, after data is written, the message will be returned to the client, the data is invisible at this point and the transaction status is PRECOMMITTED. The data will be visible only after COMMIT is triggered by client. 1. User can invoke the following interface to trigger commit operations for transaction： curl -X PUT --location-trusted -u user:passwd -H "txn_id:txnId" -H "txn_operation:commit" \ http://fe_host:http_port/api/{db}/_stream_load_2pc or curl -X PUT --location-trusted -u user:passwd -H "txn_id:txnId" -H "txn_operation:commit" \ http://be_host:webserver_port/api/{db}/_stream_load_2pc 2.User can invoke the following interface to trigger abort operations for transaction： curl -X PUT --location-trusted -u user:passwd -H "txn_id:txnId" -H "txn_operation:abort" \ http://fe_host:http_port/api/{db}/_stream_load_2pc or curl -X PUT --location-trusted -u user:passwd -H "txn_id:txnId" -H "txn_operation:abort" \ http://be_host:webserver_port/api/{db}/_stream_load_2pc	2022-02-16 11:55:04 +08:00
zhangstar333	25d64775d1	[Vectorized][Feature] Support mysql external table insert into stm (#7979 )	2022-02-15 14:58:58 +08:00
Mingyu Chen	884fddbf33	[fix](compatibility) Fix compatibility issue of PRowBatch and some tablet sink bugs (#8000 ) 1. set both `tuple_offsets` and `new_tuple_offsets` in PRowBatch for compatibility 2. set FE config `repair_slow_replica` default to false Avoid impacting the load process after upgrading. Eg, if there are only 2 replicas, one is with high version count. After upgrade, that replica will be set to bad, so that the load process will be stopped because only 1 replica is alive. 3. Fix a bug that NodeChannel may be blocked at `close_wait()` Forget to set `add_batch_finish` flag after the last rpc finished. 4. Fix a NPE of RoutineLoadScheduler	2022-02-15 11:23:19 +08:00
yiguolei	a390b766d4	[Improvement] BE could print log foreground when not use daemon mode (#8031 )	2022-02-14 09:30:12 +08:00

1 2 3 4 5 ...

1754 Commits