doris

Author	SHA1	Message	Date
Xinyi Zou	9dc5dd382a	[enhancement](memtracker) Fix Brpc mem count and refactored thread context macro (#13469 )	2022-10-21 12:01:38 +08:00
zhangstar333	3ca8bfaf30	[Function](array) support array_difference function (#13440 )	2022-10-21 10:57:37 +08:00
Gabriel	9a3c1f0867	[Improvement](decimal) print decimal according to the real precision and scale (#13437 )	2022-10-21 10:00:01 +08:00
Gabriel	d3f65aa746	[Improvement](join) remove unnecessary state for join (#13472 )	2022-10-21 09:59:34 +08:00
camby	1f7829e099	[Fix](array-type) bugfix for array column with delete condition (#13361 ) Fix for SQL with array column: delete from tbl where c_array is null; more info please refer to #13360 Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-10-21 09:29:02 +08:00
zhannngchen	1b0dafcaa1	[Enhancement](load) consider memtable in flush while reducing load me… (#13480 ) We should consider memory which are being flushed from memtable to disk when trying to reduce memory by flushing memtable. Otherwise, we might not release memory space as expected. (e.g. lots of large memtable is in flush, the reduce_mem_usage method picks some small memtables to flush, it can't release enough memory and also can generate lots of small segments, which can cause -238 error)	2022-10-21 08:35:35 +08:00
HappenLee	e62d3dd8e5	[opt](function) refactor extract_url to use StringValue (#13508 ) change extract_url use stringvalue to repalce std::string to speed up	2022-10-21 08:33:39 +08:00
Yongqiang YANG	3dd00df24b	[fix](jsonreader) release memory of both value and parse allocator (#13513 )	2022-10-21 08:33:05 +08:00
HappenLee	d2be5096d6	[Revert](mem) revert the mem config cause perfermace degradation (#13526 ) * Revert "[fix](mem) failure of allocating memory (#13414)" This reverts commit 971eb9172f3e925c0b46ec1ffd1a9037a1b49801. * Revert "[improvement](memory) disable page cache and chunk allocator, optimize memory allocate size (#13285)" This reverts commit a5f3880649b094b58061f25c15dccdb50a4a2973.	2022-10-21 08:32:16 +08:00
Xinyi Zou	736d113700	[fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528	2022-10-21 08:30:30 +08:00
Adonis Ling	d624ff0580	[chore](macOS) Avoid using binutils from Homebrew to build third parties (#13512 ) Overwrite the environment variable PATH to avoid using binutils from Homebrew to build third parties which may cause compilation errors. Error: building for macOS-x86_64 but attempting to link with file built for unknown-unsupported file format	2022-10-21 01:28:30 +08:00
Xin Liao	7109cbfe6f	[feature-wip](unique-key-merge-on-write) fix that delete the bitmap of stale rowset (#13393 )	2022-10-20 21:53:13 +08:00
ChPi	1e774036f1	[fix](function)fix be coredump when using json_object function (#13443 )	2022-10-20 17:32:37 +08:00
Mingyu Chen	32b1456b28	[feature-wip](array) remove array config and check array nested depth (#13428 ) 1. remove FE config `enable_array_type` 2. limit the nested depth of array in FE side. 3. Fix bug that when loading array from parquet, the decimal type is treated as bigint 4. Fix loading array from csv(vec-engine), handle null and "null" 5. Change the csv array loading behavior, if the array string format is invalid in csv, it will be converted to null. 6. Remove `check_array_format()`, because it's logic is wrong and meaningless 7. Add stream load csv test cases and more parquet broker load tests	2022-10-20 15:52:31 +08:00
Pxl	1892e8f66e	[Enhancement](scanner) support split avg key range (#13166 )	2022-10-20 14:53:16 +08:00
DongLiang-0	2b328eafbb	[function](string_function) add new string function 'extract_url_parameter' (#13323 )	2022-10-20 11:11:43 +08:00
TengJianPing	b5cd167713	[fix](hashjoin) fix coredump of hash join in ubsan build (#13479 ) * [fix](hashjoin) fix coredump of hash join in ubsan build	2022-10-20 10:16:19 +08:00
Ashin Gau	f7c69ade18	[feature-wip](multi-catalog) implement predicate pushdown in native OrcReader (#13453 ) # Proposed changes Implement predicate pushdown in `OrcReader` by converting doris `ColumnValueRange` to orc `SearchArgument`. ## Remaining problems 1. Orc support `not in`, which may have effect on bloom filter. However, doris `ScanNode` has not push down `not in` to file scanner. 2. Orc support `is null`, and row range has `hasNull` identifier. However, `_contain_null` in `ColumnValueRange` is ambiguous. `_contain_null = true` only means that the value can be nullable, not equal to null. 3. `DateTimeV2` has lost microsecond precision in `ColumnValueRange`, which may cause filtering error when a min-max value equals to the predicate value. 4. `DateTimeV1` is not accurate enough, and only saved to seconds. 5. Orc support the predicate pushdown of `float&double` type, but doris has not push down `float&double` type for precision reason.	2022-10-20 10:07:36 +08:00
xiaojunjie	4996eafe74	[bugfix](VecDateTimeValue) eat the value of microsecond in function from_date_format_str (#13446 ) * [bugfix](VecDateTimeValue) eat the value of microsecond in function from_date_format_str * add sql based regression test Co-authored-by: xiaojunjie <xiaojunjie@baidu.com>	2022-10-20 09:02:33 +08:00
xy720	f329d33666	[chore](fix) Fix some spell errors in be's comments. #13452	2022-10-20 08:56:01 +08:00
HappenLee	3821f8420d	[opt](tpch) after change the config to speed up q21 (#13460 )	2022-10-20 08:54:35 +08:00
HappenLee	50e2d0fd3e	[opt](storage) opt the read by column decimal (#13488 ) do the opt： TPCH Q18 36s->33s Q20 18s->17s	2022-10-20 08:53:23 +08:00
Zhengguo Yang	3a2d5db914	[fix](String) fix string type length set to -1 when load stirng data (#13475 ) string type length may set to -1 when create TypeDescriptor from thrift or protobuf, this will cause check limit overflow	2022-10-20 08:45:25 +08:00
Adonis Ling	410e36ef5b	[enhancement](macOS) Refine the build scripts for macOS (#13473 ) Set the environment up before running the build scripts on macOS.	2022-10-19 22:52:22 +08:00
camby	9ac4cfc9bb	[bugfix](array-type) ColumnDate lost is_date_type after cloned (#13420 ) Problem: IColumn::is_date property will lost after ColumnDate::clone called. Fix: After ColumnDate created, also set IColumn::is_date. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-10-19 21:29:36 +08:00
Gabriel	c4b5ba2a4f	[Regression](java-udf) Move source code used by Java UDF test case (#13476 )	2022-10-19 21:05:06 +08:00
Zhengguo Yang	0b368fbbfa	[Bugfix](vec) Fix all create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true (#13448 ) * [Bugfix] add negtive value check when create mv using vec	2022-10-19 15:40:04 +08:00
Mingyu Chen	5423de68dd	[refactor](new-scan) remove old file scan node (#13433 ) All these files are not used anymore, can be removed.	2022-10-19 14:25:32 +08:00
yiguolei	1e42598fe6	[memory](podarray) revert not allocate too much memory in podarray change (#13457 ) revert not allocate too much memory in podarray change	2022-10-19 14:08:44 +08:00
Xinyi Zou	2745a88814	[enhancement](memtracker) Fix brpc causing query mem tracker to be inaccurate #13401	2022-10-19 12:28:20 +08:00
luozenglin	c449028a5f	[fix](year) fix `year()` results are not as expected (#13426 ) fix `year()` results are not as expected	2022-10-19 11:28:00 +08:00
zy-kkk	8a068c8c92	[function](string_function) add new string function 'not_null_or_empty' (#13418 )	2022-10-19 11:10:37 +08:00
Kang	755a946516	[feature](jsonb) jsonb functions (#13366 ) Issue Number: Step3 of DSIP-016: Support JSON type	2022-10-19 08:44:08 +08:00
starocean999	ac037e57f5	[fix](sort)the sort expr's nullability property may not be right (#13328 )	2022-10-18 22:09:02 +08:00
Jerry Hu	971eb9172f	[fix](mem) failure of allocating memory (#13414 ) When the target size to allocate is 8164, MemPool will return nullptr.	2022-10-18 21:11:30 +08:00
Yongqiang YANG	174054e32d	[fix](conf) aggressive_memory_decommit and chunk_reserve_limits can not be changed when running (#13427 )	2022-10-18 18:21:38 +08:00
yixiutt	6d322f85ac	[improvement](compaction) delete num based compaction policy (#13409 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-10-18 16:13:28 +08:00
Ashin Gau	21f233d7e7	[feature-wip](multi-catalog) use apache orc reader to read orc file (#13404 ) Use apache orc to read orc file, and convert ColumnVectorBatch to doris block.	2022-10-18 13:47:56 +08:00
Adonis Ling	125def5102	[enhancement](macOS M1) Support building from source on macOS (M1) (#13195 ) # Proposed changes This PR fixed lots of issues when building from source on macOS with Apple M1 chip. ## ATTENTION The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime: 1. Some errors with memory tracker occur when BE (RELEASE) starts. 2. Some UT cases fail. ... Temporarily, the following changes are made on macOS to start BE successfully. 1. Disable memory tracker. 2. Use tcmalloc instead of jemalloc. This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues. ## Use case ```shell ./build.sh -j 8 --be --clean cd output/be/bin ulimit -n 60000 ./start_be.sh --daemon ``` ## Something else It takes around _10+_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the development experience on macOS greatly when we finish the adaptation job.	2022-10-18 13:10:13 +08:00
Gabriel	3f964ad5a8	[Regression](javaudf) add regression test for javaudf (#13266 )	2022-10-18 12:48:57 +08:00
Gabriel	cd3450bd9d	[Improvement](join) optimize join probing phase (#13357 )	2022-10-18 12:37:17 +08:00
HappenLee	f0dbbe5b46	[Bug](funciton) fix repeat coredump when step is to long (#13408 )	2022-10-18 09:55:06 +08:00
carlvinhust2012	49b060418a	[optimization](array-type) array_min/array_max function support the date/datetime type (#13407 ) This pr is used to expand the supported data type for array_min/array_max function. Before the change , the array_min/array_max function can't support the date/datetime type. After the change, array_min/array_max function can support the date/datetime type. Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-10-17 23:38:20 +08:00
Mingyu Chen	dbf71ed3be	[feature-wip](new-scan) Support stream load with csv in new scan framework (#13354 ) 1. Refactor the file reader creation in FileFactory, for simplicity. Previously, FileFactory had too many `create_file_reader` interfaces. Now unified into two categories: the interface used by the previous BrokerScanNode, and the interface used by the new FileScanNode. And separate the creation methods of readers that read `StreamLoadPipe` and other readers that read files. 2. Modify the StreamLoadPlanner on FE side to support using ExternalFileScanNode 3. Now for generic reader, the file reader will be created inside the reader, not passed from the outside. 4. Add some test cases for csv stream load, the behavior is same as the old broker scanner.	2022-10-17 23:33:41 +08:00
xy720	c114d87d13	[Enhancement](array-type) Tuple is null predicate support array type (#13307 ) Issue Number: #12689	2022-10-17 18:50:56 +08:00
luozenglin	207f4e559e	[feature](agg) support `group_bitmap_xor` agg function. (#13287 ) support `group_bitmap_xor` agg function	2022-10-17 18:40:06 +08:00
Xinyi Zou	87a6b1a13b	[enhancement](memtracker) Fix bthread local consume mem tracker (#13368 ) Previously, bthread_getspecific was called every time bthread local was used. In the test at #10823, it was found that frequent calls to bthread_getspecific had performance problems. So a cache is implemented on pthread local based on the btls key, but the btls key cannot correctly sense bthread switching. So, based on bthread_self to get the bthread id to implement the cache.	2022-10-17 18:31:07 +08:00
abmdocrt	045bccdbea	[Feature](Retention) support retention function (#13056 )	2022-10-17 11:00:47 +08:00
HappenLee	6ea9a65bb6	[Opt](vec) opt runtime filter for TPCH Q22 (#13339 )	2022-10-17 10:30:07 +08:00
Xinyi Zou	9454bcca12	[fix](memory) Fix USE_JEMALLOC=true UBSAN compilation error #13398	2022-10-17 08:52:14 +08:00

1 2 3 4 5 ...

3010 Commits