doris

Author	SHA1	Message	Date
Zhengguo Yang	0b368fbbfa	[Bugfix](vec) Fix all create mv using to_bitmap() on negative value columns when enable_vectorized_alter_table is true (#13448 ) * [Bugfix] add negtive value check when create mv using vec	2022-10-19 15:40:04 +08:00
Mingyu Chen	5423de68dd	[refactor](new-scan) remove old file scan node (#13433 ) All these files are not used anymore, can be removed.	2022-10-19 14:25:32 +08:00
yiguolei	1e42598fe6	[memory](podarray) revert not allocate too much memory in podarray change (#13457 ) revert not allocate too much memory in podarray change	2022-10-19 14:08:44 +08:00
Xinyi Zou	2745a88814	[enhancement](memtracker) Fix brpc causing query mem tracker to be inaccurate #13401	2022-10-19 12:28:20 +08:00
luozenglin	c449028a5f	[fix](year) fix `year()` results are not as expected (#13426 ) fix `year()` results are not as expected	2022-10-19 11:28:00 +08:00
zy-kkk	8a068c8c92	[function](string_function) add new string function 'not_null_or_empty' (#13418 )	2022-10-19 11:10:37 +08:00
Yongqiang YANG	248ca14df7	[fix](test) let each case uses its own table name (#13419 )	2022-10-19 10:58:56 +08:00
Kang	755a946516	[feature](jsonb) jsonb functions (#13366 ) Issue Number: Step3 of DSIP-016: Support JSON type	2022-10-19 08:44:08 +08:00
starocean999	ac037e57f5	[fix](sort)the sort expr's nullability property may not be right (#13328 )	2022-10-18 22:09:02 +08:00
Jerry Hu	971eb9172f	[fix](mem) failure of allocating memory (#13414 ) When the target size to allocate is 8164, MemPool will return nullptr.	2022-10-18 21:11:30 +08:00
luozenglin	a8fd76fe32	[Fix](docs) fix error description of `LDAP_ADMIN_PASSWORD` in the document (#13405 ) co-author:@luozenglin	2022-10-18 18:53:10 +08:00
Yongqiang YANG	174054e32d	[fix](conf) aggressive_memory_decommit and chunk_reserve_limits can not be changed when running (#13427 )	2022-10-18 18:21:38 +08:00
ElvinWei	d8e53da764	[feature-wip](statistics) collect statistics by sampling sql-tasks (#13399 ) 1. Collect statistics by sampling sql-tasks. 2. Consolidate statistics SQL statements and remove redundant statements.	2022-10-18 16:34:01 +08:00
yixiutt	6d322f85ac	[improvement](compaction) delete num based compaction policy (#13409 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-10-18 16:13:28 +08:00
Ashin Gau	21f233d7e7	[feature-wip](multi-catalog) use apache orc reader to read orc file (#13404 ) Use apache orc to read orc file, and convert ColumnVectorBatch to doris block.	2022-10-18 13:47:56 +08:00
Adonis Ling	125def5102	[enhancement](macOS M1) Support building from source on macOS (M1) (#13195 ) # Proposed changes This PR fixed lots of issues when building from source on macOS with Apple M1 chip. ## ATTENTION The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime: 1. Some errors with memory tracker occur when BE (RELEASE) starts. 2. Some UT cases fail. ... Temporarily, the following changes are made on macOS to start BE successfully. 1. Disable memory tracker. 2. Use tcmalloc instead of jemalloc. This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues. ## Use case ```shell ./build.sh -j 8 --be --clean cd output/be/bin ulimit -n 60000 ./start_be.sh --daemon ``` ## Something else It takes around _10+_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the development experience on macOS greatly when we finish the adaptation job.	2022-10-18 13:10:13 +08:00
Gabriel	3f964ad5a8	[Regression](javaudf) add regression test for javaudf (#13266 )	2022-10-18 12:48:57 +08:00
Gabriel	cd3450bd9d	[Improvement](join) optimize join probing phase (#13357 )	2022-10-18 12:37:17 +08:00
minghong	18f2db6064	[feature](nereids) let minValue and maxValue in stats support for Date, CHAR and VARCHAR type (#13311 ) 1. enable varchar/char type set min/max value. take first 8 chars as long, and convert to double. 2. fix bug when set min/max value for date and datav2	2022-10-18 12:12:33 +08:00
HappenLee	f0dbbe5b46	[Bug](funciton) fix repeat coredump when step is to long (#13408 )	2022-10-18 09:55:06 +08:00
carlvinhust2012	49b060418a	[optimization](array-type) array_min/array_max function support the date/datetime type (#13407 ) This pr is used to expand the supported data type for array_min/array_max function. Before the change , the array_min/array_max function can't support the date/datetime type. After the change, array_min/array_max function can support the date/datetime type. Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-10-17 23:38:20 +08:00
Mingyu Chen	dbf71ed3be	[feature-wip](new-scan) Support stream load with csv in new scan framework (#13354 ) 1. Refactor the file reader creation in FileFactory, for simplicity. Previously, FileFactory had too many `create_file_reader` interfaces. Now unified into two categories: the interface used by the previous BrokerScanNode, and the interface used by the new FileScanNode. And separate the creation methods of readers that read `StreamLoadPipe` and other readers that read files. 2. Modify the StreamLoadPlanner on FE side to support using ExternalFileScanNode 3. Now for generic reader, the file reader will be created inside the reader, not passed from the outside. 4. Add some test cases for csv stream load, the behavior is same as the old broker scanner.	2022-10-17 23:33:41 +08:00
xy720	c114d87d13	[Enhancement](array-type) Tuple is null predicate support array type (#13307 ) Issue Number: #12689	2022-10-17 18:50:56 +08:00
luozenglin	207f4e559e	[feature](agg) support `group_bitmap_xor` agg function. (#13287 ) support `group_bitmap_xor` agg function	2022-10-17 18:40:06 +08:00
Xinyi Zou	87a6b1a13b	[enhancement](memtracker) Fix bthread local consume mem tracker (#13368 ) Previously, bthread_getspecific was called every time bthread local was used. In the test at #10823, it was found that frequent calls to bthread_getspecific had performance problems. So a cache is implemented on pthread local based on the btls key, but the btls key cannot correctly sense bthread switching. So, based on bthread_self to get the bthread id to implement the cache.	2022-10-17 18:31:07 +08:00
Yongqiang YANG	3b5b7ae12b	[improvement](config) let default value of alter and load timeout suitable for most cases (#13370 ) It is frustrated that a long running job fails due to small timeout. Actually, users do not expect a timeout for a log running job.	2022-10-17 14:55:05 +08:00
Hong Liu	53286794c6	[typo](docs) Fixed thrift_client_timeout_ms's incorrect description of en docs. (#13391 ) Co-authored-by: smallhibiscus <8449081280@qq.com>	2022-10-17 14:54:38 +08:00
carlvinhust2012	4caa1e8041	[optimization](array-type) update the docs for import data to array column (#13345 ) 1. this pr is used to update the json load docs for import data to array column. when we use json to import data to array column, the Rapidjson will cause precision problems. so we update the json-load docs to specify how to avoid these problems. Issue Number: #7570 Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-10-17 12:43:22 +08:00
abmdocrt	045bccdbea	[Feature](Retention) support retention function (#13056 )	2022-10-17 11:00:47 +08:00
HappenLee	6ea9a65bb6	[Opt](vec) opt runtime filter for TPCH Q22 (#13339 )	2022-10-17 10:30:07 +08:00
Dongyang Li	c1588b2900	[thirdparty](zstd)update dist info and thirdparty change log (#13392 )	2022-10-17 09:09:16 +08:00
Xin Liao	2da7fe940c	[fix](regression-test) fix that multiple cases conflict with the same table name (#13395 )	2022-10-17 09:08:30 +08:00
Xinyi Zou	9454bcca12	[fix](memory) Fix USE_JEMALLOC=true UBSAN compilation error #13398	2022-10-17 08:52:14 +08:00
xy720	e84d9a6c87	[fix](array-type) Fix cast null to array make be core (#13324 ) Doris do not support explicitly cast NULL_TYPE to ANY type . ``` mysql> select cast(NULL as int); ERROR 1105 (HY000): errCode = 2, detailMessage = Invalid type cast of NULL from NULL_TYPE to INT ``` So we should also forbid user from casting NULL_TYPE to ARRAY type. This commit will produce the following effect: ``` mysql> select cast(NULL as array<int>); ERROR 1105 (HY000): errCode = 2, detailMessage = Invalid type cast of NULL from NULL_TYPE to ARRAY<INT(11)> ```	2022-10-17 00:04:50 +08:00
camby	162e60eb19	[fix](array-type) check value valid while insert data into array column (#13365 ) We should prevent insert while value overflow. 1. create table: `CREATE TABLE test_array_load_test_array_int_insert_db.test_array_load_test_array_int_insert_tb ( k1 int NULL, k2 array<int> NULL ) DUPLICATE KEY(k1) DISTRIBUTED BY HASH(k1) BUCKETS 5` 2. try insert data less than INT_MIN. `insert into test_array_load_test_array_int_insert_tb values (1005, [-2147483649])` Before this pr, the insert will success, but the value it not correct. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-10-17 00:01:03 +08:00
zxealous	a83eaddfcf	[test](cache)Add remote cache ut (#13377 )	2022-10-16 23:59:50 +08:00
Gabriel	1d5ba9cbcc	[Improvement](like) Change `like` function to batch call (#13314 )	2022-10-16 16:18:22 +08:00
Pxl	632670a49c	[Enhancement](function) refactor of date function (#13362 ) refactor of date function	2022-10-16 14:31:26 +08:00
HappenLee	144486e220	[Opt](fun) simd the substring function and use stack buf to speed up (#13338 )	2022-10-16 11:48:34 +08:00
yiguolei	a5f3880649	[improvement](memory) disable page cache and chunk allocator, optimize memory allocate size (#13285 ) disable page cache by default disable chunk allocator by default not use chunk allocator for vectorized allocator by default add a new config memory_linear_growth_threshold = 128Mb, not allocate memory by RoundUpToPowerOf2 if the allocated size is larger than this threshold. This config is added to MemPool, ChunkAllocator, PodArray, Arena.	2022-10-15 17:27:17 +08:00
starocean999	bf2e20c4c4	[fix](agg) reset the content of grouping exprs instead of replace it with original exprs (#13376 ) * [fix](agg)the reseet the content of grouping exprs instead of replace it with original exprs * keep old behavior if the grouping type is not GROUP_BY	2022-10-15 11:07:35 +08:00
Dongyang Li	52397df9f0	[thirdparty](update) zstd 1.5.0 to 1.5.2 #13378	2022-10-15 10:50:20 +08:00
starocean999	f2fa9606c9	[fix](agg)count function should return 0 for null value (#13247 ) count(null) should return 0 instead of 1, the streaming_agg_serialize_to_column function didn't handle if the input value is null, this pr fix it.	2022-10-15 10:40:52 +08:00
zhangstar333	4bc33a54a1	[Fix](agg) fix bitmap agg core dump when phmap pointer assert alignment (#13381 )	2022-10-15 10:39:23 +08:00
Gabriel	8218cfed40	[Bug](function) Fix constant predicate evaluation (#13346 )	2022-10-15 01:05:29 +08:00
Gabriel	79a5125eff	[Improvement](predicates) Use datev2 as the compatible type between string and datev2 (#13348 ) If string literal can be converted to dateV2, we use datev2 as the compatible type instead of datetimev2.	2022-10-14 19:00:37 +08:00
jakevin	993f38fe3c	[feature](Nereids): use Multi join to rearrange join to eliminate cross join by using predicate. (#13353 )	2022-10-14 17:26:34 +08:00
Yongqiang YANG	5bc8858571	[fix](jsonreader) teach jsonreader to release memory (#13336 ) Allocator of rapidjson does not release memory, this fix use allocator with local buffer and call Clear to release memory allocated beyond local buffer.	2022-10-14 15:52:05 +08:00
TengJianPing	6746434770	[improvement](schema change) avoid using column ptr swap (#13273 )	2022-10-14 15:19:08 +08:00
ElvinWei	b82e54a525	[feature](statistics) support to drop table or partition statistics (#13303 ) Manually drop statistics for tables or partitions. Table or partition can be specified, if neither is specified, all statistics under the current database will be deleted. syntax: ```SQL DROP STATS [tableName [PARTITIONS(partitionNames)]]; -- e.g. DROP STATS; -- drop all table statistics under the current database DROP STATS t0; -- drop t0 statistics DROP STATS t1 PARTITIONS(p1); -- drop partition p1 statistics of t1 ```	2022-10-14 15:15:37 +08:00

1 2 3 4 5 ...

6753 Commits