doris

Author	SHA1	Message	Date
Xinyi Zou	c8d303a82c	[bugfix] Fix BE core about vectorized join build thread memtracker switch, and FileStat duplicate	2022-05-31 19:12:42 +08:00
Pxl	fa50b63cee	fix core dump on vcase_expr::close (#9875 )	2022-05-31 15:45:39 +08:00
HappenLee	0cba6b7d95	[Bug][Fix] One Rowset have same key output in unique table (#9858 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-05-31 12:29:16 +08:00
HappenLee	7199102d7c	[Opt][VecLoad] Opt the vec stream load performance (#9772 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-05-31 11:53:32 +08:00
Gabriel	7b55d4cb88	[BUG] return NULL for invalid date value (#9862 )	2022-05-30 21:35:41 +08:00
Amos Bird	85f525e991	[Bugfix(Vec)] Close result_sink properly (#9849 ) Close result_sink properly so that error code is reported and expr_context is always closed.	2022-05-30 19:03:33 +08:00
Adonis Ling	f377c26bf7	[refactor][be] Optimize headers (#9708 )	2022-05-30 16:12:10 +08:00
yiguolei	4af2493c42	[Improvement] optimize scannode concurrency query performance in vectorized engine. (#9792 )	2022-05-30 16:04:40 +08:00
Jing Shen	7b98dd438d	[feature](function) Add nvl function (#9726 )	2022-05-30 09:43:00 +08:00
EmmyMiao87	0683181fef	[API changed](parser) Remove merge join syntax (#9795 ) Remove merge join sql and merge join node	2022-05-30 09:04:21 +08:00
Gabriel	a96b41db7a	[Improvement] Simplify expressions for _vconjunct_ctx_ptr (#9816 )	2022-05-29 23:05:21 +08:00
Amos Bird	63aab5ee5d	[Bugfix(Vec)] Fix some memory leak issues (#9824 )	2022-05-29 23:04:11 +08:00
yixiutt	1aeb16d153	[improvement](load) reduce useless err_msg format in VOlapTableSink send (#9531 )	2022-05-29 16:02:57 +08:00
Mingyu Chen	9fe3827239	[fix](ut) fix BE ut (#9831 ) introduced from #8923, the github checks has some problem that failed to check BE ut in #8923	2022-05-29 12:25:41 +08:00
Pxl	f33ef32d92	[Bug] [Bitmap] change to_bitmap to always_not_nullable (#9716 )	2022-05-28 17:33:55 +08:00
Dayue Gao	4d1e926b6c	[feature][config] introduce a new BE config storage_page_cache_shard_size (#9821 ) Co-authored-by: gaodayue <gaodayue@bytedance.com>	2022-05-28 10:17:09 +08:00
Kang	efdb3b79a5	[feature] add zstd compression codec (#9747 ) ZSTD compression is fast with high compression ratio. It can be used to archive higher compression ratio than default Lz4f codec for storing cost sensitive data such as logs. Compared to Lz4f codec, we see zstd codec get 35% compressed size off, 30% faster at first time read without OS page cache, 40% slower at second time read with OS page cache in the following comparison test. test data: 25GB text log, 110 million rows test table: test_table(ts varchar(30), log string) test SQL: set enable_vectorized_engine=1; select sum(length(log)) from test_table be.conf: disable_storage_page_cache = true set this config to disable doris page cache to avoid all data cached in memory for test real decompression speed. test result master branch with lz4f codec result: - compressed size 4.3G - SQL first exec time(read data from disk + decompress + little computation) : 18.3s - SQL second exec time(read data from OS pagecache + decompress + little computation) : 2.4s this branch with zstd codec (hardcode enable it) result: - compressed size: 2.8G - SQL first exec time: 12.8s - SQL second exec time: 3.4s	2022-05-27 21:56:18 +08:00
Lightman	b2c2cdb122	[feature] Support compression prop (#8923 )	2022-05-27 21:52:05 +08:00
Luwei	af2cfa2db4	[fix] Fix bug of bloom filter hash value calculation error (#9802 ) * Fix bug of bloom filter hash value calculation error * fix code style	2022-05-27 20:44:26 +08:00
yinzhijian	cbbda7857b	[feature-wip](parquet-orc) Support orc scanner in vectorized engine (#9541 )	2022-05-26 21:39:12 +08:00
Pxl	13c1d20426	[Bug] [Vectorized] add padding when load char type data (#9734 )	2022-05-26 16:51:01 +08:00
jacktengg	9236c2efc9	[improvement] Show detail status code string for be http api (#9771 ) 1. move to_json method to common/status 2. modify related usage in http folder	2022-05-26 15:09:21 +08:00
jacktengg	f4dd3bf013	[bugfix] fix memleak in olapscannode(#9736 )	2022-05-26 15:06:54 +08:00
Gabriel	24631915ed	[bugfix] fix correctness for vectorized compaction (#9773 )	2022-05-26 15:05:50 +08:00
Gabriel	cd99c24844	[Improvement] remove unused code in vectorized compaction (#9774 )	2022-05-26 15:05:27 +08:00
Adonis Ling	2a11a4ab99	[feature-wip][array-type] Support more sub types. (#9466 ) Please refer to #9465	2022-05-26 08:41:34 +08:00
spaces-x	73e31a2179	[stream-load-vec]: memtable flush only if necessary after aggregated (#9459 ) Co-authored-by: weixiang <weixiang06@meituan.com>	2022-05-25 21:12:24 +08:00
Gabriel	8470543144	[Improvement] fix typo (#9743 )	2022-05-25 19:29:01 +08:00
Zhengguo Yang	f5bef328fe	[fix] disable transfer data large than 2GB by brpc (#9770 ) because of brpc and protobuf cannot transfer data large than 2GB, if large than 2GB will overflow, so add a check before send	2022-05-25 18:41:13 +08:00
camby	2725127421	[fix] group by with two NULL rows after left join (#9688 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-05-25 16:43:55 +08:00
Xinyi Zou	ca05d1ee01	[fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (#9661 ) 1. Fix Lru Cache MemTracker consumption value is negative. 2. Fix compaction Cache MemTracker has no track. 3. Add USE_MEM_TRACKER compile option. 4. Make sure the malloc/free hook is not stopped at any time.	2022-05-25 08:56:17 +08:00
Dongyang Li	90e8cda5f2	[Enhancement](Vectorized)build hash table with new thread, as non-vec… (#9290 ) * [Enhancement][Vectorized]build hash table with new thread, as non-vectorized past do edit after comments * format code with clang format Co-authored-by: lidongyang <dongyang.li@rateup.com.cn> Co-authored-by: stephen <hello-stephen@qq.com>	2022-05-24 10:23:15 +08:00
Yongqiang YANG	6353539ef7	[bugfix]teach BufferedBlockMgr2 track memory right (#9722 ) The problem was introduced by e2d3d0134eee5d50b6619fd9194a2e5f9cb557dc.	2022-05-24 10:18:51 +08:00
Kang	8b7bb2d07c	[bugfix]fix column reader compress codec unsafe problem (#9741 ) by moving codec from shared reader to unshared iterator	2022-05-23 20:25:49 +08:00
HappenLee	5039ec4570	[vec][opt] opt hash join build resize hash table before insert data (#9735 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-05-23 15:13:57 +08:00
HappenLee	500c36717d	[Bug-Fix][Vectorized] Full join return error result (#9690 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-05-23 13:29:37 +08:00
Yongqiang YANG	c13a6a1d8a	[fix] NullPredicate should implement evaluate_vec (#9689 ) select column from table where column is null	2022-05-22 21:29:53 +08:00
pengxiangyu	75b3707a28	[refactor](load) add tablet errors when close_wait return error (#9619 )	2022-05-22 21:27:42 +08:00
gtchaos	b3a2a92bf5	[deps] libhdfs3 build enable kerberos support (#9524 ) Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication enabled, and found that kerberos-related dependencies（gsasl and krb5） were not added when build libhdfs3. so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5: - gsasl version: 1.8.0 - krb5 version: 1.19	2022-05-22 20:58:19 +08:00
xiepengcheng01	31e40191a8	[Refactor] add vpre_filter_expr for vectorized to improve performance (#9508 )	2022-05-22 11:45:57 +08:00
Gabriel	61a60d1dcc	[code style] minor update for code style (#9695 )	2022-05-20 11:47:49 +08:00
HappenLee	8fa677b59c	[Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner (#9666 ) * [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner 1. fix bug of vjson scanner not support `range_from_file_path` 2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different 3. fix bug of vparquest filter_block reference of column in not 1 4. refactor code to simple all the code It only changed vectorized load, not original row based load. Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-05-20 11:43:03 +08:00
zhangstar333	6f61af7682	[Vectorized][java-udf] add datetime&&largeint&&decimal type to java-udf (#9440 )	2022-05-20 10:26:09 +08:00
Jibing-Li	5fa6e892be	[fix](broker-scan-node) Remove trailing spaces in broker_scanner. Make it consistent with hive and trino behavior. (#9190 ) Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't. This would cause different query result with hive. Add a new session variable "trim_tailing_spaces_for_external_table_query". If set to true, when reading csv from broker scan node, it will trim the tailing space of the column	2022-05-20 09:55:13 +08:00
Yongqiang YANG	defdae1e7d	[improvement](stream-load) adjust read unit of http to optimize stream load (#9154 )	2022-05-20 09:52:36 +08:00
yiguolei	2c79d223e4	[refactor][rowset]move rowset writer to a single place (#9368 )	2022-05-19 23:57:02 +08:00
Lightman	ef65f484df	[Enhancement] improve parquet reader via arrow's prefetch and multi thread (#9472 ) * add ArrowReaderProperties to parquet::arrow::FileReader * support perfecth batch	2022-05-19 23:52:01 +08:00
Pxl	6951c42d5c	[Bug][Vectorized] fix schema change add varchar type column default value get wrong result (#9523 )	2022-05-19 23:38:57 +08:00
Dayue Gao	c09858671d	[improvement][performance] improve lru cache resize performance and memory usage (#9521 )	2022-05-19 23:37:59 +08:00
huangzhaowei	0f9ef26576	[Bug] Fix timestamp_diff issue when timeunit is year and month (#9574 )	2022-05-19 21:24:43 +08:00

1 2 3 4 5 ...

2117 Commits