Commit Graph

2099 Commits

Author SHA1 Message Date
af2cfa2db4 [fix] Fix bug of bloom filter hash value calculation error (#9802)
* Fix bug of bloom filter hash value calculation error

* fix code style
2022-05-27 20:44:26 +08:00
cbbda7857b [feature-wip](parquet-orc) Support orc scanner in vectorized engine (#9541) 2022-05-26 21:39:12 +08:00
Pxl
13c1d20426 [Bug] [Vectorized] add padding when load char type data (#9734) 2022-05-26 16:51:01 +08:00
9236c2efc9 [improvement] Show detail status code string for be http api (#9771)
1. move to_json method to common/status
2. modify related usage in http folder
2022-05-26 15:09:21 +08:00
f4dd3bf013 [bugfix] fix memleak in olapscannode(#9736) 2022-05-26 15:06:54 +08:00
24631915ed [bugfix] fix correctness for vectorized compaction (#9773) 2022-05-26 15:05:50 +08:00
cd99c24844 [Improvement] remove unused code in vectorized compaction (#9774) 2022-05-26 15:05:27 +08:00
2a11a4ab99 [feature-wip][array-type] Support more sub types. (#9466)
Please refer to #9465
2022-05-26 08:41:34 +08:00
73e31a2179 [stream-load-vec]: memtable flush only if necessary after aggregated (#9459)
Co-authored-by: weixiang <weixiang06@meituan.com>
2022-05-25 21:12:24 +08:00
8470543144 [Improvement] fix typo (#9743) 2022-05-25 19:29:01 +08:00
f5bef328fe [fix] disable transfer data large than 2GB by brpc (#9770)
because of brpc and protobuf cannot transfer data large than 2GB, if large than 2GB will overflow, so add a check before send
2022-05-25 18:41:13 +08:00
2725127421 [fix] group by with two NULL rows after left join (#9688)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-05-25 16:43:55 +08:00
ca05d1ee01 [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (#9661)
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
2022-05-25 08:56:17 +08:00
90e8cda5f2 [Enhancement](Vectorized)build hash table with new thread, as non-vec… (#9290)
* [Enhancement][Vectorized]build hash table with new thread, as non-vectorized past do

edit after comments

* format code with clang format

Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: stephen <hello-stephen@qq.com>
2022-05-24 10:23:15 +08:00
6353539ef7 [bugfix]teach BufferedBlockMgr2 track memory right (#9722)
The problem was introduced by e2d3d0134eee5d50b6619fd9194a2e5f9cb557dc.
2022-05-24 10:18:51 +08:00
8b7bb2d07c [bugfix]fix column reader compress codec unsafe problem (#9741)
by moving codec from shared reader to unshared iterator
2022-05-23 20:25:49 +08:00
5039ec4570 [vec][opt] opt hash join build resize hash table before insert data (#9735)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-23 15:13:57 +08:00
500c36717d [Bug-Fix][Vectorized] Full join return error result (#9690)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-23 13:29:37 +08:00
c13a6a1d8a [fix] NullPredicate should implement evaluate_vec (#9689)
select column from table where column is null
2022-05-22 21:29:53 +08:00
75b3707a28 [refactor](load) add tablet errors when close_wait return error (#9619) 2022-05-22 21:27:42 +08:00
b3a2a92bf5 [deps] libhdfs3 build enable kerberos support (#9524)
Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication 
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.

so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:

- gsasl version: 1.8.0
- krb5 version: 1.19
2022-05-22 20:58:19 +08:00
31e40191a8 [Refactor] add vpre_filter_expr for vectorized to improve performance (#9508) 2022-05-22 11:45:57 +08:00
61a60d1dcc [code style] minor update for code style (#9695) 2022-05-20 11:47:49 +08:00
8fa677b59c [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner (#9666)
* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code

It only changed vectorized load, not original row based load.

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-20 11:43:03 +08:00
6f61af7682 [Vectorized][java-udf] add datetime&&largeint&&decimal type to java-udf (#9440) 2022-05-20 10:26:09 +08:00
5fa6e892be [fix](broker-scan-node) Remove trailing spaces in broker_scanner. Make it consistent with hive and trino behavior. (#9190)
Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't.
This would cause different query result with hive.

Add a new session variable "trim_tailing_spaces_for_external_table_query".
If set to true, when reading csv from broker scan node, it will trim the tailing space of the column
2022-05-20 09:55:13 +08:00
defdae1e7d [improvement](stream-load) adjust read unit of http to optimize stream load (#9154) 2022-05-20 09:52:36 +08:00
2c79d223e4 [refactor][rowset]move rowset writer to a single place (#9368) 2022-05-19 23:57:02 +08:00
ef65f484df [Enhancement] improve parquet reader via arrow's prefetch and multi thread (#9472)
* add ArrowReaderProperties to parquet::arrow::FileReader

* support perfecth batch
2022-05-19 23:52:01 +08:00
Pxl
6951c42d5c [Bug][Vectorized] fix schema change add varchar type column default value get wrong result (#9523) 2022-05-19 23:38:57 +08:00
c09858671d [improvement][performance] improve lru cache resize performance and memory usage (#9521) 2022-05-19 23:37:59 +08:00
0f9ef26576 [Bug] Fix timestamp_diff issue when timeunit is year and month (#9574) 2022-05-19 21:24:43 +08:00
73c4ec7167 Fix some typos in be/. (#9681) 2022-05-19 20:55:39 +08:00
119ff2c02d [enhancement] Improve debugging experience. (#9677) 2022-05-19 16:36:37 +08:00
7c2db79b73 [BUG] fix bug for vectorized compaction and some storage vectorization bug (#9610) 2022-05-19 16:35:15 +08:00
bdaf0b3fcc [fix](storage) low_cardinality_optimize core dump when is null predicate (#9586)
Issue Number: close #9555
Make the last value of the dictionary null, when ColumnDict inserts a null value,
add the encoding corresponding to the last value of the dictionary·
2022-05-18 14:57:13 +08:00
c9ab5e22fe [fixbug](vec-load) fix core of segment_writer while it is not thread-safe (#9569)
introduce in stream-load-vec #9280, it will cause multi-thread
operate to same segment_write cause BetaRowset enable multi-thread
of memtable flush, memtable flush call rowset_writer.add_block, it
use member variable _segment_writer to write, so it will cause
multi-thread in segment write.

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-05-18 11:29:15 +08:00
bfb1ab059d [BUG] fix information_schema.columns results not correctly on vec engine (#9612)
* VSchemaScanNode get_next bugfix

* add regression-test case for VSchemaScanNode

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-05-18 07:44:32 +08:00
Pxl
26353ba8b5 [clang build]fix clang compile error (#9615) 2022-05-18 07:42:31 +08:00
908f9cb7b9 [Improvement][ASAN] make BE can exit normally and ASAN memory leak checking work (#9620) 2022-05-18 07:40:57 +08:00
4312ef93d7 [Improvement] reduce string size in serialization (#9550) 2022-05-17 22:38:34 +08:00
ec2cd0083a [code format]Upgrade clang-format in BE Code Formatter from 8 to 13 (#9602) 2022-05-17 19:28:15 +08:00
536d8ca1ed [Bug][Vectorized] Fix insert bimmap column with nullable column (#9408)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-17 14:42:20 +08:00
1cc9653bd8 [Bug][Vectorized] Fix BE crash with delete condition and enable_storage_vectorization (#9547)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-17 14:01:22 +08:00
7d9fa04472 [fix](storage-vectorized) fix VMergeIterator core dump (#9564)
It could be re appeared on rowset with many segment, it means segment overlap. Maybe could not reappear it easily.
2022-05-17 11:58:59 +08:00
bee5c2f8aa [feature-wip](parquet-vec) Support parquet scanner in vectorized engine (#9433) 2022-05-17 09:37:17 +08:00
5660815dc6 [chore] Fix compilation errors reported by clang (#9584) 2022-05-16 22:36:16 +08:00
953429e370 [fix](function) fix last_value get wrong result when have order by clause (#9247) 2022-05-15 23:56:01 +08:00
e0c790094c [enhancement][betarowset]optimize lz4 compress and decompress speed by reusing context (#9566) 2022-05-15 21:18:32 +08:00
3cfa83784e [bugfix](vectorized) vectorized write: invalid memory access caused by podarray resize (#9556) 2022-05-14 19:03:51 +08:00