doris

Author	SHA1	Message	Date
Pxl	f5e3cd2737	[Improvement](aggregation) optimization for aggregation hash_table_lazy_emplace (#22327 ) optimization for aggregation hash_table_lazy_emplace	2023-08-02 11:50:21 +08:00
Xinyi Zou	bc87002028	[opt](conf) remote scanner thread num is changed to core num * 10 (#22427 )	2023-08-01 23:09:49 +08:00
HappenLee	3a11de889f	[Opt](exec) opt the performance of date parquet convert by date dict (#22384 ) before： mysql> select count(l_commitdate) from lineitem; +---------------------+ \| count(l_commitdate) \| +---------------------+ \| 600037902 \| +---------------------+ 1 row in set (0.86 sec) after: mysql> select count(l_commitdate) from lineitem; +---------------------+ \| count(l_commitdate) \| +---------------------+ \| 600037902 \| +---------------------+ 1 row in set (0.36 sec)	2023-08-01 12:24:00 +08:00
Ashin Gau	89433f6a13	[fix](complex_type) throw error when reading complex types in broker/stream load (#22331 ) Check whether there are complex types in parquet/orc reader in broker/stream load. Broker/stream load will cast any type as string type, and complex types will be casted wrong. This is a temporary method, and will be replaced by tvf.	2023-07-31 22:23:08 +08:00
zclllyybb	f2919567df	[feature](datetime) Support timezone when insert datetime value (#21898 )	2023-07-31 13:08:28 +08:00
HappenLee	4077338284	[Opt](parquet) opt the performance of date convertion (#22360 ) before： ``` mysql> select count(l_commitdate) from lineitem; +---------------------+ \| count(l_commitdate) \| +---------------------+ \| 600037902 \| +---------------------+ 1 row in set (1.61 sec) ``` after: ``` mysql> select count(l_commitdate) from lineitem; +---------------------+ \| count(l_commitdate) \| +---------------------+ \| 600037902 \| +---------------------+ 1 row in set (0.86 sec) ```	2023-07-30 15:54:13 +08:00
Pxl	210f6661b4	[Bug](profile) add lock on add_filter_info #22355 multiple scanner may update profile at same time	2023-07-29 12:45:50 +08:00
daidai	ae8a26335c	[opt](hive)opt select count() stmt push down agg on parquet in hive . (#22115 ) Optimization "select count() from table" stmtement , push down "count" type to BE. support file type : parquet ，orc in hive . 1. 4kfiles , 60kwline num before: 1 min 37.70 sec after: 50.18 sec 2. 50files , 60kwline num before: 1.12 sec after: 0.82 sec	2023-07-29 00:31:01 +08:00
Pxl	f7e0479605	[Chore](refactor) remove some unused code (#22152 ) remove some unused code	2023-07-28 17:30:46 +08:00
zhangstar333	1c6246f7ee	[improve](agg) support distinct agg node (#22169 ) select c_name from customer union select c_name from customer this sql used agg node to get distinct row of c_name, so it's no need to wait for inserted all data to hash map, could output the data which it's inserted into hash map successed.	2023-07-28 13:54:10 +08:00
Ashin Gau	0d7d9b92db	[fix](multi-catalog) complex types parsing failed, with unexpected nulls and rows (#22228 ) Fix tow bugs: 1. Unexpected null values in array column. If 65535 consecutive values are not null in nullable array column, this error will be triggered. The reason is that the array parser did not handle boundary conditions. 2. The number of rows of key filed, and that of value field in map column are not equal. Similarly, the number of rows among fields in struct column are not the same. This would be triggered when the number of rows are not equal among parquet pages of different columns in a row group.	2023-07-28 10:03:08 +08:00
Qi Chen	8caa5a9ba4	[Fix](mutli-catalog) Fix null partitions error in iceberg tables. (#22185 ) ### Issue when partition has null partitions, it throws error `Failed to fill partition column: t_int=null` ### Resolution - Fix the following null partitions error in iceberg tables by replacing null partition to '\N'. - Add regression test for hive null partition.	2023-07-27 23:57:35 +08:00
Mingyu Chen	00863f25e9	[improvement](profile) add table name for file scan node (#22299 ) ``` VFILE_SCAN_NODE(region) (id=0):(Active: 3.537us, % non-child: 0.00%) - RuntimeFilters: : - UseSpecificThreadToken: False - AcquireRuntimeFilterTime: 501ns - AllocateResourceTime: 105.598us ```	2023-07-27 23:54:31 +08:00
Jerry Hu	d0c369d61b	[fix](vec) Arena was not initialized in PartitionMethodSerialized (#22295 )	2023-07-27 18:55:57 +08:00
lsy3993	6f1c03c766	[fix](jdbc_catalog) fix int and bigint in mysql view when use doris catalog (#22251 )	2023-07-27 16:50:42 +08:00
Pxl	560731f392	[Bug](runtime-filter) fix probe expr prepared twice on minmax runtime filter (#22229 ) fix probe expr prepared twice on minmax runtime filter	2023-07-26 19:44:35 +08:00
TengJianPing	21ea0055fc	[improvement](scanner) use batch size of session instead of limit to improve performance of reading (#22240 )	2023-07-26 18:57:42 +08:00
Pxl	9451382428	[Improvement](aggregate) optimization for AggregationMethodKeysFixed::insert_keys_into_columns (#22216 ) optimization for AggregationMethodKeysFixed::insert_keys_into_columns	2023-07-26 16:19:15 +08:00
Ashin Gau	4c4f08f805	[fix](hudi) the required fields are empty if only reading partition columns (#22187 ) 1. If only read the partition columns, the `JniConnector` will produce empty required fields, so `HudiJniScanner` should read the "_hoodie_record_key" field at least to know how many rows in current hoodie split. Even if the `JniConnector` doesn't read this field, the call of `releaseTable` in `JniConnector` will reclaim the resource. 2. To prevent BE failure and exit, `JniConnector` should call release methods after `HudiJniScanner` is initialized. It should be noted that `VectorTable` is created lazily in `JniScanner`, so we don't need to reclaim the resource when `HudiJniScanner` is failed to initialize. ## Remaining works Other jni readers like `paimon` and `maxcompute` may encounter the same problems, the jni reader need to handle this abnormal situation on its own, and currently this fix can only ensure that BE will not exit.	2023-07-26 10:59:45 +08:00
Qi Chen	7b270d1ae9	[Fix](mutli-catalog) Fix orc reader crashed when hdfs reading error by catching exception. (#22193 ) orc reader crashed when hdfs reading error. 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /home/zcp/repo_center/zcp_repo/be/src/common/signal_handler.h:413 1# 0x00007F6F8B3C00C0 in /lib/x86_64-linux-gnu/libc.so.6 2# raise in /lib/x86_64-linux-gnu/libc.so.6 3# abort in /lib/x86_64-linux-gnu/libc.so.6 4# _gnu_cxx::_verbose_terminate_handler() [clone .cold] at ../../../../libstdc+-v3/libsupc+/vterminate.cc:75 5# _cxxabiv1::_terminate(void ()) at ../../../../libstdc+-v3/libsupc+/eh_terminate.cc:48 6# 0x0000555CBC4718C1 in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 7# 0x0000555CBC471A14 in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 8# doris::vectorized::ORCFileInputStream::read(void, unsigned long, unsigned long) at /home/zcp/repo_center/zcp_repo/be/src/vec/exec/format/orc/vorc_reader.cpp:121 9# orc::SeekableFileInputStream::Next(void const, int) in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 10# orc::DecompressionStream::readHeader() in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 11# orc::DecompressionStream::Next(void const, int) in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 12# void orc::RleDecoderV2::next<long>(long, unsigned long, char const) in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 13# orc::StringDictionaryColumnReader::loadDictionary() in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 14# orc::StructColumnReader::loadStringDicts(std::unordered_map<unsigned long, std::_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, std::unordered_map<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, orc::StringDictionary, std::hash<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, orc::StringDictionary> > >, orc::StringDictFilter const) in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 15# orc::RowReaderImpl::startNextStripe(orc::ReadPhase const&) in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 16# orc::RowReaderImpl::nextBatch(orc::ColumnVectorBatch&, void) in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 17# doris::vectorized::OrcReader::get_next_block(doris::vectorized::Block, unsigned long, bool) at /home/zcp/repo_center/zcp_repo/be/src/vec/exec/format/orc/vorc_reader.cpp:1420 18# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/zcp_repo/be/src/vec/exec/scan/vfile_scanner.cpp:250 19# doris::vectorized::VScanner::get_block(doris::RuntimeState, doris::vectorized::Block, bool) in /mnt/hdd01/STRESS_ENV/be/lib/doris_be 20# doris::vectorized::ScannerScheduler::_scanner_scan(doris::vectorized::ScannerScheduler, doris::vectorized::ScannerContext, std::shared_ptr<doris::vectorized::VScanner>) at /home/zcp/repo_center/zcp_repo/be/src/vec/exec/scan/scanner_scheduler.cpp:335 21# std::_Function_handler<void (), doris::vectorized::ScannerScheduler::_schedule_scanners(doris::vectorized::ScannerContext)::$_1::operator()() const::	2023-07-26 08:57:31 +08:00
zy-kkk	cf677b327b	[fix](jdbc catalog) Fixed mappings with type errors for bool and tinyint(1) (#22089 ) First of all, mysql does not have a boolean type, its boolean type is actually tinyint(1), in the previous logic, We force tinyint(1) to be a boolean by passing tinyInt1isBit=true, which causes an error if tinyint(1) is not a 0 or 1, Therefore, we need to match tinyint(1) according to tinyint instead of boolean, and this change will not affect the correctness of where k = 1 or where k = true queries	2023-07-25 22:45:22 +08:00
Gabriel	23e7423748	[pipeline](refactor) refactor pipeline task schedule logics (#22028 )	2023-07-25 17:18:26 +08:00
Gabriel	103c473b96	[Bug](pipeline) fix pipeline shared scan + topn optimization (#21940 )	2023-07-25 12:48:27 +08:00
Qi Chen	752cec9e19	[Fix](multi-catalog) Fix not single slot filter conjuncts with dict filter issue. (#22052 ) ### Issue Dictionary filtering is a mechanism that directly reads the dictionary encoding of a single string column filter condition for filter comparison. But dictionary filtered single string columns may be included in other multi-column filter conditions. This can cause problems. For example: `select * from multi_catalog.lineitem_string_date_orc where l_commitdate < l_receiptdate and l_receiptdate = '1995-01-01' order by l_orderkey, l_partkey, l_suppkey, l_linenumber limit 10;` `l_receiptdate` is string filter column，it is included by multi-column filter condition `l_commitdate < l_receiptdate`. ### Solution Resolve it by separating the multi-column filter conditions and executing it after the dictionary filter column is converted to string.	2023-07-24 22:31:18 +08:00
zhangdong	7fcf702081	[improvement](multi catalog)paimon support filesystem metastore (#21910 ) 1.support filesystem metastore 2.support predicate and project when split 3.fix partition table query error todo: Now you need to manually put paimon-s3-0.4.0-incubating.jar in be/lib/java_extensions when use s3 filesystem doc pr: #21966	2023-07-24 22:02:57 +08:00
Pxl	19ba6bec38	[Improvement](pipeline) support send eos on local exchange and remove some unused code (#22086 ) support send eos on local exchange and remove some unused code	2023-07-24 09:25:32 +08:00
yiguolei	2c16fe0da9	[bugfix](runtimefilter) runtime filter is shared between multi instances with same node id, should not cache exprs (#22114 ) runtime filter is shared among multi instances. in the past, we cached pushdown expr(runtime filter generated) every scannode[runtime filter consumer] will try to call prepare expr but the expr may generated with different fn_context_id --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-07-23 13:04:33 +08:00
zhangstar333	256051a965	[bug](node) fix partiton sort node core dump when eos (#22108 ) fix partiton sort node core dump when eos	2023-07-23 12:00:53 +08:00
yiguolei	f8307f1a1a	[bugfix](scanner) when scanner init failed during get tablet, not need call update counters (#22117 ) Co-authored-by: yiguolei <yiguolei@gmail.com> If the scanner is failed during init or open, then not need update counters because the query is fail and the counter is useless. And it may core during update counters. For example, update counters depend on scanner's tablet, but the tablet == null when init failed.	2023-07-23 10:19:20 +08:00
zhangstar333	afeac4419f	[Bug](node) fix partition sort node forget handle some type of key in hashmap (#22037 ) * [enhancement](repeat) add filter in repeat node in BE * update	2023-07-21 23:30:40 +08:00
Kaijie Chen	bed940b7fc	[fix](log) column index off-by-one error in scanner logs (#19747 )	2023-07-21 18:30:01 +08:00
ZenoYang	6512893257	[refactor](vectorized) Remove useless control variables to simplify aggregation node code (#22026 ) * [refactor](vectorized) Remove useless control variables to simplify aggregation node code * fix	2023-07-21 12:45:23 +08:00
Mryange	6875ef4b8b	[refactor](mem_reuse) refactor mem_reuse in MutableBlock (#21564 )	2023-07-20 22:53:19 +08:00
HappenLee	7947569993	[Bug][RegressionTest] fix the DCHECK failed in join code (#22021 )	2023-07-20 18:12:20 +08:00
zhangstar333	650d7cfc8c	[enhancement](repeat) add filter in repeat node in BE (#21984 ) [enhancement](repeat) add filter in repeat node in BE (#21984)	2023-07-20 17:25:13 +08:00
lihangyu	20242d9a0e	[Improve](simdjson) put unescaped string value after parsed (#21866 ) In some cases, it is necessary to unescape the original value, such as when converting a string to JSONB. If not unescape, then later jsonb parse will be failed	2023-07-20 10:33:17 +08:00
HappenLee	b35cfc5d5e	[opt](join) Opt the performance of join probe (#21845 )	2023-07-19 01:21:22 +08:00
Pxl	4171309b9b	[Bug](scanner) fix core dump due to release ScannerContext too early #21946	2023-07-19 00:53:23 +08:00
Mryange	c36d225a27	[feature](profile) add process hashtable time in join node (#21878 ) add process hashtable time in join node	2023-07-18 18:09:42 +08:00
Pxl	3089e4b3b6	[Bug](excution) fix ScannerContext is done make query failed (#21923 ) fix ScannerContext is done make query failed	2023-07-18 17:58:00 +08:00
Pxl	19492b06c1	[Bug](decimalv3) fix failed on test_dup_tab_decimalv3 due to wrong precision (#21890 ) fix failed on test_dup_tab_decimalv3 due to wrong precision	2023-07-18 12:53:09 +08:00
Pxl	b3d3ffa2de	[Bug](pipeline) adjust scanner scheduler.submit and _num_scheduling_ctx maintain (#21843 ) adjust scanner scheduler.submit and _num_scheduling_ctx maintain	2023-07-18 11:55:21 +08:00
Mingyu Chen	5fc0a84735	[improvement](catalog) reduce the size thrift params for external table query (#21771 ) ### 1 In previous implementation, for each FileSplit, there will be a `TFileScanRange`, and each `TFileScanRange` contains a list of `TFileRangeDesc` and a `TFileScanRangeParams`. So if there are thousands of FileSplit, there will be thousands of `TFileScanRange`, which cause the thrift data send to BE too large, resulting in: 1. the rpc of sending fragment may fail due to timeout 2. FE will OOM For a certain query request, the `TFileScanRangeParams` is the common part and is same of all `TFileScanRange`. So I move this to the `TExecPlanFragmentParams`. After that, for each FileSplit, there is only a list of `TFileRangeDesc`. In my test, to query a hive table with 100000 partitions, the size of thrift data reduced from 151MB to 15MB, and the above 2 issues are gone. ### 2 Support when setting `max_external_file_meta_cache_num` <=0, the file meta cache for parquet footer will not be used. Because I found that for some wide table, the footer is too large(1MB after compact, and much more after deserialized to thrift), it will consuming too much memory of BE when there are many files. This will be optimized later, here I just support to disable this cache.	2023-07-17 13:37:02 +08:00
HappenLee	a7eb186801	[Bug](CSVReader) fix null pointer coredump in CSVReader in p2 (#20811 )	2023-07-15 22:50:10 +08:00
HappenLee	7f50c07219	[Opt](exec) opt the outer join performance in TPCDS Q95 (#21806 )	2023-07-14 18:42:08 +08:00
Siyang Tang	b013f8006d	[enhancement](multi-table) enable mullti table routine load on pipeline engine (#21729 )	2023-07-14 12:16:32 +08:00
daidai	ca6e33ec0c	[feature](table-value-functions)add catalogs table-value-function (#21790 ) mysql> select * from catalogs() order by CatalogId;	2023-07-14 10:25:16 +08:00
HappenLee	254f76f61d	[Agg](exec) support aggregation_node limit short circuit (#21767 )	2023-07-14 00:29:19 +08:00
Qi Chen	6fd8f5cd2f	[Fix](parquet-reader) Fix parquet string column min max statistics issue which caused query result incorrectly. (#21675 ) In parquet, min and max statistics may not be able to handle UTF8 correctly. Current processing method is using min_value and max_value statistics introduced by PARQUET-1025 if they are used. If not, current processing method is temporarily ignored. A better way is try to read min and max statistics if it contains only ASCII characters. I will improve it in the future PR.	2023-07-14 00:09:41 +08:00
lihangyu	9cad929e96	[Fix](rowset) When a rowset is cooled down, it is directly deleted. This can result in data query misses in the second phase of a two-phase query. (#21741 ) * [Fix](rowset) When a rowset is cooled down, it is directly deleted. This can result in data query misses in the second phase of a two-phase query. related pr #20732 There are two reasons for moving the logic of delayed deletion from the Tablet to the StorageEngine. The first reason is to consolidate the logic and unify the delayed operations. The second reason is that delayed garbage collection during queries can cause rowsets to remain in the "stale rowsets" state, preventing the timely deletion of rowset metadata, It may cause rowset metadata too large. * not use unused rowsets	2023-07-13 11:46:12 +08:00

1 2 3 4 5 ...

894 Commits