doris

Author	SHA1	Message	Date
Xinyi Zou	2c9bdd64fa	[fix](memory) arena support memory reuse after clear() (#21033 )	2023-06-21 23:27:21 +08:00
Gabriel	2ce8cfbebd	[profile](sort) add some metrics in profile (#21056 )	2023-06-21 22:57:46 +08:00
Xinyi Zou	661e1ae7c5	[fix](memory) no switch bthread context in UBSAN compile (#21064 ) When UBSAN is compiled, all memory will be tracked to the orphan (unknown) mem tracker, and the bthread context and mem tracker will no longer be switched. The supplementary fixes are as follows: #20999	2023-06-21 21:14:07 +08:00
Xinyi Zou	84b97860a1	[fix](memory) Fix memory exceed limit and query has been canceled, Allocator will block 100ms (#20959 )	2023-06-21 17:35:19 +08:00
Qi Chen	bad22dd4e2	[Fix](orc-reader) Fix orc dict filter null value issue in `_convert_dict_cols_to_string_cols` which caused incorrect result. (#21047 ) Query results should not have empty values. ``` use regresssion.multi_catalog; select commit_id from github_events_orc WHERE (event_type = 'CommitCommentEvent') AND commit_id != "" limit 10; ``` ``` +------------------------------------------+ \| commit_id \| +------------------------------------------+ \| 685c1fd8dbbdc10c042932f9a9f88be00ff96c75 \| \| 685c1fd8dbbdc10c042932f9a9f88be00ff96c75 \| \| 4e3ab2ff2d2474f5d51334b9b0fdf17e9845a166 \| \| \| \| \| \| \| \| \| \| \| \| \| \| 7191c20cb49da07a7fc16aa32dc0de4faff528b2 \| +------------------------------------------+ 10 rows in set (0.54 sec) ```	2023-06-21 14:54:01 +08:00
Gabriel	81abdeffbc	[Improvement](pipeline) Improve shared scan performance (#20785 )	2023-06-21 14:36:05 +08:00
Pxl	5f0bb49d46	[Feature](materialized-view) support create mv contain aggstate column (#20812 ) support create mv contain aggstate column	2023-06-21 13:06:52 +08:00
Ashin Gau	ef17289925	[feature](jni) add jni metrics and attach to BE profile automatically (#21004 ) Add JNI metrics, for example: ``` - HudiJniScanner: 0ns - FillBlockTime: 31.29ms - GetRecordReaderTime: 1m5s - JavaScanTime: 35s991ms - OpenScannerTime: 1m6s ``` Add three common performance metrics for JNI scanner: 1. `OpenScannerTime`: Time to init and open JNI scanner 2. `JavaScanTime`: Time to scan data and insert into vector table in java side 3. `FillBlockTime`: Time to convert java vector table to c++ block And support user defined metrics in java side, for example: `OpenScannerTime` is a long time for the open process, we want to determine which sub-process takes too much time, so we add `GetRecordReaderTime` in java side. The user defined metrics in java side can be attached to BE profile automatically.	2023-06-21 11:19:02 +08:00
dujl	0cf9de8cef	[fix](decimalv3) fix result error when cast a round decimalv3 to double (#20678 )	2023-06-21 00:02:48 +08:00
HappenLee	ca6f51fcd5	[Performance] disable mmap alloc for doris performance (#21034 ) disable mmap alloc for some benchmark	2023-06-20 23:27:49 +08:00
Xinyi Zou	6d579d924d	[fix](profile) delete useless profile add_child #20989	2023-06-20 23:21:52 +08:00
Kang	2c11ce0a02	[bugfix](topn) fix key topn merge block conflict with index predicate result columns (#20820 )	2023-06-20 21:23:00 +08:00
Xinyi Zou	622ef63c69	[fix](memory) fix `bthread_setspecific` error in rpc done.run() (#20999 )	2023-06-20 21:00:45 +08:00
TengJianPing	55a6649da9	[fix](testcase) fix test case failure of insert null value into not null column (#20963 )	2023-06-20 20:46:07 +08:00
zzzxl	190debaac9	[Improvement](load) single partition load optimize (#20876 ) 1. When creating a single partition，partition and tablet are not looked up for each row of data 2. Only DISTRIBUTED BY random	2023-06-20 20:29:39 +08:00
Qi Chen	c85271d2ae	[Fix](orc-reader) Fix filter size mismatch in orc reader. (#20998 ) Fix filter size mismatch in orc reader introduced by #20806	2023-06-20 12:27:16 +08:00
Ashin Gau	923f7edad0	[opt](hudi) using native reader to read the base file with no log file (#20988 ) Two optimizations: 1. Insert string bytes directly to remove decoding&encoding process. 2. Use native reader to read the hudi base file if it has no log file. Use `explain` to show how many splits are read natively.	2023-06-20 11:20:21 +08:00
zzzzzzzs	824bc02603	[Function] Support date function: microsecond() (#20044 )	2023-06-20 10:32:54 +08:00
yongjinhou	26cca5e00a	[Enhancement](tvf) Add frontends table-valued-function (#20857 )	2023-06-19 13:57:40 +08:00
Jerry Hu	08fff8923f	[improvement](serde) Optimizing the performance of mysql result writter (#20928 ) When converting query results into MySQL format, it involves transforming columnar data storage into row-based storage. This process raises the question of choosing between sequential reading and sequential writing. In reality, sequential writing is the better choice for performance optimization. Test with 9M rows contains more than 20 columns, this patch can reduce the conversion time from 20s to 11s.	2023-06-19 12:29:01 +08:00
TengJianPing	fb9fcf460a	[fix](leftjoin) fix bug of left and full join with other conjuncts (#20946 ) Fix bug of left and full outer join with other conjuncts. When equal matched row count of a probe row exceed batch_size, some times the _join_node->_is_any_probe_match_row_output flag is not set correcty, which result in outputing extra rows for the probe row.	2023-06-19 12:27:06 +08:00
Pxl	85c5d7c6a9	[Chore](materialized-view) add ssb_flat mv test case (#20869 ) add ssb_flat mv test case	2023-06-19 10:51:50 +08:00
YueW	d6b7640cf0	[fix](inverted index) fix check failed for block erase temp column (#20924 )	2023-06-18 19:27:48 +08:00
zhangstar333	0a59580aa4	[Enhancement](function) fix compatibility issues of sum/count during upgrade process (#20890 ) in order to solve agg of sum/count is not compatibility during the upgrade process. in PR [refactor](agg_state) refactor agg_state type to support fixed length object type #20370 have changed the serialize type and serialize column of sum/count before is ColumnVector, now sum/count change to use ColumnFixedLengthObject so during the upgrade process, will be not compatible if exist Old BE and Newer BE	2023-06-17 12:51:01 +08:00
xzj7019	ab32299ba4	[feature](nereids) Support multi target rf #20714 Support multi target runtime filter, mainly for set operation, such as union/intersect/except.	2023-06-16 20:26:00 +08:00
yuxuan-luo	97135a1cbb	[Feature] (json)add json_contains function (#20824 )	2023-06-16 15:10:12 +08:00
Qi Chen	b7a50a09fe	[Opt](orc-reader) Optimize orc reader by dict filtering. (#20806 ) Optimize orc reader by dict filtering. It is similar with #17594. Test result ssb-flat-100: (3 nodes) \| Query \| before opt \| after opt \| \| ------------- \|:-------------:\| ---------:\| Q1.1 \| 1.239 \| 1.145 Q1.2 \| 1.254 \| 1.128 Q1.3 \| 1.931 \| 1.644 Q2.1 \| 1.359 \| 1.006 Q2.2 \| 1.229 \| 0.674 Q2.3 \| 0.934 \| 0.427 Q3.1 \| 2.226 \| 1.712 Q3.2 \| 2.042 \| 1.562 Q3.3 \| 1.631 \| 1.021 Q3.4 \| 1.618 \| 0.732 Q4.1 \| 2.294 \| 1.858 Q4.2 \| 2.511 \| 1.961 Q4.3 \| 1.736 \| 1.446 total \| 22.004 \| 16.316	2023-06-16 13:11:37 +08:00
YueW	420603317b	[improve](match) Improve performance for match query without inverted index (#20815 )	2023-06-16 10:35:38 +08:00
HHoflittlefish777	bb5d36c5cb	[Log](load) change VLOG to INFO when write replica failing (#20783 )	2023-06-15 16:11:14 +08:00
Pxl	b6835840f7	[Bug](table-function) return InvalidArgument when explode_split meet empty delimiter (#20795 ) return InvalidArgument when explode_split meet empty delimiter	2023-06-15 15:17:22 +08:00
Pxl	01e53f4e67	[Bug](materialized-view) fix problems about create mv on ssb_flat q4.1 failed (#20658 ) fix problems about create mv on ssb_flat q4.1 failed	2023-06-15 14:38:21 +08:00
Pxl	17a395f5e3	[Bug](runtime-filter) fix runtime filter not register on vdata_gen_scan_node (#20787 ) fix runtime filter not register on vdata_gen_scan_node	2023-06-15 14:06:14 +08:00
fornaix	2151f5d04d	[fix](bitmap) fix bug: incorrect orthogonal bitmap result in some cases (#20819 ) (#20822 ) Issue Number: close #20819 If there is only one aggregation (update finalize) phase, result field will not be updated. This pr is aim to resolve it.	2023-06-15 14:05:24 +08:00
Xinyi Zou	1ce8f13837	[fix](memory) fix mem tracker in NodeChannel rpc callback (#20779 )	2023-06-15 10:35:25 +08:00
Mryange	460399f214	[fix](profile) remove same profile in join node (#20734 )	2023-06-15 08:08:39 +08:00
zy-kkk	09d187ec77	[improvement](ck jdbc) Optimized reading of datetime and ip types of the ClickHouse JDBC Catalog (#20804 )	2023-06-14 23:28:08 +08:00
slothever	bb617ee2cc	[fix](parquet-reader)fix page v2 header offset (#20814 ) fix page v2 header offset. get correct offset when read next page in file.	2023-06-14 23:27:31 +08:00
yiguolei	31a4f96f01	[refactor](exprcontext) move close to expr context's dector method (#20747 ) The close method does nothing. But I am not sure we could remove it. So that I add it to dector method and remove many many calls.	2023-06-14 18:01:07 +08:00
Pxl	a0d4f11667	[Bug](function) catch error state in function cast to avoid core dump (#20751 ) catch error state in function cast to avoid core dump	2023-06-14 17:34:34 +08:00
lihangyu	0f470fec0e	[Bug](topn opt) Fix Two-Phase read when some rowset swept (#20732 ) * [Bug](topn opt) Fix Two-Phase read when some rowset swept If this is a Two-Phase read query, and we need to delay the release of Rowset by row->update_delayed_expired_timestamp() to expand the lifespan of rowsets. This is necessary to avoid data loss during the second phase reading, where some stale rowsets may be swept and result in missing data.	2023-06-14 15:46:29 +08:00
Pxl	9244cb6553	[Chore](runtime-filter) do not make query fail when rf publish failed (#20742 ) do not make query fail when rf publish failed	2023-06-13 18:23:46 +08:00
TengJianPing	feb21fc9e9	[fix](group_concat) use default seperator ',' instead of ', ' for group_concat, to be consistant with mysql (#20741 )	2023-06-13 17:20:29 +08:00
lihangyu	2dddab03a1	[compatibility](schema cache) ensure schema version when using schema cache (#20729 ) When FE is old version, be is new version, issue a schema change(add column) and then query, old version of FE query without schema version could result in reading stale schema from schema cache	2023-06-13 15:19:26 +08:00
Mingyu Chen	4b15185e25	[improvement](hdfs) add parquet footer cache and hdfs file handle cache (#20544 ) 1. Add hdfs file handle cache for hdfs file reader Copied from Impala, `https://github.com/apache/impala/blob/master/be/src/util/lru-multi-cache.h`. (Thanks for the Impala team) This is a lru cache that can store multi entries with same key. The key is build with {file name + modification time} The value is the hdfsFile pointer that point to a certain hdfs file. This cache is to avoid reopen same hdfs file mutli time, which can save query time. Add a BE config `max_hdfs_file_handle_cache_num` to limit the max number of file handle cache, default is 20000. 2. Add file meta cache The file meta cache is a lru cache. the key is {file name + modification time}, the value is the parsed file meta info of the certain file, which can save the time of re-parsing file meta everytime. Currently, it is only used for caching parquet file footer. The test show that is cache is hit, the `FileOpenTime` and `ParseFooterTime` is reduce to almost 0 in query profile, which can save time when there are lots of files to read.	2023-06-13 15:13:57 +08:00
Pxl	e010fa8d4f	[Chore](runtime filter) remove runtime filter ready_for_publish/publish_finally (#20593 )	2023-06-13 11:20:49 +08:00
lexluo09	57656b2459	[Enhancement](java-udf) java-udf module split to sub modules (#20185 ) The java-udf module has become increasingly large and difficult to manage, making it inconvenient to package and use as needed. It needs to be split into multiple sub-modules, such as : java-commom、java-udf、jdbc-scanner、hudi-scanner、 paimon-scanner. Co-authored-by: lexluo <lexluo@tencent.com>	2023-06-13 09:41:22 +08:00
HappenLee	51bbf17786	[Refactor](Profile) Add and refactor the join profile (#20693 )	2023-06-13 09:06:51 +08:00
Qi Chen	73ad885e19	[Feature][Fix](multi-catalog) Implements transactional hive full acid tables. (#20679 ) After supporting insert-only transactional hive full acid tables #19518, #19419, this PR support transactional hive full acid tables. Support hive3 transactional hive full acid tables. Hive2 transactional hive full acid tables need to run major compactions.	2023-06-13 08:55:16 +08:00
Xinyi Zou	1433544c56	[fix](case expr) fix coredump of case for null value 3 #20711	2023-06-12 20:58:01 +08:00
TengJianPing	9d47c6a871	[fix](columnstring) fix bug of columnstring prefetch (#20698 )	2023-06-12 17:03:44 +08:00

1 2 3 4 5 ...

1822 Commits