doris

Author	SHA1	Message	Date
xzj7019	ab32299ba4	[feature](nereids) Support multi target rf #20714 Support multi target runtime filter, mainly for set operation, such as union/intersect/except.	2023-06-16 20:26:00 +08:00
zhannngchen	baed26acc6	[fix](merge-on-write) fix publish timeout (#20891 )	2023-06-16 18:11:41 +08:00
yongjinhou	2e295a1ee9	[Enhancement](http) unify http auth config (#20864 )	2023-06-16 16:55:46 +08:00
yuxuan-luo	97135a1cbb	[Feature] (json)add json_contains function (#20824 )	2023-06-16 15:10:12 +08:00
Chenyang Sun	f80bf3c1bc	[fix](compaction) fix time series compaction policy (#20837 )	2023-06-16 13:25:55 +08:00
Qi Chen	b7a50a09fe	[Opt](orc-reader) Optimize orc reader by dict filtering. (#20806 ) Optimize orc reader by dict filtering. It is similar with #17594. Test result ssb-flat-100: (3 nodes) \| Query \| before opt \| after opt \| \| ------------- \|:-------------:\| ---------:\| Q1.1 \| 1.239 \| 1.145 Q1.2 \| 1.254 \| 1.128 Q1.3 \| 1.931 \| 1.644 Q2.1 \| 1.359 \| 1.006 Q2.2 \| 1.229 \| 0.674 Q2.3 \| 0.934 \| 0.427 Q3.1 \| 2.226 \| 1.712 Q3.2 \| 2.042 \| 1.562 Q3.3 \| 1.631 \| 1.021 Q3.4 \| 1.618 \| 0.732 Q4.1 \| 2.294 \| 1.858 Q4.2 \| 2.511 \| 1.961 Q4.3 \| 1.736 \| 1.446 total \| 22.004 \| 16.316	2023-06-16 13:11:37 +08:00
Xinyi Zou	1ec7f0e50a	[fix](memory) memory management thread exits gracefully memory_maintenance_thread, memory_gc_thread, load_channel_tracker_refresh_thread, memory_tracker_profile_refresh_thread	2023-06-16 11:40:24 +08:00
Jack Drogon	9d41edd9eb	[Feature](binlog) Add binlog gc && Auth master_token (#20854 ) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>	2023-06-16 11:25:11 +08:00
Pxl	db1da6b787	[Chore](pipeline) add some profile log when pipeline canceled (#20825 ) add some profile log when pipeline canceled	2023-06-16 10:54:54 +08:00
YueW	420603317b	[improve](match) Improve performance for match query without inverted index (#20815 )	2023-06-16 10:35:38 +08:00
Jack Drogon	b38417358b	[chore](compile) Remove fs/benchmark/fs_benchmark_tool.cpp in IO_FILES (#20845 )	2023-06-15 21:16:59 +08:00
Xin Liao	f1af09ef87	[Enhancement](merge-on-write) parallel calculate delete bitmap when tablet has multi segments (#20706 )	2023-06-15 21:11:39 +08:00
Pxl	4bfceb7acb	[Bug](pipeline) fix pipeline task call finish_p_dependency more than once (#20851 ) fix pipeline task call finish_p_dependency more than once When pipeline task meet eos->PENDING_FINISH->CANCELED, this task will call finish_p_dependency twice.	2023-06-15 19:21:40 +08:00
HHoflittlefish777	bb5d36c5cb	[Log](load) change VLOG to INFO when write replica failing (#20783 )	2023-06-15 16:11:14 +08:00
Pxl	b6835840f7	[Bug](table-function) return InvalidArgument when explode_split meet empty delimiter (#20795 ) return InvalidArgument when explode_split meet empty delimiter	2023-06-15 15:17:22 +08:00
Pxl	01e53f4e67	[Bug](materialized-view) fix problems about create mv on ssb_flat q4.1 failed (#20658 ) fix problems about create mv on ssb_flat q4.1 failed	2023-06-15 14:38:21 +08:00
zhannngchen	15b9830859	[fix](partial-update) sequence column is not proceeded correctly #20813 When checking the keys in PrimaryKeyIndex, seq_col_length is not set to correct value, then we got a NOT_FOUND result for an existing key.	2023-06-15 14:07:00 +08:00
Pxl	17a395f5e3	[Bug](runtime-filter) fix runtime filter not register on vdata_gen_scan_node (#20787 ) fix runtime filter not register on vdata_gen_scan_node	2023-06-15 14:06:14 +08:00
fornaix	2151f5d04d	[fix](bitmap) fix bug: incorrect orthogonal bitmap result in some cases (#20819 ) (#20822 ) Issue Number: close #20819 If there is only one aggregation (update finalize) phase, result field will not be updated. This pr is aim to resolve it.	2023-06-15 14:05:24 +08:00
zhengyu	5b6761acb8	[enhencement](streamload) add on_close callback for httpserver (#20826 ) Sometimes connection cannot be released properly during on_free. We need on_close callback as the last resort. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-06-15 13:44:02 +08:00
airborne12	4b18cde0c7	[Update](clucene) synchronize clucene version to improve clcuene performance (#20794 ) Improve clucene performance by zero copy memory and SIMD string compare. Related PR: apache/doris-thirdparty#87	2023-06-15 10:35:51 +08:00
Xinyi Zou	1ce8f13837	[fix](memory) fix mem tracker in NodeChannel rpc callback (#20779 )	2023-06-15 10:35:25 +08:00
abmdocrt	5c4f6b4ca9	[fix](MOW)Fix publish timeout when enable MOW (#20828 )	2023-06-15 10:34:20 +08:00
zhengyu	4bf15b9788	[fix](load) fix race condition problem when insert commitinfo (#20823 ) Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-06-15 09:53:32 +08:00
Mryange	460399f214	[fix](profile) remove same profile in join node (#20734 )	2023-06-15 08:08:39 +08:00
Chenyang Sun	2a2e485456	[Enhancement](compaction) time-series scenario cumulative compaction policy (#20715 ) new compaction policy for log and time-series scenario	2023-06-14 23:48:44 +08:00
zy-kkk	09d187ec77	[improvement](ck jdbc) Optimized reading of datetime and ip types of the ClickHouse JDBC Catalog (#20804 )	2023-06-14 23:28:08 +08:00
slothever	bb617ee2cc	[fix](parquet-reader)fix page v2 header offset (#20814 ) fix page v2 header offset. get correct offset when read next page in file.	2023-06-14 23:27:31 +08:00
Pxl	3727483c06	[Chore](build) update ldb_toolchain to v0.18 (#20802 ) * update ldb_toolchain to v0.18 * update	2023-06-14 18:38:35 +08:00
lihangyu	0ecc98df82	[Bug](rowset) expire delayed rowsets should be ignored and should not be deleted in _tablet_meta (#20803 )	2023-06-14 18:30:13 +08:00
yiguolei	31a4f96f01	[refactor](exprcontext) move close to expr context's dector method (#20747 ) The close method does nothing. But I am not sure we could remove it. So that I add it to dector method and remove many many calls.	2023-06-14 18:01:07 +08:00
abmdocrt	b97537b04b	[Fix](MOW) Fix load data publish timeout when enable unique key MOW (#20720 )	2023-06-14 17:56:02 +08:00
Mingyu Chen	615778924e	[feature](fs) add fs benchmark tool framework (#20770 ) Add an optional executable binary fs_benchmark_tool, for test the performance of file system such as hdfs, s3. Usage: ./fs_benchmark_tool --conf my.conf --fs_type=s3 --operation=read --iterations=5 in my.conf, you can add any config key value with following format: key1=value1 key2=value2 By default, this binary will not be built. Only build it when setting BUILD_FS_BENCHMARK=ON. The binary will be installed in output/be/lib. For developer, you can add new subclass of BaseBenchmark to add your own benchmark. See be/src/io/fs/benchmark/s3_benchmark.hpp for an example	2023-06-14 17:50:06 +08:00
Pxl	a0d4f11667	[Bug](function) catch error state in function cast to avoid core dump (#20751 ) catch error state in function cast to avoid core dump	2023-06-14 17:34:34 +08:00
airborne12	d922a4a9fa	[Feature-WIP](inverted index) add inverted index file size method (#20758 ) This PR calculates the size of the inverted index files. The changes consist of: Introduction of a new get_inverted_index_size() method in different column writers such as ScalarColumnWriter, StructColumnWriter, ArrayColumnWriter, and MapColumnWriter. This method will fetch the size of the inverted index file associated with that column. If the file size cannot be fetched, it defaults to 0. A new method file_size() has been added in InvertedIndexColumnWriter class which retrieves the size of the file stored on disk. If the file size cannot be fetched, it logs an error and returns -1. Additionally, a new method get_inverted_index_file_size() is introduced in SegmentWriter which aggregates the inverted index file sizes of all the column writers.	2023-06-14 17:18:20 +08:00
Xin Liao	dd5b82fe00	[Enhancement](merge-on-write) optimize contains_agg when calculate delete bitmap (#20762 )	2023-06-14 16:25:11 +08:00
lihangyu	0f470fec0e	[Bug](topn opt) Fix Two-Phase read when some rowset swept (#20732 ) * [Bug](topn opt) Fix Two-Phase read when some rowset swept If this is a Two-Phase read query, and we need to delay the release of Rowset by row->update_delayed_expired_timestamp() to expand the lifespan of rowsets. This is necessary to avoid data loss during the second phase reading, where some stale rowsets may be swept and result in missing data.	2023-06-14 15:46:29 +08:00
Xinyi Zou	f2025b9eed	[fix](memory) before compaction run, check memory exceed limit #20782	2023-06-14 14:20:48 +08:00
plat1ko	9b4b0d4bf9	[fix](cooldown) Fix bug when cooldown a dropped tablet (#20750 )	2023-06-14 09:42:55 +08:00
Xin Liao	fd97587aff	[fix](merge-on-write) fix the merged rows is not equal to missed rows when do cumulative compaction (#20754 )	2023-06-13 22:18:59 +08:00
Pxl	9244cb6553	[Chore](runtime-filter) do not make query fail when rf publish failed (#20742 ) do not make query fail when rf publish failed	2023-06-13 18:23:46 +08:00
airborne12	ad2f1b5647	[Update](clucene) synchronize clucene version to address PFOR adaptation issue (#20736 )	2023-06-13 18:04:48 +08:00
TengJianPing	feb21fc9e9	[fix](group_concat) use default seperator ',' instead of ', ' for group_concat, to be consistant with mysql (#20741 )	2023-06-13 17:20:29 +08:00
lihangyu	2dddab03a1	[compatibility](schema cache) ensure schema version when using schema cache (#20729 ) When FE is old version, be is new version, issue a schema change(add column) and then query, old version of FE query without schema version could result in reading stale schema from schema cache	2023-06-13 15:19:26 +08:00
Mingyu Chen	4b15185e25	[improvement](hdfs) add parquet footer cache and hdfs file handle cache (#20544 ) 1. Add hdfs file handle cache for hdfs file reader Copied from Impala, `https://github.com/apache/impala/blob/master/be/src/util/lru-multi-cache.h`. (Thanks for the Impala team) This is a lru cache that can store multi entries with same key. The key is build with {file name + modification time} The value is the hdfsFile pointer that point to a certain hdfs file. This cache is to avoid reopen same hdfs file mutli time, which can save query time. Add a BE config `max_hdfs_file_handle_cache_num` to limit the max number of file handle cache, default is 20000. 2. Add file meta cache The file meta cache is a lru cache. the key is {file name + modification time}, the value is the parsed file meta info of the certain file, which can save the time of re-parsing file meta everytime. Currently, it is only used for caching parquet file footer. The test show that is cache is hit, the `FileOpenTime` and `ParseFooterTime` is reduce to almost 0 in query profile, which can save time when there are lots of files to read.	2023-06-13 15:13:57 +08:00
Pxl	e010fa8d4f	[Chore](runtime filter) remove runtime filter ready_for_publish/publish_finally (#20593 )	2023-06-13 11:20:49 +08:00
lexluo09	57656b2459	[Enhancement](java-udf) java-udf module split to sub modules (#20185 ) The java-udf module has become increasingly large and difficult to manage, making it inconvenient to package and use as needed. It needs to be split into multiple sub-modules, such as : java-commom、java-udf、jdbc-scanner、hudi-scanner、 paimon-scanner. Co-authored-by: lexluo <lexluo@tencent.com>	2023-06-13 09:41:22 +08:00
HappenLee	51bbf17786	[Refactor](Profile) Add and refactor the join profile (#20693 )	2023-06-13 09:06:51 +08:00
Qi Chen	73ad885e19	[Feature][Fix](multi-catalog) Implements transactional hive full acid tables. (#20679 ) After supporting insert-only transactional hive full acid tables #19518, #19419, this PR support transactional hive full acid tables. Support hive3 transactional hive full acid tables. Hive2 transactional hive full acid tables need to run major compactions.	2023-06-13 08:55:16 +08:00
Pxl	5e3a96d605	[Bug](pipeline) fix memory leak because pipeline shared ptr not release #20710	2023-06-13 08:50:34 +08:00

1 2 3 4 5 ...

4771 Commits