doris

Author	SHA1	Message	Date
Gabriel	a3714981fd	[Bug](schema change) Fix bug for vectorized schema change (#11652 )	2022-08-10 21:42:51 +08:00
zhannngchen	70b39475cf	[fix](scanner) delete predicates might be inconsistent with rowset readers (#11598 )	2022-08-10 19:40:54 +08:00
Jerry Hu	c8418d13b5	[improvement](config)Use session variable to replace configuration for 'enable_function_pushdown' (#11641 )	2022-08-10 19:25:02 +08:00
Jerry Hu	0291f84a9e	[fix](like-predicate) Add missing functions in LikeColumnPredicate (#11631 )	2022-08-10 15:03:14 +08:00
caiconghui	71d9b383d4	[Enhancement](hdfs) Support loading hdfs config for be from hdfs-site.xml (#11634 ) * [Enhancement](hdfs) Support loading hdfs config for be from hdfs-site.xml Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2022-08-10 14:49:50 +08:00
Xin Liao	aaaf6915e4	[feature-wip](unique-key-merge-on-write) fix rowid conversion ut that may create a directory under an incorrect path (#11628 )	2022-08-10 08:17:47 +08:00
starocean999	601f28dd90	[fix](regexpr)regexpr functions' contexts should be THREAD_LOCAL (#11595 )	2022-08-10 06:58:24 +08:00
camby	01e4522612	[fix]collect_list/collect_set without GROUP BY for NOT NULL column (#11529 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-08-09 20:49:37 +08:00
carlvinhust2012	df47b6941d	[feature-wip](array-type) support the array type in reverse function (#11213 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-08-09 20:49:09 +08:00
Tiewei Fang	169996d8e4	[feature](information_schema) add `rowsets` table into information_s… (#11266 ) * [feature](information_schema) add 'segments' table into information_schema	2022-08-09 18:15:54 +08:00
Kang	f9b151744d	optimize topn query if order by columns is prefix of sort keys of table (#10694 ) * [feature](planner): push limit to olapscan when meet sort. * if olap_scan_node's sort_info is set, push sort_limit, read_orderby_key and read_orderby_key_reverse for olap scanner * There is a common query pattern to find latest time serials data. eg. SELECT * from t_log WHERE t>t1 AND t<t2 ORDER BY t DESC LIMIT 100 If the ORDER BY columns is the prefix of the sort key of table, it can be greatly optimized to read much fewer data instead of read all data between t1 and t2. By leveraging the same order of ORDER BY columns and sort key of table, just read the LIMIT N rows for each related segment and merge N rows. 1. set read_orderby_key to true for read_params and _reader_context if olap_scan_node's sort info is set. 2. set read_orderby_key_reverse to true for read_params and _reader_context if is_asc_order is false. 3. rowset reader force merge read segments if read_orderby_key is true. 4. block reader and tablet reader force merge read rowsets if read_orderby_key is true. 5. for ORDER BY DESC, read and compare in reverse order 5.1 segment iterator read backward using a new BackwardBitmapRangeIterator and reverse the result block before return to caller. 5.2 VCollectIterator::LevelIteratorComparator, VMergeIteratorContext return opposite result for _is_reverse order in its compare function. Co-authored-by: jackwener <jakevingoo@gmail.com>	2022-08-09 09:08:44 +08:00
pengxiangyu	b44c47fc10	[fix] (remote storage) fix bug for storage policy (#11597 )	2022-08-09 09:05:48 +08:00
Gabriel	ed7f7dead9	[Refactor](push-down predicate) Derive push-down predicate from vconjuncts (#11468 ) * [Refactor](push-down predicate) Derive push-down predicate from vconjuncts	2022-08-08 19:19:26 +08:00
yixiutt	0a5fd99d02	[feature-wip](unique-key-merge-on-write) speed up publish_txn (#11557 ) In our origin design, we calc delete bitmap in publish txn, and this operation will cost too much time as it will load segment data and lookup row key in pre rowset and segments.And publish version task should run in order, so it'll lead to timeout in publish_txn. In this pr, we seperate delete_bitmap calculation to tow part, one of it will be done in flush mem table, so this work can run parallel. And we calc final delete_bitmap in publish_txn, get a rowset_id set that should be included and remove rowsets that has been compacted, the rowset difference between memtable_flush and publish_txn is really small so publish_txn become very fast.In our test, publish_txn cost about 10ms. Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-08-08 18:57:55 +08:00
lihangyu	9349746987	[Fix](stream-load-json) fix VJsonReader::_write_data_to_column invalid column type cast when meet null (#11564 ) column_ptr will be a none nullable column pointer after `column_ptr = &nullable_column->get_nested_column()` so we should not cast column_ptr to ColumnNullable any more	2022-08-08 15:57:39 +08:00
Gabriel	87f56914e9	[Improvement](debug message) add necessary info to DCHECK message (#11586 )	2022-08-08 15:54:09 +08:00
Ashin Gau	37d1180cca	[feature-wip](parquet-reader)decode parquet data (#11536 )	2022-08-08 12:44:06 +08:00
Pxl	2cd3bf80dc	[bugfix](schema change)fix core dump on vectorized_alter_table (#11538 )	2022-08-08 10:45:28 +08:00
Xin Liao	1e6a3610a7	[feature-wip](unique-key-merge-on-write) optimize rowid conversion and add ut (#11541 )	2022-08-08 10:41:44 +08:00
slothever	e8a344b683	[feature-wip](parquet-reader) add predicate filter and column reader (#11488 )	2022-08-08 10:21:24 +08:00
yixiutt	bd4048f8fb	[enhancement](compaction) add idle schedule and max_size limit for base compaction (#11542 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-08-07 16:21:57 +08:00
slothever	95753ec868	[feature](parquet-reader) add group filter util (#11533 ) * [feature-wip](parquet-reader) add group filter util Co-authored-by: jinzhe <jinzhe@selectdb.com>	2022-08-05 14:02:48 +08:00
yiguolei	321107cb40	[refactor](schema change) Using tablet schema shared ptr instead of raw ptr (#11475 ) * Using tabletschema shared ptr instead of raw ptrs Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-05 11:04:38 +08:00
huangzhaowei	6eb8ac0ebf	[feature-wip][multi-catalog]Support caseSensitive field name in file scan node (#11310 ) * Impl case sentive in file scan node	2022-08-05 08:03:16 +08:00
Lightman	b5531c5caf	[BugFix](BE) fix condition index doesn't match (#11474 ) * [BugFix](Be) fix condition index doesn't match	2022-08-05 07:57:18 +08:00
starocean999	092a394782	[improvement](agg)limit the output of agg node (#11461 ) * [improvement](agg)limit the output of agg node	2022-08-05 07:53:55 +08:00
Gabriel	75fc830573	[Bug](date function) fix wrong year for format '%x' (#11520 )	2022-08-05 06:22:22 +08:00
Gabriel	b6118acc19	[feature](functions) support `add_months` on vectorized engine (#11518 )	2022-08-04 21:39:10 +08:00
Xinyi Zou	346fdeeee0	[fix](ut) Fix BE UT BetaRowsetTest failed (#11500 )	2022-08-04 17:57:57 +08:00
Ashin Gau	aed0282046	[feature-wip](parquet-reader)get compressed parquet page data (#11493 )	2022-08-04 17:44:52 +08:00
Pxl	ec3c911f97	[Feature][Materialized-View] support materialized view on vectorized engine (#10792 )	2022-08-04 14:07:48 +08:00
pengxiangyu	a943adac1a	[feature](cache) Add FileCache for RemoteFile (#11186 ) Add FileCache for RemoteFile, it will be opened in StoragePolicy. Cold data in remote file will be download to local cache files.	2022-08-04 10:57:32 +08:00
Xinyi Zou	ecbf87d77b	[bugfix](memtracker)fix exceed memory limit log (#11485 )	2022-08-04 10:22:20 +08:00
Pxl	ce68d24e95	[Bug](function) fix current_date not equal to curdate (#11463 ) * fix current_date not equal to curdate	2022-08-04 09:25:50 +08:00
weizuo93	838fdc1354	[Bug](httpserver) Fix bug that http server should not be stoped in destructor if it not running Co-authored-by: weizuo <weizuo@xiaomi.com>	2022-08-03 19:44:46 +08:00
Gabriel	e1b878fe10	[Improvement](datev2) apply time LUT to datev2/datetimev2 (#11401 ) * [Improvement](datev2) apply time LUT to datev2/datetimev2	2022-08-03 17:15:09 +08:00
Gabriel	5b9b6c9065	[WIP](decimalv3) WIP (#11443 ) * [feature-WIP](decimalv3) fix some bugs of decimalv3	2022-08-03 11:21:36 +08:00
HappenLee	77d82bb292	[Bug](MaterializedView) Fix bug of light schema change do not set right unique id cause MV coredump (#11396 ) Fix bug of light schema change do not set right unique id cause MV coredump	2022-08-03 11:21:28 +08:00
TengJianPing	b892dfdbbd	[Improvement](regresstion test) Fix regression test case failure for ASAN build (#11400 ) * [improvement](regresstion test) Improve performance of ASAN build by using -O3 and fix mem limit exceed error for nereids test cases * exclude tpcds_sf1 q72 for ASAN build because this query takes too long time	2022-08-03 11:19:50 +08:00
Adonis Ling	573ebf235e	[enhancement](build) Support customizing extra compile flags (#11444 )	2022-08-03 11:02:17 +08:00
slothever	1b4d6a620a	(feature-wip)[parquet-reader] support page index serde (#11415 )	2022-08-03 10:36:06 +08:00
yiguolei	de4466624d	[refactor](schema change)Remove delete from sc (#11441 ) * not need call delete handler to filter rows since they are filtered in rowset reader * need not call delete eval in schema change and remove related code Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-03 03:29:41 +08:00
Jerry Hu	842a5b8e24	[refactor](agg) Abstract the hash operation into a method" (#11399 )	2022-08-02 17:27:19 +08:00
awakeljw	1db8a2d136	[bugfix](runtimefilter)fix runtimefilter access violation when stub is nullptr (#11180 )	2022-08-02 16:57:17 +08:00
HappenLee	38ffe685b5	[Bug](ODBC) fix vectorized null value error report in odbc scan node (#11420 ) * [Bug](ODBC) fix vectorized null value error report in odbc scan node Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-08-02 15:44:12 +08:00
weizuo93	f730a048b1	[feature-wip](load) Support single replica load (#10298 ) During load process, the same operation are performed on all replicas such as sort and aggregation, which are resource-intensive. Concurrent data load would consume much CPU and memory resources. It's better to perform write process (writing data into MemTable and then data flush) on single replica and synchronize data files to other replicas before transaction finished.	2022-08-02 11:44:18 +08:00
Mingyu Chen	abbf75d302	[doc][refactor](metrics) Reorganize FE and BE metrics and add document (#11307 )	2022-08-02 11:34:06 +08:00
huangzhaowei	0ac5228c05	[feature-wip][multi-catalog]Support prefetch for orc file format (#11292 ) Refactor the prefetch code in parquet and support prefetch for orc file format	2022-08-02 11:01:15 +08:00
muyizi	bd6e3cf132	[improvement]lock_times_limit (#11404 ) Co-authored-by: songning03 <songning03@meituan.com>	2022-08-02 10:59:58 +08:00
Ashin Gau	44a1a20e65	[feature-wip](parquet-reader)parse parquet schema (#11381 ) Analyze schema elements in parquet FileMetaData, and generate the hierarchy of nested fields. For exmpale: 1. primitive type ``` // thrift: optional int32 <column-name>; // sql definition: <column-name> int32; ``` 2. nested type ``` // thrift: optional group <column-name> (LIST) { repeated group bag { optional group array_element (LIST) { repeated group bag { optional int32 array_element } } } } // sql definition: <column-name> array<array<int32>> ```	2022-08-02 10:56:13 +08:00

1 2 3 4 5 ...

2552 Commits