Commit Graph

7941 Commits

Author SHA1 Message Date
edecc2e706 [feature-wip](inverted index) API for inverted index reader and syntax for fulltext match (#14211)
* [feature-wip](inverted index)inverted index api: reader

* [feature-wip](inverted index) Fulltext query syntax with MATCH/MATCH_ALL/MATCH_ALL

* [feature-wip](inverted index) Adapt to index meta

* [enhance] add more metrics

* [enhance] add fulltext match query check for column type and index parser

* [feature-wip](inverted index) Support apply inverted index in compound predicate which except leaf node of and node
2022-12-30 21:48:14 +08:00
b23d068281 [refactor](remove-non-vec) Remove non vec load from memtable and delta writer (#15517)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-12-30 21:22:58 +08:00
aacd11336a [typo](docs)update java udf demo (#15521) 2022-12-30 21:12:34 +08:00
aeaa319203 [fix](fe)change session variable group_concat_max_len from int to long (#15515) 2022-12-30 20:45:44 +08:00
ec52907b06 [fix](index) fix wrong dcheck in indexed column writer (#15520) 2022-12-30 20:12:41 +08:00
8e58d92e77 [typo](docs) fix document info missing in SHOW-TABLETS.md (#15488) 2022-12-30 18:39:21 +08:00
084eec87ee [docs](docs)update en docs (#15470)
* Update basic-summary.md
2022-12-30 18:38:26 +08:00
a7895ba169 [feature](Nereids): Support variance_samp function. (#15500) 2022-12-30 17:32:06 +08:00
34d7eeb571 [doc](session variable) add doc content for adding variables called rewrite_or_to_in_predicate_threshold (#15513)
Co-authored-by: wuhangze <wuhangze@jd.com>
2022-12-30 17:11:45 +08:00
93a25e1af5 [fix](nereids) the project node is lost when creating PhysicalStorageLayerAggregate node (#15467) 2022-12-30 16:33:24 +08:00
08d4dcefff [typo](doc)data partition doc including en and zh-CN #15379
Co-authored-by: Chen Jinquan 陈金泉 (690) <chenjinq@haier.com>
2022-12-30 15:38:25 +08:00
dec1eb360c [fix](brokerload) be core dump caused by broker load orc format file nullptr pointer (#15460) 2022-12-30 15:37:33 +08:00
2f572ccc43 [fix](index) fix that the last element of each batch will be read repeatedly for binary prefix page (#15481) 2022-12-30 15:36:55 +08:00
9246e03932 [Enhancement](hdfs) make libhdfs3 compatible with hdfs2 server (#15497)
When doris be getFileStatus from HDFS2 server, libhdfs3 will throw exception because of the permission code returned by hdfs2 server is greater than 1<<12.
The bit 12 of permission code is aclBit which has been deprecated in hadoop3. so we remove the check code in libhdfs3, same as hadoop3 java project.
2022-12-30 15:36:39 +08:00
2704651fde [fix](nereids) hll and bitmap type can't be used as order by and group by exprs (#15471)
hll, bitmap, array and quantile state type can't be used in order by, group by and some agg exprs.
2022-12-30 14:26:21 +08:00
5ec4e5586f [refactor]remove seek block in segmentIterator (#15413)
* remove seek block

* add reg test

Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-12-30 14:14:16 +08:00
520b6d7910 [Improvement](decimalv3) Add a config to check overflow for DECIMALV3 (#15463) 2022-12-30 14:02:24 +08:00
5db8b52441 [Fix](SparkLoad): fix the timeout aborted loadtasks are not cleaned up. (#15480)
Co-authored-by: spaces-x <weixiang06@meituan.com>
2022-12-30 14:02:00 +08:00
5c5b7a5c6f [Broker](bos) suppoert baidu bos object storage for broker (#15448) 2022-12-30 12:39:10 +08:00
2339dcda05 [fix](icebergv2)update icebergv2 regression case (#15442)
update icebergv2 regression case
Co-authored-by: jinzhe <jinzhe@selectdb.com>
2022-12-30 12:24:26 +08:00
917b266799 [fix](planner) table valued function could not used in subquery (#15496) 2022-12-30 10:01:25 +08:00
10be583e52 [chore](pipeline) optimize profile information (#15433) 2022-12-30 09:56:33 +08:00
2c8de30cce [optimize](multi-catalog) use dictionary encode&filter to process delete files (#15441)
**Optimize**
PR #14470 has used `Expr` to filter delete rows to match current data file,
but the rows in the delete file are [sorted by file_path then position](https://iceberg.apache.org/spec/#position-delete-files)
to optimize filtering rows while scanning, so this PR remove `Expr` and use binary search to filter delete rows.

In addition, delete files are likely to be encoded in dictionary, it's time-consuming to decode `file_path`
columns into `ColumnString`, so this PR use `ColumnDictionary` to read `file_path` column.

After testing, the performance of iceberg v2's MOR is improved by 30%+.

**Fix Bug**
Lazy-read-block may not have the filter column, if the whole group is filtered by `Expr`
and the batch_eof is generated from next batch.
2022-12-30 08:57:55 +08:00
85c7c531f1 [vectorized](jdbc) support array type in jdbc external table (#15303) 2022-12-30 00:29:08 +08:00
edb9a3b58d [Bug](timediff) Fix wrong result for function timediff (#15312) 2022-12-30 00:28:51 +08:00
9a517d6a8f [DataType](Deciamlv3) change the avg function scale of decimalv3 (#15445) 2022-12-30 00:27:51 +08:00
73f7ccb58f [typo](docs) fix document display error in SHOW-ALTER.md and SHOW-PARTITION-ID.md and SHOW-PARTITIONS.md (#15453) 2022-12-30 00:27:22 +08:00
3ff01ca799 [feature-wip](multi-catalog) support Iceberg time travel in external table (#15418)
For example
SELECT* FROM tbl FOR VERSION AS OF 10963874102873;
SELECT* FROM tbl FOR TIME AS OF '1986-10-26 01:21:00';
2022-12-30 00:25:21 +08:00
6c847daba0 [Feature](Nereids) Support grouping set for materialized index. (#15383)
This PR adds support for materialized index selecting when the query has grouping sets.
2022-12-29 23:17:02 +08:00
dda505487c [fix](nereids) SimplifyArithmeticRuleTest ut failed (#15486)
this PR remove typeCoercion on expected expr in ExpressionRewriteTestHelper. Because we should not rewrite expected expr at all. It will change the expected expr unexpectedly.
2022-12-29 22:53:27 +08:00
bb305aa572 [chore](badges) Remove daily test badges for origin engine (#15482)
The code of origin engine will be remove later, we already stop the daily test for origin engine, so we should remove this badges from home page.
2022-12-29 21:25:15 +08:00
c54c2f8035 [fix](statistics) fix npe when __internal_schema not created (#15464) 2022-12-29 21:24:33 +08:00
9b371f6b0b [fix](web ui) fix fe web ui (#14887) 2022-12-29 21:19:44 +08:00
79113b0cd1 [Fix](storage) Fix bug that cooldown time is error (#15444)
Cooldown time is wrong for data in SSD, because cooldown time for all `table/partitionis`
is only calculated once when class `DataProperty` loaded and that cannot be updated later.
This patch is to ensure that cooldown time for each table/partition can be calculated in real time
when table/partition is created.
Co-authored-by: weizuo <weizuo@xiaomi.com>
2022-12-29 21:01:36 +08:00
e651a9bb11 [feature](nereids) add variance function for nereids (#15370)
support variance function. currently, it dose not support decimalV3 type
2022-12-29 18:33:52 +08:00
43c8e7b465 [chore](thirdparty) Support cleaning extracted data before building them (#15458)
Currently, we may fail to build the third-party libraries if we keep the outdated extracted data.

Considering the following scenario, Bob added patches to some libraries and Alice updates the codebase and builds 
the third-party libraries. If Alice kept the outdated extracted data, she should fail to build the third-party libraries 
because the patches are not applied due to the outdated `patched_marks`.

This PR introduces a way to clean the outdated data before building the third-party libraries.
2022-12-29 16:01:23 +08:00
c22ba8e160 [Bug](Decimalv3) coredump of decimalv3 multiply (#15452) 2022-12-29 15:35:17 +08:00
89e2fb4301 [docs](readme)update the readme.md (#15465) 2022-12-29 14:51:17 +08:00
25b257e37c [enhancement](session var) varariable to control whether to rewrite OR to IN or not (#15437) 2022-12-29 14:50:32 +08:00
e2603ca883 [fix](docs) fix some docs about stream load and select. (#15372)
* [fix](docs) fix some docs about stream load and select.

* update
2022-12-29 14:50:06 +08:00
d95be84629 [enhancement](profile) add session variable parallel_fragment_exec_instance_num to profile (#15457) 2022-12-29 14:46:07 +08:00
657f3e6318 [fix](pipeline) disable sharing hashtable for broadcast join for pipeline engine (#15432) 2022-12-29 14:19:57 +08:00
2ae28ea9dd [typo](docs)fix-doc #15438 2022-12-29 14:19:24 +08:00
4179ea31bd [typo](docs) fix typo in SHOW-ALTER.md and SHOW-LOAD-WARNINGS.md (#15431) 2022-12-29 14:19:05 +08:00
4157d8c0a6 [Fix](Regression_test)add order by to increase test stability #15439 2022-12-29 14:18:51 +08:00
298c0a2391 [typo](docs)fix be dynamic configuration doc #15443 2022-12-29 14:18:14 +08:00
f5b4faf682 fix compile doc (#15454) 2022-12-29 14:17:53 +08:00
c5277bb4d6 [doc](label)update the label (#15455) 2022-12-29 14:17:40 +08:00
7ab6ea684b [Improvement](meta) hide password of show catalog xxx stmt and for es catalog (#15410)
* [Improvement](meta) hide password of show catalog xxx

* hide es password in show create ctlg and show ctlg xx stmt
2022-12-29 14:16:32 +08:00
5b09d27d54 [feature-wip](nereids) Made decimal in nereids more complete (#15087)
1. Add IntegralDivide operator to support `DIV` semantics
2. Add more operator rewriter to keep expression type consistent between operators
3. Support the convertion between float type and decimal type.

After this PR, below cases could be executed normaly like the legacy optimizer:
  use test_query_db;
  select k1, k5,100000*k5 from test order by k1, k2, k3, k4;
  select avg(k9) as a from test group by k1 having a < 100.0 order by a;
2022-12-29 13:01:47 +08:00