Commit Graph

4996 Commits

Author SHA1 Message Date
ed1e130ef6 [BUGFIX] fix wrong children quantity in debug string (#10348) 2022-06-23 09:10:30 +08:00
274a0f2603 [fix] do not read seq column when reading a compacted rowset (#10344)
SEQ_COL is used on tables with unique key to order data in one transaction(rowset),
when there is only one rowset and the rowset is compacted, rows in the rowset is sorted 
and rows with same keys are resolved by compaction, so a scanner sets direct_mode to 
optimize read iterator to avoid sorting and aggregating, and iterators does not need SEQ_COL. 
However, init_return_columns adds SEQ_COL to return_columns, which is passed to SegmentIterator.
Then segment Iterator would be called via get_next with a block without SEQ_COL, segment iterator 
creates columns included in return_columns but not in the block. SEQ_COL is nullable, segment Iterator 
does not handle it, so a core dump happen.

Actually, in the above case, segment iterator does not need to read SEQ_COL. 
When SEQ_COL is really needed, iterators creates SEQ_COL column in block,
so segment Iterator does not need do create SEQ_COL at all.
2022-06-23 08:44:43 +08:00
46d20818f1 [fix][vectorized] Fix bug the of window function result not match plan nullable (#10340)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-06-23 08:40:12 +08:00
a2b5020375 [fix](partition-cache) fix result may not write when enable partition cache (#10319)
fix result may not write when enable partition cache
2022-06-22 23:42:46 +08:00
2bd8b0713a [feature](Nereids): cost and stats; (#10199)
feature: cost and stats;

+ Add cost model data struct.
+ Add operator visitor for calculating cost.
+ Add plan context.
2022-06-22 22:46:22 +08:00
8204b41d88 [feature](nereids)temporary disable exploration jobs in cascades (#10290)
Only do implementation in cascades.

In order to achieve the ssb test, at this stage, temporarily comment out the explore rule process in cascades.
It needs to be added in the future.
2022-06-22 20:05:01 +08:00
2a0ac05ce7 [fix] Fix duplicate code for PropertyAnalyzer.analyzeDataProperty() (#10190)
PropertyAnalyzer.analyzeDataProperty(Map<String, String> properties, final DataProperty oldDataProperty) has something not suitable.
Parameter oldDataProperty is the old DataProperty, properties should be used to replace some of its members.
If properties has no some members, old ones need to be left, but not be set to default value.
Function modifyPartitionsProperty() uses analyzeDataProperty(), but create a new DataProperty again, it is duplicate.
2022-06-22 15:28:04 +08:00
98b3306e05 [docs]add key words for helps (#10263)
* add key words for helps

* add key words for helps

* add key words for helps
2022-06-22 14:41:15 +08:00
b913d59560 [docs] aes docs fix (#10251)
* fix aes docs

* update keywords inside aes.md

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-22 14:40:40 +08:00
200557052a [BUGFIX] wrong answer with with as + two phase agg (#10303) 2022-06-22 14:39:39 +08:00
994feb9dbe [bugfix][compaction][vectorized]fix compaction OOM (#10289) 2022-06-22 14:38:30 +08:00
f7ed2817ad [fix] [ubsan] Fix TCMalloc Hook deadlocks when ThreadContext is initialized (#10310) 2022-06-22 14:37:48 +08:00
0361c5b0cf [TEST] use DB in config file instead of hard coding (#10321) 2022-06-22 14:37:15 +08:00
239ef84835 [Minor](nereids): Get plan directly (#10270) 2022-06-22 14:07:28 +08:00
531c9abac8 [feature](nereids) Add ssb related expressions and PlanNode (#10227)
Add related expressions and AggPlan and SortPlan.

This PR includes the addition of the following expression types:
Arithmetic/BetweenPredicate/CompoundPredicate/FunctionCall.

PlanNode:
LogicalAggregation/LogicalSort

Add a ut to verify related expressions and node parsing
2022-06-22 11:54:28 +08:00
e49fa6075f [doc] fix wrong number (#10305)
Co-authored-by: stephen <hello-stephen@qq.com>
2022-06-22 08:58:08 +08:00
60d43dc730 [docs]update the correct link (#10318)
Co-authored-by: smallhibiscus <844981280>
2022-06-22 08:51:01 +08:00
5248b21a01 [fix UT] for pr10249 evaluate interface changed (#10269)
* UT fix for pr10249, evaluate interface changed, but UT do not change.

* fix be code format

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-22 08:49:53 +08:00
83803d1968 [feature](nereids) implementation rules. (#10194) 2022-06-21 23:34:08 +08:00
Pxl
b3b054c9ff [fix](schema-change) fix schema_change failed because of emtpy origin_column_name passed and fix ut fail on isFullyQualified (#10281) 2022-06-21 19:53:58 +08:00
47dba440d0 Revert "[feature-wip](multi-catalog) add CatalogPrivTable to support unified authority management of datalake (#10246)" (#10297)
This reverts commit 41cb4c8f9cf1b58fb33a1e46d2b7db803a15a59f.
2022-06-21 15:55:15 +08:00
d056f5873b [Fix](compile) Fix compilation errors reported by clang (#10221)
fix failed to build the codebase by clang
2022-06-21 11:04:22 +08:00
24732126d2 [feature-wip](Nereids)Add bind slot reference rules and support any/multi/group/multigroup with predicate pattern (#10241)
Add bind slot reference rules and support any/multi/group/multigroup with predicate pattern
2022-06-21 10:51:46 +08:00
41cb4c8f9c [feature-wip](multi-catalog) add CatalogPrivTable to support unified authority management of datalake (#10246)
Supported:
1. Change FeMetaVersion to 111, compatible with upgrade from 110.
2. Add catalog level privileges, and degrade global level privileges to catalog level if FeMetaVersion < 111.
3. Support 'show all grants', 'show roles' statement.
4. Previous version of SQL syntax.

Todo:
1. three-segment format catalog.database.table in SQL syntax.
2. User document for the unified authority management of datalake.
3. LDAP services to provide authentication.
2022-06-21 10:26:50 +08:00
7a2134d7cc [TLP](step-2) update .asf.yaml (#10273) 2022-06-21 09:43:56 +08:00
d18808d0eb [docs] Add some user case to user list (#10256)
* [docs] Add user case to user list
2022-06-21 09:25:57 +08:00
84f57398d9 [Improvement] set debug string for VExpressions (#10166) 2022-06-21 07:43:25 +08:00
f5e5880fb6 [Improvement] make expression for template argument a constexpr (#10268) 2022-06-21 07:42:02 +08:00
dd39287a0a [fix](compile) Fix compile error in external file scan node (#10274) 2022-06-20 22:39:32 +08:00
5974e452bc [enhancement] CRC32 instructions compatible arm arch (#10261)
The performance of some CPUs that do not implement CRC instructions is particularly poor
2022-06-20 17:49:06 +08:00
75719bca92 [doc](website)Remove incubator prefix and add graduate note (#10257) 2022-06-20 17:48:29 +08:00
c3743ec9aa [enhancement] optmize 2 cases in seg_iter: all/none rows passed predicate (#10259)
* [enhancement] optmize 2 cases: all/none rows passed predicate in seg_iter.

* format
2022-06-20 17:47:52 +08:00
8a8d24b4a6 [feature-wip](multi-catalog) External file scan node (#9973) 2022-06-20 17:46:24 +08:00
8531dcb885 [refactor](nereids) Abstract interface of statistics framework for new optimizer reuse (#10240)
As the statistics framework could not be reused by new optmizer before, so I abstract some interface to make it reusable.

1. Make Slot extends the Id
2. Add new interface:ExprStats,PlanStats
3. Move definition of PlanNode.NodeType to statistics sub-directory
2022-06-20 15:16:06 +08:00
57327e6236 [improvement]Separate input and output parameters in ColumnPredicate (#10249)
```cpp
for (uint16_t i = 0; i < *size; ++i) {
	// some code here
}
```
The value of size is read for each conditional test, which also prevents possible vectorization.
2022-06-20 15:04:57 +08:00
087fc596b1 [feature] add remote storage policy config for create table properties (#10159)
Add remote storage policy config for create table properties. It will set storage policy for table and partitions in `CREATE TABLE` and `ALTER TABLE`.
This policy will be used when partition is being migrated from local to remote.
grammy:
1.
`CREATE TABLE TblPxy1
(...)
ENGINE=olap
DISTRIBUTED BY HASH (aa) BUCKETS 1
PROPERTIES(
    "remote_storage_policy" = "testPolicy3"
);`
2.
`ALTER TABLE TblPxy01 SET ("remote_storage_policy" = "testPolicy3");`
3.
`ALTER TABLE TblPxy01 MODIFY PARTITION p2 SET ("remote_storage_policy" = "testPolicy3");`
2022-06-20 12:42:23 +08:00
acf07d8966 [test] Add bitmap_intersect and schema_change test for regression test. (#9854) 2022-06-20 09:51:31 +08:00
588634ddf6 [feature] support runtime filter on vectorized engine (#10103) 2022-06-20 09:46:38 +08:00
2f37e108e3 [feature-wip](array-type) add ArrayType support for FeFunctions (#10041)
FEFunctionSignature do not support ArrayType as args, then following SQL failed:
`> select array_contains([1,2,3], 1);`
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: org.apache.doris.catalog.ArrayType cannot be cast to org.apache.doris.catalog.ScalarType
2022-06-20 09:35:06 +08:00
185de4dd43 [docs]update develop document (#10242) 2022-06-20 09:24:49 +08:00
ecdf8bcfdd [comments]Replace some chinese comments in product Code (#10243) 2022-06-20 09:24:19 +08:00
1c9ce29440 [improvement]Avoid frequently allocating and releasing flags in InListPredicate (#10248) 2022-06-20 09:08:02 +08:00
ab29ad2144 [typo] Fix typos in comments (#10247) 2022-06-20 09:06:29 +08:00
9a1f1c3864 [improvement](variables) change session variable when set global variable (#10238)
Currently, when setting variables with `global` keywords, it will not affect the
current session variable's value. That is always make user confused.

This CL mainly changes:

1. Change session variable when set global variable
2022-06-20 09:05:50 +08:00
f728fd4933 [fix](auth) Authentication exception when the name of database or table contains an underscore in grant statement. (#10213) 2022-06-19 22:20:01 +08:00
67f341f44e [TLP](step-1) Remove incubator prefix (#10230)
Remove some `incubator-` prefix in source code.
The document is not modified, will be done in next PR.
2022-06-19 19:34:52 +08:00
e09066c7ee [Improvement] delete deprefacte config in document and regression test (#10231) 2022-06-19 18:16:59 +08:00
6ad024a2bf [fix] (mem tracker) Refactor memtable mem tracker, fix flush memtable DCHECK failed (#10156)
1. Added memory leak detection for `DeltaWriter` and `MemTable` mem tracker
2. Modify memtable mem tracker to virtual to avoid frequent recursive consumption of parent tracker.
3. Disable memtable flush thread attach memtable tracker, ensure that memtable mem tracker is completely accurate.
4. Modify `memory_verbose_track=false`. At present, there is a performance problem in the frequent switch thread mem tracker. 
      - Because the mem tracker exists as a shared_ptr in the thread local. Each time it is switched, the atomic variable use_count in the shared_ptr of the current tracker will be -1, and the tracker to be replaced use_count +1, multi-threading Frequent changes to the same tracker shared_ptr are slow.
      - TODO: 1. Reduce unnecessary thread mem tracker switch, 2. Consider using raw pointers for mem tracker in thread local.
2022-06-19 16:48:42 +08:00
61d7724ab3 [tpch] Change all replication_num to 1 (#10244) 2022-06-19 10:42:04 +08:00
8439adad05 [doc] update array functions docs' location (#10226)
Change docs about array functions to correct directory.
Because we already refractor the docs directory.

```
docs/en/sql-manual/sql-functions/array-functions/    ===>
docs/en/docs/sql-manual/sql-functions/array-functions

```
```
docs/zh-CN/sql-manual/sql-functions/array-functions/    ===>
docs/zh-CN/docs/sql-manual/sql-functions/array-functions/
```
2022-06-19 10:40:40 +08:00