Commit Graph

18263 Commits

Author SHA1 Message Date
e7e13bc338 [optimize](array function) array_apply fucntion vectorized compute column_filter loop (#17687) 2023-03-19 10:18:09 +08:00
5f2b68df24 [fix](regression-test) fix unstable regression test cases found in p0 (#17900) 2023-03-19 10:11:57 +08:00
0de7f9787b [doc](typo) Fix a few character errors (#17911) 2023-03-19 09:09:02 +08:00
c5c89f3016 [Improve](hana catalog)Currently logged in users should only see the schemas they can access (#17918)
In the case of hana catalog, I think the current logged-in users should only see the schemas they can access.
2023-03-18 22:21:01 +08:00
d79da2f926 [Fix](parquet-reader) Fix dict filter not enabled. (#17882) 2023-03-18 22:16:37 +08:00
c95eb8a67f [enhancement] Function(create/drop) support the global operation (#16973) (#17608)
Support create/drop global function. 
     When you create a custom function, it can only be used within in one database. It cannot be used in other database/catalog. When there are many databases/catalog, it needs to create function one by one.

## Problem summary

Describe your changes.
1、 When a function is created or deleted, add the global keyword.

CREATE [GLOBAL] [AGGREGATE] [ALIAS] FUNCTION function_name (arg_type [, ...]) [RETURNS ret_type] [INTERMEDIATE inter_type] [WITH PARAMETER(param [,...]) AS origin_function] [PROPERTIES ("key" = "value" [, ...]) ]

DROP [GLOBAL] FUNCTION function_name (arg_type [, ...])

2、A completely global global function is set, and the global function metadata is stored in the image. The function lookup strategy is to look in the database first, and if it can't be found, it looks in the global function.
Co-authored-by: lexluo <lexluo@tencent.com>
2023-03-18 22:06:48 +08:00
49a053b3da [typo](docs) Add a hyperlink to facilitate user redirect. (#17899) 2023-03-18 21:21:36 +08:00
ee0f6120db [typo](doc)Replace invalid urls (#17886)
Co-authored-by: hechao <hechao@selectdb.com>
2023-03-18 21:20:52 +08:00
88713037bf [Bug][Fix] pipeline exec engine get wrong result when run regression test (#17896)
Fix regression p1:regression-test/suites/datev2/tpcds_sf1_p1/sql/pipeline case
2023-03-18 20:41:10 +08:00
e4d9ecd389 [regression-test](array) Fix array case, add order by (#17906) 2023-03-17 21:24:41 +08:00
3593b82498 [fix](schema change) Fix fe restart failed because of replay schema change alter job failed (#17825) 2023-03-17 20:54:50 +08:00
5c5dcfda78 Revert "[enhancement](memory) PODArray replaces MemPool in PredicateColumn (#17800)" (#17910)
This reverts commit 17d1c1bc7f6cc95eecd224eaa219c976b60fa17e.
2023-03-17 20:50:01 +08:00
46d88ede02 [Refactor](Metadata tvf) Reconstruct Metadata table-value function into a more general framework. (#17590) 2023-03-17 19:54:50 +08:00
8debc96d74 [enhancement](nereids) update FilterEstimation and Agg in stats derive (#17790)
* 1. update ndv in Stats,
2. skip __DORIS_DELETE_SIGN__=0 in stats derive,
3. equalTo in stats derive
4. update agg stats derive, support the case: all column_stats are unknown

* computeSize

* fix ut
2023-03-17 18:01:50 +08:00
043f77200f [Bug](dynamic-table) Fix column alignment logic and support filtering null values when slot is not null (#17842)
Before this PR when encountering null values with some columns which is specified as `NOT NULL`, null values will not be filtered,thi behavior does not match with the original load behavior.
Second column alignment logic has bug :

```
template <typename ColumnInserterFn>
void align_variant_by_name_and_type(ColumnObject& dst, const ColumnObject& src, size_t row_cnt,
                                    ColumnInserterFn inserter) {
    CHECK(dst.is_finalized() && src.is_finalized());
    // Use rows() here instead of size(), since size() will check_consistency
    // but we could not check_consistency since num_rows will be upgraded even
    // if src and dst is empty, we just increase the num_rows of dst and fill
    // num_rows of default values when meet new data
    size_t num_rows = dst.rows();
```
2023-03-17 16:53:30 +08:00
5bd5402378 [bug](udf) add synchronized to test resolve error of zip file closed (#17812) 2023-03-17 14:35:26 +08:00
1080a413a2 [fix](metric) Fix bug for that register txn replica failed (#17855) 2023-03-17 11:42:40 +08:00
bd44cc3f73 [fix](regression-test) move some case in test_query_sys_tables to p2 #17859 2023-03-17 11:26:06 +08:00
b95cd7eca2 [Refactor](function) Reconstruct default logic for const args. (#17830) 2023-03-17 11:13:13 +08:00
5d3de05976 [feature](map) basic functions for map datatype (#16916)
basic functions for map datatype:
- MAP<K, V> map(K k1, V v1, ...)
- BIGINT map_size(MAP<K, V> m)
- BOOL map_contains_key(MAP<K, V> m, K k1)
- BOOL map_contains_value(MAP<K, V> m, V v1)
- ARRAY< K> map_keys(MAP<K, V> m)
- ARRAY< V> map_values(MAP<K, V> m)
2023-03-17 10:28:17 +08:00
f099b7d7b5 [Doc](typo) Remove redundant symbols and Fix some misspelled words (#17609) 2023-03-17 09:01:59 +08:00
1222403233 [doc](typo)Fix docs insert into manual #17856 2023-03-17 08:57:02 +08:00
Hao
cf56a76108 [doc](typo)fix the different between cn and en document #17875 2023-03-17 08:52:51 +08:00
0ec10d4836 [Enhancement](fe exception) write a java annotation to catch throwable from a method and print log (#17797)
How it works?
Aspectj is used to implement the aspect function of annotations. During the compilation process, the aspectj-maven-plugin plugin will automatically weave the code with aspect annotations into the generated classes file.
When to use to?
When a method wants to add a try catch to save exception information, the LogException annotation can be used. When there is a method that does not allow errors, the NoException annotation can be used.
What is the result when adding this annotation?
Use the LogException annotation to automatically capture exceptions into the Log file, and the code can be more concise. Use the NoException annotation to automatically capture the exception to the Log file and exit the program when an exception occurs.
2023-03-17 08:52:27 +08:00
b4b126b817 [Feature](parquet-reader) Implements dict filter functionality parquet reader. (#17594)
Implements dict filter functionality parquet reader to improve performance.
2023-03-16 20:29:27 +08:00
24dbdb6a2f Update es.md (#17816) 2023-03-16 20:14:47 +08:00
e98143d44a [chore](thrift proto) add mysql_row_binary_format to PaloInternalService.thrift (#17844) 2023-03-16 18:48:56 +08:00
9173096b2d [fix](ui) fix download txt format error (#17789)
fix download txt format error
2023-03-16 17:59:29 +08:00
b5867176d4 [enhance](cooldown) add new cooldown cases (#16888) 2023-03-16 17:34:32 +08:00
b17c421f52 [fix](datetime) will get String index out of range exception (#17735)
will get String index out of range exception when use error datetime values like '2020-02-01'
before:
MySQL [test]> select test121.k1 from test121 where k1 != ('9102-12-');
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: String index out of range: 8

after:
MySQL [test]> select test121.k1 from test121 where k1 = '9102-12-';
ERROR 1105 (HY000): errCode = 2, detailMessage = Incorrect datetime value: '9102-12-' in expression: k1 = '9102-12-'
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-03-16 16:13:47 +08:00
5dde910931 [feature](profile) add clean all profile sql (#17751)
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-03-16 16:12:21 +08:00
cbfbe67508 [enhancement](fe query schedule) use try write lock to avoid too much wait time for planner (#17822)
* [enhancement](fe query schedule) use try write lock to avoid too much wait time for planner; prin acii id instead of big int
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-16 16:01:33 +08:00
ffda858f01 [fix](regression) fix unstable test cases and remove redundant cases (#17845)
aggregate_strategies execution too slow, use smaller table valued function to speed up
add a p2 case nereids_syntax_p2/aggregate_strategies to use larger table valued function to ensure correct
remove case nereids_syntax_p0/test_join_nereids since it redundant with nereids_p0/join/test_join
remove unstable case in query_p0/aggregate/aggregate
2023-03-16 15:59:26 +08:00
c29582bd57 [pipeline](split by segment)support segment split by scanner (#17738)
* support segment split by scanner

* change code by cr
2023-03-16 15:25:52 +08:00
ea943415a0 [bugfix](compaction) remove useless check (#17804)
transient size may not equal to candidate_rowset size.
For example, one rowset has many segment, but size is smaller then
promotion size, this rowset will break pick rowset loop cause compaction
score is enough but will be filtered in level_size check, this will make
 transient size not equal to candidate size.
2023-03-16 15:23:49 +08:00
caed2155f5 [test](fix) use vertorized interface in test (#17649) 2023-03-16 15:23:07 +08:00
731ba93773 [fix](regression) fix regression case (#17846) 2023-03-16 14:33:16 +08:00
b3d8be7cac [fix](cooldown)add push conf for alter storage policy (#17818)
* add push conf for alter storage policy
2023-03-16 14:27:27 +08:00
ee7226348d [FIX](Map) fix map compaction error (#17795)
When compaction case, memory map offsets coming to  same olap convertor which is from 0 to 0+size
but it should be continue in different pages when in one segment writer . 
eg : 
last block with map offset : [3, 6, 8, ... 100] 
this block with map offset : [5, 10, 15 ..., 100] 
the same convertor should record last offset to make later coming offset followed last offset.
so after convertor : 
the current offset should [105, 110, 115, ... 200], then column writer just call append_data() to make the right offset data append pages
2023-03-16 13:54:01 +08:00
0086fdbbdb [enhancement](planner) support delete from using syntax (#17787)
support syntax delete using, this syntax only support UNIQUE KEY model

use the result of `t2` join `t3` to romve rows from `t1`

```sql
-- create t1, t2, t3 tables
CREATE TABLE t1
  (id INT, c1 BIGINT, c2 STRING, c3 DOUBLE, c4 DATE)
UNIQUE KEY (id)
DISTRIBUTED BY HASH (id)
PROPERTIES('replication_num'='1', "function_column.sequence_col" = "c4");

CREATE TABLE t2
  (id INT, c1 BIGINT, c2 STRING, c3 DOUBLE, c4 DATE)
DISTRIBUTED BY HASH (id)
PROPERTIES('replication_num'='1');

CREATE TABLE t3
  (id INT)
DISTRIBUTED BY HASH (id)
PROPERTIES('replication_num'='1');

-- insert data
INSERT INTO t1 VALUES
  (1, 1, '1', 1.0, '2000-01-01'),
  (2, 2, '2', 2.0, '2000-01-02'),
  (3, 3, '3', 3.0, '2000-01-03');

INSERT INTO t2 VALUES
  (1, 10, '10', 10.0, '2000-01-10'),
  (2, 20, '20', 20.0, '2000-01-20'),
  (3, 30, '30', 30.0, '2000-01-30'),
  (4, 4, '4', 4.0, '2000-01-04'),
  (5, 5, '5', 5.0, '2000-01-05');

INSERT INTO t3 VALUES
  (1),
  (4),
  (5);

-- remove rows from t1
DELETE FROM t1
  USING t2 INNER JOIN t3 ON t2.id = t3.id
  WHERE t1.id = t2.id;
```

the expect result is only remove the row where id = 1 in table t1

```
+----+----+----+--------+------------+
| id | c1 | c2 | c3     | c4         |
+----+----+----+--------+------------+
| 2  | 2  | 2  |    2.0 | 2000-01-02 |
| 3  | 3  | 3  |    3.0 | 2000-01-03 |
+----+----+----+--------+------------+
```
2023-03-16 13:12:00 +08:00
bece027135 [ehancement](profile) Add HTTP interface for q-error (#17786)
1. Add Http interface for query q-error
2. Fix the selectivity calculation of inner join, it would always be 0 if there is only one join condition before
2023-03-16 12:19:23 +08:00
67b7128e8a [tools](tpcds) fix bug of generating and loading data (#17835)
---------

Co-authored-by: stephen <hello_stephen@@qq.com>
2023-03-16 11:59:39 +08:00
c2edca7bda [fix](Nereids) construct project with all slots in semi-semi-transpose-project rule (#17811)
error msg in tpch 20
```
SlotRef have invalid slot id: , desc: 22, slot_desc: tuple_desc_map: [Tuple(id=10 slots=[Slot(id=51 type=DECIMALV2(27, 9) col=-1, colname= null=(offset=0 mask=80)), Slot(id=52 type=INT col=-1, colname= null=(offset=0 mask=0)), Slot(id=53 type=INT col=-1, colname= null=(offset=0 mask=0)), Slot(id=54 type=INT col=-1, colname= null=(offset=0 mask=0)), Slot(id=55 type=INT col=-1, colname= null=(offset=0 mask=0))] has_varlen_slots=0)] tuple_id_map: [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0] tuple_is_nullable: [0] , desc_tbl: Slot(id=22 type=INT col=-1, colname= null=(offset=0 mask=0))
```

Before we only use slots in `hashJoin` conditions to construct projects, which may lost some slots in `project`, such as 
```
LOGICAL_SEMI_JOIN_LOGICAL_JOIN_TRANSPOSE_PROJECT

LogicalJoin[1135] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(PS_PARTKEY#0 = P_PARTKEY#6)], otherJoinConjuncts=[] )
|--LogicalProject[1128] ( distinct=false, projects=[PS_PARTKEY#0, PS_SUPPKEY#1], excepts=[], canEliminate=true )
|  +--LogicalJoin[1120] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(L_PARTKEY#17 = PS_PARTKEY#0), (L_SUPPKEY#18 = PS_SUPPKEY#1)], otherJoinConjuncts=[(cast(PS_AVAILQTY#2 as DECIMAL(27, 9)) > (0.5 * sum(L_QUANTITY))#33)] )
|     |--GroupPlan( GroupId#2 )
|     +--GroupPlan( GroupId#7 )
+--GroupPlan( GroupId#12 )
----------------------after----------------------
LogicalJoin[1141] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(L_PARTKEY#17 = PS_PARTKEY#0), (L_SUPPKEY#18 = PS_SUPPKEY#1)], otherJoinConjuncts=[(cast(PS_AVAILQTY#2 as DECIMAL(27, 9)) > (0.5 * sum(L_QUANTITY))#33)] )
|--LogicalProject[1140] ( distinct=false, projects=[PS_PARTKEY#0, PS_SUPPKEY#1], excepts=[], canEliminate=true )
|  +--LogicalJoin[1139] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(PS_PARTKEY#0 = P_PARTKEY#6)], otherJoinConjuncts=[] )
|     |--GroupPlan( GroupId#2 )
|     +--GroupPlan( GroupId#12 )
+--GroupPlan( GroupId#7 )
```
`PS_AVAILQTY#2` lost in project

Now we use all slots to construct projest
2023-03-16 11:53:32 +08:00
12d9d19366 [docs](Nereids) add nereids zh-CN docs (#16743) 2023-03-16 11:52:30 +08:00
a627a563cc [pipeline](ckb) the pull request which id is even also needs run clickbench (#17759)
the pull request which id is even also needs run clickbench
2023-03-16 11:36:22 +08:00
ffa1d4d96a [regression-test](mtmv) drop table and mv before running the case (#17802)
To avoid table or mv already exist problem
2023-03-16 11:16:06 +08:00
ebe651dae9 [Fix](Planner)Add call once logic to analyze of function aes_decrypt #17829
The problem is an exception when doing analyze:
java.lang.IllegalStateException: exceptions :
errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): xxx

The scenario is:
select aes_decrypt(xxx,xxx) as c0 from table group by c0;

Analyze of problem:
The direct problem is mismatched of slotref, and this mismatched due to the mismatched of parameter number of aes_decrypt function. When debuging, we can see the slotref of group column is added to ExprSubstitutionMap, but can not matching with select result columns. And this is because when substiting expr it will analyze again, so the parameter would be added twice. This will cause the mismatching of function, so it would not be substitute as a slotref, the exception would be throw.

Fix:
Add call once to adding third parameter of aes_decrypt type function. Compare the child we want to add to the last child of function. If they are the same, do not add it.
2023-03-16 11:04:21 +08:00
1da3e7596e [fix](point query) Fix NegativeArraySizeException when prepared statement contains a long string (#17651) 2023-03-16 10:24:33 +08:00
f8ad01f55d [fix](fe) fix drop frontend removeUnReadyElectableNode incorrectly (#17680)
* when add two not exist fe and drop two not exit fe, we may meet exception like this:
'''
    java.lang.IllegalArgumentException: com.sleepycat.je.config.IntConfigParam:
             param je.rep.electableGroupSizeOverride doesn't validate, -1 is less than min of 0
    at com.sleepycat.je.config.IntConfigParam.validate(IntConfigParam.java:47)
    at com.sleepycat.je.config.IntConfigParam.validateValue(IntConfigParam.java:75)
    at com.sleepycat.je.dbi.DbConfigManager.setVal(DbConfigManager.java:648)
    at com.sleepycat.je.dbi.DbConfigManager.setIntVal(DbConfigManager.java:694)
    at com.sleepycat.je.rep.ReplicationMutableConfig.setElectableGroupSizeOverrideVoid(ReplicationMutableConfig.java:523)
    at com.sleepycat.je.rep.ReplicationMutableConfig.setElectableGroupSizeOverride(ReplicationMutableConfig.java:512)
    at org.apache.doris.ha.BDBHA.removeUnReadyElectableNode(BDBHA.java:236)
    at org.apache.doris.catalog.Env.dropFrontend(Env.java:2533)
'''
2023-03-16 10:22:42 +08:00
b043b9798d [feature](bdbje) Add config param for bdbje logging level (#17064)
Add new config param bdbje_file_logging_level
2023-03-16 09:50:44 +08:00