Commit Graph

894 Commits

Author SHA1 Message Date
4ef46159ae [vectorized](udaf) support array type for java-udaf (#17351) 2023-03-09 11:30:07 +08:00
06dee69174 [Refactor](map) remove using column array in map to reduce offset column (#17330)
1. remove column array in map 
2. add offsets column in map 
Aim to reduce duplicate offset  from key-array and value-array in disk
2023-03-09 11:22:26 +08:00
368e6a4f9c [Bug](array filter) Fix bug due to ColumnArray::filter_generic invalid inplace size_at after set_end_ptr (#17554)
We should make a new PodArray to add items instead of do it inplace
2023-03-09 10:59:29 +08:00
00727e8c11 [fix](in-bitmap) fix result may be wrong if the left side of the in bitmap predicate is a constant (#17570) 2023-03-09 10:59:05 +08:00
397cc011c4 [fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420)
ECB algorithm, block_encryption_mode does not take effect, it only takes effect when init vector is provided.
Solved: 192/256 supports calculation without init vector

For other algorithms, an error should be reported when there is no init vector

Initialization Vector. The default value for the block_encryption_mode system variable is aes-128-ecb, or ECB mode, which does not require an initialization vector. The alternative permitted block encryption modes CBC, CFB1, CFB8, CFB128, and OFB all require an initialization vector.

Reference: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_aes-decrypt

Note: This fix does not support smooth upgrades. during upgrade process, query may report error: funciton not found
2023-03-09 09:51:41 +08:00
2b6d971c2f [fix](nereids)fix first_value/lead/lag window function bug in nereids (#17315)
* [fix](nereids)fix first_value/lead/lag window function bug in nereids

* add more test

* add order by to fix test case

* fix test cases
2023-03-09 09:35:27 +08:00
4822b9811a [feature](nereids)support bitmap runtime filter on nereids (#16927)
* A in(B) -> bitmap_contains(bitmap_union(B), A)
support bitmap runtime filter on nereids

* GroupPlan -> Plan

* fmt

* fix target cast problem
remove test code
2023-03-09 09:30:24 +08:00
f0bd002911 [fix](DOE) Fix esquery not working (#17566)
Function esquery does not work because there is a problem parsing the first parameter type.
The first parameter, which is SlotRef, will be cast to CastExpr. This will cause error while generating ES DSL.
Add more types to adapt esquery function.
2023-03-08 21:51:17 +08:00
bd5ed2b0c2 [enhancement](histogram) optimize the histogram bucketing strategy, etc (#17264)
* optimize the histogram bucketing strategy, etc

* fix p0 regression of histogram
2023-03-08 20:12:05 +08:00
eea6d770d7 [fix](bitmap) fix wrong result of bitmap_or for null (#17456)
Result of select bitmap_to_string(bitmap_or(to_bitmap(1), null)) should be 1 instead of null.

This PR fix logic of bitmap_or and bitmap_or_count.

Other count related funcitons should also be checked and fix, they will be fixed in another PR.
2023-03-08 16:29:01 +08:00
aab14922af [Feature](Nereids) support MarkJoin (#16616)
# Proposed changes
1.The new optimizer supports the combination of subquery and disjunction.In the way of MarkJoin, it behaves the same as the old optimizer. For design details see:https://emmymiao87.github.io/jekyll/update/2021/07/25/Mark-Join.html.
2.Implicit type conversion is performed when conjects are generated after subquery parsing
3.Convert the unnesting of scalarSubquery in filter from filter+join to join + Conjuncts.
2023-03-08 14:26:24 +08:00
626fbc34f9 [bugfix](jsonb) Fix create mv using jsonb key cause be crash (#17430) 2023-03-08 14:18:26 +08:00
4ea0d6c5fa [feature](array_function) add support for array_popfront (#17416) 2023-03-08 13:57:38 +08:00
b1d65f855d [Feature](array-function) Support array_concat function (#17436) 2023-03-08 13:57:16 +08:00
778acb3c5b [opt](string) optimize string equal comparision (#17336)
Optimize string equal and not-equal comparison by using memequal_small_allow_overflow15.
2023-03-08 11:30:00 +08:00
4b743061b4 [feature](function) support type template in SQL function (#17344)
A new way just like c++ template is proposed in this PR. The previous functions can be defined much simpler using template function. 

    # map element extract template function
    [['element_at', '%element_extract%'], 'E', ['ARRAY<E>', 'BIGINT'], 'ALWAYS_NULLABLE', ['E']],

    # map element extract template function
    [['element_at', '%element_extract%'], 'V', ['MAP<K, V>', 'K'], 'ALWAYS_NULLABLE', ['K', 'V']],


BTW, the plain type function is not affected and the legacy ARRAY_X MAP_K_V is still supported for compatability.
2023-03-08 10:51:31 +08:00
69c62b6c6c [Fix](vectorization) fixed that when a column's _fixed_values exceeds the max_pushdown_conditions_per_column limit, the column will not perform predicate pushdown, but if there are subsequent columns that need to be pushed down, the subsequent column pushdown will be misplaced in _scan_keys and it causes query results to be wrong (#17405)
the max_pushdown_conditions_per_column limit, the column will not perform predicate pushdown, but if there are subsequent columns that need to be pushed down, the subsequent column pushdown will be misplaced in _scan_keys and it causes query results to be wrong
Co-authored-by: tongyang.hty <hantongyang@douyu.tv>
2023-03-08 07:23:56 +08:00
06468ba627 [vectorized](bug) fix array constructor function change origin column from block (#17296) 2023-03-07 16:42:23 +08:00
fd8adb492d [fix](nereids) fix bugs in nereids window function (#17284)
fix two problems:

1. push agg-fun in windowExpression down to AggregateNode
for example, sql:
select sum(sum(a)) over (order by b)
Plan:
windowExpression( sum(y) over (order by b))
+--- Agg(sum(a) as y, b)

2. push other expr to upper proj
for example, sql:
select sum(a+1) over ()
Plan:
windowExpression(sum(y) over ())
+--- Project(a + 1 as y,...)
+--- Agg(a,...)
2023-03-07 16:35:37 +08:00
8ccc805cd0 [Fix](Lightweight schema Change) query error caused by array default type is unsupported (#17331)
We have supportted array type default [], but when using lightweight schema Change to add column array type, query failed as follows:

Fix "array default type is unsupported" error.
Fix the default value filling assignment digit problem.
2023-03-07 16:30:41 +08:00
86252e25bf [regression-test](Nereids) add binary arithmetic regression test cases(#17363)
add all of the valid binary arithmetic expressions test for nereids.
currently, float, double, stringlike(string, char, varchar) doesn't support div, bitand, bitor, bitxor.
some results with float type are incorrect because of inaccurate precision of regression-test framework.
2023-03-07 15:48:22 +08:00
caacee253d [fix](olap)Crashing caused by IS NULL expression (#17463)
Issue Number: close #17462
2023-03-07 15:32:52 +08:00
7e96b06e6c [Enhance](auth)Users support multiple roles (#17236)
Describe your changes.
1.support GRANT role [, role] TO user_identity
2.support REVOKE role [, role] FROM user_identity
3.’Show grants‘ Add a column to display the roles owned by users
4.‘alter user’ prohibit deleting user's role
5.Repair Logic of roleName cannot start with RoleManager.DEFAULT_ ROLE
2023-03-07 10:28:56 +08:00
f85f89f240 [fix](planner) Fix incosistency between groupby expression and output of aggregation node (#17438) 2023-03-07 09:38:20 +08:00
440cf526c8 [fix](type compatibility) fix unsigned int type compatibility problem (#17427)
Fix unsigned int type compatibility value scope problem.

When defining columns, map UNSIGNED INT to BIGINT for compatibility.
The problems are as follows:
It is not consistent with this doc
image

We support the unsigned int type to be compatible with mysql types, but the unsigned int type is created as the int at the time of definition. This will cause numerical overflow.
2023-03-07 08:55:38 +08:00
aedbc5fcb1 [fix](planner) Slots in the cojuncts of table function node didn't got materialized #17460 2023-03-07 08:50:33 +08:00
e7cba11680 [fix](array)(parquet) fix be core dump due to load from parquet file containing array types (#17298) 2023-03-06 15:18:42 +08:00
ee1be6edd7 [chore](fe) enhance_mysql_data_type (#17429) 2023-03-06 10:42:01 +08:00
a8f20eb4ac [Enhencement](schema_scanner) Optimize the performance of reading information schema tables (#17371)
batch fill block
batch call rpc from FE to get table desc
For 34w colunms

SELECT COUNT( * ) FROM information_schema.columns;
time: 10.3s --> 0.4s
2023-03-06 09:53:01 +08:00
9aecd517b0 [test](Nereids) turn on all test in scalar function w (#17269)
turn on all test case in scalar function W except width_bucket(fix be bug in next PR)
turn off all test case for group_concat(distinct order by)
fix return nullable in TimestampArithmetic
2023-03-04 08:23:50 +08:00
e82b827bc8 [optimize](vectorization)Optimize to_string's performance. (#17076) 2023-03-03 10:35:59 +08:00
ba82cd10c6 [Enhencement](Jdbc catalog) Add two optional properties for jdbc catalog (#17245)
1. The first property is `only_specified_database`:
In the past, `Jdbc Catalog` will synchronize all database from source database.
Now we add a parameter called `only_specified_database` to jdbc catalog to allow only the specified database to be synchronized, eg:

```sql
create resource if not exists ${resource_name} properties(
    "type"="jdbc",
    "user"="root",
    "password"="123456",
    "jdbc_url" = "jdbc:mysql://172.18.0.1:${mysql_port}/doris_test?useSSL=false",
    "driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar",
    "driver_class" = "com.mysql.cj.jdbc.Driver",
    "only_specified_database" = "true"
);
```
if `only_specified_database` is `true`, jdbc catalog will only synchronize the database which is specified in `jdbc_url`.

2. The second property is `lower_case_table_names`:
This property will synchronize jdbc external data source table names in lower case.

```sql
create resource if not exists ${resource_name} properties(
  "type"="jdbc",
  "user"="doris_test",
  "password"="123456",
  "jdbc_url" = "jdbc:oracle:thin:@172.18.0.1:${oracle_port}:${SID}",
  "driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/ojdbc8.jar",
  "driver_class" = "oracle.jdbc.driver.OracleDriver",
  "lower_case_table_names" = "true"
);
```
2023-03-03 00:47:46 +08:00
3eeeff09fd [enhancement](nereids) convert string literal to commontype in in-expr and cass-when-expr (#17200) 2023-03-02 22:05:35 +08:00
27352afdf6 [fix](fe)support multi distinct group_concat (#17237)
* [fix](fe)support multi distinct group_concat

* update based on comments
2023-03-02 17:53:13 +08:00
9f088f6e90 [feature](json) add json_valid function (#17247)
add json_valid function

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-03-02 14:08:52 +08:00
9155d8b9d1 [fix](delete) fix 'is null' or 'is not null' delete predicate will get wrong result (#17190)
fix 'is null' or 'is not null' delete predicate will get wrong result

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-03-02 14:05:44 +08:00
543539cf18 [Feature](multi catalog)(nereids)Support ES external table for new planner. (#17290)
Support ES external table query using Nereids planner.
2023-03-01 22:32:41 +08:00
722755efe9 [fix](planner) change back legacy planner type coercion (#17070)
revert legacy planner change in #16844
2023-03-01 20:55:56 +08:00
0732eb54bc [feature](struct-type) support csv format stream load for struct type (#17143)
Refactor from_string method in data_type_struct.cpp to support csv format stream load for struct type.
2023-03-01 15:48:48 +08:00
Pxl
62440f3140 [Bug](Materialized-View) forbiden mv rewrite on create view and remove duplicate method getIsM… (#17194)
1. forbiden mv rewrite on create view to avoid select fail
2. remove duplicate method getIsMaterialized
2023-03-01 13:46:56 +08:00
b8ebcdff78 [Bug](bloomfilter) Fix wrong result using bloomfilter with date type (#17225) 2023-03-01 12:29:20 +08:00
979cf42d7a [Bug](decimalv3) Use correct decimal scale for function round (#17232)
Co-authored-by: maochongxin <maochongxin@gmail.com>
2023-03-01 12:28:41 +08:00
62ec74f4e7 segcompaction featuring verticalcompaction (#16731)
This patchset applies the following changes:

using vertical compaction machanism to do segcompaction
basic (WIP) refraction to separate segcompaction logic from BetaRowsetWriter
add segcompaction specific ut and regression tests
2023-03-01 10:55:40 +08:00
1b58f7f2ea [fix](Nereids) json object and json array should always not nullable (#17205) 2023-02-28 20:26:21 +08:00
9bcc3ae283 [Fix](DOE)Fix be core dump when parse es epoch_millis date format (#17100) 2023-02-28 20:09:35 +08:00
94cea0ea6d [fix](Nereids) Disable preagg when there is DELETE_SIGN filter (#17157)
1. disable preAgg when there is delete sign when binding relation
2. keep the preAgg status in SelectMaterializeIndex rule
2023-02-28 19:59:05 +08:00
34813bae13 [improvement](meta) make database,table,column names to support unicode (replace PR #13467 with this) (#14531)
Make database, table, column and other names support unicode by changing LABEL_REGEX COMMON_NAME_REGIEX COMMON_TABLE_NAME_REGEX COLUMN_NAME_REGEX regular expressions in class FeNameFormat.

P.S. @SharpRay has transfered PR #13467 to me, and I‘m responsible for the task now. There will be some modifications during the review period, so I create a new PR and the original #13467 could be closed. Thanks.
2023-02-28 18:50:36 +08:00
727853017c [regression-test](Nereids) add agg function, tvf, generator, window function test cases (#16824)
add agg_function, tvf, generator, window_function test for nereids and add more feature to gen.py
2023-02-28 17:51:39 +08:00
1dd2a41e38 [vectorized](bug) fix window function can't handle first row of beyond (#17084)
Issue Number: close #16845
2023-02-28 17:30:23 +08:00
3e40467ce6 [Bug](vec) Fix chinese pinyin order by (#17152)
bug: some chinese word not sort by pinyin in GBK coding

CREATE TABLE `test_convert` (
                 `a` varchar(100) NULL
             ) ENGINE=OLAP
               DUPLICATE KEY(`a`)
               DISTRIBUTED BY HASH(`a`) BUCKETS 3
               PROPERTIES (
               "replication_allocation" = "tag.location.default: 1"
               );
insert into test_convert values("b"), ("a"), ("c"), ("睿"), ("多"), ("丝");
Query OK, 6 rows affected (0.03 sec)
{'label':'insert_ca73a6acc2194d5b_888218a3949355a6', 'status':'VISIBLE', 'txnId':'18068'}
mysql [test]>select * from test_convert;
+------+
| a    |
+------+
| a    |
| c    |
| 丝   |
| b    |
| 多   |
| 睿   |
+------+
6 rows in set (0.01 sec)
mysql [test]>select * from test_convert order by convert(a using gbk);          
+------+
| a    |
+------+
| a    |
| b    |
| c    |
| 多   |
| 丝   |
| 睿   |
+------+
6 rows in set (0.01 sec)
2023-02-28 14:29:56 +08:00