Commit Graph

3524 Commits

Author SHA1 Message Date
0a57f12578 [Bug](datev2) Fix bugs for datev2 (#15860)
These bugs are found when I run regression test with enable_date_conversion on
2023-01-14 18:36:36 +08:00
313e14d220 [Bugfix] (ROLLUP) fix the coredump when add rollup by link schema change (#15654)
Because of the rollup has the same keys and the keys's order is same, BE will do linked schema change. The base tablet's segments will link to the new rollup tablet. But the unique id from the base tablet is starting from 0 and as the rollup tablet also. In this case, the unique id 4 in the base table is column 'city', but in the rollup tablet is 'cost'. It will decode the varcode page to bigint page so that be coredump. It needs to be rejected.

I think that if a rollup add by link schema change, it means this rollup is redundant. It brings no additional revenue and wastes storage space. So It needs to be rejected.
2023-01-14 10:20:07 +08:00
2810029d24 [fix](multi-catalog) fix bug that replay init catalog may happen after catalog is dropped (#15919) 2023-01-14 09:41:37 +08:00
cedbed67be [feature-wip](MTMV) Support table aliases when creating a materialized view with multiple tables (#15849)
## Use Case

mysql> CREATE TABLE t_user (
    ->   event_day DATE,
    ->   id bigint,
    ->   username varchar(20)
    -> )
    -> DISTRIBUTED BY HASH(id) BUCKETS 10
    -> PROPERTIES ('replication_num' = '1');
Query OK, 0 rows affected (0.07 sec)

mysql> CREATE TABLE t_user_pv(
    ->   event_day DATE,
    ->   id bigint,
    ->   pv bigint
    -> )
    -> DISTRIBUTED BY HASH(id) BUCKETS 10
    -> PROPERTIES ('replication_num' = '1');
Query OK, 0 rows affected (0.09 sec)

mysql> CREATE MATERIALIZED VIEW mv
    -> BUILD IMMEDIATE REFRESH COMPLETE
    -> START WITH "2022-10-27 19:35:00"
    -> NEXT 1 SECOND
    -> KEY (username)
    -> DISTRIBUTED BY HASH(username) BUCKETS 10
    -> PROPERTIES ('replication_num' = '1')
    -> AS SELECT t1.username ,t2.pv FROM t_user t1 LEFT JOIN t_user_pv t2 on t1.id = t2.id;
Query OK, 0 rows affected (0.10 sec)

mysql> DESC mv;
+----------+-------------+------+-------+---------+-------+
| Field    | Type        | Null | Key   | Default | Extra |
+----------+-------------+------+-------+---------+-------+
| username | VARCHAR(20) | Yes  | true  | NULL    |       |
| pv       | BIGINT      | Yes  | false | NULL    | NONE  |
+----------+-------------+------+-------+---------+-------+
2 rows in set (0.02 sec)
2023-01-14 01:29:32 +08:00
2580c88c1b [feature](multi-catalog) support oracle jdbc catalog (#15862) 2023-01-14 00:01:33 +08:00
bd2280b4ce [fix](planner) move join reorder to the single node planner (#15817)
Reorder in analyze phase would produce a stmt which its corresponding SQL could not be analyzed correctly and cause an analyze exception that may be happened in the stmt rewrite, since the rewriter will reset and reanalyze the rewritten stmt.
2023-01-13 19:42:12 +08:00
e979cc444a [improvement](multi-catalog) support hive 1.x (#15886)
The inferface of hive metastore changes from version to version.
Currently, Doris use hive 2.3.7 as hms client version.
When using to connect hive 1.x, some interface such as get_table_req does not exist
in hive 1.x. So we can't get metadata from hive 1.x.

In this PR, I copied the HiveMetastoreClient from hive 2.3.7 release, and modify some of interface's
implementation, so that it will use old interface to connect to hive 1.x.

And when creating hms catalog, you can specify the hive version, eg:

CREATE CATALOG `hive` PROPERTIES (
  "hive.metastore.uris" = "thrift://127.0.0.1:9083",
  "type" = "hms",
  "hive.version" = "1.1"
);
If hive.version does not specified, Doris will use hive 2.3.x compatible interface to visit hms.
2023-01-13 18:32:12 +08:00
a8dacfbfd9 [opt](planner) return bigint literal when cast date literal to bigint type (#15613) 2023-01-13 12:58:04 +08:00
c1963e799a [fix](nereids)upgrade signature datatype bug (#15867)
ComputeSignatureHelper.upgradeDateOrDateTimeToV2()
we upgrate return date type, but forget to upgrade arguments datatype.

The same problem in upgradeDecimalV2ToV3()
2023-01-13 12:54:25 +08:00
67378a2dc3 [fix](nereids) fix bug in SequenceFunction legality check (#15812)
1. fix bug in sequence_match function
2. do type promotion instead of explicit cast for
  - varcharLiteral -> stringLiteral
  - charLiteral->stringLiteral
2023-01-13 12:09:53 +08:00
688a0bb96a [feature](multi-catalog) support clickhouse jdbc catalog (#15780) 2023-01-13 10:07:22 +08:00
a7af869bfd [opt](Nereids) group_concat to support more cases (#15815)
enhance group_concat to support group_concat(cast(slot), ...) and support call it with 1 argument.
2023-01-13 00:41:13 +08:00
9d41994c17 [opt](Nereids) throw exception when aliasedQuery has no alias(#15854) 2023-01-13 00:35:16 +08:00
d23646793c [fix](nereids) binding group by key on agg.output if output is slot (#15623)
case 1
`select count(1) from t1 join t2 on t1.a = t2.a group by a`
`group by a` is ambiguous

case 2
`select t1.a from t1 join t2 on t1.a = t2.a group by a`
`group by a` is bound on t1.a
2023-01-12 16:34:56 +08:00
39697bb83e [fix](Nereids) make the type of the first parameter in window_funnel is intergerLike (#15810) 2023-01-12 11:53:28 +08:00
ea0ef0d880 [fix](session-variable) repeat_max_num should be forwarded (#15840)
repeat_max_num should be forwarded to master, or stmt like:
insert into tbl values(repeat("a", 1000)) will not be affected by this session variable.
2023-01-12 10:51:35 +08:00
88a2088c1d [feature](Nereids) parse pipe_concat symbol as concat when sql mode set to PIPES_AS_CONCAT (#15775) 2023-01-11 21:41:14 +08:00
ea1493d946 [fix](Nereids) can not parse left and right function (#15655) 2023-01-11 21:29:32 +08:00
330ed9a84c [fix](Nereids) toSql is not work well in non-query statement (#15752) 2023-01-11 18:56:55 +08:00
cfb110c905 [fix](nereids) fix some nereids bugs (#15714)
1. remove forcing nullable for slot on EmptySetNode.
2. order by xxx desc should use nulls last as default order.
3. don't create runtime filter if runtime filter mode is OFF.
4. group by constant value need check the corresponding expr shouldn't have any aggregation functions.
5. fix two left outer join reorder bug( A left join B left join C).
6. fix semi join and left outer join reorder bug.( A left join B semi join C ).
7. fix group by NULL bug.
8. change ceil and floor function to correct signature.
9. add literal comparasion for string and date type.
10. fix the getOnClauseUsedSlots method may not return valid value.
11. the tightness common type of string and date should be date.
12. the nullability of set operation node's result exprs is not set correctly.
13. Sort node should remove redundent ordering exprs.
2023-01-11 17:18:44 +08:00
d4e4e18b47 [fix](DOE): Fix query _id error and es properties error (#15792)
Fix query _id error
_id not exist mapping, but be can query it, we need skip check it exist mapping.
2023-01-11 17:00:59 +08:00
18a3b75626 [fix](QueryDetail) fix QueryDetail may be incorrect and null pointer exception (#15765)
* [fix](QueryDetail) fix QueryDetail may be incorrect and null pointer exception
2023-01-11 16:38:55 +08:00
4424874237 [fix](Nereids): move parentExpression in moveOwnership() (#15786) 2023-01-11 15:47:37 +08:00
006b3bd61a [fix](nereids) orthogonal_bitmap_intersect's return type should be bitmap (#15784) 2023-01-11 12:53:37 +08:00
7f2c433e08 [feature](Nereids) add relation id to unboundTVFRelation to avoid incorrect group expression comparison (#15740) 2023-01-11 12:49:14 +08:00
Pxl
2587095811 [Bug](mv) fix mv selector check group expr && forbid create dup mv with bitmap/hll && add some case (#15738) 2023-01-11 11:38:56 +08:00
3c8c31a5f8 [chore](Session) remove unused codes for enable_lateral_view
session variable `enable_lateral_view` has been removed for a long time.
This cl just remove variable name `enable_lateral_view`.
2023-01-11 11:24:28 +08:00
89c21af87d [chore](fe) update fe snapshot to 1.2 and fix auditloader compile error (#15787)
This PR #14925 change some field of AuditEvent, so we need to upgrade the fe-core's SNAPSHOT to 1.2
because auditloader depends on fe-core

Already push the 1.2-SNAPSHOT to
https://repository.apache.org/content/repositories/snapshots/org/apache/doris/fe-core/1.2-SNAPSHOT/
2023-01-11 08:46:48 +08:00
8f31a36429 [feature] support spill to disk for sort node (#15624) 2023-01-11 08:40:58 +08:00
bc34a44f06 [Fix](Nereids) fix type coercion for binary arithmetic (#15185)
support sql like: select true + 1 + '2.0' and prevent select true + 1 + 'x';
2023-01-11 02:55:44 +08:00
c87a9a5949 [fix](Nereids) Add varchar literal compare (#15672)
support "1" = "123"
2023-01-11 02:41:50 +08:00
280603b253 [fix](nereids) bind sort key priority problem (#15646)
`a.b.c` should only bind on `a.b.c`, not on `b.c` or `c`
2023-01-11 02:03:09 +08:00
ab2e0fd397 [fix](tvf) cancel strict restrictions on tvf parameters (#15764)
Cancel strict restrictions on tvf parameters.
2023-01-10 22:40:19 +08:00
79b24cdb1f [fix](JdbcResource) fix that JdbcResource does not support the jdbcurl of Oracle and SQLServer (#15757)
Actually, `JdbcResource` should support `Oracle` jdbcurl and `SQLServer` jdbcurl for jdbc external table.
2023-01-10 22:38:30 +08:00
90a92f0643 [feature-wip](multi-catalog) add iceberg tvf to read snapshots (#15618)
Support new table value function `iceberg_meta("table" = "ctl.db.tbl", "query_type" = "snapshots")`
we can use the sql `select * from iceberg_meta("table" = "ctl.db.tbl", "query_type" = "snapshots")` to get snapshots info  of a table. The other iceberg metadata will be supported later when needed.

One of the usage:

Before we use following sql to time travel:
`select * from ice_table FOR TIME AS OF "2022-10-10 11:11:11"`;
`select * from ice_table FOR VERSION AS OF "snapshot_id"`;
we can use the snapshots metadata to get the `committed time` or `snapshot_id`, 
and then, we can use it as the time or version in time travel clause
2023-01-10 22:37:35 +08:00
542542a4b2 [fix](nereids) fix bug in estimation of min/max of Year (#15712)
1. fix bug in estimation of min/max of Year
2. remove Utils.getLocalDatetimeFromLong(Long). this method is will throw exception if input parameter is too big. And this method is not used any more when we fix the above bug
2023-01-10 21:29:16 +08:00
fec89ad58c [fix](nereids) week should be able to recognized as function name in function call context (#15735) 2023-01-10 19:54:59 +08:00
7767931aca [ehancement](nereids) let parser support utf8 identifier (#15721)
After this PR, below SQL could be parsed well too
- SELECT k1 AS 测试 FROM  test;
- SELECT k1 AS テスト FROM test;
2023-01-10 19:43:04 +08:00
bb28144c76 [fix](schema change) bugfix for light schema change while with rollup (#15681)
Describe your changes.
this problem come from pr: #11494

After add column to rollup index, it also change column UniqueId inside base index.
2023-01-10 19:03:06 +08:00
a67cea2d27 [Enhancement](metric) add current edit log metric (#15657) 2023-01-10 18:46:57 +08:00
503b6ee4da [chore](vulnerability) fix fe high risk vulnerability scanned by bug scanner (#15649) 2023-01-10 17:44:18 +08:00
47097a3db8 [fix](having) revert 15143 and fix having clause with multi-conditions (#15745)
Describe your changes.

Firstly having clause of Mysql is really very complex, we are hard to follow all rules, so we revert pr15143 to keep the logic the same as before.

Secondly the origin implementation has problem while having clause has multi-conditions.
For example:

case1: here v2 inside having clause use table column test_having_alias_tb.v2
SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v2>1);
ERROR 1105 (HY000): errCode = 2, detailMessage = HAVING clause not produced by aggregation output (missing from GROUP BY clause?): (`v2` > 1)
case2: here v2 inside having clause use alias name v2 =sum(test_having_alias_tb.v2), another condition make logic of v2 differently.
SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v>0 AND v2>1) ORDER BY id,v;
+------+------+------+
| id   | v    | v2   |
+------+------+------+
|    2 |    1 |    3 |
+------+------+------+
So here we try to make the having clause rules simple:
Rule1: if alias name inside having clause is the same as column name, we use column name not alias name;
Rule2: if alias name inside having clause do not have same name as column name, we use alias name;

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2023-01-10 15:57:29 +08:00
dec79c000b [fix](MTMV) build mode is missing after restart FE (#15551) 2023-01-10 11:38:56 +08:00
1888aba301 [fix](MTMV) fix replayReplaceTable error when restart fe (#15564) 2023-01-10 11:36:17 +08:00
025623a124 [feature](Nereids) Support lots of aggregate functions (#15671)
1. generate lots of aggregate functions
2. support `group_concat(columns order by order_columns)`  grammer
3. support and generate array aggregate/scalar functions, like `array_union`. we should support array grammar in the future, e.g. `select [1, 2, 3]`
4. add `checkLegalityBeforeTypeCoercion` and `checkLegalityAfterRewrite` function to check the legality of expression before type coercion and after rewrite, copy the semantic check of `FunctionCallExpr` to the checkLegality; remove the `ForbiddenMetricTypeArguments`; move the check of aes/sm4 crypto function from translator to checkLegalityBeforeTypeCoercion
5. refactor the `NullableAggregateFunction`: distinct is the first parameter, alwaysNullable is the second parameter; Fix some wrong initialize order: some function invoke super(distinct, alwaysNullable) but some function invoke super(alwaysNullable, distinct)
2023-01-10 11:20:27 +08:00
601d9af23b [fix](planner) disconjunct in sub-query failed when plan it on hash join (#15653)
all conjuncts should be added before HashJoinNode init. Otherwise, some slots on conjuncts linked to the tuple not in intermediate tuple on HashJoinNode
2023-01-10 11:10:12 +08:00
c19e391d32 [fix](profile) show query profile for pipeline engine (#15687) 2023-01-10 10:12:34 +08:00
9e3a61989b [refactor](es) remove BE generated dsl for es query #15751
remove fe config enable_new_es_dsl and all related code.
Now the DSL for es is always generated on FE side.
2023-01-10 08:40:32 +08:00
05f6e4c48a [fix](predicate) fix be core dump caused by pushing down the double column predicate (#15693) 2023-01-09 19:31:04 +08:00
2b0e5e42a5 [ehancement](nereids) Support list parttion prune (#15724) 2023-01-09 19:00:53 +08:00