Commit Graph

1805 Commits

Author SHA1 Message Date
d88711aabb [fix] fix TableRef.java checkstyle failed (#7538) 2021-12-30 10:47:26 +08:00
4d01219849 [fix](lower_case_table_names) Fix the bug of case-sensitive aliases in the query when lower_case_table_names=1 is set (#7495)
* [fix](lower_case_table_names) Fix the bug of case-sensitive aliases in the query when lower_case_table_names=1 is set
2021-12-30 10:23:45 +08:00
0894848045 fix having clause constant folding (#7507)
Change-Id: I49d7f2b17e498e8b393a8c67d85aa1196f961393

Co-authored-by: qijianliang01 <qijianliang01@baidu.com>
2021-12-30 10:22:07 +08:00
85c30fc720 [deps] Upgrade Log4j to 2.7.1 to solve the CVE-2021-44832 security vulnerability (#7536)
Upgrade Log4j to 2.7.1 to solve the CVE-2021-44832 security vulnerability

Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
2021-12-30 10:21:37 +08:00
2872dbfeb8 [refactor] Standardize the writing of pom files, prepare for deployment to maven (#7477) 2021-12-30 10:16:37 +08:00
e93360791f Revert "[improvement](planner) make BinaryPredicate do not cast date to datetime/varchar (#7045)" (#7517) 2021-12-28 23:05:27 +08:00
3a5de976a3 [Feature](Partition pruning) Implement V2 version of partition prune. (#7434)
Implement a V2 version of partition prune algorithm. We use session variable partition_prune_algorithm_version as the control flag, with a default value of 2.

1. Support disjunctive predicates when prune partitions for both list and range partitions.
2. Optimize partition prune for multiple-column list partitions.

Closed #7433
2021-12-28 22:32:34 +08:00
a2d6e6e06f [improvement](config) Modify default value of some brpc config (#7493)
1. Change `brpc_socket_max_unwritten_bytes` to 1GB

    This can make the system more fault-tolerant.
    Especially in the case of high system load, try to reduce EOVERCROWDED errors.

2. Change `brpc_max_body_size` to 3GB

    To handle some large object such as bitmap or string.
2021-12-28 16:47:53 +08:00
3454735eba [fix](balance) fix partition rebalance bug (#7213)
the number of replica on specified medium we get from `getReplicaNumByBeIdAndStorageMedium()` is
defined by table properties. But in fact there may not has SSD/HDD disk on this backend. 
So if we found that no SSD/HDD disk on this backend, set the replica number to 0,
but the partitionInfoBySkew doesn't consider this scene, medium has no SSD/HDD disk also skew,
cause rebalance exception
2021-12-28 15:03:29 +08:00
07e2acb2f3 [feature] Suport national secret (national commercial password) algorithm SM3/SM4 (#7464)
SM3 is password hash algorithm
SM4 is a block cipher used to replace DES / AES and other international algorithms.
2021-12-28 10:39:54 +08:00
ab60c5eb59 [fix](spark-load) fix Roaring64Map big-endian read/write in de/serialization (#7480)
See #7479
This bug is triggered when the bitmap exceeds 32 bits.
2021-12-26 11:09:50 +08:00
98551f8e5e [fix](grouping-set) Grouping set clause act wrong for function expr in view (#7410) (#7411)
Fix #7410
2021-12-26 11:05:48 +08:00
fe1d0c1428 [fix](materialized-view)(planner) fix mv rewrite bug (#7362)
Close related [#7361]

As the sql described in [#7361](https://github.com/apache/incubator-doris/issues/7361)

```
select k1, count(k2) / count(1) from UserTable group by k1
``` 

Before this pr, `count(k2) / count(1)` will be rewritten as `sum(UserTable.mv_count_k2) / count(1)`,
and will be kept in second-round analyze, which could cause mv select fail.

After this pr, `count(k2) / count(1)` will still be rewritten as `sum(UserTable.mv_count_k2) / count(1)`,
but won't be kept in second-round analyze, so query could successfully run.
2021-12-26 11:00:39 +08:00
f33d1f7143 [fix](ut) Fix FE ut SelectStmtTest.testDeduplicateOrs (#7481)
Fix UT failure in `SelectStmtTest.testDeduplicateOrs`
2021-12-24 21:36:19 +08:00
2347f128b0 [fix](fe-ut) Fix NPE when start FE in unit test. (#7471)
Before this PR, NPE would be thrown if start FE in a unit test.
2021-12-24 21:33:30 +08:00
c596b0362c [docs](docker) Add document of docker dev (#7447)
Add development document using docker
2021-12-24 21:27:39 +08:00
Pxl
ff5a0e98b0 [improvement](planner) make BinaryPredicate do not cast date to datetime/varchar (#7045) 2021-12-24 21:22:43 +08:00
3128c7cd37 [fix](ut) fix testPartitionRebalancer ut (#7468) 2021-12-23 23:29:07 +08:00
a8c444d6d5 [fix](sql-rewrite) Rewrite Bigint SlotRef to compare DecimalLiteral in Binary predicate (#7265)
Convert the binary predicate of the form
`<CastExpr<SlotRef(ResultType=BIGINT)>> <op><DecimalLiteral>` 
to the binary predicate of 
`<SlotRef(ResultType=BIGINT)> <new op> <new DecimalLiteral>`, 
thereby allowing the binary predicate The predicate pushes down and completes the bucket clipped.

For query `select * from T where t1 = 2.0`, when the ResultType of column t1 is equal to BIGINT,
in the binary predicate analyze, the type will be unified to DECIMALV2, so the binary predicate will be converted to 
`<CastExpr<SlotRef>> <op In the form of ><DecimalLiteral>`, because Cast wraps the t1 column, it cannot be pushed 
down, resulting in poor performance.We convert it to the equivalent query `select * from T where t1 = 2` to push down 
and improve performance.

SSB test:
1. query `select * from LINEORDER3 where LO_ORDERKEY <2.2`

Performance improvement: `1.587s` -> `0.012s`,
The result and performance of `select * from LINEORDER3 where LO_ORDERKEY <3` are equivalent, and the other comparison methods are the same.

2. query `select * from LINEORDER3 where LO_ORDERKEY = 2.2`
Performance improvement: `0.012s` -> `0.006`.
2021-12-22 23:28:19 +08:00
97749ed85b [community][chore] Modify .asf.yaml and fix BE build warning (#7439) 2021-12-21 11:06:12 +08:00
HB
560c8b8911 [enhancement] Remove the two lines of duplicate import. (#7331) (#7332)
* [Enhancement] Remove the two lines of duplicate import. (#7331)
2021-12-21 11:04:53 +08:00
998489ac50 [fix](sql-block-rule) move sql block rule check from ConnectProcessor to StmtExecutor (#7407)
SqlBlockRule should block only query stmt. And exclude explain stmt.
2021-12-21 10:25:09 +08:00
30db2cdd19 [fix](cache) Some view stmt cannot be obtained when view in the subquery and add cache key UT (#7375)
1. Fix bug that some view stmt cannot be obtained when view in the subquery
2. Add cache key UT
2021-12-21 10:21:28 +08:00
7a1bb5b335 log4j upgrade to 2.17.0 (#7440)
Solved the third security vulnerability CVE-2021-45105 that was discovered
2021-12-21 09:28:02 +08:00
06c38ce46e [enhancement] Make concurrent_number for routine load task can be larger than be num (#7386)
* [enhancement] Make concurrent_number for routine load task can be larger than be num

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2021-12-17 11:04:29 +08:00
c873c8c162 [fix](lateral view)(subquery) Forbidden directly AGG/SORT on lateral view (#7337)
This PR mainly prohibits operations such as aggregation/sorting/window functions
on lateral views containing subqueries.
For example:
select min(e1) from (select c1 from table group by c1)tmp1 lateral view explode_split(c1, ",") tmp2 as e1
But the query can be written in another way, and the result is the same.
select min(e1) from (select e1 from (select c1 from table group by c1)tmp1 lateral view explode_split(c1, ",") tmp2 as e1) tmp3

The reason is that when the results of a inline view are subjected to a lateral view,
and the outer query performs aggregation or sorting operations on non-table-function columns.
The output slot id of the table function node is empty or has fewer columns.

The essential reason is that when the inner layer contains an inline view,
the outer expression needs to be mapped to the correct tuple through the substitute method
according to the smap instead of the virtual tuple.
But the substitute method of slot ref cannot recurse to its own source exprs.

E.g
SlotRef: c2 <source expr min(c1)> from agg tuple
smap: <c1, c3>
before: c2 <source expr min(c1)>
after: c2 <source expr min(c1)> no changed
2021-12-16 15:42:39 +08:00
0499b2211b [feat](lateral-view) Support execution of lateral view stmt (#7255)
1. Add table function node
2. Add 3 table functions: explode_split, explode_bitmap and explode_json_array
2021-12-16 10:46:15 +08:00
2b90967c4c [fix][refactor](broker load) refactor the scheduling logic of broker load (#7371)
1. Refactor the scheduling logic of broker load. Details see #7367 
2. Fix bug that loadedBytes in SHOW LOAD result is wrong.
3. Cancel the thread of LoadTimeoutChecker
   Now for PENDING load jobs, there will be no timeout. And the timeout of a load job
   start when pending load task is scheduled.
4. Fix a bug that the loading task is never submitted to the pool.
   The logic of BlockedPolicy is wrong. We should make sure the task is submitted to the pool,
   or the RejectedExecutionException should be thrown.
5. Now the transaction of a load job will begin in pending task, instead of when submitting the job.
2021-12-16 10:39:22 +08:00
6ede693839 [fix](insert) modify code logic of InsertStmt (#7360)
when entry is null, there will be NullPointerException.
2021-12-16 10:38:05 +08:00
382351b0ee [fix](ut) Fix run fe ut failed, be ut memory leak and build thirdparty failed (#7377) 2021-12-15 11:00:20 +08:00
926540c561 [feature] Support return bitmp/hll data in select statement (#7276)
Support return bitmp/hll data in select statement, this can be used when set show_object_data=true;
2021-12-15 09:48:27 +08:00
e64da03866 [deps](log4j) Upgrade log4j 2 to 2.16.0 (#7394)
Upgrade log4j 2 to 2.16.0, the official strongly recommends upgrading to this version
2021-12-14 15:57:16 +08:00
568f6611df [deps](log4j) upgrade log4j (#7364)
to 2.15.0
2021-12-10 23:19:11 +08:00
ac739fec10 [refactor] modify the control flow code to improve code readability (#7302)
Now the code of command handler isn't clear.
We can modify `if` and `else` to improve code readability.
2021-12-09 22:35:46 +08:00
db57c42c83 [improvement](compaction)(tablet repair) Add missing rowsets in compaction status url and support force dropping redundant replica (#7283)
1. Add missing rowsets in compaction status url
2. Add a new config `force_drop_redundant_replica` to force drop redundant replicas.
3. Fix FE ut
2021-12-09 22:34:57 +08:00
dc281ebc34 [fix](routine load) fix bug that can not read image when using keyword STREAM (#7323)
issue #7322 

1. Support `stream` as an identifier.
2. Optimize exception log output in `RoutineLoad`
2021-12-08 20:51:17 +08:00
10ccadacce [fix](forward) Avoid endless forward execution (#7335)
Close related #7334

1. Fix bug describe in [Bug] show frontends cause FE oom #7334
2. Fix error of CurrentConnected fields in show frontends result.
3. Add more FAQ
2021-12-08 16:25:04 +08:00
2ae9c41aa1 [fix](lateral view)(subquery) Fix column materialization error (#7330)
Fix the problem that when the source column of the lateral view comes from a inline view,
the column in the inline view cannot be materialized correctly.

At the same time, fix the problem that the correct output column cannot be projected
when the source column of the lateral view comes from a inline view.

It should be noted that when the column in the query is from a inline view column.
During semantic analysis and planning, it needs to be converted from tuple(virtual) to real tuple.
2021-12-07 10:23:33 +08:00
3b10002536 [community][typo](github) modify PR template (#7310)
I found some small problems when I read code. So I add some small enhancement. 

1. modify PR template.  Now the template of PR isn't simple and clear. It's useful to refactor it.
2. some small change (typo, format .....)
2021-12-07 10:03:28 +08:00
5e32ae3c3f [improvement](cache) Optimize sql cache (#7231)
issue: #7230
When getting the latest update time of a table, only compare the partitions of this query,
not all partitions of a table.
The goal is to improve the SqlCache hit rate.
2021-12-07 09:59:31 +08:00
03ad8c1fe3 [fix](load) Fix bug that show load may be blocked (#7254)
When a broker load's task is failed, it may be retried by holding the
LoadJob's write lock and submit loading task to a thread pool.

But submitting a task to thread pool may be blocked for at most 60 seconds
(depends on BlockPolicy), so it will hold write lock for too long.
2021-12-07 09:58:50 +08:00
62d12067aa [feature](udf) make orthogonal bitmap udaf as build in functions (#7211)
move orthogonal bitmap udaf as build in functions
add three buildin bitmap functions:

- orthogonal_bitmap_intersect
- orthogonal_bitmap_intersect_count
- orthogonal_bitmap_union_count
2021-12-07 09:57:26 +08:00
8660bf69ff [fix](select join) Make selected slotRef nullable when slotRef is from nullable tuple in outer join sql block (#7290) 2021-12-06 16:17:10 +08:00
164b27412c [revert] "[improvement](bdbje) clean too many bdbje log (#7273)" (#7312)
Reverts #7273
Because there is no EnvironmentConfig.RESERVED_DISK.
2021-12-06 11:32:45 +08:00
200210e708 [fix] (ut) fix fe unit test failed, this is because we fix the MAX_PHYSICAL_PACKET_LENGTH to 0xffffff 2021-12-06 11:13:01 +08:00
bffc2836d7 [fix](show) Fix bug that AdminShowDataSkew operation may cause fe oom (#7297) 2021-12-06 10:32:00 +08:00
e080afa186 [typo] update comment of MasterDaemon (#7285)
The comment of MasterDaemon is out of date, may misguide reader.
2021-12-06 10:30:48 +08:00
974ab9b90c [improvement](bdbje) clean too many bdbje log (#7273)
In an HA environment, JE will retains as many reserved files.
the jdbje log become too large.
so we should limit the reserved files size, default set 1GB
2021-12-06 10:28:36 +08:00
4bfee42ba1 [feature-wip](lateral view) Support lateral view based on subquery (#7269)
Support lateral view of the result column in subquery.
For example:
  ```
  select e1 from (select k2 as a from test_explode group by a) tmp1
  lateral view explode_split(a, ",") tmp2 as e1;
  ```
The lateral view will parse the inline view column
and put the table function node above the subquery.
2021-12-06 10:26:36 +08:00
845f931098 [fix](select outfile) Remove optional properties check of hdfs storage (#7272) 2021-12-03 13:42:56 +08:00