Commit Graph

3669 Commits

Author SHA1 Message Date
dc9cd34047 [docs] Add user manual for hdfs load and transaction. (#7497) 2021-12-30 10:22:48 +08:00
0894848045 fix having clause constant folding (#7507)
Change-Id: I49d7f2b17e498e8b393a8c67d85aa1196f961393

Co-authored-by: qijianliang01 <qijianliang01@baidu.com>
2021-12-30 10:22:07 +08:00
85c30fc720 [deps] Upgrade Log4j to 2.7.1 to solve the CVE-2021-44832 security vulnerability (#7536)
Upgrade Log4j to 2.7.1 to solve the CVE-2021-44832 security vulnerability

Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
2021-12-30 10:21:37 +08:00
bc4ceeca44 [improvement] optimize java cmd find (#7428)
* optimize java cmd find, if java_home not set use java in PATH
2021-12-30 10:16:56 +08:00
2872dbfeb8 [refactor] Standardize the writing of pom files, prepare for deployment to maven (#7477) 2021-12-30 10:16:37 +08:00
e93360791f Revert "[improvement](planner) make BinaryPredicate do not cast date to datetime/varchar (#7045)" (#7517) 2021-12-28 23:05:27 +08:00
3a5de976a3 [Feature](Partition pruning) Implement V2 version of partition prune. (#7434)
Implement a V2 version of partition prune algorithm. We use session variable partition_prune_algorithm_version as the control flag, with a default value of 2.

1. Support disjunctive predicates when prune partitions for both list and range partitions.
2. Optimize partition prune for multiple-column list partitions.

Closed #7433
2021-12-28 22:32:34 +08:00
a2d6e6e06f [improvement](config) Modify default value of some brpc config (#7493)
1. Change `brpc_socket_max_unwritten_bytes` to 1GB

    This can make the system more fault-tolerant.
    Especially in the case of high system load, try to reduce EOVERCROWDED errors.

2. Change `brpc_max_body_size` to 3GB

    To handle some large object such as bitmap or string.
2021-12-28 16:47:53 +08:00
Pxl
9fb89004aa [revert] part of "[improvement](planner) make BinaryPredicate do not cast date to datetime/varchar (#7045)" (#7501) 2021-12-28 15:07:10 +08:00
3454735eba [fix](balance) fix partition rebalance bug (#7213)
the number of replica on specified medium we get from `getReplicaNumByBeIdAndStorageMedium()` is
defined by table properties. But in fact there may not has SSD/HDD disk on this backend. 
So if we found that no SSD/HDD disk on this backend, set the replica number to 0,
but the partitionInfoBySkew doesn't consider this scene, medium has no SSD/HDD disk also skew,
cause rebalance exception
2021-12-28 15:03:29 +08:00
07e2acb2f3 [feature] Suport national secret (national commercial password) algorithm SM3/SM4 (#7464)
SM3 is password hash algorithm
SM4 is a block cipher used to replace DES / AES and other international algorithms.
2021-12-28 10:39:54 +08:00
6e052f4ede [Doc][Website] blogs are sorted by date (#7491)
* blogs are sorted by date

Co-authored-by: 943155336 <wangyongfeng>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
2021-12-27 14:30:08 +08:00
80587e7ac2 [improvement](spark-connector)(flink-connector) Modify the max num of batch written by Spark/Flink connector each time. (#7485)
Increase the default batch size and flush interval
2021-12-26 11:13:47 +08:00
755e0693b9 [feature](broker) support ks3 for kmr in ksyun (#7484) 2021-12-26 11:10:47 +08:00
ab60c5eb59 [fix](spark-load) fix Roaring64Map big-endian read/write in de/serialization (#7480)
See #7479
This bug is triggered when the bitmap exceeds 32 bits.
2021-12-26 11:09:50 +08:00
43e93180c5 [chore](docker) Add clang11 in docker dev image (#7470) 2021-12-26 11:09:17 +08:00
ca97535491 [docs](executor) correct some be error code (#7460)
correct some be error code in doc.
2021-12-26 11:06:54 +08:00
98551f8e5e [fix](grouping-set) Grouping set clause act wrong for function expr in view (#7410) (#7411)
Fix #7410
2021-12-26 11:05:48 +08:00
0c154733e0 [feature](function) support bitmap_union/intersect have more columns parameters (#7379)
support multi bitmap parameter for all bitmap aggregation function
2021-12-26 11:03:20 +08:00
fe1d0c1428 [fix](materialized-view)(planner) fix mv rewrite bug (#7362)
Close related [#7361]

As the sql described in [#7361](https://github.com/apache/incubator-doris/issues/7361)

```
select k1, count(k2) / count(1) from UserTable group by k1
``` 

Before this pr, `count(k2) / count(1)` will be rewritten as `sum(UserTable.mv_count_k2) / count(1)`,
and will be kept in second-round analyze, which could cause mv select fail.

After this pr, `count(k2) / count(1)` will still be rewritten as `sum(UserTable.mv_count_k2) / count(1)`,
but won't be kept in second-round analyze, so query could successfully run.
2021-12-26 11:00:39 +08:00
4ed1846369 [fix](ut) Fix BE broker scanner unit test bug (#7486)
introduced from #7454
2021-12-26 10:30:37 +08:00
f33d1f7143 [fix](ut) Fix FE ut SelectStmtTest.testDeduplicateOrs (#7481)
Fix UT failure in `SelectStmtTest.testDeduplicateOrs`
2021-12-24 21:36:19 +08:00
2347f128b0 [fix](fe-ut) Fix NPE when start FE in unit test. (#7471)
Before this PR, NPE would be thrown if start FE in a unit test.
2021-12-24 21:33:30 +08:00
91332fa6bd [fix](reader) fix logic error for Tablet::capture_rs_readers (#7469) 2021-12-24 21:32:49 +08:00
43ed54faa1 [docs] The name of hidden column is incorrect in batch-delete-manual.md(#7465) (#7466) 2021-12-24 21:30:57 +08:00
a8a5c0a6a8 [improvement](load) memory usage optimization for load job (#7454)
Reduce memory usage when loading unqualified data
2021-12-24 21:30:28 +08:00
b4ce189646 [improvement](flink-connector) flush data without multi httpclients (#7329) (#7450)
reuse http client to flush data
2021-12-24 21:28:35 +08:00
c596b0362c [docs](docker) Add document of docker dev (#7447)
Add development document using docker
2021-12-24 21:27:39 +08:00
Pxl
bfa6bc3b0a [fix](function) fix aggregate function min() at type varchar (#7437) 2021-12-24 21:27:01 +08:00
Pxl
6d1cf599f8 [fix] DCHECK fail at BitmapValue getSizeInBytes (#7430) 2021-12-24 21:23:58 +08:00
3ba6dcf236 [fix](function) fix round function for inaccuracy (#7421) 2021-12-24 21:23:11 +08:00
Pxl
ff5a0e98b0 [improvement](planner) make BinaryPredicate do not cast date to datetime/varchar (#7045) 2021-12-24 21:22:43 +08:00
3128c7cd37 [fix](ut) fix testPartitionRebalancer ut (#7468) 2021-12-23 23:29:07 +08:00
889e33d53d [docs](seatunnel) Seatunnel Supports Doris connector (#7453) 2021-12-22 23:29:02 +08:00
a8c444d6d5 [fix](sql-rewrite) Rewrite Bigint SlotRef to compare DecimalLiteral in Binary predicate (#7265)
Convert the binary predicate of the form
`<CastExpr<SlotRef(ResultType=BIGINT)>> <op><DecimalLiteral>` 
to the binary predicate of 
`<SlotRef(ResultType=BIGINT)> <new op> <new DecimalLiteral>`, 
thereby allowing the binary predicate The predicate pushes down and completes the bucket clipped.

For query `select * from T where t1 = 2.0`, when the ResultType of column t1 is equal to BIGINT,
in the binary predicate analyze, the type will be unified to DECIMALV2, so the binary predicate will be converted to 
`<CastExpr<SlotRef>> <op In the form of ><DecimalLiteral>`, because Cast wraps the t1 column, it cannot be pushed 
down, resulting in poor performance.We convert it to the equivalent query `select * from T where t1 = 2` to push down 
and improve performance.

SSB test:
1. query `select * from LINEORDER3 where LO_ORDERKEY <2.2`

Performance improvement: `1.587s` -> `0.012s`,
The result and performance of `select * from LINEORDER3 where LO_ORDERKEY <3` are equivalent, and the other comparison methods are the same.

2. query `select * from LINEORDER3 where LO_ORDERKEY = 2.2`
Performance improvement: `0.012s` -> `0.006`.
2021-12-22 23:28:19 +08:00
20ef8a6e21 [feature-wip](remote storage)(step1) use a struct instead of string for parameter path, add basic remote method (#7098)
For the first, we need to make a parameter to discribe the data is local or remote.
At then, we need to support some basic function to support the operation for remote storage.
2021-12-22 22:58:23 +08:00
2ab3a66e7a [docs][community] Remove articles (#7449)
The articles will be moved to https://github.com/apache/incubator-doris-website
And I will modify the READ of incubator-doris-website later
2021-12-21 18:50:09 +08:00
97749ed85b [community][chore] Modify .asf.yaml and fix BE build warning (#7439) 2021-12-21 11:06:12 +08:00
e9049605b6 [fix](flink-connector) Connector should visit the surviving BE nodes (#7435) 2021-12-21 11:05:42 +08:00
695eca8cbc [docs] add bloomfilter index doc (#7318)
* add bloomfilter index doc
2021-12-21 11:05:20 +08:00
HB
560c8b8911 [enhancement] Remove the two lines of duplicate import. (#7331) (#7332)
* [Enhancement] Remove the two lines of duplicate import. (#7331)
2021-12-21 11:04:53 +08:00
998489ac50 [fix](sql-block-rule) move sql block rule check from ConnectProcessor to StmtExecutor (#7407)
SqlBlockRule should block only query stmt. And exclude explain stmt.
2021-12-21 10:25:09 +08:00
30db2cdd19 [fix](cache) Some view stmt cannot be obtained when view in the subquery and add cache key UT (#7375)
1. Fix bug that some view stmt cannot be obtained when view in the subquery
2. Add cache key UT
2021-12-21 10:21:28 +08:00
2d72c039ad [deps](openssl) upgrade openssl to 1.1.1m (#7446)
upgrade openssl to 1.1.1m, ready for support SM2 / SM3 / SM4 national secret (national commercial password) algorithm
2021-12-21 10:09:36 +08:00
7a1bb5b335 log4j upgrade to 2.17.0 (#7440)
Solved the third security vulnerability CVE-2021-45105 that was discovered
2021-12-21 09:28:02 +08:00
6c320dffe5 [community](github) Add .asf.yaml (#7431) 2021-12-20 15:13:24 +08:00
e74e55d2a4 [docs] Fix typos (#7404)
There are a few typos in the document, which have been corrected by me
2021-12-19 18:31:35 +08:00
f6e598dca2 Revert "[improvement](reader) optimize for single rowset reading (#7351)" (#7427)
Reverts apache/incubator-doris#7351

This commit will cause wrong result with agg table.
For example, an agg table `(k1, k2, v1 sum)` with single non-overlapping rowset

`select count(k1) from tbl1;` should using `_direct_agg_key_next_row` instead of `_agg_key_next_row`.

Otherwise it return less rows than expected.(because `_agg_key_next_row` will only do aggregation with `k1`)
2021-12-19 18:31:11 +08:00
e9536a8cf1 [deps](cyrus_sasl) Add -fPIC for cyrus_sasl (#7408) 2021-12-17 13:11:25 +08:00
06c38ce46e [enhancement] Make concurrent_number for routine load task can be larger than be num (#7386)
* [enhancement] Make concurrent_number for routine load task can be larger than be num

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2021-12-17 11:04:29 +08:00