Commit Graph

6608 Commits

Author SHA1 Message Date
491dd34ba7 [fix](planner) fix orthogonal_bitmap_union_count plan : wrong PREAGGREGATION (#12095)
Execution plan display when using orthogonal_bitmap_union_count function:

PREAGGREGATION: OFF

Reason: Invalid Aggregate Operator: orthogonal_bitmap_union_count

The correct plan is: PREAGGREGATION: ON
Co-authored-by: lihuigang <lihuigang@meituan.com>
2022-09-08 15:00:43 +08:00
461a4cc94e [Enhancement](Error Msg) show details of COLUMN and TABLE name regex #11999
Co-authored-by: wuhangze <wuhangze@jd.com>
2022-09-08 14:59:39 +08:00
824a192f8f [enhancement](http) executeSQL rest api support streaming response (#12239) 2022-09-08 14:57:15 +08:00
9225dd16ca [fix](grouping sets) grouping sets cause be core or return wrong results (#12313) 2022-09-08 14:55:50 +08:00
c3af60eff8 [fix](threadpool) threadpool schedules does not work right on concurr… (#12370)
* [fix](threadpool) threadpool schedules does not work right on concurrent token

Assuming there is a concurrent thread token whose concurrency is 2, and the 1st
submit on the token is submitted to threadpool while the 2nd is not submitted due
to busy. The token's active_threads is 1, then thread pool does not schedule the
token.

The patch fixes the problem.
2022-09-08 14:54:46 +08:00
26cf2d3742 [enhancement](array-type) avoid abuse of Offset and Offset64 #12378
We already separate Array Offset64 and String Offset(32bit) in PR: #12341

Now we limit: Offset inside IColumn, Offset64 only inside ColumnArray, to avoid abuse of them.
If we use the wrong one, it will compile failed.
2022-09-08 14:53:07 +08:00
53b619c487 [brpc]using pooled connection and enlarge brpc connection timeout and retry… (#10443)
* using pooled connection and enlarge brpc connection timeout and retry times

When a connection failure happen, doris fails queries using the connection.
We should lower the impact of a connection failure by using pooled connection
and enlaring connection timeout and retry times.

* clang format
2022-09-08 14:50:15 +08:00
af0f4584d5 fix cache cleaner (#12432) 2022-09-08 13:31:19 +08:00
6cd06f7586 [typo](docs) INSERT documentation fix (#12455)
INSERT documentation fix
2022-09-08 13:09:08 +08:00
74ffdbeebc [feature](Nereids) Support OneRowRelation and EmptyRelation (#12416)
Support OneRowRelation and EmptyRelation.

OneRowRelation: `select 100, 'abc', substring('abc', 1, 2)`
EmptyRelation: `select * from tbl limit 0`

Note:
PhysicalOneRowRelation will translate to UnionNode(constExpr) for BE execution
2022-09-08 12:21:13 +08:00
2a64571bef [enhancement](generic_iterator) fix num check and add some notes (#12434)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-09-08 12:09:02 +08:00
a6880ca573 [fix](Nereids) throw IndexOutOfBoundsException in DistributionSpecHash#equalsSatisfy (#12446)
In earlier PR #11976 , we changed DistributionSpecHash#equalsSatisfy, and forgot to check whether the length of both side are same. When required's shuffle slot size longer than current one, exception will be thrown.
2022-09-08 11:41:48 +08:00
dd2f834c79 [feature-wip](parquet-reader) bug fix, create compress codec before parsing dictionary (#12422)
## Fix five bugs:
1. Parquet dictionary data may be compressed, but `ColumnChunkReader` try to parse dictionary data before creating compression codec, causing unexpected data errors.
2. `FE` doesn't resolve array type
3. `ParquetFileHdfsScanner`  doesn't fill partition values when the table is partitioned
4. `ParquetFileHdfsScanner` set `_scanner_eof = true` when a scan range is empty, causing the end of the scanner, and resulting in data loss
5. typographical error in `PageReader`
2022-09-08 09:54:25 +08:00
d40a9d0555 [fix](memtracker) Fix memtracker did not subtract the memory released by load channel cancel (#12405)
When the load channel is canceled, the memtracker does not subtract the memory released by the load channel. This will cause the memory usage counted by the memtracker of the load channel mgr to be larger than the actual memory usage.
2022-09-08 09:22:11 +08:00
41bc6b857d [refactor](shuffle) remove unused code (#12442) 2022-09-08 09:15:25 +08:00
018b4b7e1e [bugfix](report) fix continuous version miss check (#12415)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-09-08 08:39:22 +08:00
e7aa131506 [enhancement](tcmalloc) add aggressive_memory_decommit conf and make it disable (#12436)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-09-08 08:37:16 +08:00
a536030979 [FOLLOWUP](load) fix nullable and add regression (#12375)
* [FOLLOWUP](load) fix nullable and add regression
2022-09-08 00:05:04 +08:00
86e347f3bb [Bug](doe) fix closing scanner twice (#12408) 2022-09-07 22:45:30 +08:00
569ab30556 [bug](NodeChannel) fix OOM caused by pending queue in sink send (#12359) (#12362)
Each NodeChannel has its own queue, with size up to 1/20 exec_mem_limit.
User will crash into OOM if set exec_mem_limit high. This commit uses
fixed number to control the total max memory used by NodeChannels.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2022-09-07 20:49:08 +08:00
bdbce77227 [fix](nereids) cast left child of TimestampArithmetic to wrong type in BindFunction (#12423) 2022-09-07 20:32:47 +08:00
c2808de867 [Doc](balance)add replica balance speed param (#12406)
* update balance param
2022-09-07 19:41:45 +08:00
184be8d13c [fix](array-type) ARRAY is not supported in bloomfilter index (#12353)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-09-07 18:00:01 +08:00
941bda5a20 [enhancement](spark-load)support dynamic set env (#12276)
* [enhancement](spark-load)support dynamic set env and display spark appid

* [enhancement](spark-load)support dynamic set env
2022-09-07 16:24:29 +08:00
40f481049a [fix](Nereids)lowest cost plan map do not be merged when do group merge (#12396)
* [fix](Nereids)lowest cost plan map do not be merged when do group merge
2022-09-07 16:13:11 +08:00
09b45f2b71 [Function](ELT)Add elt function (#12321) 2022-09-07 15:21:08 +08:00
f2923f9180 [Refactor](Nereids) Simplify get input and output slots for plan/expression. (#12356)
Simplify the code of getting input/output slots from `Expression` or `Plan`.

**new interfaces add**

`Expression`:
`getInputSlots`: Get all the input slots of the expression.

`Plan`:
- `getOutputSet`: Get the output slot set of the plan.
- `getInputSlots`: Get the input slot set of the plan.

**changed interface**

`TreeNode`:
- `collect`: return `set` as result instead of `list`.
2022-09-07 14:05:37 +08:00
0bb06a1fa7 [feature](Nereids) let nullable of Year, WeekOfYear and Divide be the same as implementation in BE (#12374)
These function/expression should always be nullable, so just return true in the overwrite method.
- Year
- WeekOfYear
- Divide
2022-09-07 13:09:08 +08:00
46776af2a3 [fix](Nereids)plan translator lost other conjuncts on hash join node (#12391)
In the earlier PR #11812 , we split join condition into two parts: hash join conjuncts and other condition. But we forgot to translate other condition into other conjuncts in HashJoinNode of legacy planner. So we get wrong result if query has other condition on join node. Such as:

SELECT * FROM lineorder INNER JOIN part ON lo_partkey = p_partkey WHERE lo_orderkey > p_size;
2022-09-07 11:32:05 +08:00
449d0c219f [Improvement](sort) Accumulate blocks to do partial sort (#12336) 2022-09-07 10:34:28 +08:00
42bdde8750 [Feature](Vectorized) support jdbc scan node (#12010) 2022-09-07 10:29:41 +08:00
54d1630c42 [Opt](vectorized) speed up hash function compute in hash partition (#12334)
After do the opt of hash function, the compute of siphash in HASH_PARTITION in vdata_stream_sender

Before: 1s800ms
After: 800ms
2022-09-07 10:11:40 +08:00
e4b894a318 [Bug](remote) Fix BE crash because of call the future's get method twice (#12357)
call the future's get method once and save it.
2022-09-07 10:11:27 +08:00
445f0882d1 [Enhancement](log) improve error msg for delta writer fail (#12121) (#12360)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2022-09-07 10:10:51 +08:00
de9b9b3e8e [chore](ut) enable asan core dump when running be ut (#12371) 2022-09-07 10:09:18 +08:00
3485dfa927 [chore](profile) add some counters in aggregatation & sender (#12385) 2022-09-07 10:09:05 +08:00
d410797200 [fix](regression p0) fix regression p0 test qt_window_hang2 always failed because of timeout #12388
Co-authored-by: wuhangze <wuhangze@jd.com>
2022-09-07 10:08:12 +08:00
232d17efea [Enhancement](sparkload) cast the src slot types of bitmap columns to bitmap when FE push tasks in spark load (#12394)
In the current spark load implementation, the types of source data, that BE reads from the Broker, are all set to varchar.
However, the two types of varchar and bitmap are not compatible anymore after version 1.1.0, which will cause spark load failure.

An example of spark load error message:

detailMessage = type not match, originType=VARCHAR(*), targeType=BITMAP
Describe your changes.

Set the src type of the bitmap columns from varchar to bitmapwhen fe pushtasks.
2022-09-07 10:07:38 +08:00
9ccc39c164 [Enhancement](regression) add regression tests for executeSQL http rest api #12265 2022-09-07 10:02:37 +08:00
a465549f5e [feature](Nereids)support parse and analyze having clause (#12129)
Implement the having clause for Nereids Planner.

NOTE:

This PR aims at making Nereids Planner generate the correct logical plan and physical plan only. The runtime correctness is not the goal in this PR due to GROUP BY is not ready in Nereids Planner.
2022-09-07 09:47:03 +08:00
922b04fdc1 [Improvement](vectorized) change static_cast to assert_cast for reference (#12379)
* [Improvement](vectorized) change `static_cast` to `assert_cast` for reference
2022-09-07 09:27:13 +08:00
772e5907f2 [enhancement](test) add some p0 cases (#12240) 2022-09-07 09:10:42 +08:00
5f255af065 [Enhancement](docker): Add elasticsearch docker file (#12377) 2022-09-07 08:47:10 +08:00
893567628e [fix](exec-node) fix nullptr of runtime state (#12395)
Remove default nullptr runtime state, which is very error-prone
2022-09-07 08:46:42 +08:00
55fb90d6ae [feature](Nereids)add colocate, shuffle and bucket shuffle join algorithm to Nereids (#11976)
This PR
1. add support below join algorithm already supported by legacy to Nereids
- colocate join
- bucket shuffle join
- shuffle join
- broadcast join

2. update all cost enforce derive utils
- ChildOutputPropertyDeriver
- EnforceMissingPropertiesHelper
- RequestPropertyDeriver

3. add a local quick sort plan used in enforce
4. set PhysicalProperties to PhysicalPlan when choose best plan from memo
5. rename Job#pushTask to Job#pushJob
2022-09-07 00:31:21 +08:00
4c36e3dfa6 [fix](Nereids)LogicalAggregate's equals and hashCode missing two attributes (#12393)
After applying NormalizeAggregate rule, owner groups of all aggregate children are removed.
The root cause is the new aggregate node is regarded as the old aggregate node, because LogicalAggregate.equals() does not take some attributes ("normalized", "disassembled") into account.
2022-09-07 00:07:26 +08:00
3a0aae1b82 [enhancement](explain)add projections and output id in explain string (#12358)
In earlier PR #11842, we add the ability of projection on each ExecNode.
But, we cannot get the projection expr list in explain. This is inconvenience to debug.
This PR add them into explain string if they exist.
2022-09-06 21:03:02 +08:00
b8cc576cba [fix](array-type) add data valid check for ARRAY type while insert or load (#12283)
Add data valid check for ARRAY type while insert or load
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-09-06 20:48:58 +08:00
c975d71fd4 [typo](docs)Sql blacklist documentation fix (#12376)
Sql blacklist documentationfix
2022-09-06 19:34:05 +08:00
b398fd60fc [DOCS](function) Add docs for new time functions (#12382)
Add docs for new time functions
2022-09-06 19:33:41 +08:00