Commit Graph

1308 Commits

Author SHA1 Message Date
be7273da83 [refactor](executor)Refactor workload meta update to be #26710 2023-11-18 11:19:38 +08:00
b1eef30b49 [pipelineX](dependency) Wake up task by dependencies (#26879)
---------

Co-authored-by: Mryange <2319153948@qq.com>
2023-11-18 03:20:24 +08:00
5d548935e0 [improvement](insert) support schema change and decommission for group commit (#26359) 2023-11-17 21:41:38 +08:00
0a1a6cf02f [fix](topn) add defensive code in topn opt to avoid crash due to column not in tablet schema 2023-11-17 21:14:10 +08:00
3ad865fef9 [refactor](storage) Expressing the types of computation layer and storage layer in PrimitiveTypeTraits (#26191) 2023-11-15 21:34:49 +08:00
30d1e6036c [feature](runtime filter) New session variable runtime_filter_wait_infinitely (#26888)
New session variable: runtime_filter_wait_infinitely. If set runtime_filter_wait_infinitely = true, consumer of rf will wait on receiving until query is timeout.
2023-11-14 21:05:59 +08:00
f6a9914bc7 [feature](move-memtable) support auto partition in sink v2 (#26914) 2023-11-14 11:39:44 +08:00
de6ecd2035 [fix](tls) Manually track memory in Allocator instead of mem hook and ThreadContext life cycle to manual control (#26904)
Manually track query/load/compaction/etc. memory in Allocator instead of mem hook.
Can still use Mem Hook when cannot manually track memory code segments and find memory locations during debugging.
This will cause memory tracking loss for Query, loss less than 10% compared to the past, but this is expected to be more controllable.
Similarly, Mem Hook will no longer track unowned memory to the orphan mem tracker by default, so the total memory of all MemTrackers will be less than before.
Not need to get memory size from jemalloc in Mem Hook each memory alloc and free, which would lose performance in the past.
Not require caching bthread local in pthread local for memory hook, in the past this has caused core dumps inside bthread, seems to be a bug in bthread.
ThreadContext life cycle to manual control
In the past, ThreadContext was automatically created when it was used for the first time (this was usually in the Jemalloc Hook when the first malloc memory), and was automatically destroyed when the thread exited.
Now instead of manually controlling the create and destroy of ThreadContext, it is mainly created manually when the task thread start and destroyed before the task thread end.
Run 43 clickbench query tests.
Use MemHook in the past:
2023-11-14 10:30:42 +08:00
b19abac5e2 [fix](move-memtable) pass num local sink to backends (#26897) 2023-11-14 08:28:49 +08:00
c0fda8c5c2 [improve](group commit) Add a swicth to wait internal group commit lo… (#26734)
* [improve](group commit) Add a swicth to make internal group commit load finish

* modify group commit tvf plan
2023-11-13 10:35:35 +08:00
d767804815 [feature](merge-cloud) Decouple rowset id generator and local rowsets gc implementation (#25921) 2023-11-10 10:07:02 +08:00
a5565f68b2 [Refactor](opentelemetry) Remove opentelemetry (#26605) 2023-11-09 18:05:34 +08:00
58bf79f79e [fix](move-memtable) pass load stream num to backends (#26198) 2023-11-08 16:16:33 +08:00
6637f9c15f Add enable_cgroup_cpu_soft_limit (#26510) 2023-11-08 15:52:13 +08:00
a3666aa87e [feature](decimal) support decimal256 when creating table (#26308) 2023-11-08 15:21:01 +08:00
47ba4aaf30 [Enhancement](load) add timer and partitions number limit (#26549)
add timer and partitions number limit
2023-11-08 11:22:40 +08:00
1544110c1b [feature-wip](arrow-flight)(step4) Support other DML and DDL statements, besides Select (#25919)
Design Documentation Linked to #25514
2023-11-08 10:50:42 +08:00
607a5d25f1 [feature](streamload) support HTTP request with chunked transfer (#26520) 2023-11-08 10:07:05 +08:00
a354f87d2e [refactor](pipeline) simplify runtime state ctor (#26461) 2023-11-08 09:57:09 +08:00
4995ca8fba [fix](move-memtable) ensure segment is flushed before add segment (#26522) 2023-11-07 22:42:16 +08:00
32b36d3c9c [refactor](move-memtable) rename proto OpenStreamSink to OpenLoadStream (#26527) 2023-11-07 22:41:20 +08:00
5d80e7dc2f [Improvement](pipelineX) Improve local exchange on pipelineX engine (#26464) 2023-11-07 22:11:44 +08:00
a404ff5ab9 [fix](regression) fix group commit regression test (#26519) 2023-11-07 18:17:45 +08:00
ef95e962c7 [fix](timev2) fix Type not implemented in fold by be (#26478) 2023-11-07 17:25:20 +08:00
277329c035 [fix](auditlog) fix without lock in QueryStatisticsRecvr find #26440 2023-11-07 13:53:22 +08:00
1a83a39aec Revert "[fix](auto-partition) Fix auto partition concurrent conflict (#26166)" (#26448)
This reverts commit f22611769944e78c28f1b0a1eeb7b7414a16e8db.
2023-11-06 16:39:19 +08:00
2cc68381ec [feature](binlog) Add ingest_binlog/http_get_snapshot limit download speed && Add async ingest_binlog (#26323) 2023-11-06 11:14:44 +08:00
f226117699 [fix](auto-partition) Fix auto partition concurrent conflict (#26166) 2023-11-06 10:34:26 +08:00
b19f275714 [improvement](insert) refactor group commit insert into (#25795) 2023-11-03 12:02:40 +08:00
a5ef90dacc [enhancement](recover) support skipping missing version in select by session variable (#25654) 2023-11-02 20:01:51 +08:00
387e33fa34 [enhancement](group commit)Add group commit block queues memory back pressure (#26045) 2023-11-01 16:29:45 +08:00
4644191fd0 [fix](broker-read) refactor broker reading process to avoid null broker connection (#26050) 2023-11-01 15:58:30 +08:00
8f15f9adf6 [test](case) add test case to improve code coverage (#25516)
[test](case) add test case to improve code coverage (#25516)
2023-11-01 12:51:12 +08:00
Pxl
696ecc8c83 [Chore](log) adjust error code on too many filtered rows (#26168) 2023-11-01 00:15:56 +08:00
Pxl
15ba886725 do not print stack when stream load catch failed status on thrift (#26062)
do not print stack when stream load catch failed status on thrift
2023-10-30 10:36:01 +08:00
e20cab64f4 [improvement](scan) avoid too many scanners for file scan node (#25727)
In previous, when using file scan node(eq, querying hive table), the max number of scanner for each scan node
will be the `doris_scanner_thread_pool_thread_num`(default is 48).
And if the query parallelism is N, the total number of scanner would be 48 * N, which is too many.

In this PR, I change the logic, the max number of scanner for each scan node
will be the `doris_scanner_thread_pool_thread_num / query parallelism`. So that the total number of scanners
will be up to `doris_scanner_thread_pool_thread_num`.

Reduce the number of scanner can significantly reduce the memory usage of query.
2023-10-29 17:41:31 +08:00
606223ab62 Revert "[refactor](pipeline) simplify runtime state ctor (#25995)" (#26029)
This reverts commit a01922cdc55e2b3a63d9a9aafb38ac5ed64c6dd3.
2023-10-27 18:15:30 +08:00
a01922cdc5 [refactor](pipeline) simplify runtime state ctor (#25995) 2023-10-27 15:45:29 +08:00
46d40b1952 [refactor](executor)Remove empty group logic #26005 2023-10-27 14:24:41 +08:00
c1d64a7128 [Feature](datatype) Add IPv4/v6 data type for doris (#24965) 2023-10-26 17:33:28 +08:00
1ba8a9bae4 [feature-wip](executor)Fe send topic info to be (#25798) 2023-10-26 15:52:48 +08:00
d6c64d305f [chore](log) Add log to trace query execution #25739 2023-10-26 14:09:25 +08:00
4434b3f32e [fix](move-memtable) use pthread mutex in LoadStreamMgr (#25882) 2023-10-26 09:19:59 +08:00
e783ef716f [fix](multi-table) fix unknown source slot descriptor when load multi table (#25762) 2023-10-25 21:52:01 +08:00
e8f479882d [pipelineX](local exchange) Add local exchange operator (#25846) 2023-10-25 18:45:02 +08:00
693982fd1a [feature](decimal) support decimal256 (#25386) 2023-10-25 15:47:51 +08:00
97c2fe75d1 [feature](pipelineX) use expected<T, Status> in local_state (#25878) 2023-10-25 15:23:17 +08:00
235ae9ded4 [improvement](fragment) optimize to get query context logic (#25621) 2023-10-25 14:03:47 +08:00
5e3277e8fb [improvement](routine-load) add routine load rows check (#25818) 2023-10-25 11:04:28 +08:00
552091f21f [performance](pipelineX) optimize pipelineX (#25713) 2023-10-25 10:13:17 +08:00