c8845c9e07
[opt](scanner) Improve the efficiency of TOPN opt ( #29937 )
2024-01-16 18:37:44 +08:00
5e697990a8
[bugfix](timeout) serving_blocks_num may cause timeout, try to fix it ( #29912 )
...
Although serving_blocks_num is an atomic variable. It's ++ and -- are not protected by transfer lock.
I am not sure the memory order of ++ and --.
I think it maybe the root cause of query timeout. So that I remove the check and test it in github pipeline.
2024-01-16 18:34:19 +08:00
e4e57e9b05
[chore](removelogs) remove debug query timeout logs
2024-01-12 14:37:20 +08:00
ad2c13e009
[Optimize](kill-query)Support the scanners exits as soon as possible when kill query #29803
2024-01-12 13:58:19 +08:00
0d691c638b
[Feature](profile)Support report runtime workload statistics #29591
2024-01-12 11:59:27 +08:00
ca75c9b8ab
add more logs to debug timeout
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-12 11:48:39 +08:00
abb7640d37
[debug](timeout) add more log in scanner ctx to find timeout problem #29704
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-12 11:44:21 +08:00
8fc9c18c85
[improvement](jdbc catalog) Put the jdbc connection pool parameters into catalog properties ( #29195 )
2024-01-12 11:40:28 +08:00
9ef4e49307
[bugfix](scannerdeadloop) there is a dead loop in scanner ctx ( #29794 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-11 16:47:54 +08:00
c497f749ce
[debug](timeout) debug select timeout ( #29627 )
...
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-07 19:54:02 +08:00
0b731800a0
[enhancement](group_commit) refector wal manager code ( #29560 )
2024-01-07 18:54:41 +08:00
f28dbc702c
[bugfix](scanner done) should not set process status to query context ( #29512 )
...
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-04 15:18:10 +08:00
bfe65565d8
[feature](paimon)support native reader ( #29339 )
...
Support native reader fro paimon.
Upgrade paimon 0.5 to 0.6 : apache/doris-shade#32
2024-01-04 14:31:48 +08:00
bd8113f424
[bugfix](scannerscheduler) should minus num_of_scanners before check should schedule #28926 ( #29331 )
...
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-03 20:47:35 +08:00
c84cd30223
[pipelineX](fix) Fix query cancel timeout ( #29460 )
...
There are 2 potential reasons to cancel pipelineX query timeout.
Cancel fragment context first and set ready to execute will set cancel flag to false.
Dead lock.
2024-01-03 20:29:04 +08:00
2ed122b787
[improvement](task exec context) add parent class HasTaskExecutionCtx to own the task ctx ( #29388 )
...
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-02 15:28:27 +08:00
738abac9ed
[minor](context) duplicate query context in fragment ctx ( #29364 )
...
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-01-01 22:08:23 +08:00
b9572f9de0
[pipelineX](fix) Fix pip scanner context bug ( #29229 )
2023-12-29 13:24:39 +08:00
ffd178f5ff
[feat](pipelinex) support parallel scan on pipeline x engine ( #29070 )
...
* [feat](pipelinex) support parallel scan on pipeline x engine
* make parallel scan be independent of shared scan
2023-12-28 21:29:07 +08:00
c75e63a2a5
[Improvement](scan) Use scanner to do projection of scan node ( #29124 )
2023-12-27 16:00:52 +08:00
6440fbfab6
[feature](scan) Implement parallel scanning by dividing the tablets based on the row range ( #28967 )
...
* [feature](scan) parallel scann on dup/mow mode
* fix bugs
2023-12-26 17:18:41 +08:00
f30e50676e
[opt](scanner) optimize the number of threads of scanners ( #28640 )
...
1. Remove `doris_max_remote_scanner_thread_pool_thread_num`, use `doris_scanner_thread_pool_thread_num` only.
2. Set the default value `doris_scanner_thread_pool_thread_num` as `std::max(48, CpuInfo::num_cores() * 4)`
2023-12-26 10:24:12 +08:00
7081139bdc
[fix](block) fix be core while mutable block merge may cause different row size between columns in origin block ( #27943 )
2023-12-25 20:35:22 +08:00
e9e1e2894b
[performance](variant) support topn 2phase read for variant column ( #28318 )
...
[performance](variant) support topn 2phase read for variant column
2023-12-25 11:50:41 +08:00
1545c36d16
Revert "[bugfix](scannercore) scanner will core in deconstructor during collect profile ( #28727 )" ( #28931 )
...
This reverts commit 4066de375efe6ff8e156a61df4f9316b3d9eaa4e.
2023-12-24 20:37:33 +08:00
db1da161f5
[optimize](zonemap) skip zonemap if predicate does not support_zonemap ( #28595 )
...
* [optimize](zonemap) skip zonemap if predicate does not support_zonemap #27608 (#28506 )
2023-12-24 19:34:13 +08:00
4066de375e
[bugfix](scannercore) scanner will core in deconstructor during collect profile ( #28727 )
2023-12-23 11:09:46 +08:00
aca8406e31
[refactor](executor)remove scan group #28847
2023-12-22 17:05:50 +08:00
012e66729a
[improvement](executor) Add tvf and regression test for Workload Scheduler ( #28733 )
...
1 Add select workload schedule policy tvf
2 Add reg test
2023-12-22 12:09:51 +08:00
bcf2683b9d
[fix](scanner) fix concurrency bugs when scanner is stopped or finished ( #28650 )
...
`ScannerContext` will schedule scanners even after stopped, and confused with `_is_finished` and `_should_stop`.
Only Fix the concurrency bugs when scanner is stopped or finished reported in https://github.com/apache/doris/pull/28384
2023-12-21 10:37:58 +08:00
970e1c8475
[fix](group_commit) fix group commit cancel stuck ( #28749 )
2023-12-21 10:32:21 +08:00
2b2d3d0eb1
[fix](meta_scanner) fix meta_scanner process ColumnNullable ( #28711 )
2023-12-20 17:41:38 +08:00
b142ade69e
[refactor](renamefile) rename some files according to the class names ( #28606 )
2023-12-19 14:10:11 +08:00
66fbb22ad7
[fix](group commit) Fix some wal problems on group commit ( #28554 )
2023-12-19 09:51:03 +08:00
73f7b61019
[refactor](scanner) use weak ptr to lock task execution context to avoid core in scanner dctor ( #28493 )
...
using weak ptr as a lock between fragment execute thread and scanner thread, to solve the core problem in scanner's dctor to access scannode's profile.
2023-12-18 14:09:32 +08:00
469edbdd3d
[feature](executor)make scan task wait timeout config #28467
2023-12-16 11:36:15 +08:00
9fe2fce306
[minor](refactor) remove unused code ( #28383 )
2023-12-14 17:16:41 +08:00
ec91dd1129
[opt](vfilescanner) interrupt running parquet/orc readers when scannode is finished ( #28223 )
...
VScanNode::get_next will check whether the ScanNode has reached limit condition, and send eos to TaskScheduler, and TaskScheduler will try to close ScanNode.
However, ScanNode must wait all running scanners finished, so even if ScanNode has reached limit condition, it can't be closed immediately.
This PR try to interrupt the running readers, and make ScanNode to end as soon as possible.
2023-12-13 19:31:08 +08:00
13b9350aeb
[Bug](scan)fix some case query timeout of not schedule scanner ( #28243 )
...
now in pipeline, when result block queue is empty, will be reschedule, and then choose a batch of scanner,
but sometimes, get_available_thread_slot_num() will return thread_slot_num <= 0, so it's will do nothing,
and then block queue will always empty.
have no chance to reschedule again until query timeout.
2023-12-12 21:00:22 +08:00
5ff110e845
[exec](profile) only build expr debug string enable profile ( #28261 )
2023-12-12 09:13:37 +08:00
8f2202c89d
[minor](log) Add debug info in operators ( #28211 )
2023-12-11 10:02:24 +08:00
9461e86b10
[pipelineX](debug) add debug string ( #28137 )
...
* [pipelineX](debug) add debug string
* update
2023-12-07 23:21:10 +08:00
8df94f0d07
[fix](remote-scanner-pool) missing _remote_thread_pool_max_size value ( #28057 )
2023-12-07 11:18:42 +08:00
54d062ddee
[feature](stream load) (step one)Add arrow data type for stream load ( #26709 )
...
By using the Arrow data format, we can reduce the streamload of data transferred and improve the data import performance
2023-12-06 23:29:46 +08:00
1be513b927
[pipelineX](local shuffle) Fix local shuffle for colocate/bucket join ( #28032 )
2023-12-06 10:02:36 +08:00
6074cddcf8
[feature](mtmv)add Job and task tvf ( #27967 )
...
add:
select * from jobs("type"="mv");
select * from tasks("type"="mv");
select * from jobs("type"="insert");
select * from tasks("type"="insert");
add check priv for mv_infos("database"="xxx");
change JobType MTMV==>MV
2023-12-05 15:12:36 +08:00
54fe1a166b
[Refactor](scan) refactor scan scheduler to improve performance ( #27948 )
...
* [Refactor](scan) refactor scan scheduler to improve performance
* fix pipeline x core
2023-12-05 13:03:16 +08:00
e3d2425d47
[Improvement](join) remove insert_indices_from_join and special judge for -1 ( #27779 )
...
remove insert_indices_from_join and special judge for -1
2023-12-04 11:03:22 +08:00
d2a99aa03b
[refactor](scan) change scan reschedule into scan context ( #27766 )
...
* [refactor](scan) change scan reschedule into scan context
2023-12-04 10:25:52 +08:00
54b5d04ff9
[improve](csv_reader) handle csv reader error ( #27892 )
2023-12-02 10:05:02 +08:00