Commit Graph

14630 Commits

Author SHA1 Message Date
f018b00646 [ci](perf) add new pipeline of tpch-sf100 (#26334)
* [ci](perf) add new pipeline of tpch-sf100
Co-authored-by: stephen <hello-stephen@qq.com>
2023-11-08 15:32:02 +08:00
a3666aa87e [feature](decimal) support decimal256 when creating table (#26308) 2023-11-08 15:21:01 +08:00
xy
be7d49cb9f [Fix](doc) Fixed some errors in the documentation (#26410)
Co-authored-by: xingying01 <xingying01@corp.netease.com>
2023-11-08 15:19:34 +08:00
f80495da83 [fix](Nereids) ban right outer, right anti, full outer with bucket shuffle (#26529)
if left bucket has no data, we do not generate left bucket instance.
These join should reserve all right side data. But because left instance
is not exists. So right data will be discard since no dest be set.

We ban these join temporarily until we could generate all instance
for left side in Coordinator.
2023-11-08 01:16:50 -06:00
5d4557938a [regression-test](fix) fix export_struct bug (#26561) 2023-11-08 14:57:07 +08:00
fc304c0e7c (metric) add histogramJsonMetric and nodeInfo (#26172)
Add histogramJsonMetric and nodeInfo to the interface "http://fe_host:http_port/metrics?type=json".
2023-11-08 14:46:18 +08:00
44b51bf0b9 [Feature](Variant) support variant load (#26572) 2023-11-08 00:37:57 -06:00
0f3e97f9c5 [regression-test][framework] support cases that can only run in non-concurrent-mode. (#26487) 2023-11-08 12:46:36 +08:00
9502cc758d [fix](regression) fix group commit regression test (#26557) 2023-11-08 11:57:07 +08:00
f8f3bc6a67 Revert "[Chore](ci)Temporarily cancel the mandatory restrictions of ShellCheck (#26553)" (#26565)
This reverts commit b7c81bc73625b26df746fc2213980c16b9d8f1a0.
2023-11-08 11:52:08 +08:00
a2419a8eb4 [enhancement](sink) refactor code of auto partition and where clause and enable them on sinkv2 (#26432)
For better performance and elasticity, we move memtable from loadchannel to
sink, VTabletSinkV2 is introduced, then there are VTabletWriter and
VTabletSinkV2 distributing rows to tablets. where clauses on mvs are
executed in VTabletWriter, while VTabletSinkV2 needs it too. So common code
is moved to row distribution.

Actually, we can layer code by rows' data flow, then the code is much more
understood and maintainable.

ScanNode -> Sink/Writer (RowDistribution -> IndexChannel / DeltaWriter)
2023-11-08 11:51:40 +08:00
7bad2e1d9f [opt](nereids) infer result column name in ctas and query stmt (#26055)
Infer name if it is an expression and doesn't alias artificially when create or select stmt in nereids.
The infer name strategy is the same as #24990
2023-11-07 21:28:48 -06:00
f4cbbe6429 [chore](workflow) Fix security issues with pull_request_target (#26525)
In the workflow Code Checks, we use the event pull_request_target which has write permission to enable the actions to comment on our PRs. We should be careful with the write permission and must forbid from running any user code. The previous PR #24761 tried its best to achieve this goal.
However, there is a scenario lacking of consideration (See #26494). #26494 attacks the workflow by git submodule way. This PR fixes this scenario by checkouting the external action explicitly in the workflow.
2023-11-08 11:23:13 +08:00
47ba4aaf30 [Enhancement](load) add timer and partitions number limit (#26549)
add timer and partitions number limit
2023-11-08 11:22:40 +08:00
c93c8f6105 [opt](nereids) make AGG_SCALAR_SUBQUERY_TO_WINDOW_FUNCTION rewrite rule #25969 2023-11-08 11:04:08 +08:00
290070074a [refactor](stats) refactor collection logic and opt some config (#26163)
1. not collect partition stats anymore
2. merge insert of stats
3. delete period collector since it is useless
4. remove enable_auto_sample
5. move some config related to stats to global session variable

Before this PR, when analyze  a table, the insert count equals column count times 2

After this PR, insert count of analyze table would reduce to column count / insert_merge_item_count.

According to my test, when analyzing  tpch lineitem, the insert sql count is 1
2023-11-08 11:03:44 +08:00
1544110c1b [feature-wip](arrow-flight)(step4) Support other DML and DDL statements, besides Select (#25919)
Design Documentation Linked to #25514
2023-11-08 10:50:42 +08:00
806461721c [opt](Nereids) remove Nondeterministic trait from date related functions (#26444) 2023-11-07 20:43:37 -06:00
b7c81bc736 [Chore](ci)Temporarily cancel the mandatory restrictions of ShellCheck (#26553)
To let #26525 pass.
2023-11-08 10:42:22 +08:00
daea751a98 [Improvement](auditlog) add column catalog for audit log and audit log table (#26403) 2023-11-08 10:25:15 +08:00
Pxl
3cdbb6e637 [Bug](materialized-view) fix some bugs on create mv with percentile_approx (#26528)
1. percentile_approx have wrong symbol
2. fnCall.getParams() get obsolete childrens
2023-11-08 10:09:37 +08:00
519b48648e [fix](move-memtable) handle status when possible (#26526) 2023-11-08 10:09:06 +08:00
607a5d25f1 [feature](streamload) support HTTP request with chunked transfer (#26520) 2023-11-08 10:07:05 +08:00
a354f87d2e [refactor](pipeline) simplify runtime state ctor (#26461) 2023-11-08 09:57:09 +08:00
70bc8600a9 [fix](regression) fix regression framework bug: if real test result is negative, it will miss check test result (#25734) 2023-11-08 09:05:58 +08:00
a6756b4660 [pipelineX](bug) Fix broadcast buffer reference count (#26545) 2023-11-08 00:14:48 +08:00
4995ca8fba [fix](move-memtable) ensure segment is flushed before add segment (#26522) 2023-11-07 22:42:16 +08:00
32b36d3c9c [refactor](move-memtable) rename proto OpenStreamSink to OpenLoadStream (#26527) 2023-11-07 22:41:20 +08:00
3faf3b4118 [chore] Print FE version even if it has been started (#26427)
In the previous implementation, `bin/start_fe.sh --version` will
complain that "Frontend running as process xxx. Stop it first."

To show version
1. `bin/start_fe.sh --version` will print version info to fe.out
2. `bin/start_fe.sh --console --version` will print version info to stdout
2023-11-07 22:33:02 +08:00
5d80e7dc2f [Improvement](pipelineX) Improve local exchange on pipelineX engine (#26464) 2023-11-07 22:11:44 +08:00
ceccc451fa [enhancement](Nereids): add LOG info to show the phase of NereidsPlanner. (#26538)
Add LOG info to show the phase of NereidsPlanner, we can use these info to debug.
2023-11-07 21:46:54 +08:00
2be6c9ff7d [enhancement](Nereids): when the DPhyper failed, roll back to cascades without join reorder (#26390)
when the DPhyper failed, roll back to cascades without join reorder
2023-11-07 20:05:40 +08:00
5e9a23e643 [fix](prepare statement) Not supported such prepared statement if prepare a forward master sql (#26512) 2023-11-07 19:41:44 +08:00
2bb3ef1981 [refactor](scan) delete bloom_filter_predicate (#26499) 2023-11-07 19:37:31 +08:00
d6eb3324a1 [cleanup](load) remove unused code in sink v2 header (#26521) 2023-11-07 19:35:12 +08:00
ad1f635070 [Feature](auditloader) Plugin auditloader use auth token to avoid using cleartext passwords in config (#26278)
Doris FE will check if stream load http request has auth token after checking password failed;
Plugin audit-log loader can use auth token if plugin config set use_auth_token to true

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2023-11-07 19:14:57 +08:00
38a14c3325 [docs](fix) add bitmap_remove in sidebars.json (#26523) 2023-11-07 19:01:27 +08:00
2feed57f47 [Fix](fs_benchmark_tools) Fix run_fs_benchmark.sh classpath issue. (#26183)
Fix run_fs_benchmark.sh classpath issue.
2023-11-07 18:43:30 +08:00
a404ff5ab9 [fix](regression) fix group commit regression test (#26519) 2023-11-07 18:17:45 +08:00
ef95e962c7 [fix](timev2) fix Type not implemented in fold by be (#26478) 2023-11-07 17:25:20 +08:00
b0788652bd [bugfix](clickhouse) fix datetime convert error. (#26128) 2023-11-07 17:16:07 +08:00
3ad8e27b09 [Fix](autoinc) Init auto increment info in VOlapTableSinkV2 (#26502) 2023-11-07 16:51:38 +08:00
f0bf3fadad [test](executor)Add workload group regression test (#26446) 2023-11-07 16:37:54 +08:00
efd1aa3016 [Revert](code-style) revert FE code-format #25033 and #26488 (#26505) 2023-11-07 16:37:24 +08:00
8da1a9a370 [pipeline](fix) remove unreasonable CHECK (#26504) 2023-11-07 15:48:07 +08:00
277329c035 [fix](auditlog) fix without lock in QueryStatisticsRecvr find #26440 2023-11-07 13:53:22 +08:00
9687932d57 [refactor](function) improve compoundPred optimization work with children is nullable (#26160)
before this optimization work has limit, it's children must not nullable.
2023-11-07 13:52:10 +08:00
f138aaa07a [fix](nereids) unnest in-subquery with agg node in proper condition (#25800)
consider sql having in-subquery

SELECT count(*)
        FROM sub_query_correlated_subquery6
        WHERE k1 IN 
            (SELECT k1
            FROM 
                (**SELECT k1,
                sum(k3) AS bbb,
                count(k2) AS aaa
                FROM sub_query_correlated_subquery7
                WHERE k1 > 0
                        AND k3 > 0
                GROUP BY  k1** ) y
                WHERE y.aaa>0
                        AND k1>1); 

The subquery part having agg is un-correlated, which can be unnested.

on the other side:
SELECT count(*)
                    FROM sub_query_correlated_subquery6
                    WHERE k1 IN 
                        (SELECT k1
                        FROM 
                            (**SELECT k1,
                            sum(k3) AS bbb,
                            count(k2) AS aaa
                            FROM sub_query_correlated_subquery7
                            WHERE k1 > 0
                                    AND k3 > 0 and sub_query_correlated_subquery6.k1 > 2
                            GROUP BY  k1** ) y
                            WHERE y.aaa>0
                                    AND k1>1);

The subquery part having agg is correlated, which can't be unnested.
2023-11-06 20:35:13 -06:00
16644eff7f [opt](load) optimize the performance of row distribution (#25546)
For non-pipeline non-sinkv2:
before: 14s
now: 6s-
For pipeline + sinkv2:
before: 230ms *48 instances
now: 38ms *48 instances
2023-11-07 10:04:59 +08:00
fa7a38b587 [fix](runtime filter) append late arrival runtime filters in vfilecanner (#25996)
`VFileScanner` will try to append late arrival runtime filters in each loop of `ScannerScheduler::_scanner_scan`.  However, `VFileScanner::_get_next_reader` only generates the `_push_down_conjuncts` in the first loop, so the late arrival runtime filters are ignored.
2023-11-07 09:50:35 +08:00