Commit Graph

15962 Commits

Author SHA1 Message Date
f34b46a366 [fix](glue) support amazonaws.com.cn endpoint (#29128) 2023-12-29 13:50:30 +08:00
9fc613de9c [fix](nereids) Fix query rewrite by mv fail when self join (#29227)
Fix query rewrite by mv fail when self join, after fix query like following can be rewrited

def materialized view = """
    select 
    a.o_orderkey,
    count(distinct a.o_orderstatus) num1,
    SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority = 1 AND a.o_orderdate = '2023-12-08' AND b.o_orderdate = '2023-12-09' THEN a.o_shippriority+b.o_custkey ELSE 0 END) num2,
    SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority = 1 AND a.o_orderdate >= '2023-12-01' AND a.o_orderdate <= '2023-12-09' THEN a.o_shippriority+b.o_custkey ELSE 0 END) num3,
    SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority in (1,2) AND a.o_orderdate >= '2023-12-08' AND b.o_orderdate <= '2023-12-09' THEN a.o_shippriority-b.o_custkey ELSE 0 END) num4,
    AVG(a.o_totalprice) num5,
    MAX(b.o_totalprice) num6,
    MIN(a.o_totalprice) num7
    from
    orders a
    left outer join orders b
    on a.o_orderkey = b.o_orderkey
    and a.o_custkey = b.o_custkey
    group by a.o_orderkey;
"""

def query = """
    select 
    a.o_orderkey,
    SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority = 1 AND a.o_orderdate = '2023-12-08' AND b.o_orderdate = '2023-12-09' THEN a.o_shippriority+b.o_custkey ELSE 0 END) num2,
    SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority = 1 AND a.o_orderdate >= '2023-12-01' AND a.o_orderdate <= '2023-12-09' THEN a.o_shippriority+b.o_custkey ELSE 0 END) num3,
    SUM(CASE WHEN a.o_orderstatus = 'o' AND a.o_shippriority in (1,2) AND a.o_orderdate >= '2023-12-08' AND b.o_orderdate <= '2023-12-09' THEN a.o_shippriority-b.o_custkey ELSE 0 END) num4,
    AVG(a.o_totalprice) num5,
    MAX(b.o_totalprice) num6,
    MIN(a.o_totalprice) num7
    from
    orders a
    left outer join orders b
    on a.o_orderkey = b.o_orderkey
    and a.o_custkey = b.o_custkey
    group by a.o_orderkey;
"""
2023-12-29 13:45:33 +08:00
2794427e7f [enhancement](Nereids): refactor eliminating inner join by foreign key (#28816) 2023-12-29 13:41:54 +08:00
b9572f9de0 [pipelineX](fix) Fix pip scanner context bug (#29229) 2023-12-29 13:24:39 +08:00
70f5a26f44 [pipelineX](fix) Fix heap-use-after-free for AggSource dependency (#29272) 2023-12-29 12:50:41 +08:00
a57ad36c4e [fix](docs) Update Link for 404 page of dev docs (#29213) 2023-12-29 11:37:50 +08:00
3cddb597c6 [fix](regression) fix pipeline load regression test error (#29216)
Co-authored-by: qinhao <qinhao@newland.com.cn>
2023-12-29 11:06:20 +08:00
48d41a8c8b [feature](Nereids): support comparing mv with inferred predicate (#29132) 2023-12-29 10:38:53 +08:00
253846a99d [improve](compaction) enable compaction priority scheduling (#29261) 2023-12-29 10:13:08 +08:00
a525d5c5a3 [refactor](decimal) change type name Decimal128 to Decimal128V2, Decimal128I to Decimal128V3 to avoid confusion (#29265)
change type name Decimal128 to Decimal128V2, Decimal128I to Decimal128V3 to avoid confusion
2023-12-29 10:11:44 +08:00
d2dc12bed5 [fix](nereids)exists subquery should handle top level scarlar agg correctly (#29135) 2023-12-29 09:45:20 +08:00
61677d1d4b [ci](perf) 1. add perf check of tpcds, 2. adjust clickbench and tpch check (#28431) 2023-12-29 09:26:15 +08:00
269c1b189d [improve](vtablet_writer) check runtime state is cancel when back pressure (#29260) 2023-12-29 09:11:24 +08:00
2f29dda5aa [Fix](core) Fix file system scan deleted file (#29266) 2023-12-29 09:07:59 +08:00
c3679a2750 [opt](Nereids) derive physical properties of Project and Filter (#29171) 2023-12-29 07:08:12 +08:00
7d44c5a1f1 [FIX](map)fix element_at in old planner make fe exception and regress cases from ck #29241 2023-12-29 01:00:47 +08:00
ce13a1d951 [fix](nereids) make runtime filter order stable #29203 2023-12-29 01:00:27 +08:00
69c90b1640 [fix](group commit)fix group commit regresstion test (#29079) 2023-12-29 00:50:22 +08:00
825732a395 [fix](regression) fix unstable test case (#29264) 2023-12-29 00:45:34 +08:00
fb0ed8c253 [fix](move-memtable) check missing tablets before commit (#29223) 2023-12-29 00:33:58 +08:00
efea006f3a [ut](move-memtable) add CLOSE_LOAD before EOS ut case (#29253)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-12-29 00:33:34 +08:00
99a1e066b5 [fix](group_commit) group_commit is not support on table with property light_schema_change=false (#29244) 2023-12-29 00:26:38 +08:00
9be0f04506 (improv)[group commit] refactor some group commit code (#29180) 2023-12-29 00:26:10 +08:00
9a277a6f11 [fix](move-memtable) don't abort in replica write layer unless all replica fails (#29257) 2023-12-29 00:03:28 +08:00
feebe3e6fb [FIX](literal) fix expression literal error #29157 2023-12-28 23:08:01 +08:00
a90304c208 [fix](parquet) complex type in parquet is case sensitive (#29245)
Change name of complex type in parquet to case-insensitive. Otherwise, uppercase column names of complex types will return null.
2023-12-28 22:43:11 +08:00
8a491e7b1d Fix workload scheduler start too early may cause npe (#29258) 2023-12-28 22:41:42 +08:00
e64c5687f2 [fix](index compaction)support compact multi segments in one index (#28889) 2023-12-28 21:33:21 +08:00
ffd178f5ff [feat](pipelinex) support parallel scan on pipeline x engine (#29070)
* [feat](pipelinex) support parallel scan on pipeline x engine

* make parallel scan be independent of shared scan
2023-12-28 21:29:07 +08:00
0912b137e6 [Improvement](pipelineX) optimize local exchange sink (#29250) 2023-12-28 21:22:29 +08:00
b093097bc3 [improvement](statistic)Improve auto analyze visibility. (#29046)
Show auto analyze can show the running jobs, not only the finished/failed jobs.
Show analyze task status could show auto tasks as well.
Remove some useless code.
Auto analyze execute catalog/db/table in the order of id, small id first.
2023-12-28 21:21:17 +08:00
5129ab5738 [fix](decimalv2) fix decimalv2 agg errors (#29246) 2023-12-28 21:17:16 +08:00
c8a0d3e03c [fix](invert index) fix error handling for match_regexp resulting in an empty match. (#29233) 2023-12-28 19:58:41 +08:00
a14daca7ba [feature](inverted index)write separated index files in RAM directory to reduce IO(#28810)
Normally we write the separate index files to disk before we merge the index files into an idx compound file.
In high-frequency load scenarios, disk IO can become a bottleneck. 
In order to reduce the pressure on the disk, we write the standalone index file to the RAM directory for the first time, and then write it to the disk when merging it into a composite file.

Add config `index_inverted_index_by_ram_dir_enable`, default is `false`.
2023-12-28 17:18:59 +08:00
e610044bae [Enhancement] (schema) add column type check (#28718) 2023-12-28 17:11:24 +08:00
6323c17ad5 [fix](test) fix wrong DDL in test pipeline load #29211 2023-12-28 16:51:48 +08:00
b31494b18c [test](regression) add fault injection cases for LoadStream (#29101)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-12-28 16:16:26 +08:00
03a6a2880a [fix](journal) Fix infinite block due to initial BDB journal failed (#29205)
Opening a BDBJournal will acquire the max journal id, but it doesn't
need to check whether the replica txn is matched with the master.
2023-12-28 15:57:51 +08:00
8becf053cb [fix](multi-catalog)unsupported hive input format should throw an exception and remove useless method (#29087)
introduce from: #28644
2023-12-28 15:43:28 +08:00
ba7b7c1f60 [Chore](Job)It is forbidden to change the status of internal JOB through PAUSE/RESUME (#29036) 2023-12-28 15:40:16 +08:00
5171a77f9e [fix](Nereids): merge Offset in Limit Translator (#29100) 2023-12-28 15:32:45 +08:00
14c902b504 [fix](regression test) fix test_alter_colocate_table (#29009) 2023-12-28 15:09:21 +08:00
31b3be456c add workload scheduler in be (#29116) 2023-12-28 15:04:22 +08:00
Pxl
118775f913 [Bug](schame-change) fix wrong result after reorder mor table (#29045)
* fix wrong result after reorder mor table

* update
2023-12-28 14:57:31 +08:00
Pxl
c98489fc09 [Feature](materialized-view) support visitBitmapUnion mv rewrite (#29200)
* support visitBitmapUnion rewrite

* add case
2023-12-28 14:56:33 +08:00
29a7c0d677 [pipelineX](scan) ignore storage data distribution by default (#29192) 2023-12-28 14:54:09 +08:00
fe93a8f1d0 [cleanup](move-memtable) remove unused log in load stream stub (#29084) 2023-12-28 14:39:10 +08:00
2e910dac2a [enhencement](segcompaction) cancel inflight segcompaction tasks faster when load finish (#28901)
[Goal]
When building the rowset writer, avoid waiting for inflight segcompaction
to elimite long tail latency for load.

[Current situation]
1. The segcompaction of a rowset is executed serially. During the build phase,
we need to wait for the completion of the inflight segcompaction task.

2. If the rowset writer finishes writing and starts building meta, then segments
that have not been compacted will not be submitted to segcompaction worker.
We simply ignore them to accelerate the build process.

3. But this is not enough. If a segcompaction task has already been submitted to
the worker thread pool, we will set a cancelled flag for the worker,
and nothing will be done during execution to complete the task ASAP.

4. But this is still not enough. Although the latency of the segcompaction task
has been shortened by aforemetioned method, tasks may still be queuing in the
thread pool.

[Solution]
We can increase the worker thread pool to avoid queuing congestion, but this is
not the best solution.
Segcompaction should be a best effort work, and should not use too many CPU and
memory resources. So we adopted the strategy of unbinding build and segcompaction,
specifically:

1. For the segcompaction task that is performing compaction operations, we should
not interrupt it, otherwise it may cause file corruption

2. For those tasks still queued, we no longer care about their results (because
these tasks will know they are cancelled and will not perform any actual operations),
so we just ignore them and continue with the subsequent rowset build process

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-12-28 14:32:29 +08:00
4f2d54d462 [fix](DatabaseTransactionMgr) Fix clean label bug which may cause inconsitent editlog operation (#29198) 2023-12-28 14:17:35 +08:00
f816d13c56 [feature](Nereids): eliminate groupby (#28615) 2023-12-28 14:00:41 +08:00