Commit Graph

14667 Commits

Author SHA1 Message Date
db317841a0 [hotfix](editlog) Fix upsert replay on follower not contains loadedTableIndexIds (#26597)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-11-09 12:21:43 +08:00
124a8a9b34 [enhancement](regression) add profile before datev2 insert for debug (#26617)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-11-09 12:21:15 +08:00
33e46ee13d [enhancement](config) enable single_replica_load by default in BE (#26619)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-11-09 12:14:37 +08:00
5d52162484 [Test](statistics) Add test cases for external table statistics (#26511)
1. Test for close and open auto collection for external catalog.
2. Test for analyze table table_name (column) and whole table.
2023-11-09 12:12:29 +08:00
57ed781bb6 [fix](regression-test) Add tvf regression tests (#26455) 2023-11-09 12:09:32 +08:00
d1438a8563 [Fix](orc-reader) Fix orc complex types when late materialization was turned on by disabling late materialization in this case. (#26548)
Fix orc complex types when late materialization was turned on in orc reader by disabling late materialization in this case.
2023-11-09 12:05:43 +08:00
f6b7046a6e [fix](regression-test) add tests for jdbc catalog (#26608) 2023-11-09 11:59:35 +08:00
01094fd25e [Coverage](BE) Delete vinfo_func in BE (#26562)
Delete vinfo_func in BE
2023-11-09 11:00:15 +08:00
95f74f1544 [FIX](complextype)fix shrink in topN for complex type #26609 2023-11-09 10:56:14 +08:00
74e452f19c [bug](bitmap) fix bitmap value copy operator not call reset (#26451)
when a empty bitmap assign to other bitmap
the other bitmap should reset self firstly, and then set empty type.
2023-11-09 10:05:09 +08:00
7df60a4980 [Refactor](Tvf) delete some unused code of tvf and add doc for queries tvf (#26460)
1. delete some unused code of tvf
2. add doc for `queries` tvf: #25051
2023-11-09 09:06:09 +08:00
66e591f7f2 [enhancement](brpc) add a auto release closure to ensure the closue safety (#26567) 2023-11-09 08:50:42 +08:00
55b2988bfd [Opt](date_add/sub) Throw exception when result of date_add/sub out of range (#26475) 2023-11-09 08:46:51 +08:00
faaf0ecc85 [regression-test](framework) Support running tests multiple times and reporting correctly to TeamCity (#26606) 2023-11-08 09:42:53 -06:00
9c828ff79c [cases](regression-test) Add backup & restore test case of dup table (#26490)
Co-authored-by: Bears0haunt <bearshaunt0@gamil.com>
2023-11-08 22:30:01 +08:00
18b3d0ec6b [cases](regression-test) add unique and duplicate backup and restore … (#26491)
* [cases](regression-test) add unique and duplicate backup and restore table models

* Add delete and mor scenes
2023-11-08 22:29:34 +08:00
b3ae7f04f9 [fix](backup) Add repo id to local meta/info files to avoid overwriting (#26536)
The local meta/info files generated during backup are not distinguished
by repo names. If two backup jobs with the same name are submitted to
different repos at the same time, meta/info may be overwritten by another
backup job.
2023-11-08 22:28:49 +08:00
ee6e6911da [regression-test](stream load) Invalid merge type check (#26599) 2023-11-08 22:28:30 +08:00
8e332fa979 (selectdb-cloud) Reduce FE db lock range for ShowDataStmt (#26588)
Reduce read lock critical sections and avoid execution timeouts
2023-11-08 22:24:17 +08:00
b7a2c2e9c4 [chore](regression) Do stale resource reclaim before executing cold heat separation p2 case(#26596) 2023-11-08 22:14:54 +08:00
e92d2fcb5a [improvement](group commit) Group commit insert into can be executed on observer fe (#26589) 2023-11-08 22:10:06 +08:00
a6f9df7096 [LOG] Add fatal log in exchange sink buffer (#26594) 2023-11-08 21:52:21 +08:00
5bcf6bfd46 [fix](jdbc catalog) fix mysql zero date (#26569) 2023-11-08 21:41:56 +08:00
1fc360df19 [ci](p0) support run p0 10 times (#26603)
* [ci](p0) support run mutiple time

Co-authored-by: stephen <hello-stephen@qq.com>
2023-11-08 21:25:55 +08:00
06343e6d68 [opt](nereids)replace scan by empty relation when all partitions are pruned (#26514)
* replace scan by empty relation when all partitions are pruned
2023-11-08 20:54:35 +08:00
e718952e89 [fix](nereids)only enable colocate scan for one phase global parttion topn in some condition (#26473) 2023-11-08 20:46:40 +08:00
0c1458f21f [fix](planner)isnull predicate can't be safely constant folded in inlineview (#25377)
disable is null predicate constant fold rule for inline view
consider sql
select c.*
from (
select a.*, b.x
from test_insert a left join
(select 'some_const_str' x from test_insert) b on true
) c
where c.x is null;

when push “c.x is null” into c, after folding constant rule, it will get empty result. Because x is 'some_const_str' and "x is null" will be evaluated to false. This is wrong.
2023-11-08 20:46:29 +08:00
d749d99fe2 [fix](nereids)don't normalize column name for base index (#26476) 2023-11-08 20:45:58 +08:00
d0960bac56 [Fix](partial update) Fix partial update info loss when the delete bitmaps of the committed transactions are calculated by the compaction (#26556)
a fix for #25147
2023-11-08 19:56:31 +08:00
223be6947c [opt](Nereids) let DataType toSql same with legacy planner (#26576) 2023-11-08 05:34:32 -06:00
ec87401581 Fix workload group regression test failed (#26579) 2023-11-08 19:23:49 +08:00
3bce6d3828 [Opt](orc-reader) Optimize orc string dict filter in not_single_conjunct case. (#26386)
Optimize orc/parquet string dict filter in not_single_conjunct case. We can optimize this processing to filter block firstly by dict code, then filter by not_single_conjunct. Because dict code is int, it will filter faster than string.

For example:
```
select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate  and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01';
```
 `l_receiptdate` and `l_shipmode` will using string dict filtering, and `l_commitdate < l_receiptdate` is the an not_single_conjunct which contains dict filter field. We can optimize this processing to filter block firstly by dict code, then filter by not_single_conjunct. Because dict code is int, it will filter faster than string.

### Test Result:
Before:
 mysql> select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate  and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01';
+----------------------+
| count(l_receiptdate) |
+----------------------+
|             49314694 |
+----------------------+
1 row in set (6.87 sec)

After:
mysql> select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate  and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01';
+----------------------+
| count(l_receiptdate) |
+----------------------+
|             49314694 |
+----------------------+
1 row in set (4.85 sec)
2023-11-08 18:03:18 +08:00
45c2fa62a4 [pipeline](exec) disable shared scan in default and disable shared scan in limit with where scan (#25952) 2023-11-08 17:51:12 +08:00
a6d2013802 [opt](nereids) use 2 phase agg above union all (#26245)
forbid one phase agg for pattern: agg-unionAll
one phase agg plan: agg-union-hashDistribute-children
two phase agg plan: agg(global) - hashDistribute-agg(local)-union-randomDistribute
the key point is the cost of randomDistribute is much lower than the hashDistribute, and hence two-phase agg wins.
2023-11-08 17:15:53 +08:00
96d2e3394a [opt](meta) Improve the performance of getting expr name (#26341)
CaseFormat.UPPER_CAMEL.to(CaseFormat.LOWER_UNDERSCORE, name)
It's time-consuming when call many times. So lazy call when necessary
2023-11-08 03:14:15 -06:00
58bf79f79e [fix](move-memtable) pass load stream num to backends (#26198) 2023-11-08 16:16:33 +08:00
6637f9c15f Add enable_cgroup_cpu_soft_limit (#26510) 2023-11-08 15:52:13 +08:00
f018b00646 [ci](perf) add new pipeline of tpch-sf100 (#26334)
* [ci](perf) add new pipeline of tpch-sf100
Co-authored-by: stephen <hello-stephen@qq.com>
2023-11-08 15:32:02 +08:00
a3666aa87e [feature](decimal) support decimal256 when creating table (#26308) 2023-11-08 15:21:01 +08:00
xy
be7d49cb9f [Fix](doc) Fixed some errors in the documentation (#26410)
Co-authored-by: xingying01 <xingying01@corp.netease.com>
2023-11-08 15:19:34 +08:00
f80495da83 [fix](Nereids) ban right outer, right anti, full outer with bucket shuffle (#26529)
if left bucket has no data, we do not generate left bucket instance.
These join should reserve all right side data. But because left instance
is not exists. So right data will be discard since no dest be set.

We ban these join temporarily until we could generate all instance
for left side in Coordinator.
2023-11-08 01:16:50 -06:00
5d4557938a [regression-test](fix) fix export_struct bug (#26561) 2023-11-08 14:57:07 +08:00
fc304c0e7c (metric) add histogramJsonMetric and nodeInfo (#26172)
Add histogramJsonMetric and nodeInfo to the interface "http://fe_host:http_port/metrics?type=json".
2023-11-08 14:46:18 +08:00
44b51bf0b9 [Feature](Variant) support variant load (#26572) 2023-11-08 00:37:57 -06:00
0f3e97f9c5 [regression-test][framework] support cases that can only run in non-concurrent-mode. (#26487) 2023-11-08 12:46:36 +08:00
9502cc758d [fix](regression) fix group commit regression test (#26557) 2023-11-08 11:57:07 +08:00
f8f3bc6a67 Revert "[Chore](ci)Temporarily cancel the mandatory restrictions of ShellCheck (#26553)" (#26565)
This reverts commit b7c81bc73625b26df746fc2213980c16b9d8f1a0.
2023-11-08 11:52:08 +08:00
a2419a8eb4 [enhancement](sink) refactor code of auto partition and where clause and enable them on sinkv2 (#26432)
For better performance and elasticity, we move memtable from loadchannel to
sink, VTabletSinkV2 is introduced, then there are VTabletWriter and
VTabletSinkV2 distributing rows to tablets. where clauses on mvs are
executed in VTabletWriter, while VTabletSinkV2 needs it too. So common code
is moved to row distribution.

Actually, we can layer code by rows' data flow, then the code is much more
understood and maintainable.

ScanNode -> Sink/Writer (RowDistribution -> IndexChannel / DeltaWriter)
2023-11-08 11:51:40 +08:00
7bad2e1d9f [opt](nereids) infer result column name in ctas and query stmt (#26055)
Infer name if it is an expression and doesn't alias artificially when create or select stmt in nereids.
The infer name strategy is the same as #24990
2023-11-07 21:28:48 -06:00
f4cbbe6429 [chore](workflow) Fix security issues with pull_request_target (#26525)
In the workflow Code Checks, we use the event pull_request_target which has write permission to enable the actions to comment on our PRs. We should be careful with the write permission and must forbid from running any user code. The previous PR #24761 tried its best to achieve this goal.
However, there is a scenario lacking of consideration (See #26494). #26494 attacks the workflow by git submodule way. This PR fixes this scenario by checkouting the external action explicitly in the workflow.
2023-11-08 11:23:13 +08:00