f52067415b
Improve analyze mv/mtmv wait row count report logic. ( #33695 )
2024-04-17 23:42:13 +08:00
0499d4013e
Support identical column name in different index. ( #32792 )
2024-04-10 11:34:29 +08:00
96b995504c
[enhancement](statistics) excluded delta rows num for rollup&mv tablets ( #32568 )
...
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com >
Co-authored-by: tsy <tangsiyang2001@foxmail.com >
2024-04-10 11:34:28 +08:00
5e5fffe4e3
Set enable_unique_key_partial_update to false in statistics session variable. ( #33220 )
2024-04-08 16:49:58 +08:00
735f5213c1
Add log for case to help debug. ( #32675 )
2024-03-24 08:04:46 +08:00
60eeff8e18
[enhance](mtmv)refresh mtmv must add auto ( #32522 )
2024-03-22 08:52:16 +08:00
fe1b7a7b0a
Improve analyze stats case, avoid cluster delay caused failure. ( #32507 )
2024-03-22 08:52:16 +08:00
ea2fbfaffa
[feature](Nereids) support agg state type in create table ( #32171 )
...
this PR introduce a behavior change, syntax of create table with agg_state type is changed.
2024-03-15 18:04:49 +08:00
e9c1638507
Add waiting timeout while creating mv and row count report. ( #31944 )
2024-03-09 19:44:54 +08:00
de9b5f7b69
[improvement](statistics)Log one bdbje record for one load transaction. #31619 ( #31697 )
2024-03-02 23:04:26 +08:00
82faa7469b
Support analyze rollup. ( #31576 )
2024-03-01 04:25:43 +08:00
ddb37d7371
Fix analyze mv from follower case bug. ( #31523 )
2024-02-29 08:42:35 +08:00
3ca412efe3
Return UNKNOWN column stats if ndv is 0. ( #31439 )
2024-02-29 08:42:35 +08:00
57d604c48a
Fix statistics p0 case. ( #31330 )
2024-02-23 20:44:43 +08:00
04c295c4c2
Improve show column stats performance. ( #31298 )
2024-02-23 19:03:28 +08:00
18955174e9
Fix analyze mv and mtmv p0 case. ( #31191 )
2024-02-21 17:01:19 +08:00
4aaab6fb44
[fix](statistics)Refresh follower FE cache after alter column stats. Support alter index column stats ( #31108 )
...
1. Refresh follower FE cache after alter column stats. So that follower could update the cached stats too.
2. Support alter index column stats.
2024-02-20 16:23:53 +08:00
edfa5e750b
Remove slow auto analyze test case. ( #31037 )
2024-02-18 14:45:25 +08:00
e086d0d719
[test](statistics)Add analyze mtmv test case ( #30847 )
2024-02-16 10:12:23 +08:00
042934e545
Add auto analyze mv and show task case. ( #30894 )
2024-02-16 10:12:23 +08:00
e9f9fdf9af
Fix unstable analyze mv case. ( #30859 )
2024-02-05 22:00:36 +08:00
9e76592297
Support analyze materialized view. ( #30540 )
2024-02-04 22:21:16 +08:00
afab713048
[fix](Nereids) query mv column directly ( #30444 )
2024-01-29 19:03:47 +08:00
9174ac921b
Fix statistics p0. ( #30351 )
2024-01-25 21:33:50 +08:00
86d7a8be44
[improvement](statistics nereids)Nereids support select mv. ( #30267 )
2024-01-25 13:24:09 +08:00
b3e37b3efa
[unit test](statistics)Add unit test case for auto analyze. #29904
...
Add unit and p0 test case for auto analyze.
2024-01-16 18:31:27 +08:00
e4707154fa
[opt](statistics) create or update table stats after alter column stats.
...
Create or update table stats after alter column stats.
Set flag to disable auto analyze for the table after user inject column stats.
2024-01-12 11:44:21 +08:00
ddaa645a4f
[improvement](statistics) Force to use zonemap for collecting string type min max. ( #29631 )
...
Force to use zonemap for collecting string type min max.
String type is not using zonemap for min max, because zonemap value at BE side is truncated at 512 bytes which may cause the value not accurate. But it's OK for statisitcs min max, and this could also avoid scan whole table while sampling.
2024-01-12 11:34:07 +08:00
612e0631ac
Do not collect min max for agg table value columns while doing sample analyze. ( #29483 )
2024-01-06 17:15:40 +08:00
2308881e9f
[improvement](statistics) Analyze partition columns when new partition loaded data for the first time. ( #29154 )
...
The first time load data to a partition, we need to analyze the partition columns even when the health rate is high. Because if not, the min max value of the column may not include the new partition values, which may cause bad plan.
2023-12-29 14:36:48 +08:00
f4c5ce260b
[fix](statistics)Fix rowCount==0 while analyzing bug ( #28969 )
...
Sample analyzing need to get row count by using table.getRowCount(). This method is not updated in real time, which may cause the sample task to scan whole table.
This pr is to fix this. Set the flag that indicate the analyze job is for an empty table and skip scan the table. Meanwhile, don't reset updatedRows in this case.
Set hugeTableAutoAnalyzeIntervalInMillis = 0 because all default huge table size has been set to 0.
2023-12-27 23:04:37 +08:00
9d5b9cc452
[fix](statistics)Fix drop stats fail silently bug. ( #28635 )
...
Drop stats use IN predicate to filter the column stats to delete. The default length of IN predicate is 1024, drop table stats with more than 1024 columns may fail.
This pr is to split the delete sql based on the IN predicate length.
2023-12-20 15:41:25 +08:00
099b1b7106
[fix](statistics)Fix column stats trigger info bug ( #28303 )
...
Before, we didn't update the jobType info in ColStatsMeta. This will case the jobType always be the type
when it first be set. For example, if we manually analyzed table, the jobType will always be MANUAL,
even if this table is auto analyzed again later.
2023-12-13 20:31:03 +08:00
cd3d31ba13
[fix](statistics)Escape load stats sql ( #28117 )
...
Escape load stats sql, because column name may contain special characters.
2023-12-11 20:25:18 +08:00
4cac07be30
[improvement](statistics)Analyze empty table. #28077
...
Analyze a table even when it's empty. The result should be like this:
mysql> show column stats nation;
+-------------+-------+------+----------+-----------+---------------+------+------+--------+--------------+---------+-------------+---------------------+
| column_name | count | ndv | num_null | data_size | avg_size_byte | min | max | method | type | trigger | query_times | updated_time |
+-------------+-------+------+----------+-----------+---------------+------+------+--------+--------------+---------+-------------+---------------------+
| n_comment | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | N/A | N/A | FULL | FUNDAMENTALS | MANUAL | 0 | 2023-12-06 19:22:09 |
| n_nationkey | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | N/A | N/A | FULL | FUNDAMENTALS | MANUAL | 0 | 2023-12-06 19:22:09 |
| n_regionkey | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | N/A | N/A | FULL | FUNDAMENTALS | MANUAL | 0 | 2023-12-06 19:22:09 |
| n_name | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | N/A | N/A | FULL | FUNDAMENTALS | MANUAL | 0 | 2023-12-06 19:22:09 |
+-------------+-------+------+----------+-----------+---------------+------+------+--------+--------------+---------+----
2023-12-07 10:16:52 +08:00
7f1b558011
[fix](stats) truncate min/max if too long ( #27955 )
...
For some string value the max/min might be a very long string
which might take too many memory of FE,
so we truncate to 1024 chars if it's too long
2023-12-05 20:40:38 +08:00
02512cd0e2
[fix](stats)Drop stats or update updated rows after truncate table ( #27931 )
...
1. Also clear follower's stats cache when doing drop stats.
2. Drop stats when truncate a table.
2023-12-05 14:53:35 +08:00
fc2129a09f
[fix](stats) skip collect agg_state type ( #27640 )
2023-11-28 11:43:48 +08:00
732a3fa9c8
[fix](stats) fix auto collector always create sample job no matter the table size ( #26968 )
2023-11-22 02:42:40 -06:00
4e105e94a2
[fix](statistics) fix updated rows incorrect due to typo in code ( #26979 )
2023-11-15 05:25:46 -06:00
290070074a
[refactor](stats) refactor collection logic and opt some config ( #26163 )
...
1. not collect partition stats anymore
2. merge insert of stats
3. delete period collector since it is useless
4. remove enable_auto_sample
5. move some config related to stats to global session variable
Before this PR, when analyze a table, the insert count equals column count times 2
After this PR, insert count of analyze table would reduce to column count / insert_merge_item_count.
According to my test, when analyzing tpch lineitem, the insert sql count is 1
2023-11-08 11:03:44 +08:00
f831774121
[test](regression) Add more regression test for FE ( #26384 )
2023-11-06 11:10:37 +08:00
268c69971d
[fix](stats) Store max/min by base64
2023-11-01 14:31:35 +08:00
d698fb9225
[regression-test](fix) fix two regression test case bug ( #26071 )
2023-10-31 03:48:29 -05:00
0eea19403e
[fix](stats) analyze specific column only if indicate column in analyze stmt ( #25660 )
2023-10-25 04:08:10 -05:00
8c5af5a088
[fix](case) Fix test_analyze case ( #25476 )
...
It has following problems before this PR
use count(*) to check if all column analyzed
return directly when fe count > 1
Co-authored-by: AKIHA <cyborgz1999@example.com >
2023-10-17 15:06:01 +08:00
9deda929b9
[refactor](stats) Use id instead name in analysis info ( #25213 )
2023-10-16 03:49:53 -05:00
b17bac6323
[fix](case) use the custom DB explicitly in analyze_stats.groovy ( #25285 )
...
Co-authored-by: stephen <hello-stephen@qq.com >
use the custom DB explicitly in analyze_stats.groovy
2023-10-11 19:28:14 +08:00
771b8b5bec
[fix](case) Update analyze_stats.groovy ( #25146 )
2023-10-10 12:51:29 +08:00
7ceb029a17
[Fix](statistics)Fix alter column stats data size is always 0 bug ( #24891 )
...
Fix alter column stats data size is always 0 bug.
2023-10-09 15:48:11 +08:00