doris

Author	SHA1	Message	Date
morrySnow	ea2fbfaffa	[feature](Nereids) support agg state type in create table (#32171 ) this PR introduce a behavior change, syntax of create table with agg_state type is changed.	2024-03-15 18:04:49 +08:00
Jibing-Li	e9c1638507	Add waiting timeout while creating mv and row count report. (#31944 )	2024-03-09 19:44:54 +08:00
Jibing-Li	de9b5f7b69	[improvement](statistics)Log one bdbje record for one load transaction. #31619 (#31697 )	2024-03-02 23:04:26 +08:00
Jibing-Li	82faa7469b	Support analyze rollup. (#31576 )	2024-03-01 04:25:43 +08:00
Jibing-Li	ddb37d7371	Fix analyze mv from follower case bug. (#31523 )	2024-02-29 08:42:35 +08:00
Jibing-Li	3ca412efe3	Return UNKNOWN column stats if ndv is 0. (#31439 )	2024-02-29 08:42:35 +08:00
Jibing-Li	57d604c48a	Fix statistics p0 case. (#31330 )	2024-02-23 20:44:43 +08:00
Jibing-Li	04c295c4c2	Improve show column stats performance. (#31298 )	2024-02-23 19:03:28 +08:00
Jibing-Li	18955174e9	Fix analyze mv and mtmv p0 case. (#31191 )	2024-02-21 17:01:19 +08:00
Jibing-Li	4aaab6fb44	[fix](statistics)Refresh follower FE cache after alter column stats. Support alter index column stats (#31108 ) 1. Refresh follower FE cache after alter column stats. So that follower could update the cached stats too. 2. Support alter index column stats.	2024-02-20 16:23:53 +08:00
Jibing-Li	edfa5e750b	Remove slow auto analyze test case. (#31037 )	2024-02-18 14:45:25 +08:00
Jibing-Li	e086d0d719	[test](statistics)Add analyze mtmv test case (#30847 )	2024-02-16 10:12:23 +08:00
Jibing-Li	042934e545	Add auto analyze mv and show task case. (#30894 )	2024-02-16 10:12:23 +08:00
Jibing-Li	e9f9fdf9af	Fix unstable analyze mv case. (#30859 )	2024-02-05 22:00:36 +08:00
Jibing-Li	9e76592297	Support analyze materialized view. (#30540 )	2024-02-04 22:21:16 +08:00
morrySnow	afab713048	[fix](Nereids) query mv column directly (#30444 )	2024-01-29 19:03:47 +08:00
Jibing-Li	9174ac921b	Fix statistics p0. (#30351 )	2024-01-25 21:33:50 +08:00
Jibing-Li	86d7a8be44	[improvement](statistics nereids)Nereids support select mv. (#30267 )	2024-01-25 13:24:09 +08:00
Jibing-Li	b3e37b3efa	[unit test](statistics)Add unit test case for auto analyze. #29904 Add unit and p0 test case for auto analyze.	2024-01-16 18:31:27 +08:00
Jibing-Li	e4707154fa	[opt](statistics) create or update table stats after alter column stats. Create or update table stats after alter column stats. Set flag to disable auto analyze for the table after user inject column stats.	2024-01-12 11:44:21 +08:00
Jibing-Li	ddaa645a4f	[improvement](statistics) Force to use zonemap for collecting string type min max. (#29631 ) Force to use zonemap for collecting string type min max. String type is not using zonemap for min max, because zonemap value at BE side is truncated at 512 bytes which may cause the value not accurate. But it's OK for statisitcs min max, and this could also avoid scan whole table while sampling.	2024-01-12 11:34:07 +08:00
Jibing-Li	612e0631ac	Do not collect min max for agg table value columns while doing sample analyze. (#29483 )	2024-01-06 17:15:40 +08:00
Jibing-Li	2308881e9f	[improvement](statistics) Analyze partition columns when new partition loaded data for the first time. (#29154 ) The first time load data to a partition, we need to analyze the partition columns even when the health rate is high. Because if not, the min max value of the column may not include the new partition values, which may cause bad plan.	2023-12-29 14:36:48 +08:00
Jibing-Li	f4c5ce260b	[fix](statistics)Fix rowCount==0 while analyzing bug (#28969 ) Sample analyzing need to get row count by using table.getRowCount(). This method is not updated in real time, which may cause the sample task to scan whole table. This pr is to fix this. Set the flag that indicate the analyze job is for an empty table and skip scan the table. Meanwhile, don't reset updatedRows in this case. Set hugeTableAutoAnalyzeIntervalInMillis = 0 because all default huge table size has been set to 0.	2023-12-27 23:04:37 +08:00
Jibing-Li	9d5b9cc452	[fix](statistics)Fix drop stats fail silently bug. (#28635 ) Drop stats use IN predicate to filter the column stats to delete. The default length of IN predicate is 1024, drop table stats with more than 1024 columns may fail. This pr is to split the delete sql based on the IN predicate length.	2023-12-20 15:41:25 +08:00
Jibing-Li	099b1b7106	[fix](statistics)Fix column stats trigger info bug (#28303 ) Before, we didn't update the jobType info in ColStatsMeta. This will case the jobType always be the type when it first be set. For example, if we manually analyzed table, the jobType will always be MANUAL, even if this table is auto analyzed again later.	2023-12-13 20:31:03 +08:00
Jibing-Li	cd3d31ba13	[fix](statistics)Escape load stats sql (#28117 ) Escape load stats sql, because column name may contain special characters.	2023-12-11 20:25:18 +08:00
Jibing-Li	4cac07be30	[improvement](statistics)Analyze empty table. #28077 Analyze a table even when it's empty. The result should be like this: mysql> show column stats nation; +-------------+-------+------+----------+-----------+---------------+------+------+--------+--------------+---------+-------------+---------------------+ \| column_name \| count \| ndv \| num_null \| data_size \| avg_size_byte \| min \| max \| method \| type \| trigger \| query_times \| updated_time \| +-------------+-------+------+----------+-----------+---------------+------+------+--------+--------------+---------+-------------+---------------------+ \| n_comment \| 0.0 \| 0.0 \| 0.0 \| 0.0 \| 0.0 \| N/A \| N/A \| FULL \| FUNDAMENTALS \| MANUAL \| 0 \| 2023-12-06 19:22:09 \| \| n_nationkey \| 0.0 \| 0.0 \| 0.0 \| 0.0 \| 0.0 \| N/A \| N/A \| FULL \| FUNDAMENTALS \| MANUAL \| 0 \| 2023-12-06 19:22:09 \| \| n_regionkey \| 0.0 \| 0.0 \| 0.0 \| 0.0 \| 0.0 \| N/A \| N/A \| FULL \| FUNDAMENTALS \| MANUAL \| 0 \| 2023-12-06 19:22:09 \| \| n_name \| 0.0 \| 0.0 \| 0.0 \| 0.0 \| 0.0 \| N/A \| N/A \| FULL \| FUNDAMENTALS \| MANUAL \| 0 \| 2023-12-06 19:22:09 \| +-------------+-------+------+----------+-----------+---------------+------+------+--------+--------------+---------+----	2023-12-07 10:16:52 +08:00
AKIRA	7f1b558011	[fix](stats) truncate min/max if too long (#27955 ) For some string value the max/min might be a very long string which might take too many memory of FE, so we truncate to 1024 chars if it's too long	2023-12-05 20:40:38 +08:00
Jibing-Li	02512cd0e2	[fix](stats)Drop stats or update updated rows after truncate table (#27931 ) 1. Also clear follower's stats cache when doing drop stats. 2. Drop stats when truncate a table.	2023-12-05 14:53:35 +08:00
AKIRA	fc2129a09f	[fix](stats) skip collect agg_state type (#27640 )	2023-11-28 11:43:48 +08:00
AKIRA	732a3fa9c8	[fix](stats) fix auto collector always create sample job no matter the table size (#26968 )	2023-11-22 02:42:40 -06:00
Siyang Tang	4e105e94a2	[fix](statistics) fix updated rows incorrect due to typo in code (#26979 )	2023-11-15 05:25:46 -06:00
AKIRA	290070074a	[refactor](stats) refactor collection logic and opt some config (#26163 ) 1. not collect partition stats anymore 2. merge insert of stats 3. delete period collector since it is useless 4. remove enable_auto_sample 5. move some config related to stats to global session variable Before this PR, when analyze a table, the insert count equals column count times 2 After this PR, insert count of analyze table would reduce to column count / insert_merge_item_count. According to my test, when analyzing tpch lineitem, the insert sql count is 1	2023-11-08 11:03:44 +08:00
walter	f831774121	[test](regression) Add more regression test for FE (#26384 )	2023-11-06 11:10:37 +08:00
AKIRA	268c69971d	[fix](stats) Store max/min by base64	2023-11-01 14:31:35 +08:00
shuke	d698fb9225	[regression-test](fix) fix two regression test case bug (#26071 )	2023-10-31 03:48:29 -05:00
AKIRA	0eea19403e	[fix](stats) analyze specific column only if indicate column in analyze stmt (#25660 )	2023-10-25 04:08:10 -05:00
AKIRA	8c5af5a088	[fix](case) Fix test_analyze case (#25476 ) It has following problems before this PR use count(*) to check if all column analyzed return directly when fe count > 1 Co-authored-by: AKIHA <cyborgz1999@example.com>	2023-10-17 15:06:01 +08:00
AKIRA	9deda929b9	[refactor](stats) Use id instead name in analysis info (#25213 )	2023-10-16 03:49:53 -05:00
Dongyang Li	b17bac6323	[fix](case) use the custom DB explicitly in analyze_stats.groovy (#25285 ) Co-authored-by: stephen <hello-stephen@qq.com> use the custom DB explicitly in analyze_stats.groovy	2023-10-11 19:28:14 +08:00
Dongyang Li	771b8b5bec	[fix](case) Update analyze_stats.groovy (#25146 )	2023-10-10 12:51:29 +08:00
Jibing-Li	7ceb029a17	[Fix](statistics)Fix alter column stats data size is always 0 bug (#24891 ) Fix alter column stats data size is always 0 bug.	2023-10-09 15:48:11 +08:00
AKIRA	e5fe4e5b83	[refactor](stats) Refactor TableStatsMeta 1. Add a abstraction for column stats status which is required so furthur optimization and feature development 2. Enable analyze test in p0 that disabled unexpectedly before	2023-10-07 19:48:54 +08:00
AKIRA	ffad945dd1	[opt](optimizer) Recycle expired table stats #24777 Remove table stats when olap table is dropped	2023-10-07 11:31:45 +08:00
Jibing-Li	b87ea68720	[Fix](statistics) Fix analyze olap table couldn't get partition names bug (#24696 ) Call getPartitionNames to get all partitions while analyzing for olap table. Couldn't return NULL, otherwise analyze for olap table will do nothing.	2023-09-21 10:28:37 +08:00
AKIRA	67e8951b72	[fix](stats) Fix analyze failed when there are thousands of partitions. (#24521 ) It's caused by we used same query id for multiple queries of same olap analyze task, but many structures related to query execution depends on query id.	2023-09-18 17:27:10 +08:00
AKIRA	fa37a8bba8	[opt](stats) remove corresponding col stats status if the loading at the end of analyze task is failed (#24405 )	2023-09-15 17:46:48 +08:00
AKIRA	786a721e03	[feat](stats) Support analyze with sample automatically (#23978 ) 1. Analyze with sample automatically when table size is greater than huge_table_lower_bound_size_in_bytes(5G by default). User can disable this feature by fe option enable_auto_sample 2. Support grammer like `ANALYZE TABLE test WITH FULL` to force do full analyze whatever table size is 3. Fix bugs that tables stats doesn't get updated properly when stats is dropped, or only few column is analyzed	2023-09-13 19:42:10 +08:00
zy-kkk	0dee7246bc	Revert "[opt](stats) remove table stats when table has been removed (#23803 )" (#24058 ) This reverts commit 66d3371400207f568c7ff6ff6bf5f4f0da32bd2c. Reverts #23803	2023-09-07 23:25:09 +08:00

1 2

79 Commits