This PR proposes to add a plan checker to facilitate plan checking in unit tests.
Usage of plan checker is like below:
```java
new PlanChecker()
.plan(myPlan)
.applyBottomUp(myRule)
.matches(expectedPattern);
```
I try to fix the bug in #10095. the error occurred when I first create a empty table and query it.
But I can't reproduced it again.
So I add a warn log here to observer
Before, if a table is unpartitioned, when executing following alter stmt:
```
alter table tbl1 set ("replication_num" = "1");
```
Only the tbl1 partition's replication_num is changed
(for unpartitioned table, it also has a single partition with same name as table's)
But the table's default replication_num is unchanged.
So when executing `show create table tbl1`, you will find that the replication_num is still the origin value.
This CL mainly changes:
1. For unpartitioned table, if user change it's replication num, both table's and partition's replication_num will be changed.
This pr mainly to supplement the syntax of the previous pr(#8861),
it supports users to manually inject statistics, including table, partition, and column statistics.
table/partition stats type:
- row_count
- data_size
column stats type:
- ndv
- avg_size
- max_size
- num_nulls
- min_value
- max_value
Modify table or partition statistics:
```
ALTER TABLE table_name
SET STATS ('k1' = 'v1', ...) [PARTITIONS(p_name1, p_name2...)]
```
Modify column statistics:
```
ALTER TABLE table_name MODIFY COLUMN columnName
SET STATS ('k1' = 'v1', ...) [PARTITIONS(p_name1, p_name2...)]
```
Some notes:
- Only support statistics injected into olap type tables.
- Statistics injected into temporary partitions are not supported.
- When injecting statistics, if it is a partitioned table, users need to specify a partition name.
- If multiple partitions are specified, the same stats will be injected on multiple partitions.
- The current code also has mock statistics @zhengshij
* [feature](planner): push limit to olapscan when meet sort.
* if olap_scan_node's sort_info is set, push sort_limit, read_orderby_key
and read_orderby_key_reverse for olap scanner
* There is a common query pattern to find latest time serials data.
eg. SELECT * from t_log WHERE t>t1 AND t<t2 ORDER BY t DESC LIMIT 100
If the ORDER BY columns is the prefix of the sort key of table, it can
be greatly optimized to read much fewer data instead of read all data
between t1 and t2.
By leveraging the same order of ORDER BY columns and sort key of table,
just read the LIMIT N rows for each related segment and merge N rows.
1. set read_orderby_key to true for read_params and _reader_context
if olap_scan_node's sort info is set.
2. set read_orderby_key_reverse to true for read_params and _reader_context
if is_asc_order is false.
3. rowset reader force merge read segments if read_orderby_key is true.
4. block reader and tablet reader force merge read rowsets if read_orderby_key is true.
5. for ORDER BY DESC, read and compare in reverse order
5.1 segment iterator read backward using a new BackwardBitmapRangeIterator and
reverse the result block before return to caller.
5.2 VCollectIterator::LevelIteratorComparator, VMergeIteratorContext return
opposite result for _is_reverse order in its compare function.
Co-authored-by: jackwener <jakevingoo@gmail.com>
1. Add InPredicate expression parser and translator
2. Add regression-test for In predicate (in nereids_syntax)
3. Support NOT EqualTo and NOT InPredicate in ExpressionTranslator#visitNot()
This pull request includes some implementations of the statistics(https://github.com/apache/incubator-doris/issues/6370), it will not affect any existing code and users will not be able to create statistics job.
It implements the display of statistics job information, with the following syntax, users will be able to view the corresponding job information.
syntax:
```
SHOW ANALYZE
[TABLE | ID]
[
WHERE
[STATE = ["PENDING"|"SCHEDULING"|"RUNNING"|"FINISHED"|"FAILED"|"CANCELLED"]]
]
[ORDER BY ...]
[LIMIT limit][OFFSET offset];
```
e.g.
| id | create_time | start_time | finish_time | error_msg | scope | progress | state |
| ----- | ----------------------- | ----------------------- | ----------------------- | --------- | ------------------- | -------- | -------- |
| 60051 | 2022-07-21 01:26:26.173 | 2022-07-21 01:26:26.186 | 2022-07-21 01:26:27.104 | | table1(citycode,pv) | 5/5 | FINISHED |
* [fix](hive-table) fix bug that hive external table can not query table created by Tez
If the hive is created by Tez, the location of the table is a second-level director, eg:
/user/hive/warehouse/region_tmp_union_all/
---/user/hive/warehouse/region_tmp_union_all/1
---/user/hive/warehouse/region_tmp_union_all/2
We should recursive traverse the directory to get the real files.
When upgrading to 1.x, some table's state may change to ROLLUP,
Resulting in not able to create/drop/modify partition.
I haven't find the root cause yet, so I add some log to observe
the change of table's state.