Commit Graph

2524 Commits

Author SHA1 Message Date
04d26ddf22 [feature-wip](multi-catalog)Support use catalog.db and show databases from catalog stmt (#11338)
Support use catalog.db and show databases from catalog stmt.
2022-08-11 09:50:32 +08:00
02a3f21b65 [fix](analyzer) InferFilterRule bug: equations in on clause of outer/anti join are not inferable. (#11515) 2022-08-11 09:36:43 +08:00
d8427037be [Bug](doe) Fix some bug (#11594) 2022-08-10 21:00:05 +08:00
976e7685db [minor](*): remove redundant log and unused code. (#11620) 2022-08-10 19:28:04 +08:00
8c344d33e6 [Enhancement](meta) sort result by tablename when show tables like show data (#11638)
* [improvement] sort result by tablename when show tables like 'show data'
2022-08-10 19:26:30 +08:00
c8418d13b5 [improvement](config)Use session variable to replace configuration for 'enable_function_pushdown' (#11641) 2022-08-10 19:25:02 +08:00
89d3809a0e [feature](Nereids): Enable the costAndEnforcerJob (#11604)
1. Enable the costAndEnforcerJob
2. Fix some bug of enforcer.
3. polish property name and method
2022-08-10 15:17:15 +08:00
ae90d45594 [Bug](show data skew)fix show data skew logic (#11616)
Co-authored-by: wuhangze <wuhangze@jd.com>
2022-08-10 08:18:39 +08:00
169996d8e4 [feature](information_schema) add rowsets table into information_s… (#11266)
* [feature](information_schema) add 'segments' table into information_schema
2022-08-09 18:15:54 +08:00
7b67661262 add plan checker (#11619)
This PR proposes to add a plan checker to facilitate plan checking in unit tests.

Usage of plan checker is like below:
```java
new PlanChecker()
  .plan(myPlan)
  .applyBottomUp(myRule)
  .matches(expectedPattern);
```
2022-08-09 17:19:30 +08:00
cc6c92935a [minor](log) add a warn log to observer invalid query profile (#11588)
I try to fix the bug in #10095. the error occurred when I first create a empty table and query it.
But I can't reproduced it again.
So I add a warn log here to observer
2022-08-09 14:10:03 +08:00
2cadf85988 [improvement](alter) modify table's default replica if table is unpartitioned (#11550)
Before, if a table is unpartitioned, when executing following alter stmt:
```
alter table tbl1 set ("replication_num" = "1");
```
Only the tbl1 partition's replication_num is changed
(for unpartitioned table, it also has a single partition with same name as table's)
But the table's default replication_num is unchanged.
So when executing `show create table tbl1`, you will find that the replication_num is still the origin value.

This CL mainly changes:
1. For unpartitioned table, if user change it's replication num, both table's and partition's replication_num will be changed.
2022-08-09 14:09:38 +08:00
436ee0dd1d [feature-wip](statistics) step4.1: manually inject statistics for a table or column (#11030)
This pr mainly to supplement the syntax of the previous pr(#8861),
it supports users to manually inject statistics, including table, partition, and column statistics. 

table/partition stats type:
- row_count
- data_size

column stats type:
- ndv
- avg_size
- max_size
- num_nulls
- min_value
- max_value

Modify table or partition statistics:
```
ALTER TABLE table_name 
SET STATS ('k1' = 'v1', ...) [PARTITIONS(p_name1, p_name2...)]
```

Modify column statistics:
```
ALTER TABLE table_name MODIFY COLUMN columnName 
SET STATS ('k1' = 'v1', ...) [PARTITIONS(p_name1, p_name2...)]
```

Some notes:
- Only support statistics injected into olap type tables.
- Statistics injected into temporary partitions are not supported.
- When injecting statistics, if it is a partitioned table, users need to specify a partition name.
- If multiple partitions are specified, the same stats will be injected on multiple partitions.
- The current code also has mock statistics @zhengshij
2022-08-09 11:24:23 +08:00
2b918eaccd [fix](Doris On ES) Fix es not support aliases error (#11547)
1. Fix es not support aliases error
2. Fix multicatalog query es error
3. add ut
2022-08-09 09:36:05 +08:00
f9b151744d optimize topn query if order by columns is prefix of sort keys of table (#10694)
* [feature](planner): push limit to olapscan when meet sort.

* if olap_scan_node's sort_info is set, push sort_limit, read_orderby_key
and read_orderby_key_reverse for olap scanner

* There is a common query pattern to find latest time serials data.
 eg. SELECT * from t_log WHERE t>t1 AND t<t2 ORDER BY t DESC LIMIT 100

If the ORDER BY columns is the prefix of the sort key of table, it can
be greatly optimized to read much fewer data instead of read all data
between t1 and t2.

By leveraging the same order of ORDER BY columns and sort key of table,
just read the LIMIT N rows for each related segment and merge N rows.

1. set read_orderby_key to true for read_params and _reader_context
   if olap_scan_node's sort info is set.
2. set read_orderby_key_reverse to true for read_params and _reader_context
   if is_asc_order is false.
3. rowset reader force merge read segments if read_orderby_key is true.
4. block reader and tablet reader force merge read rowsets if read_orderby_key is true.

5. for ORDER BY DESC, read and compare in reverse order
5.1 segment iterator read backward using a new BackwardBitmapRangeIterator and
    reverse the result block before return to caller.
5.2 VCollectIterator::LevelIteratorComparator, VMergeIteratorContext return
    opposite result for _is_reverse order in its compare function.

Co-authored-by: jackwener <jakevingoo@gmail.com>
2022-08-09 09:08:44 +08:00
b9f7f63c81 [Fix](planner) Fix wrong planner with count(*) optmizer for cross join optimization (#11569) 2022-08-09 09:01:25 +08:00
7c950c7cd5 [feature](Nereids) support cross join in Nereids (#11502)
support cross join in Nereids

1. add PhysicalNestedLoopJoin
2. Translate PhysicalNestedLoopJoin to CrossJoinNode in PhysicalPlanTranslator
2022-08-08 22:14:27 +08:00
1701ffa7c0 [fix](planner)push constant expr in predicate to outer join's other conjuncts by mistake (#11527)
constant expr in predicate should not be pushed to outer join's other conjuncts
2022-08-08 20:56:08 +08:00
4f60b37402 [feature](Nereids):refactor and add outer join LAsscom. (#11531)
refactor and add outer join LAsscom.
Extract the common function to LAsscomHelper.
2022-08-08 20:08:12 +08:00
Fy
647b6e843a [feature](nereids)add InPredicate in expressions (#11264)
1. Add InPredicate expression parser and translator
2. Add regression-test for In predicate (in nereids_syntax)
3. Support NOT EqualTo and NOT InPredicate in ExpressionTranslator#visitNot()
2022-08-08 19:59:54 +08:00
c1c635e944 [Refactor](Nereids) Fix expression constant and improve SlotExtractor (#11513)
1. Fix expression constant and add unit test.
2. Improve logic in SlotExtractor and remove useless class IterationVisitor.
2022-08-08 17:36:21 +08:00
411254c128 [Enhancement](hdfs) Support loading hdfs config from hdfs-site.xml (#11571) 2022-08-08 14:18:28 +08:00
4f5db35990 [fix](date) fix the value may be changed during the parsing of date and datetime types (#11573)
* [fix](date) fix the value may be changed during the parsing of date and datetime types
2022-08-08 08:58:30 +08:00
8802a41918 fix profile may cause query slow (#11386)
Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-08-07 20:52:52 +08:00
683a1261c6 [Enhancement](vectorized) Runtime Filter support equivalent slot of outer join (#11530) 2022-08-06 08:10:28 +08:00
57b7a416d2 [chore](build) add apache snapshot maven repo to repositories (#11549) 2022-08-06 07:15:28 +08:00
3070318f95 [Enhancement](IdGenerator) Use IdGeneratorBuffer to get better performance for creating tablet in fe when do alter table job (#11524)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-08-05 23:27:29 +08:00
d88d1239c5 [feature] (Nereids) support limit clause (#11209)
including:
1. limit clause parser
2. implementation rule to transform LogicalLimit to PhysicalLimit
2022-08-05 11:58:45 +08:00
6eb8ac0ebf [feature-wip][multi-catalog]Support caseSensitive field name in file scan node (#11310)
* Impl case sentive in file scan node
2022-08-05 08:03:16 +08:00
d4e6e3edfd [bugfix]fix time accuracy (#11521) 2022-08-04 21:36:20 +08:00
e11024f5cc [enhancement](Nereids)set default join type to CROSS_JOIN (#11459)
set default join type to CROSS_JOIN on join that has no equal on condition when parse sql string.
2022-08-04 21:25:24 +08:00
6dc41d57f3 [enhancement](Nereids)support count, min and avg function (#11374)
1. add count function
2. add min function
3. add avg function
2022-08-04 21:19:32 +08:00
591b7f3f92 [multi-catalog](oss)Support hive external table on Ali oss. (#11489) 2022-08-04 17:44:34 +08:00
95091256b0 [chore](deps) update bdbje tp doris bdbje, update libhdfs3 to improve performance (#11497) 2022-08-04 17:10:56 +08:00
9f221a703b [feature-wip](statistics) step5: show statistics job information (#8862)
This pull request includes some implementations of the statistics(https://github.com/apache/incubator-doris/issues/6370), it will not affect any existing code and users will not be able to create statistics job.

It implements the display of statistics job information, with the following syntax, users will be able to view the corresponding job information.

syntax:
```
SHOW ANALYZE
    [TABLE | ID]
    [
        WHERE
        [STATE = ["PENDING"|"SCHEDULING"|"RUNNING"|"FINISHED"|"FAILED"|"CANCELLED"]]
    ]
    [ORDER BY ...]
    [LIMIT limit][OFFSET offset];
```

e.g.
| id    | create_time             | start_time              | finish_time             | error_msg | scope               | progress | state    |
| ----- | ----------------------- | ----------------------- | ----------------------- | --------- | ------------------- | -------- | -------- |
| 60051 | 2022-07-21 01:26:26.173 | 2022-07-21 01:26:26.186 | 2022-07-21 01:26:27.104 |           | table1(citycode,pv) | 5/5      | FINISHED |
2022-08-04 16:10:49 +08:00
397bf354db [WIP](optional) using hash set to distinct single value (#11246)
* [WIP](optional) using hash set to distinct single value


Co-authored-by: wangbo36@meituan.com <wangbo36@meituan.com>
2022-08-04 15:52:58 +08:00
9078ab4d24 [feature](FE): add new property to control whether use light schema change or not (#11169) 2022-08-04 15:49:05 +08:00
Pxl
ec3c911f97 [Feature][Materialized-View] support materialized view on vectorized engine (#10792) 2022-08-04 14:07:48 +08:00
e7f378fec6 [Enhancement](IdGenerator) Use IdGeneratorBuffer to get better performance for getNextId operation when create table, truncate table, add partition and so on (#11479)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-08-04 11:21:35 +08:00
7703912b3e [improvement](error msg)improve the err msg returned when the key not in columns when create table (#11490) 2022-08-04 11:20:49 +08:00
Pxl
ce68d24e95 [Bug](function) fix current_date not equal to curdate (#11463)
* fix current_date not equal to curdate
2022-08-04 09:25:50 +08:00
33053ad1fe [improvement](outfile) support multibyte separator in outfile clause (#11487) 2022-08-04 09:06:06 +08:00
4ba2422039 [improvement](fe) Remove constant keys in aggregation (#11434) 2022-08-03 19:43:35 +08:00
a47eff1e46 [enhancement](Nereids) support all join type in Nereids that could do join by HashJoinNode (#11446)
add and test join type:
1. inner join
2. left outer join
3. right outer join
4. left semi join
5. right semi join
6. left anti join
7. right anti join
2022-08-03 12:14:17 +08:00
5b9b6c9065 [WIP](decimalv3) WIP (#11443)
* [feature-WIP](decimalv3) fix some bugs of decimalv3
2022-08-03 11:21:36 +08:00
77d82bb292 [Bug](MaterializedView) Fix bug of light schema change do not set right unique id cause MV coredump (#11396)
Fix bug of light schema change do not set right unique id cause MV coredump
2022-08-03 11:21:28 +08:00
4ae4909a2b [refactor](tvf) table-valued-function table (#11452) 2022-08-03 10:39:16 +08:00
c581855a41 [fix](hive-table) fix bug that hive external table can not query table created by Tez (#11345)
* [fix](hive-table) fix bug that hive external table can not query table created by Tez

If the hive is created by Tez, the location of the table is a second-level director, eg:

/user/hive/warehouse/region_tmp_union_all/
---/user/hive/warehouse/region_tmp_union_all/1
---/user/hive/warehouse/region_tmp_union_all/2

We should recursive traverse the directory to get the real files.
2022-08-03 09:07:47 +08:00
db3ba02993 [fix](planner) Fix an issue where outputSmap's size could grow exponentially (#11378)
* [fix](planner) Fix an issue where outputSmap'size  could grow exponentially
2022-08-03 09:07:00 +08:00
1e59c4054a [minor](log) add some log to observe the change of table's state. (#11448)
When upgrading to 1.x, some table's state may change to ROLLUP,
Resulting in not able to create/drop/modify partition.

I haven't find the root cause yet, so I add some log to observe
the change of table's state.
2022-08-03 08:43:14 +08:00