Commit Graph

2381 Commits

Author SHA1 Message Date
ebfbe0c8dd [opt](information_schema) support information_schema in external catalog (#28919)
Add `information_schema` database for all catalog.
This is useful when using BI tools to connect to Doris,
the tools can get meta info from `information_schema`.

This PR mainly changes:

1. There will be a `information_schema` db in each catalog.
2. Each `information_schema` db only store the meta info of the catalog it belongs to.
3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name.
4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true,
    The `TABLE_SCHEMA` column's value is the like `ctl.db`, because:

	When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`.
	
	And then some BI will try to query `information_schema` with sql like:
	
	`select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"`
	
	So it has to be format as `ctl.db`
	
	eg, the `information_schema.columns` table in external catalog `doris` is like:
	
	```
	mysql> select * from information_schema.columns limit 1\G
	*************************** 1. row ***************************
	           TABLE_CATALOG: doris
	            TABLE_SCHEMA: doris.__internal_schema
	              TABLE_NAME: column_statistics
	             COLUMN_NAME: id
	        ORDINAL_POSITION: 1
	          COLUMN_DEFAULT: NULL
	             IS_NULLABLE: NO
	               DATA_TYPE: varchar
	CHARACTER_MAXIMUM_LENGTH: 4096
	  CHARACTER_OCTET_LENGTH: 16384
	       NUMERIC_PRECISION: NULL
	           NUMERIC_SCALE: NULL
	      DATETIME_PRECISION: NULL
	      CHARACTER_SET_NAME: NULL
	          COLLATION_NAME: NULL
	             COLUMN_TYPE: varchar(4096)
	              COLUMN_KEY:
	                   EXTRA:
	              PRIVILEGES:
	          COLUMN_COMMENT:
	             COLUMN_SIZE: 4096
	          DECIMAL_DIGITS: NULL
	   GENERATION_EXPRESSION: NULL
	                  SRS_ID: NULL
	```
	
6. Modify the behavior of

	- show tables
	- shwo databases
	- show columns
	- show table status

	The above statements may query the `information_schema` db if there is `where` predicate after them
2024-01-12 13:58:19 +08:00
f67a00ffbb [opt](nereids) prune runtime redundant filters (#29828)
1. expand_runtime_filter_by_inner_join will create some redundant rfs,e.g., tpch q5 and q9, we need to remove one
2. hive: prune rf if target only used as probe
2024-01-12 13:58:19 +08:00
4d97f8ea75 [enhance](function) support two special format for str_to_date (#29823) 2024-01-12 12:00:32 +08:00
885d8b28ba [fix](Nerids): fix function deps when check unique and not null #29797 2024-01-12 11:59:52 +08:00
c9a949130b [Case](wal) Add wal group commit sink case with low disk space fault injection (#29731) 2024-01-12 11:59:52 +08:00
e93a16ac6e [fix](Nereids) support complex literal cast in fe (#29599) 2024-01-12 11:59:52 +08:00
17a2b89945 [runtimeFilter](nereids) expand runtime filter by join condition by default (#29633)
1. expand rf by join condition 
2. fix ignore_shape_nodes bug
2024-01-12 11:59:27 +08:00
e17809a684 [fix](nereids)logicalhaving is in wrong place after logicalagg and logicalwindow (#29463) 2024-01-12 11:48:39 +08:00
2c44951543 [fix](planner)only allow null safe equal when both children are nullable (#29470) 2024-01-12 11:46:29 +08:00
Pxl
7738eca6da [Bug](stream-load) fix stream load failed on table with rollup (#29665)
fix stream load failed on table with rollup
2024-01-12 11:46:29 +08:00
9cbb55d49b [fix](Nereids) create double literal when create decimal literal failed (#28959)
FIX
1. remove float and double literal toString and getStringValue introduced by
  PR #23504 and PR #23271
  These functions lead to wrong cast result of double and float literal
2. fix compute signature for datetimev2 always produce scale 6
3. fix stats calculator failed when generate node stats with two same column
4. constant fold on fe failed when cast double to integral

TODO
after fix the first problem, some mv matching not work well, fix them later
- test_dup_mv_div
- test_dup_mv_json
- test_tcu
2024-01-12 11:46:29 +08:00
fda001b6d3 [Improvement](nereids) Support join derivation when mv rewrite (#29609)
materialized view def is as following:
>            select l_linenumber, o_custkey
>           from orders
>            left join lineitem on lineitem.L_ORDERKEY = orders.O_ORDERKEY
>            where o_custkey = 1;

when query is as following, it can be rewritten by mv above
it requires that query has reject null filters on the join right input, 
current supported filter are  "=", "<", "<=", ">", ">=", "<=>" 
>            select IFNULL(orders.O_CUSTKEY, 0) as custkey_not_null,
>           case when l_linenumber in (1,2,3) then l_linenumber else o_custkey end as case_when
>            from orders
>            inner join lineitem on orders.O_ORDERKEY = lineitem.L_ORDERKEY
>            where o_custkey = 1 and l_linenumber > 0;
2024-01-12 11:44:21 +08:00
34fe5ee38b [feat](Nereids) support show constraint command (#29667)
show constraints from t1;
+------+-------------+-----------------------------------------+
| Name | Type        | Definition                              |
+------+-------------+-----------------------------------------+
| fk   | FOREIGN KEY | FOREIGN KEY (id) REFERENCES cir.t1 (id) |
| uk   | UNIQUE      | UNIQUE (id)                             |
| pk   | PRIMARY KEY | PRIMARY KEY (id)                        |
+------+-------------+-----------------------------------------+
2024-01-12 11:44:21 +08:00
be56bf06cf [feature](function) support ip function named is_ip_address_in_range(addr, cidr) (#29681) 2024-01-12 11:44:21 +08:00
028e59efab [refactor](Nereids): unify all replaceNamedExpressions (#28228)
Use a unified function `replaceNamedExpressions ` instead of implementing it yourself repeatedly.
2024-01-12 11:44:21 +08:00
d50c8b6d3a [Improvement](nereids) Query rewrite by mv support bitmap_union and bitmap_union_count roll up (#29418)
Query rewrite by mv support bitmap_union and bitmap_union_count roll up, aggregate functions which supports roll up is listed as following:

| 查询中函数            | 物化视图中函数      | 函数上卷后              |
|------------------|--------------|--------------------|
| max              | max          | max                |
| min              | min          | min                |
| sum              | sum          | sum                |
| count            | count        | sum                |
| count(distinct ) | bitmap_union | bitmap_union_count |
| bitmap_union | bitmap_union | bitmap_union|
| bitmap_union_count | bitmap_union | bitmap_union_count |

this depends on  https://github.com/apache/doris/pull/29256
2024-01-12 11:44:21 +08:00
87023d3b7a [Fix](inverted index) fix memory leak in inverted index when encountering fault (#29676) 2024-01-12 11:44:21 +08:00
a2da434e3b [refactor](Nereids): refactor PredicatePropagation & support to infer Equal Condition (#29644) 2024-01-12 11:40:57 +08:00
3cd1c7745a [fix](jdbc catalog) Fix the precision of decimal type mapping to 0 (#29407) 2024-01-12 11:39:57 +08:00
eea657a610 [rf](nereids)prune rf for external db according to jump count (#29634)
* prune some rf for external db
2024-01-12 11:37:16 +08:00
97ed06a92c [regression-test](Variant) fix unstable case (#29648) 2024-01-12 11:36:45 +08:00
53f1521308 [feature](inverted index)Support failover when index compaction failed (#29553) 2024-01-12 11:33:15 +08:00
59d7f64360 [Fix](Nereids) fix pipelineX distribute expr list with child output expr ids (#29621) 2024-01-08 10:46:27 +08:00
17cf4ab2c1 [case](regression) streamload publish timeout (#29457)
Co-authored-by: qinhao <qinhao@newland.com.cn>
2024-01-07 19:50:16 +08:00
1ea51e9f20 [Feature](group commit) Support table property "group commit data bytes" (#29484) 2024-01-07 19:46:42 +08:00
eb4c389b0b [feature](function) support ip functions isipv4string and isipv6string (#28556) 2024-01-07 13:03:11 +08:00
df43b671de [case](regression) Backup & restore with view (#29573) 2024-01-07 00:03:22 +08:00
734b258e15 [feature](create table) show create table print storage medium (#29080) 2024-01-06 22:40:51 +08:00
911635fac6 [feature](nereids) judge if the join is at bottom of join cluster (#29383) 2024-01-06 17:15:19 +08:00
5789b7e380 [fix](jin) add datetimev2 precision (#29528) 2024-01-06 13:35:26 +08:00
7a0734dbd6 [feature](Nereids): InferPredicates support In (#29458) 2024-01-05 21:25:30 +08:00
7402fee1fc [feature](function) support ip function ipv6_string_to_num(_or_default, _or_null), inet6_aton (#28361) 2024-01-05 19:24:45 +08:00
2b3e75bb27 [fix](Nereids) exists should not return null (#29435) 2024-01-05 18:13:21 +08:00
77fbbf63ed [test](Nereids): add more test for eliminate inner join by fk (#29390) 2024-01-05 16:21:24 +08:00
0eee560f94 [enhancement](Nereids): add test for push filter through operator (#27294) 2024-01-05 15:48:23 +08:00
64696829d1 [fix](Nereids) mark join should not eliminate join when child is empty (#29409) 2024-01-05 11:55:37 +08:00
c0f63915f7 [chore](test) make configuartion of parallel scan be fuzzy (#29356) 2024-01-05 11:09:43 +08:00
7a4ef90110 [Improve](regresstests)add test cases for array functions (#28492) 2024-01-04 20:39:35 +08:00
43b19fd99e [docs](timezone) refactor docs of timezone 2024-01-04 20:20:40 +08:00
abd9000368 [Feat](Nereids) add distribute hint to leading hint (#28562)
add distribute hint to leading hint, we can use leading like:
/*+ leading(t1 broadcase{t2 t3}) */ after this commit
2024-01-04 17:51:06 +08:00
Pxl
441fb49345 [Bug](load) fix load failed on stream load tvf into agg state (#28420)
fix load failed on stream load tvf into agg state
2024-01-04 17:38:31 +08:00
Pxl
d8a08dad90 [Bug](mark-join) fix wrong result on mark join + other conjunct (#29321)
fix wrong result on mark join + other conjunct
2024-01-04 11:58:39 +08:00
0d0b9d64dd [improve](move-memtable) add move memtable too many segments fault injection (#29342) 2024-01-03 21:27:54 +08:00
afaefa3a9e [regression](decimalv2) add schema change test case for decimalv2 (#29474) 2024-01-03 21:02:10 +08:00
49a3bab399 [fix](nereids) fix aggregate function roll up when expression arguments is not equals (#29256)
when aggregate function roll up, we should check the qury and mv function argument is equal
such as mv def and query sql as following, it should not rewrite success, because the  bitmap_union_basic field augument is
not equal to the `count(distinct case when o_shippriority > 10 and o_orderkey IN (1, 3) then o_custkey else null end)`  field in query

mv def:
>      select l_shipdate, o_orderdate, l_partkey, l_suppkey, 
>            sum(o_totalprice) as sum_total, 
>            max(o_totalprice) as max_total, 
>            min(o_totalprice) as min_total, 
>            count(*) as count_all, 
>            bitmap_union(to_bitmap(case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)) as bitmap_union_basic 
>           from lineitem 
>           left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate 
>            group by 
>         l_shipdate, 
>         o_orderdate, 
>          l_partkey, 
>         l_suppkey;

query sql:

>             select t1.l_partkey, t1.l_suppkey, o_orderdate,
>           sum(o_totalprice),
>            max(o_totalprice),
>           min(o_totalprice),
>           count(*),
>            count(distinct case when o_shippriority > 10 and o_orderkey IN (1, 3) then o_custkey else null end)
>            from (select * from lineitem where l_shipdate = '2023-12-11') t1
>            left join orders on t1.l_orderkey = orders.o_orderkey and t1.l_shipdate = o_orderdate
>            group by
>            o_orderdate, 
>            l_partkey,
>            l_suppkey;
2024-01-03 18:58:18 +08:00
d19530c4c2 [Fix](Nereids) fix leading hint dealing with big brace (#29405)
Co-authored-by: libinfeng <libinfeng@selectdb.com>
2024-01-03 18:13:38 +08:00
2a9b4a0f76 [enhancement](paimon)support predict for null and notnull (#29134) 2024-01-03 12:53:39 +08:00
f54f79515c [Bug](fix) str_to_date "" should be null (#29402) 2024-01-03 08:25:22 +08:00
1edf5b31b6 [Regression-test](wal) Add fault injection case for wal mem back pressure (#29298) 2024-01-03 00:06:52 +08:00
5db496d844 [Improve](Variant) make output stable (#29389) 2024-01-02 20:29:17 +08:00