doris

Author	SHA1	Message	Date
Mingyu Chen	5734e2bd30	[opt](meta-cache) refine the meta cache (#33449 ) (#33754 ) bp #33449	2024-04-17 23:42:13 +08:00
morrySnow	2890f6c3cf	[opt](Nereids) date literal support basic format with timezone (#33662 )	2024-04-17 23:42:13 +08:00
qiye	ca728a2405	[feature](proc)Add table's indexes info in show proc interface (#33438 ) 1. Add show proc `/dbs/db_id/table_id/indexes` impl 2. Remove index_id in `show index from table` 3. Add test cases --------- Co-authored-by: Luennng <luennng@gmail.com>	2024-04-17 23:42:13 +08:00
Lijia Liu	8e38549a92	[fix](nereids) Use correct PREAGGREGATION in agg(filter(scan)) (#33454 ) 1. set `PreAggStatus` to `ON` when agg key column by max or min; 2. #28747 may change `PreAggStatus` of scan, inherit it from the previous one.	2024-04-17 23:42:13 +08:00
Mingyu Chen	d18f5e2544	[refactor](refresh-catalog) refactor the refresh catalog code (#33653 ) To unify the code. In previous, we do catalog refresh in `CatalogMgr`, but do database and table refresh in `RefreshMgr`, which is very confusing. This PR move all `refresh` related code from CatalogMgr to RefreshMgr. No logic is changed in this PR.	2024-04-17 23:42:12 +08:00
starocean999	466b9f35d5	[fix](nereids)EliminateGroupBy should keep the output's datatype same as old ones (#33585 )	2024-04-17 23:42:12 +08:00
morrySnow	7659b1aa67	[opt](Nereids) prefer slot type to support delete task better (#33559 )	2024-04-17 23:42:12 +08:00
starocean999	e53a76d75b	[fix](planner) fix bug of InlineViewRef's tableNameToSql method (#33575 )	2024-04-17 23:42:12 +08:00
jakevin	b2face0d20	[feature](Nereids): date literal suppose Zone (#33534 ) support ``` '2022-05-01 01:02:55+02:30 '2022-05-01 01:02:55Asia/Shanghai ```	2024-04-17 23:42:12 +08:00
Ashin Gau	2cd4012541	[opt](scan) read scan ranges in the order of partitions (#33515 ) (#33657 ) backport: #33515	2024-04-17 23:42:12 +08:00
zy-kkk	1be753ed75	[enhancement](mysql compatible) add user and procs_priv tables to mysql db in all catalogs (#33058 ) Issue Number: close #xxx This PR aims to enhance the compatibility of BI tools (such as Dbeaver, DataGrip) when using the mysql connector to connect to Doris, because some BI tools query some tables in the mysql database. In our tests, the user and procs_priv tables were mainly queried. This PR adds these two tables and adds actual data to the user table. However, please note that most of the fields in the user table are in Doris' own format rather than mysql format, so it can only ensure that the BI tool is querying No error is reported when accessing these tables, which does not guarantee that the data is completely displayed, and the tables under Doris's mysql database do not support data modification. Thanks to @liujiwen-up for assisting in testing	2024-04-17 23:42:12 +08:00
zhangstar333	b2b385a4ff	[improve](fold) support complex type for constant folding (#32867 )	2024-04-17 23:41:59 +08:00
谢健	92d28e497b	[refactor](Nereids): compute unique and uniform property respectively (#32908 )	2024-04-17 23:41:59 +08:00
starocean999	e26a53d8a6	[fix](nereids) SemiJoinSemiJoinTransposeProject rule didn't handle mark join correctly (#33401 )	2024-04-12 15:09:25 +08:00
Pxl	5f30463bb3	[Chore](descriptors) remove unused codes for descriptors (#33408 ) remove unused codes for descriptors	2024-04-12 15:09:25 +08:00
wuwenchi	ef64d7a011	[feature](profile) add transaction statistics for profile (#33488 ) 1. commit total time 2. fs operator total time rename file count rename dir count delete dir count 3. add partition total time add partition count 4. update partition total time update partition count like: ``` - Transaction Commit Time: 906ms - FileSystem Operator Time: 833ms - Rename File Count: 4 - Rename Dir Count: 0 - Delete Dir Count: 0 - HMS Add Partition Time: 0ms - HMS Add Partition Count: 0 - HMS Update Partition Time: 68ms - HMS Update Partition Count: 4 ```	2024-04-12 15:06:16 +08:00
wuwenchi	d31bca199f	[feature](iceberg)The new DDL syntax is added to create iceberg partitioned tables (#33338 ) support partition by : ``` create table tb1 (c1 string, ts datetime) engine = iceberg partition by (c1, day(ts)) () properties ("a"="b") ```	2024-04-12 10:45:16 +08:00
slothever	18fb8407ae	[feature](insert)use optional location and add hive regression test (#33153 )	2024-04-12 10:38:54 +08:00
wuwenchi	31a7060dbd	[testcase](hive)add exception test for hive txn (#33278 ) Issue #31442 #32726 1. add LocalDfsFileSystem to manipulate local files. 2. add HMSCachedClientTest to analog HMS services. 3. add test for rollback commit.	2024-04-12 10:38:48 +08:00
wuwenchi	e11db3f050	[feature](hive)support ExternalTransaction for writing exteral table (#32726 ) Issue #31442 Add `TransactionManager` and `Transaction`. ``` public interface Transaction { void commit() throws UserException; void rollback(); } public interface TransactionManager { long begin(); void commit(long id) throws UserException; void rollback(long id); Transaction getTransaction(long id); } ``` `TransactionManager` is used to manage all external transactions: The application layer should manage the entire transaction through this `TransactionManager`, like: ``` transactionManager.commit(); transactionManager.rollback(); ``` `Transaction` is an interface. You can implement this interface according to the specific content, such as `HMSTransaction` currently implemented, iceberg that may be implemented in the future, etc.	2024-04-12 10:38:12 +08:00
slothever	f0ac21e231	[feature](external) process tbl/db exist when create/drop db/tbl (#33119 ) Issue Number: #31442	2024-04-12 10:36:43 +08:00
slothever	7a05396cd1	[feature](multi-catalog)support catalog name when create/drop db (#33116 ) Issue Number: #31442	2024-04-12 10:36:18 +08:00
slothever	01b21da82d	[feature](insert)add hive insert plan ut and remove redundant fields (#33051 ) add hive insert sink plan UT case remove some deprecated code	2024-04-12 10:30:08 +08:00
slothever	07f296734a	[regression](insert)add hive DDL and CTAS regression case (#32924 ) Issue Number: #31442 dependent on #32824 add ddl(create and drop) test add ctas test add complex type test TODO: bucketed table test truncate test add/drop partition test	2024-04-12 10:24:23 +08:00
slothever	716c146750	[fix](insert)fix hive external return msgs and exception and pass all columns to BE (#32824 ) [fix](insert)fix hive external return msgs and exception and pass all columns to BE	2024-04-12 10:23:52 +08:00
slothever	3343322965	[fix](insert)fix conversion of doris type to hive type (#32735 ) #31442 create table fix doris to hive type, use primitiveType to check doris type.	2024-04-12 10:01:30 +08:00
slothever	c68b353017	[feature][insert]add FE UT and support CTAS for external table (#32525 ) 1. add FE ut for create hive table 2. support external CTAS: > source table: ``` mysql> show create table hive.jz3.test; CREATE TABLE `test`( `id` int COMMENT 'col1', `name` string COMMENT 'col2') PARTITIONED BY ( `dt` string, `dtm` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://HDFS8000871/usr/hive/warehouse/jz3.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1710837792', 'file_format'='orc') ``` > create unpartitioned target table ``` mysql> create table hive.jz3.ctas engine=hive as select * from hive.jz3.test; mysql> show create table ctas; CREATE TABLE `ctas`( `id` int COMMENT '', `name` string COMMENT '', `dt` string COMMENT '', `dtm` string COMMENT '') ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://HDFS8000871/usr/hive/warehouse/jz3.db/ctas' TBLPROPERTIES ( 'transient_lastDdlTime'='1710860377') ``` > create partitioned target table ``` mysql> create table hive.jz3.ctas1 engine=hive partition by list (dt,dtm) () as select * from hive.jz3.test; mysql> show create table hive.jz3.ctas1; CREATE TABLE `ctas1`( `id` int COMMENT '', `name` string COMMENT '') PARTITIONED BY ( `dt` string, `dtm` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://HDFS8000871/usr/hive/warehouse/jz3.db/ctas1' TBLPROPERTIES ( 'transient_lastDdlTime'='1710919070') ```	2024-04-12 09:58:49 +08:00
slothever	36a1bf1d73	[feature][insert]Adapt the create table statement to the nereids sql (#32458 ) issue: #31442 1. adapt create table statement from doris to hive 2. fix insert overwrite for table sink > The doris create hive table statement: ``` mysql> CREATE TABLE buck2( -> id int COMMENT 'col1', -> name string COMMENT 'col2', -> dt string COMMENT 'part1', -> dtm string COMMENT 'part2' -> ) ENGINE=hive -> COMMENT "create tbl" -> PARTITION BY LIST (dt, dtm) () -> DISTRIBUTED BY HASH (id) BUCKETS 16 -> PROPERTIES( -> "file_format" = "orc" -> ); ``` > generated hive create table statement: ``` CREATE TABLE `buck2`( `id` int COMMENT 'col1', `name` string COMMENT 'col2') PARTITIONED BY ( `dt` string, `dtm` string) CLUSTERED BY ( id) INTO 16 BUCKETS ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://HDFS8000871/usr/hive/warehouse/jz3.db/buck2' TBLPROPERTIES ( 'transient_lastDdlTime'='1710840747', 'doris.file_format'='orc') ```	2024-04-12 09:57:37 +08:00
feiniaofeiafei	dc8da9ee89	[Fix](nereids) fix qualifier problem that affects delete stmt in another catalog (#33528 ) * [Fix](nereids) fix qualifier problem that affects delete stmt in another catalog * [Fix](nereids) fix qualifier problem that affects delete stmt in another catalog * [Fix](nereids) fix qualifier problem that affects delete stmt in another catalog * [Fix](nereids) fix qualifier problem that affects delete stmt in another catalog --------- Co-authored-by: feiniaofeiafei <moailing@selectdb.com>	2024-04-11 21:43:01 +08:00
zclllyybb	3d66723214	[branch-2.1](auto-partition) pick auto partition and some more prs (#33523 )	2024-04-11 17:12:17 +08:00
谢健	045dd05f2a	[fix](Nereids): don't transpose agg and join if join is mark join (#33312 )	2024-04-10 16:23:20 +08:00
yiguolei	16f8afc408	[refactor](coordinator) split profile logic and instance report logic (#32010 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-04-10 15:51:32 +08:00
feiniaofeiafei	5e59c09a60	[Fix](nereids) modify the binding aggregate function in order by (#32758 ) modify the bind logical to make the order by has same behavior with mysql when sort child is aggregate. when an order by Expr has aggregate function, all slots in this order by Expr should bind the LogicalAggregate non-AggFunction outputs first, then bind the LogicalAggregate Child e.g. select 2*abs(sum(c1)) as c1, c1,sum(c1)+c1 from t_order_by_bind_priority group by c1 order by sum(c1)+c1 asc; in this sql, the two c1 in order by all bind to the c1 in t_order_by_bind_priority	2024-04-10 15:26:09 +08:00
924060929	38d580dfb7	[fix](Nereids) fix link children failed (#33134 ) #32617 introduce a bug: rewrite may not working when plan's arity >= 3. this pr fix it (cherry picked from commit 8b070d1a9d43aa7d25225a79da81573c384ee825)	2024-04-10 14:59:45 +08:00
924060929	ff990eb869	[enhancement](Nereids) refactor expression rewriter to pattern match (#32617 ) this pr can improve the performance of the nereids planner, in plan stage. 1. refactor expression rewriter to pattern match, so the lots of expression rewrite rules can criss-crossed apply in a big bottom-up iteration, and rewrite until the expression became stable. now we can process more cases because original there has no loop, and sometimes only process the top expression, like `SimplifyArithmeticRule`. 2. replace `Collection.stream()` to `ImmutableXxx.Builder` to avoid useless method call 3. loop unrolling some codes, like `Expression.<init>`, `PlanTreeRewriteBottomUpJob.pushChildrenJobs` 4. use type/arity specified-code, like `OneRangePartitionEvaluator.toNereidsLiterals()`, `PartitionRangeExpander.tryExpandRange()`, `PartitionRangeExpander.enumerableCount()` 5. refactor `ExtractCommonFactorRule`, now we can extract more cases, and I fix the deed loop when use `ExtractCommonFactorRule` and `SimplifyRange` in one iterative, because `SimplifyRange` generate right deep tree, but `ExtractCommonFactorRule` generate left deep tree 6. refactor `FoldConstantRuleOnFE`, support visitor/pattern match mode, in ExpressionNormalization, pattern match can criss-crossed apply with other rules; in PartitionPruner, visitor can evaluate expression faster 7. lazy compute and cache some operation 8. use int field to compare date 9. use BitSet to find disableNereidsRules 10. two level loop usually faster then build Multimap when bind slot in Scope, so I revert the code 11. `PlanTreeRewriteBottomUpJob` don't need to clearStatePhase any more ### test case 100 threads parallel continuous send this sql which query an empty table, test in my mac machine(m2 chip, 8 core), enable sql cache ```sql select count(1),date_format(time_col,'%Y%m%d'),varchar_col1 from tbl where partition_date>'2024-02-15' and (varchar_col2 ='73130' or varchar_col3='73130') and time_col>'2024-03-04' and time_col<'2024-03-05' group by date_format(time_col,'%Y%m%d'),varchar_col1 order by date_format(time_col,'%Y%m%d') desc, varchar_col1 desc,count(1) asc limit 1000 ``` before this pr: 3100 peak QPS, about 2700 avg QPS after this pr: 4800 peak QPS, about 4400 avg QPS (cherry picked from commit 7338683fdbdf77711f2ce61e580c19f4ea100723)	2024-04-10 14:59:45 +08:00
minghong	a7c8abe58c	[feature](nereids) support common sub expression by multi-layer projections (fe part) (#33087 ) * cse fe part	2024-04-10 14:53:56 +08:00
Jibing-Li	dcddd88e01	Limit the max string length to 1024 while collecting column stats to control BE memory usage. (#32470 )	2024-04-10 11:34:29 +08:00
Jibing-Li	0499d4013e	Support identical column name in different index. (#32792 )	2024-04-10 11:34:29 +08:00
Jensen	407f8642da	[Enhancement](data skew) extends show data skew (#32732 )	2024-04-10 11:34:29 +08:00
jakevin	e980cd3e7f	[feature](Nereids): add ColumnPruningPostProcessor. (#32800 )	2024-04-10 11:34:29 +08:00
zhangdong	26e86d53a4	[enhance](mtmv)support olap table partition column is null (#32698 )	2024-04-10 11:34:29 +08:00
seawinde	22a7fc3c55	[improvement](mtmv) Support to get tables in materialized view when collecting table in plan (#32797 ) Support to get tables in materialized view when collecting table in plan table scehma as fllowing: create materialized view mv1 BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL DISTRIBUTED BY RANDOM BUCKETS 1 PROPERTIES ('replication_num' = '1') as select t1.c1, t3.c2 from table1 t1 inner join table3 t3 on t1.c1 = t3.c2 if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables; SELECT mv1.*, uuid() FROM mv1 LEFT SEMI JOIN table2 ON mv1.c1 = table2.c1 WHERE mv1.c1 IN ( SELECT c1 FROM table2 ) OR mv1.c1 < 10	2024-04-10 11:34:29 +08:00
xy720	dcfdbf0629	[chore](show) support statement to show views from table (#32358 ) MySQL [test]> show views; +----------------+ \| Tables_in_test \| +----------------+ \| t1_view \| \| t2_view \| +----------------+ 2 rows in set (0.00 sec) MySQL [test]> show views like '%t1%'; +----------------+ \| Tables_in_test \| +----------------+ \| t1_view \| +----------------+ 1 row in set (0.01 sec) MySQL [test]> show views where create_time > '2024-03-18'; +----------------+ \| Tables_in_test \| +----------------+ \| t2_view \| +----------------+ 1 row in set (0.02 sec)	2024-04-10 11:34:28 +08:00
Mingyu Chen	217514e5dd	[minor](test) Add Iceberg hadoop catalog FE unit test (#32449 ) For easy testing the behavior of Iceberg's HadoopCatalog.listNamespaces()	2024-04-10 11:34:28 +08:00
zclllyybb	e574b35833	[Enhancement](partition) Refine some auto partition behaviours (#32737 ) (#33412 ) fix legacy planner grammer fix nereids planner parsing fix cases forbid auto range partition with null column fix CreateTableStmt with auto partition and some partition items. 1 and 2 are about #31585 doc pr: apache/doris-website#488	2024-04-09 15:51:02 +08:00
zy-kkk	fae55e0e46	[Feature](information_schema) add processlist table for information_schema db (#32511 )	2024-04-07 23:24:22 +08:00
Tiewei Fang	b882704eaf	[fix](Export) Set the default value of the `data_consistence` property of export to `partition` (#32830 )	2024-04-07 23:24:22 +08:00
Mingyu Chen	d9d950d98e	[fix](iceberg) fix iceberg predicate conversion bug (#33283 ) Followup #32923 Some cases are not covered in #32923	2024-04-07 22:12:38 +08:00
wuwenchi	190763e301	[bugfix](iceberg)Convert the datetime type in the predicate according to the target column (#32923 ) Convert the datetime type in the predicate according to the target column. And add a testcase for #32194 related #30478 #30162	2024-04-07 22:12:33 +08:00
Mingyu Chen	32d6a4fdd5	[opt](rowcount) refresh external table's rowcount async (#32997 ) In previous implementation, the row count cache will be expired after 10min(by default), and after expiration, the next row count request will miss the cache, causing unstable query plan. In this PR, the cache will be refreshed after Config.external_cache_expire_time_minutes_after_access, so that the cache entry will remain fresh.	2024-04-07 22:11:14 +08:00

1 2 3 4 5 ...

2204 Commits