doris

Author	SHA1	Message	Date
YueW	c19e35116b	[fix](inverted index)fix transaction id not unique for one index change job when light index change (#21180 )	2023-06-26 19:54:05 +08:00
lihangyu	50c1d55769	[Improve](dynamic schema) support filtering invalid data (#21160 ) * [Improve](dynamic schema) support filtering invalid data 1. Support dynamic schema to filter illegal data. 2. Expand the regular expression for ColumnName to support more column names. 3. Be compatible with PropertyAnalyzer and support legacy tables. 4. Default disable parse multi dimenssion array, since some bug unresolved	2023-06-26 19:32:43 +08:00
zy-kkk	9c5a0cc471	[bug](jdbc catalog) fix getPrimaryKeys fun bug (#21137 )	2023-06-26 17:13:50 +08:00
jakevin	cdc2d42c3a	[refactor](Nereids): adjust order of rewrite rules. (#21133 ) Put the rules that eliminate plan in front to avoid block other rules, so we can avoid to invoke pushdown filter/limit again	2023-06-26 16:47:33 +08:00
starocean999	f2ed1bce1a	[fix](nereids)change PushdownFilterThroughProject post processor from bottom up to top down rewrite (#21125 ) 1. pass physicalProperties in withChildren function 2. use top down traverse in PushdownFilterThroughProject post processor	2023-06-26 15:34:41 +08:00
slothever	2b3c82f57a	[fix](multi-catalog)fix max compute scanner OOM and datetime (#20957 ) 1. Fix MC jni scanner OOM 2. add the second datetime type for MC SDK timestamp 3. make s3 uri case insensitive by the way 4. optimize max compute scanner parallel model	2023-06-26 13:53:29 +08:00
slothever	d4240ac21b	[fix](multi-catalog)add oss sdk, supported oss properties (#21029 )	2023-06-26 13:00:44 +08:00
caiconghui	f8ef4ed18f	[fix](log4j) fix some issues when modify log config (#21099 ) Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-06-26 08:46:33 +08:00
Pxl	0122aa79df	[Chore](vectorized) remove all isVectorized (#21076 ) isVectorized is always true now	2023-06-25 23:13:34 +08:00
starocean999	58b3e5ebdb	[fix](nereids)scan node's smap should use materiazlied slots and project list as left and right expr list (#21142 )	2023-06-25 22:34:43 +08:00
yuxuan-luo	8f7a62c79b	[improvement](mutil-catalog) PaimonColumnValue support short and Decimal (#20723 )	2023-06-25 22:31:38 +08:00
Xiangyu Wang	2c2d56e8a0	[Feature](broker-load) Add priority info for ShowLoadStmt. (#20984 ) Following pr #20628 , add priority information of the load job.	2023-06-25 22:11:21 +08:00
yiguolei	64790a3a86	[bugfix](workloadgroup) could not upgrade from 2.0 alpha (#21149 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-06-25 22:02:53 +08:00
minghong	2d1163c4d8	[refactor](nereids) update Agg stats derive method #21036 This pr has no effect on tpch queries. Some tpcds queries are impacted. They are 4/11/23/24/47/51/57/65/74, in which 4 and 51 are improved	2023-06-25 21:47:32 +08:00
minghong	34b048a2bd	[fix](nereids) update outer join estimation #21126 the row count of left outer join should be no less than left child row count.	2023-06-25 21:37:55 +08:00
Xiangyu Wang	af2b67e65a	[Fix](multi-catalog) Invalidate cache when enable auto refresh catalog. (#21070 ) The default value of RefreshCatalogStmt.invalidCache is false now, but the RefreshManager.RefreshTask does not invoke RefreshCatalogStmt.analyze() so it will not invalidate the cache. This pr mainly fix this problem	2023-06-25 19:14:44 +08:00
AKIRA	638aa41988	[fix](planner) fix push filter through agg #21080 In the previous implementation, the check for groupby exprs was ignored. Add this necessary check to make sure it would work You could reproduce it by runnning belowing sql: CREATE TABLE t_push_filter_through_agg (col1 varchar(11451) not null, col2 int not null, col3 int not null) UNIQUE KEY(col1) DISTRIBUTED BY HASH(col1) BUCKETS 3 PROPERTIES( "replication_num"="1" ); CREATE VIEW `view_i` AS SELECT `b`.`col1` AS `col1`, `b`.`col2` AS `col2` FROM ( SELECT `col1` AS `col1`, sum(`cost`) AS `col2` FROM ( SELECT `col1` AS `col1`, sum(CAST(`col3` AS INT)) AS `cost` FROM `t_push_filter_through_agg` GROUP BY `col1` ) a GROUP BY `col1` ) b; SELECT SUM(`total_cost`) FROM view_a WHERE `dt` BETWEEN '2023-06-12' AND '2023-06-18' LIMIT 1;	2023-06-25 19:14:20 +08:00
starocean999	b6c9feb458	[fix](nereids) check table privilege when it's needed (#21130 ) check privilege on LogicalOlapScan, LogicalEsScan, LogicalFileScan and LogicalSchemaScan	2023-06-25 18:35:39 +08:00
Siyang Tang	46f0295b78	[feature](load-refactor-with-tvf) S3 load with S3 tvf and native insert (#19937 )	2023-06-25 17:45:31 +08:00
AKIRA	771b0cbb4c	[fix](stats) Update analyze task execute time (#21026 ) Before this PR last_execute_time of pending analyze jobs would be 1970-01-01, you can reproduce it by run show analyze	2023-06-25 15:52:33 +08:00
AKIRA	cf66280e60	[opt](stats) Sampling when aggregate column stats (#21020 ) In the previous implementation, when aggregating partition statistics into column statistics, the calculation of distinct values (ndv) for the entire column was performed without using sampling, resulting in reduced efficiency of the sampling process. Before this PR analyze below table which has 1000000 lines would cost 5.75sec, after this PR, it would cost 3.39sec. ```sql CREATE TABLE IF NOT EXISTS `duplicate_all` ( `k3` int(11) null comment "", `k0` boolean null comment "", `k1` tinyint(4) null comment "", `k2` smallint(6) null comment "", `k4` bigint(20) null comment "", `k5` decimalv3(9, 3) null comment "", `k6` char(36) null comment "", `k10` date null comment "", `k11` datetime null comment "", `k7` varchar(64) null comment "", `k8` double null comment "", `k9` float null comment "", `k12` string null comment "", `k13` largeint(40) null comment "" ) engine=olap DUPLICATE KEY(`k3`) DISTRIBUTED BY HASH(`k3`) BUCKETS 5 properties("replication_num" = "3") ```	2023-06-25 15:52:01 +08:00
AKIRA	dd99468b8f	[fix](stats) Fix jdbc timeout with multiple FE when execute analyze table (#21115 ) SQL may forward to master to execute when connecting to follower node, the result should be set to `StmtExecutor#proxyResultSet` Before this PR, in above scenario , submit analyze sql by mysql client/jdbc whould return get malformed packet/ Communication failed.	2023-06-25 15:49:36 +08:00
Lijia Liu	76bdcf1d26	[improvement](pipeline) task group scan entity (#19924 )	2023-06-25 14:43:35 +08:00
mch_ucchi	80d54368e0	[minor](Nereids) replace some nullable field to Optional (#20967 )	2023-06-25 12:02:25 +08:00
yiguolei	207bc53b06	[functionpushdown](performance) move function pushdown as default false since its performance is not good (#21111 ) set enable function pushdown default to false. enable it in fuzzy mode to test this feature. We should remove function pushdown in the future since we already have common expr pushdown. Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-06-25 10:36:20 +08:00
didiaode18	20b92b0812	[Feature](log)friendly hint for creating table failed (#20617 )	2023-06-25 10:02:26 +08:00
Mingyu Chen	5aa16e84bf	[fix](catalog) do not call makeSureInitialized when create table from hms meta event (#21104 ) In this PR, I remove the `makeSureInitialized()` call in `createTable()` method, because it is wrong and useless. And also rename the methed's name to make it more clear.	2023-06-24 21:50:36 +08:00
Pxl	fa3bb2eabe	[Bug](materialized-veiw) fix error happens when parsing create materialized view stmt #21095	2023-06-22 15:58:32 +08:00
shenxingwuying	eb6202e8be	[minor](fe) remove several unnecessary codes (#21046 ) 1. The class 'ExternalDatabase' has implemented the 'GsonPostProcessable' interface, so there is redundant codes in some subclass of 'ExternalDatabase'. 2. A LOG object is not used in this file.	2023-06-22 15:29:25 +08:00
shuke	9f0aa8a9de	[fix](fuzzy)nereids and pipeline config changed by fuzzy in non-pipeline env. (#21092 ) * fix: nereids and pipeline config changed by fuzzy in non-pipeline env. * fix: format * fix: format	2023-06-22 08:36:19 +08:00
Calvin Kirs	b192082b62	[Improve](load)Solve the problem of RoutineLoadTaskScheduler idling when there is no data (#20986 ) Since the polling interval is 0, the CPU will be polled all the time when there is no data Before and after comparison test, the CPU usage time is reduced by 2000 times	2023-06-22 00:41:45 +08:00
starocean999	fff308352f	[fix](nereids)the microseconds value is wrong when create datatimev2 literal from LocalDateTime (#21089 ) * [fix](nereids)the microseconds value is wrong when create datatimev2 literal from LocalDateTime * fix code style	2023-06-22 00:40:53 +08:00
plat1ko	e060ffab96	[Fix](cooldown) Fix incorrect judgement of `isDropTableOrPartition` (#21084 )	2023-06-21 23:00:58 +08:00
starocean999	8b561cfb03	[fix](nereids)create datev2 and datetimev2 literal if enable_date_conversion is true (#21065 )	2023-06-21 20:29:36 +08:00
airborne12	6ac0bfeceb	[Feature](inverted index) add unicode parser for inverted index (#21035 )	2023-06-21 20:14:06 +08:00
zhannngchen	cc53391c9a	Revert "[feature](merge-on-write) enable merge on write by default (#… (#21041 )	2023-06-21 18:36:46 +08:00
HHoflittlefish777	2beed11256	[Bug](streamload) fix inconsistent load result of be and fe (#20950 )	2023-06-21 18:12:51 +08:00
lihangyu	62fb0e642e	[chore](dynamic schema) deprecated create dynamic schema table (#21058 )	2023-06-21 14:44:57 +08:00
plat1ko	6f20cac1da	[bugfix](cooldown) Fix potential deadlock while calling handleCooldownConf (#20975 )	2023-06-21 14:44:01 +08:00
Pxl	5f0bb49d46	[Feature](materialized-view) support create mv contain aggstate column (#20812 ) support create mv contain aggstate column	2023-06-21 13:06:52 +08:00
lihangyu	fcd778fb4f	[Fix](mysql proto) avoid send duplicated `OK` packet (#21032 ) 1. The Mysql Go driver has a logic that terminates when it reads an EOF (end-of-file) and expects no data in the buffer. However, the front-end (FE) mistakenly returns an additional OK packet, which causes an exception to be thrown when reading the buffer. 2. Refactor some logic to support full prepared not just in where clause, like ``` select ?, ? from tbl ```	2023-06-21 12:00:22 +08:00
amory	18beb822a3	[FIX](array-type) fix array string output with fe const expr (#21042 ) fe foldconstRule make array() function expr with const literal , and would not pass this array literal to be . but we should make fe array string output format is same with be array string output	2023-06-21 11:52:02 +08:00
Ashin Gau	ef17289925	[feature](jni) add jni metrics and attach to BE profile automatically (#21004 ) Add JNI metrics, for example: ``` - HudiJniScanner: 0ns - FillBlockTime: 31.29ms - GetRecordReaderTime: 1m5s - JavaScanTime: 35s991ms - OpenScannerTime: 1m6s ``` Add three common performance metrics for JNI scanner: 1. `OpenScannerTime`: Time to init and open JNI scanner 2. `JavaScanTime`: Time to scan data and insert into vector table in java side 3. `FillBlockTime`: Time to convert java vector table to c++ block And support user defined metrics in java side, for example: `OpenScannerTime` is a long time for the open process, we want to determine which sub-process takes too much time, so we add `GetRecordReaderTime` in java side. The user defined metrics in java side can be attached to BE profile automatically.	2023-06-21 11:19:02 +08:00
LiBinfeng	f10258577b	[Fix](Planner) Fix group concat with multi distinct and segs (#20912 ) Problem: when use select group_concat(distinct a, 'seg1'), group_concat(distinct b, 'seg2') ... Error would rised Reason: Group_concat function regard 'seg' as arguments also, so multi distinct column error would rised Solved: let Multi Distinct group_concat function only get first argument as real argument	2023-06-20 21:00:18 +08:00
Jibing-Li	ca8f51602b	[Improvement](multi catalog, statistics)Support two level external statistics cache loader (#20906 ) The current column statistic cache loader is to load data from column_statistics olap table. This pr is to change the cache loader logic to First load from column_statistics olap table, if no data was loaded, then load from table metadata. This is mainly to support fetch statistics data for external catalog using HMS or Iceberg api. This is the first PR, next pr will implement the fetch logic for different external catalogs.	2023-06-20 16:43:18 +08:00
Yongqiang YANG	cb89af49e7	[improvement](replica) donot care last failed version in publish (#21001 ) We just care 2 things: 1. If the replica acks right 2. If the replica catches up	2023-06-20 15:57:54 +08:00
DeadlineFen	0b1bbe4045	[Bugfix](CCR) BinlogTombstone tableId is null when db disable binlog (#20995 )	2023-06-20 15:48:47 +08:00
Yongqiang YANG	0d80456869	[enhancement](backup) teach fe to acquire a consistent backup between be and fe (#21014 )	2023-06-20 15:37:41 +08:00
mch_ucchi	f4d3f4ae19	[Fix](Nereids) failed to fold date_format() to constant (#20976 )	2023-06-20 15:11:25 +08:00
AKIRA	ec34f72204	[enhancement](nereids) log for exception stack of sync analyze (#21013 )	2023-06-20 15:11:03 +08:00

1 2 3 4 5 ...

5025 Commits