doris

Author	SHA1	Message	Date
Tiewei Fang	6f9a084d99	[Fix](Outfile) Use data_type_serde to export data to `parquet` file format (#24998 )	2023-10-13 13:58:34 +08:00
zhangdong	4f65a9c425	[fix](auth)fix not display be_port (#25197 ) fix not display be_port who has ADMIN_PRIV	2023-10-13 11:56:00 +08:00
Jack Drogon	ffacbe7d74	[feature](thrift) Add FE thrift rpc redirect master address (#25371 ) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>	2023-10-13 11:17:46 +08:00
DuRipeng	aa0b74d63a	[improvement](fe and broker) support specify broker to getSplits, check isSplitable, file scan for HMS Multi-catalog (#24830 ) I want to use Doris Multi-catalog to accelerate HMS query. My organization has custom distributed file system, and we think wrapping the fs access difference into broker (listLocatedFiles, openReader..) would be a elegant approach. This pr introduce HMS catalog conf `bind.broker.name`. If we set this conf, file split, query scan operation will send to broker. usage: create a hms catalog with broker usage ``` CREATE CATALOG hive_catalog_broker PROPERTIES ( 'type'='hms', 'hive.metastore.uris' = 'thrift://xxx', 'broker.name' = 'hdfs_broker' ); ``` When we try to query from this catalog, file split and query scan request will send to broker `hdfs_broker`. More details about this pr: 1. Introduce HMS catalog proporty `bind.broker.name` to specify broker name to do remote path work. When `broker.name` is set, `enable.self.splitter` must be `true` to ensure file splitting process is executed in Fe 2. Introduce 2 more interfaces to broker service: - `TBrokerIsSplittableResponse isSplittable(1: TBrokerIsSplittableRequest request)`, helps to invoke input format `isSplitable` interface. - `TBrokerListResponse listLocatedFiles(1: TBrokerListPathRequest request)`, helps to do `listFiles` or `listLocatedStatus` for remote file system 3. 3 parts of whole processing will be executed in broker: - Check whether the path with specified input format name `isSplittable` - `listLocatedFiles` of table / partition locations. - `OpenReader` for specified file splits. Co-authored-by: chenlinzhong <490103404@qq.com>	2023-10-13 11:04:38 +08:00
Mingyu Chen	a30d30e7b5	[improvement](resource-tag) limit the default user's resource tag to 'default' (#25331 ) In previous, if user property `'resource_tags.location'` is not set, the can use Backends with any resource tag. It may confuse that when the DBA set part of Backends to resource group A, then the current existing user should not be able to use this group A util it's `'resource_tags.location'` is set. So in this PR, I change the behavior, that if user property `'resource_tags.location'` is not set, it can only use the Backends with `default` tag.	2023-10-13 10:50:00 +08:00
zhangdong	11bbeb9a21	[Enhance](resource group)db support replication_allocation (#25195 ) - db support replication_allocation,when create table,if not set `replication_num` or `replication_allocation `,will use it in db - fix partition property will disappear when table partition is not null	2023-10-13 10:24:01 +08:00
yongjinhou	21223e65c5	[Enhancement](show-backends-disks) Add show backends disks (#24229 ) * Add statement to query disk information corresponding to data directory of BE node [msyql]->'show backends disks;' +-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+ \| BackendId \| Host \| RootPath \| DirType \| DiskState\| TotalCapacity \| UsedCapacity\| AvailableCapacity \| UsedPct \| +-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+ \| 10002 \| 10.xx.xx.90 \| /home/work/output/be/storage \| STORAGE \| ONLINE \| 7.049 TB \| 2.478 TB \| 4.571 TB \| 35.16 % \| \| 10002 \| 10.xx.xx.90 \| /home/work/output/be \| DEPLOY \| ONLINE \| 7.049 TB \| 2.478 TB \| 4.571 TB \| 35.16 % \| \| 10002 \| 10.xx.xx.90 \| /home/work/output/be/log \| LOG \| ONLINE \| 7.049 TB \| 2.478 TB \| 4.571 TB \| 35.16 % \| +-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+	2023-10-12 20:24:45 +08:00
morrySnow	0a38546596	[opt](Nereids) reject group commit insert temporarily (#25359 ) group commit insert introduced by PR #22829. since nereids has not support it, we forbid it temporarily on Nereids until impl it.	2023-10-12 06:20:59 -05:00
Nitin-Kashyap	bdb64eab73	[feature](meta) queries as table valued function (#25052 ) (#25052 ) 1. Add queries view as table function. 2. Proxy result to other FEs and return merged results back to BE. Co-authored-by: yiguolei <676222867@qq.com>	2023-10-12 16:26:14 +08:00
谢健	d6ff9744c9	[feature](Nereids) covert predicate to SARGABLE (#25180 ) covert predicate to SARGABLE 1. support format like `1 - a` 2. support rearrange `year/month/week/day/minutes/seconds_sub/add` function	2023-10-12 14:46:56 +08:00
Jibing-Li	c63bf24c84	[Improvement](statistics) Improve sample count accuracy (#25175 ) While doing sample analyze, the result of row count, null number and datasize need to multiply a coefficient based on the sample percent/rows. This pr is mainly to calculate the coefficient according to the sampled file size over total size.	2023-10-12 14:42:02 +08:00
starocean999	80a49ed97a	[fix](nereids)fix some function signature issue (#25301 ) 1. remove wrong signature of nvl 2. the promoted type datetimev2 for datetime should be datetimev2(0)	2023-10-12 01:23:20 -05:00
morrySnow	a0d3206d78	[fix](Nereids) support nested complex type literal (#25287 )	2023-10-12 01:17:38 -05:00
minghong	42f8b253aa	[function](nereids) support array_apply/array_repeat/group_uniq_array/ipv4numtostring (#25249 ) nereids support functions: array_apply/array_repeat/group_uniq_array/ipv4numtostring	2023-10-12 11:08:42 +08:00
Pxl	a0d2b1ec56	[Bug](materialized-view) fix not match mv when some alias on agg (#25321 ) fix not match mv when some alias on agg	2023-10-12 11:02:55 +08:00
daidai	9a4baf7ccf	[fix](Nereids)Fix the bug that count() does not push down for tables with only one column. (#25222 ) after pr #22115 . Fixed the bug that when selecting count() from table, if the table has only one column, the aggregate count is not pushed down.	2023-10-11 23:17:30 +08:00
zhangdong	d1f59a4025	[fix](catalog)fix when modifying comments in property, it will modify the comments in the catalog (#24857 ) - fix when modifying comments in property, it will modify the comments in the catalog - add `alter catalog modify comment` to modify comment for catalog - abstract some logic of `alter catalog` to parent class	2023-10-11 23:16:19 +08:00
yujun	73c3e3ab55	[Feature](x-load) support config min replica num for loading data (#21118 )	2023-10-11 21:07:35 +08:00
Mryange	ba87f7d3a3	[fix](pipelineX) add table sink and some fix in pipelineX (#25314 )	2023-10-11 20:18:08 +08:00
morrySnow	46be6c07e1	[opt](Nereids) expose multi distinct functions (#25309 )	2023-10-11 05:42:39 -05:00
yujun	1e300d895d	[improvement](checkpoint) checkpoint thread update tablet invert index (#25098 )	2023-10-11 18:18:03 +08:00
starocean999	2d19f2fbfe	[fix](planner)need call materializeSrcExpr for materialized slots in join node (#25204 )	2023-10-11 16:34:53 +08:00
starocean999	dabeeb0338	[fix](planner)should always use plan node's getTblRefIds method to get unassigned conjuncts for this node (#25130 )	2023-10-11 16:34:21 +08:00
starocean999	2221c8e2ed	[fix](planner)implicit cast should use type member variable instead of targetTypeDef (#24582 )	2023-10-11 16:33:48 +08:00
starocean999	e9554e36a8	[fix](nereids)disable parallel scan in some case (#25089 )	2023-10-11 16:32:09 +08:00
starocean999	6d999f5b95	[enhancement](nereids)add eliminate filter on one row relation rule (#24980 ) 1.simplify PushdownFilterThroughSetOperation rule 2.add eliminate filter on one row relation rule	2023-10-11 16:12:24 +08:00
谢健	47578c0fc9	[fix](Nereids) fix toSql of date literal (#25243 ) toSql should return '2023-2-1 ' for DateLiteral 2023-2-1	2023-10-11 13:04:05 +08:00
Gabriel	0d603dd4c3	[Bug](delete) Use date as common type for date comparison (#25262 )	2023-10-11 11:51:43 +08:00
Xiangyu Wang	1e6d34d1d0	[Enhancement](sql-cache) Add partition update time for hms table and use it at sql-cache. (#24491 ) Now FE does not record the update time of hms tbl's partitons, so the sql cache may be hit even the hive table's partitions have changed. This pr add a field to record the partition update time, and use it when enable sql-cache. The cache will be missed if any partition has changed at hive side. Use System.currentTimeMillis() but not the event time of hms event because we would better keep the same measurement with the schemaUpdateTime of external table. Add this value to ExternalObjectLog and let slave FEs replay it because it is better to keep the same value with all FEs, so the sql-cache can be hit by the querys through different FEs.	2023-10-11 11:05:16 +08:00
morrySnow	b91bce8a62	[feature](Nereids) add array distance functions (#25196 ) - l1_distance - l2_distance - cosine_distance - inner_product	2023-10-10 21:35:06 -05:00
Calvin Kirs	d4673ce28a	[Feature](Job)Jobs in the Finish state will be automatically deleted after three days. (#25170 )	2023-10-11 10:04:19 +08:00
zhangstar333	fb3b888ff1	[prune](partition)support prune partition when is auto partition with function call (#24747 ) now create table use auto create partition: AUTO PARTITION BY RANGE date_trunc(event_day, 'day') so the value of event_day will be insert into partition of date_trunc(event_day, 'day'), eg: select * from partition_range where date_trunc(event_day,"day")= "2023-08-07 11:00:00"; we can prune some partitions by invoke function of date_trunc("2023-08-07 11:00:00","day" );	2023-10-10 20:39:43 +08:00
zhiqqqq	62a6b132be	[Fix](func numbers) Remove backend_nums argument of numbers function (#25200 )	2023-10-10 20:25:58 +08:00
morrySnow	fc1bad9a6b	[feature](Nereids) support query MATERIALIZED_VIEW type table (#25227 )	2023-10-10 06:44:29 -05:00
Petrichor	67ddfb1abc	[fix](httpserver) creating this cookie without the "secure" flag and enabling cross-origin resource safe (#25107 )	2023-10-10 06:25:09 -05:00
morrySnow	8b56ca84c7	[fix](Nereids) support AnyDataType in function signature (#25173 ) 1. support AnyDataType in function signature 2. update histogram signature	2023-10-10 06:09:47 -05:00
morrySnow	0435b286fb	[feature](Nereids) support metadata tvf and fix bugs in group_commit() (#25224 ) metadata tvf list: - backends - catalogs - frontends - frontends_disks - group_commit - iceberg_meta - workload_groups fix group_commit bugs - throw NPE when properties do not contain 'table_id' - throw NPE when table_id's table do not exist - throw class Cast failed when table_id's table's type is not OLAP	2023-10-10 05:20:19 -05:00
谢健	7276665f1e	[enhancement](Nereids) avoiding broadcast join heuristically and pruning more in CostAndEnforceJob (#25137 ) When the rowCount exceeds a certain threshold, refrain from generating a broadcast join. Only enforce the best expression in CostAndEnforce Job, rather than enforcing every expression. Remove lower bound group pruning	2023-10-10 13:38:10 +08:00
morrySnow	181c58c691	[fix](Nereids) count_by_enum signature is wrong (#25167 )	2023-10-10 13:05:20 +08:00
HappenLee	880d0d7e70	[Bug](pipeline) Support the auto partition in pipeline load (#25176 )	2023-10-10 11:51:12 +08:00
morrySnow	59dee6b235	[fix](Nereids) support string cast to complex type (#25154 )	2023-10-10 10:26:33 +08:00
Jerry Hu	f5b826b66d	[fix](mark join) mark join column should be nullable (#24910 )	2023-10-10 10:10:36 +08:00
Mryange	90ad48cdb7	[feature](pipelineX) add node id and profilev2 in pipelineX (#25084 )	2023-10-10 09:09:26 +08:00
谢健	5e8aef4756	[feature](Nereids) fold weeks_sub/add on fe (#25155 ) support folding weeks_sub/add on fe	2023-10-09 21:52:44 +08:00
amory	53b46b7e6c	[FIX](filter) update for filter_by_select logic (#25007 ) this pr is aim to update for filter_by_select logic and change delete limit only support scala type in delete statement where condition only support column nullable and predict column support filter_by_select logic, because we can not push down non-scala type to storage layer to pack in predict column but do filter logic	2023-10-09 21:27:40 +08:00
morrySnow	37247ac449	[opt](Nereids) add two args signature to trim family functions (#25169 )	2023-10-09 07:17:52 -05:00
AKIRA	08e7a7b932	[feat](optimizer) Scale sample stats with ratio to make it more precise (#25079 ) Since Doris support query specific tablet only, so we don't depend on tableSample to do sample, instead use grammar: TABLET(id) to do so. In OlapAnalyzeTask, we calculate which tablets would be hit and set theirs id in it, so we could get how many rows actually queried and furthur we could get the scale up ratio here	2023-10-09 07:01:59 -05:00
zy-kkk	400b9f2f97	[Enhancement](log) Improve Safety and Robustness of Log4j Configuration (#24861 )	2023-10-09 06:44:37 -05:00
morrySnow	f8eb36158a	[fix](Nereids) alias function support arithmetic functions (#25162 )	2023-10-09 19:04:47 +08:00
Tiewei Fang	977d119545	[fix](Insert select tvf) fix NPE because tvf do not have catalog name (#25149 )	2023-10-09 18:02:43 +08:00

1 2 3 4 5 ...

4840 Commits