doris

Author	SHA1	Message	Date
yuxuan-luo	fe63a0a3bb	[Feature](multi-catalog)support paimon catalog (#19681 ) CREATE CATALOG paimon_n2 PROPERTIES ( "dfs.ha.namenodes.HDFS1006531" = "nn2,nn1", "dfs.namenode.rpc-address.HDFS1006531.nn2" = "172.16.65.xx:4007", "dfs.namenode.rpc-address.HDFS1006531.nn1" = "172.16.65.xx:4007", "hive.metastore.uris" = "thrift://172.16.65.xx:7004", "type" = "paimon", "dfs.nameservices" = "HDFS1006531", "hadoop.username" = "hadoop", "paimon.catalog.type" = "hms", "warehouse" = "hdfs://HDFS1006531/data/paimon1", "dfs.client.failover.proxy.provider.HDFS1006531" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" );	2023-06-06 15:08:30 +08:00
Chengpeng Yan	ae428c29e2	[feature](planner)(nereids) support user defined variable (#20334 ) Support user-defined variables. After this PR, we can use `set @a = xx` to define a user variable and use it in the query like `select @a`. the changes of this PR: 1. Support the grammar for `set user variable` in the parser. 2. Add the `userVars` in `VariableMgr` to store the user-defined variables. 3. For the `set @a = xx`, we will store the variable name and its value in the `userVars` in `VariableMgr`. 4. For the `select @a`, we will get the value for the variable name in `userVars`.	2023-06-06 14:35:16 +08:00
yuxuan-luo	0fce7b9011	[fix](http) Let the sdk find the httpclient package determined (#20205 )	2023-06-06 14:20:38 +08:00
amory	1f032a551d	[Improve](array-functions) support array first function (#20397 ) add array_first(lambda, [1,2,3,null]) function for doris	2023-06-06 12:08:46 +08:00
TengJianPing	1b94b6368f	[fix](load) in strict mode, return error for insert if datatype convert fails (#20378 ) * [fix](load) in strict mode, return error for load and insert if datatype convert fails Revert "[fix](MySQL) the way Doris handles boolean type is consistent with MySQL (#19416)" This reverts commit 68eb420cabe5b26b09d6d4a2724ae12699bdee87. Since it changed other behaviours, e.g. in strict mode insert into t_int values ("a"), it will result 0 is inserted into table, but it should return error instead. * fix be ut * fix regression tests	2023-06-06 12:04:03 +08:00
morrySnow	e553615a27	[opt](Nereids) perfer use datev2 / datetimev2 in date related functions (#20224 ) 1. update all date related functions' signatures order. 1.1. if return value need to be compute with time info, args with datetimev2 at the top of the list, followed by datev2, datetime and date 1.2. if return value need to be compute with only date info, args with datev2 at the top of list, followed by datetimev2, date and datetime 2. Priority for use datev2, if we must cast date to datev2 or datetime/datetimev2	2023-06-06 11:42:29 +08:00
zy-kkk	c56eddbfa9	[bug](jdbc) fix trino date/datetime filter (#20443 ) When querying Trino's JDBC catalog, if our WHERE filter condition is k1 >= '2022-01-01', this format is incorrect. In Trino, the correct format should be k1 >= date '2022-01-01' or k1 >= timestamp '2022-01-01 00:00:00'. Therefore, the date string in the WHERE condition needs to be converted to the date or timestamp format supported by Trino.	2023-06-06 11:20:42 +08:00
Yang, Xu	d02737a293	[feature](struct-type) support struct_element function (#19045 ) This commit support a function allows return a field column in named struct column. Since the function can return any type, this commit also supports ANY_STRUCT_TYPE and ANY_ELEMENT_TYPE.	2023-06-06 10:44:08 +08:00
Mingyu Chen	f839c90c27	[fix][refactor](backend-policy)(compute) refactor the hierarchy of external scan node and fix compute node bug #20402 There should be 2 kinds of ScanNode: OlapScanNode ExternalScanNode The Backends used for ExternalScanNode should be controlled by FederationBackendPolicy. But currently, only FileScanNode is controlled by FederationBackendPolicy, other scan node such as MysqlScanNode, JdbcScanNode will use Mix Backend even if we enable and prefer to use Compute Backend. In this PR, I modified the hierarchy of ExternalScanNode, the new hierarchy is: ScanNode OlapScanNode SchemaScanNode ExternalScanNode MetadataScanNode DataGenScanNode EsScanNode OdbcScanNode MysqlScanNode JdbcScanNode FileScanNode FileLoadScanNode FileQueryScanNode MaxComputeScanNode IcebergScanNode TVFScanNode HiveScanNode HudiScanNode And previously, the BackendPolicy is the member of FileScanNode, now I moved it to the ExternalScanNode. So that all subtype ExternalScanNode can use BackendPolicy to choose Compute Backend to execute the query. All all ExternalScanNode should implement the abstract method createScanRangeLocations(). For scan node like jdbc scan node/mysql scan node, the scan range locations will be selected randomly from compute node(if preferred). And for compute node selection. If all scan nodes are external scan nodes, and prefer_compute_node_for_external_table is set to true, the BE for this query will only select compute nodes.	2023-06-06 10:35:30 +08:00
slothever	b7fc17da68	[feature-wip](multi-catalog)(step2)support read max compute data by JNI (#19819 ) Issue Number: #19679	2023-06-05 22:10:08 +08:00
mch_ucchi	fac0b50f56	[Fix](Planner)fix cast date/datev2/datetime to float/double return null. (#20008 )	2023-06-05 19:06:50 +08:00
minghong	92721c84d3	[improve](nereids)derive analytics node stats (#20340 ) 1. derive analytic node stats, add support for rank() 2. filter estimation stats derive updated. update row count of filter column. 3. use ColumnStatistics.orginal to replace ColumnStatistics.orginalNdv, where ColumnStatistics.orginal is the column statisics get from TableScan. TPCDS 70 on tpcds_sf100 improved from 23sec to 2 sec This pr has no performance downgrade on other tpcds queries and tpch queries.	2023-06-05 18:56:20 +08:00
wangbo	c7dd7c2eba	Fix query hang when using queue (#20434 )	2023-06-05 18:12:26 +08:00
morrySnow	7d11db0807	[fix](Nereids) throw NPE when sql cannot be parsed by all planner (#20440 )	2023-06-05 17:49:08 +08:00
zhangdong	bc65e9b5fb	[fix](MTMV) Support star expressions in select list (#20355 )	2023-06-05 17:06:05 +08:00
jakevin	9d39fd7aae	[fix](Nereids): fix filter can't be pushdown unionAll (#20310 )	2023-06-05 16:56:25 +08:00
LiBinfeng	f0b0bda04a	[Fix](Nereids) Fix duplicated name in view does not throw exception (#20374 ) When using nereids, if we have duplicated name in output of view, we need to throw an exception. A check rule was added in bindExpression rule set	2023-06-05 16:10:54 +08:00
luozenglin	a66d5a6ae0	[fix](workload-group) fix workload group non-existence error (#20428 )	2023-06-05 15:33:26 +08:00
LiBinfeng	fe942eaf44	[Fix](Nereids) Fix minidump using put all of hashmap (#20268 ) Minidump file wants to get information as much as possible, but when close the switch, these methods should not be called after refactor pr: #20049. Other place of doing more jobs after add Minidump feature also be checked.	2023-06-05 13:05:15 +08:00
minghong	0dc6d3a568	[fix](nereids) avg size of column stats always be 0 (#20341 ) it takes lot of effort to compute the avgSizeByte for col stats. we use schema information to avoid compute actual average size	2023-06-05 13:01:58 +08:00
AKIRA	cd0379df4e	[fix](nereids) select with specified partition name is not work as expected (#20269 ) This PR is to fix the select specific partition issue, certain codes related to this feature were accidentally deleted.	2023-06-05 12:48:54 +08:00
camby	3c28a71378	[fix](dynamic partition) partition create failed after alter distributed column (#20239 ) This pr fix following two problems: Problem1: Alter column comment make add dynamic partition failed inside issue #10811 create table with dynamic partition policy; restart FE; alter distribution column comment; alter dynamic_partition.end to trigger add new partition by dynamic partition scheduler; Then we got the error log, and the new partition create failed. dynamic add partition failed: errCode = 2, detailMessage = Cannot assign hash distribution with different distribution cols. default is: [id int(11) NULL COMMENT 'new_comment_of_id'], db: default_cluster:example_db, table: test_2 Problem2: rename distributed column, make old partition insert failed. inside #20405 The key point of the reproduce steps is restart FE. It seems all versions will be affected, include master and lts-1.1 and so on.	2023-06-05 12:20:50 +08:00
Yulei-Yang	a6d8115cbc	[Improvement](planner) expand sql-block-rule to make it can be used on all kinds of sql stmt (#19540 ) Currently, sql-block-rule can only be used for query statements, while it's useful for other stmts like insert / delete / alter / drop etc. Now remove the limitation and expand its using scenario.	2023-06-05 11:01:43 +08:00
Yulei-Yang	660ab34147	[fix](multicatalog) support read from hive table with HIVE_UNION_SUBDIR in path location (#20329 )	2023-06-05 11:01:24 +08:00
AKIRA	12f89b879f	[fix](stats) Analysis info lost after checkpoint (#20412 ) 1. Implement write/read for AnalysisManager 2. If database or table has any column with complex type, the analyze stmt would fail directly. Enable to ignore complex type columns and analyze rest of them in this PR	2023-06-05 10:51:02 +08:00
starocean999	c6387847aa	[fix](nereids) change defaultConcreteType function's return value for decimal (#20380 ) 1. add default decimalv2 and decimalv3 for NullType 2. change defaultConcreteType of decimalv3 to this	2023-06-05 10:50:07 +08:00
amory	59a0f80233	[Improve](array-function)Improve array function intersect (#20085 ) now we just support array function with 2 arrays , but intersect operator can support more than 2 arrays	2023-06-05 10:38:48 +08:00
Pxl	8e39f0cf6b	[Enchancement](Agg State) storage function name and result is nullable in agg state type (#20298 ) storage function name and result is nullable in agg state type	2023-06-04 22:44:48 +08:00
ElvinWei	ad5e34ab9c	[Doc](statistics) supplement stats doc (regression test and automatic collection) (#20071 )	2023-06-03 17:25:33 +08:00
YueW	77855fcd43	[fix](inverted index) fix transaction id changed when light index change (#20302 )	2023-06-03 16:05:02 +08:00
Kang	ffadaa4935	[improvement](inverted index) skip write index on load and generate index on compaction (#20325 )	2023-06-03 16:03:21 +08:00
caiconghui	6958a8f92f	[fix](dynamic_partition) fix dead lock when modify dynamic partition property for olap table (#20390 ) Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-06-03 08:25:20 +08:00
morrySnow	299c3dc396	[fix](Nereids) should not inherit child's limit and offset when generate exchange node (#20373 ) in legacy planner, when we new exchange, it inherit its child's limit and offset. but in Nereids, we should not do this. because if we need set limit or offset, we will set it manually. In this PR, we use a new ctor of ExchangeNode to ensure not set limit or offset unexpected.	2023-06-02 19:55:33 +08:00
luozenglin	a8e0841ef1	[fix](workload-group) fix incorrect memoryLimitPercent value (#20377 )	2023-06-02 18:57:57 +08:00
zy-kkk	a20a6d2bea	[refactor](jdbc catalog) Refactor the JdbcClient code (#20109 ) This PR does the following: 1. This PR is a substantial refactor of the JDBC client architecture. The previous monolithic JDBC client has been refactored into an abstract base class `JdbcClient`, and a set of database-specific subclasses (e.g., `JdbcMySQLClient`, `JdbcOracleClient`, etc.), and the JdbcClient required config, abstract into an object. This allows for improved modularity, easier addition of support for new databases, and cleaner, more maintainable code. This change is backward-compatible and does not affect existing functionality. 2. As a result of client refactoring, OceanBaseClient can automatically recognize the mode of operation as MySQL or Oracle, so we cancel the oceanbase_mode property in the Jdbc Catalog, but due to the cancellation of the property, When creating a single OceanBase Jdbc Table, the table type needs to be filled in as oceanbase(mysql mode) or oceanbase_oracle(oracle_mode). The above work is a change in the usage behavior, please note. 3. For the PostgreSQL Jdbc Catalog, I did two things: 1. The adaptation to MATERIALIZED VIEW and FOREIGN TABLE is added 2. Fixed reading jsonb, which had been incorrectly changed to json in a previous PR 4. fix some jdbc catalog test case 5. modify oceanbase jdbc doc And,Thanks @wolfboys for the guidance	2023-06-02 17:58:10 +08:00
yongjinhou	4395fb70c4	[Enhancement](tvf) Backends tvf supports authentication (#20333 ) Add authentication for backends tvf.	2023-06-02 17:53:44 +08:00
minghong	386a4a0b43	[fix](nereids) add fragment id on all PhysicalRelation (#20371 ) fix "cannot find fragment id for scan" exception	2023-06-02 17:13:09 +08:00
morrySnow	422fcd6377	[fix](Nereids) forbid unexpected expression on filter and fix two more bugs (#20331 ) fix below bugs: 1. not check filter's expression, aggregate function, grouping scalar function and window expression should not appear in filter 2. show not change nullable of aggregate function when it is window function in window expression 3. bitmap and other metric types should not appear in order by or partition by of window expression	2023-06-02 16:19:50 +08:00
Yongqiang YANG	b1e6c6ffe5	[enhancement](txn) print commit backends when commit fails (#20367 ) Print commit backends when a commit fails.	2023-06-02 15:10:38 +08:00
amory	d68f3f3b3d	[Feature](array-functions)improve array functions for array_last_index (#20294 ) Now we just support array_first_index for lambda input , but no array_last_index	2023-06-02 13:54:03 +08:00
AKIRA	e32eba8fdf	[refactor](stats) Persist status of analyze task to FE meta data (#20264 ) 1. In the past, we use a BE table named `analysis_jobs` to persist the status of analyze jobs/tasks, however there are many flaws such as, if BE crashed analyze job/task would failed however the status of analyze job/task couldn't get updated. 2. Support `DROP ANALYZE JOB [job_id]` to delete analyze job 3. Support `SHOW ANALYZE TASK STATUS [job_id] ` to get the task status of specific job 4. Restrict the execute condition of auto analyze, only when the last execution of auto analyze job finished a while ago could be executed again 5. Support analyze whole DB	2023-06-02 12:33:31 +08:00
mch_ucchi	9d8043e4c1	[Fix](Nereids) should not gather data when sink (#20330 )	2023-06-02 10:33:11 +08:00
xy720	5a3b97bbf2	[enhancement](struct-type)support comment for struct field (#20200 ) support comment for struct field	2023-06-02 10:29:56 +08:00
Gabriel	937f04033f	[Bug](runtime filter) fix NPE if runtime filter has no target (#20338 )	2023-06-02 09:54:37 +08:00
starocean999	a8a4da9b9e	[fix](nereids)dphyper join reorder may cache wrong project list for project node (#20209 ) * [fix](nereids)dphyper join reorder may cache wrong project list for project node	2023-06-02 09:35:28 +08:00
xueweizhang	ecdc5124be	[feature-wip](duplicate-no-keys) schame change support for duplicate no keys (#19326 )	2023-06-02 09:22:41 +08:00
wangbo	0df073699d	[fix](planner)Fix missing kw for workload #20319 1 add usage docment for Workload Group query queue; 2 Fix missing KW for workload, this may cause create workload group failed.	2023-06-02 09:04:22 +08:00
Yongqiang YANG	363e78f08f	[enhancement](publish) print detailed info for failed publish (#20309 )	2023-06-01 22:24:16 +08:00
zhangstar333	34c1cda14a	[bug](udaf) fix java-udaf test case failed with decimal (#20315 ) java-udaf have some test case with decimal will be failed in P0, because the decimal of scale is not set correctly	2023-06-01 20:14:54 +08:00
lihangyu	f0513a861d	[Improve](Scan) add a session variable to make scan run serial (#20220 ) Parallel scanning can result in some read amplification, for example, select * from xx where limit 1 actually requires only one row of data. However, due to parallel scanning of multiple tablets, read amplification occurs, leading to performance bottlenecks in high-concurrency scenarios. This PR Adding a SessionVariable to enforce serial scanning can help mitigate this issue.	2023-06-01 15:06:35 +08:00

... 68 69 70 71 72 ...

8289 Commits