doris

Author	SHA1	Message	Date
Kang	295ea482a1	[improvement](log) optimize template function log for performance (#23746 ) change log level to debug and use format in template function log for performance.	2023-09-01 19:02:33 +08:00
minghong	0b94eee4c7	[fix](rest)query_info returns empty rows #23595	2023-09-01 18:50:49 +08:00
谢健	797d9de192	[fix](Nereids) When col stats is Unknow, not expression should return the stats with selectivity of 1	2023-09-01 17:36:31 +08:00
LiBinfeng	e3bbba82cf	[Fix](planner) fix to_date failed in create table as select (#23613 ) Problem: when create table as select using to_date function, it would failed Example: create table test_to_date properties('replication_num' = '1') as select to_date('20230816') as datev2; Reason: after release version 2.0, datev1 is disabled, but to_date function signature does not upgrade, so it failed when checking return type of to_date Solved: when getfunction, forbidden to_date with return type date_v1， datetime v1 also changed to datetime v2 and decimal v2 changed to decimal v3	2023-09-01 17:28:40 +08:00
starocean999	b5232ce0d7	[fix](nereids) NormalizeAggregate may push redundant expr to child project node (#23700 ) NormalizeAggregate may push exprs to child project node. We need make sure there is no redundant expr in the pushed down expr list. This pr use 'Set' to make sure of that.	2023-09-01 17:16:10 +08:00
yujun	e3886bcf2a	[fix](tablet sheduler) change sched period back to 1s (#23573 ) This reverts commit 285bf978442fdff65fda5264ff40bd8291954ef2. * change tablet sched peroid back to 1s	2023-09-01 15:29:59 +08:00
plat1ko	9d2fc78bd5	[fix](cooldown) Fix potential data loss when clone task's dst tablet is cooldown replica (#17644 ) Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com> Co-authored-by: Kang <kxiao.tiger@gmail.com>	2023-09-01 15:27:52 +08:00
yujun	b843b79ddc	[fix](tablet clone) fix tablet sched ctx toString cause null exeption (#23731 )	2023-09-01 15:05:28 +08:00
Pxl	0e9dd348fb	[Improvment](materialized-view) add short circuit for selectBestMV #23743	2023-09-01 14:46:54 +08:00
morrySnow	5b2360e836	[opt](planner) speed up computeColumnsFilter on ScanNode (#23742 ) computeColumnsFilter compute filter on all table base schema's column. However, it table is very wide, such as 5000 columns. It will take a long time. This PR compare conjuncts size and columns size. If conjuncts size is small than columns size, then collect slots from conjuncts to avoid traverse all columns.	2023-09-01 14:22:17 +08:00
Calvin Kirs	e88c218390	[Improve](Job)Job internal interface provides immediate scheduling (#23735 ) Delete meaningless job status System scheduling is executed in the time wheel Optimize window calculation code	2023-09-01 12:50:08 +08:00
AlexYue	d96bc2de1a	[enhance](policy) Support to change table's storage policy if the two policy has same resource (#23665 )	2023-09-01 11:25:27 +08:00
Jibing-Li	d6450a3f1c	[Fix](statistics)Fix external table auto analyze bugs (#23574 ) 1. Fix auto analyze external table recursively load schema cache bug. 2. Move some function in StatisticsAutoAnalyzer class to TableIf. So that external table and internal table could implement the logic separately. 3. Disable external catalog auto analyze by default, could open it by adding catalog property "enable.auto.analyze"="true"	2023-09-01 10:58:14 +08:00
Jibing-Li	9a7e8b298a	[Improvement](statistics)Show column stats even when error occurred (#23703 ) Before, show column stats will ignore column with error. In this pr, when min or max value failed to deserialize, show column stats will use N/A as value of min or max, and still show the rest stats. (count, null_count, ndv and so on).	2023-09-01 10:57:37 +08:00
morrySnow	b93a1a83a5	[opt](Nereids) let keywords list same with legacy planner (#23632 )	2023-09-01 10:24:30 +08:00
hzq	16d6357266	[fix] (mac compile) Fix mac compile error & fe start time related (#23727 ) Fix of PR #23582 Some Fe codes are deleted by [Improvement](pipeline) Cancel outdated query if original fe restarts #23582 , need to be added back; Fix mac build failed caused by wrong thrift declaration order.	2023-09-01 08:02:30 +08:00
mch_ucchi	52e645abd2	[Feature](Nereids): support cte for update and delete statements of Nereids (#23384 )	2023-08-31 23:36:27 +08:00
daidai	e680d42fe7	[feature](information_schema)add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql (#22702 ) add information_schema.metadata_name_idsfor quickly get catlogs,db,table. 1. table struct : ```mysql mysql> desc internal.information_schema.metadata_name_ids; +---------------+--------------+------+-------+---------+-------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +---------------+--------------+------+-------+---------+-------+ \| CATALOG_ID \| BIGINT \| Yes \| false \| NULL \| \| \| CATALOG_NAME \| VARCHAR(512) \| Yes \| false \| NULL \| \| \| DATABASE_ID \| BIGINT \| Yes \| false \| NULL \| \| \| DATABASE_NAME \| VARCHAR(64) \| Yes \| false \| NULL \| \| \| TABLE_ID \| BIGINT \| Yes \| false \| NULL \| \| \| TABLE_NAME \| VARCHAR(64) \| Yes \| false \| NULL \| \| +---------------+--------------+------+-------+---------+-------+ 6 rows in set (0.00 sec) mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G; ************************* 1. row ************************* CATALOG_ID: 113008 CATALOG_NAME: hive1 DATABASE_ID: 113042 DATABASE_NAME: ssb1_parquet TABLE_ID: 114009 TABLE_NAME: dates 1 row in set (0.07 sec) ``` 2. when you create / drop catalog , need not refresh catalog . ```mysql mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************ 1. row ************************* count(): 21301 1 row in set (0.34 sec) mysql> drop catalog hive2; Query OK, 0 rows affected (0.01 sec) mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************* 1. row ************************* count(): 10665 1 row in set (0.04 sec) mysql> create catalog hive3 ... mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************* 1. row ************************* count(): 21301 1 row in set (0.32 sec) ``` 3. create / drop table , need not refresh catalog . ```mysql mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ; mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************* 1. row ************************* count(): 10666 1 row in set (0.04 sec) mysql> drop table demo.example_tbl; Query OK, 0 rows affected (0.01 sec) mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************* 1. row ************************* count(): 10665 1 row in set (0.04 sec) ``` 4. you can set query time , prevent queries from taking too long . ``` fe.conf : query_metadata_name_ids_timeout the time used to obtain all tables in one database ``` 5. add information_schema.profiling in order to Compatible with mysql ```mysql mysql> select from information_schema.profiling; Empty set (0.07 sec) mysql> set profiling=1; Query OK, 0 rows affected (0.01 sec) ```	2023-08-31 21:22:26 +08:00
morrySnow	b5e8217743	[opt](Nereids) speed up deepEquals of TreeNode (#23710 )	2023-08-31 19:38:44 +08:00
zhangstar333	3a34ec95af	[FE](fucntion) add date_floor/ceil in FE function (#23539 )	2023-08-31 19:26:47 +08:00
morrySnow	da5c78019c	[opt](fe-ui) support read hardware info from aarch64 MacOS (#23708 ) update the version of oshi and jna to support read hardware info from aarch64 MacOS	2023-08-31 18:16:33 +08:00
hzq	c083336bbe	[Improvement](pipeline) Cancel outdated query if original fe restarts (#23582 ) If any FE restarts, queries that is emitted from this FE will be cancelled. Implementation of #23704	2023-08-31 17:58:52 +08:00
starocean999	7379cdc995	[feature](nereids) support subquery in select list (#23271 ) 1. add scalar subquery's output to LogicalApply's output 2. for in and exists subquery's, add mark join slot into LogicalApply's output 3. forbid push down alias through join if the project list have any mark join slots. 4. move normalize aggregate rule to analysis phase	2023-08-31 15:51:32 +08:00
Xiangyu Wang	126606cb4d	[Fix](cache) fix query cache returns wrong result after deleting partitions. (#23555 ) The reason is that sql cache just use partitionKey , latestVersion and latestTime to check if the cache should be returned, if we delete some partition(s) which is not the latest updated partition, all above values are not changed, so the cache will hit. Use a field to save the partition num of these tables and sum the partition nums and send it to BE, there are two situations which contains delete-partition ops: - just delete some partition(s), so the sum of partition num will be lower than before. - delete some partition(s) coexists with add some partition(s), so the latest time or latest version will be higher than before.	2023-08-31 14:22:52 +08:00
Pxl	f35ab37e1e	[Bug](materialized-view) fix load db use analyzer to analyze diffrent metaindex (#23673 ) fix load db use analyzer to analyze diffrent metaindex	2023-08-31 12:35:38 +08:00
starocean999	41c5e00071	[fix](planner)fix bug of resolve column (#23512 ) if resolve a inline view column failed, we try to resolve it again by removing the table name. But it's wrong if the table name(may be the inlineview's alias) is same as some table name inside inlineview. So this pr check the table name, and only remove it when there is no table inside the inlineview has the same name with the column's table name	2023-08-31 12:25:26 +08:00
morrySnow	897151fc2b	[fix](Nereids) set operation syntax is not compatible with legacy planner (#23668 ) for example ```sql WITH A AS (SELECT * FROM B) SELECT * FROM C UNION SELECT * FROM D ``` the scope of CTE in Nereids is the first set oeprand. the scope of CTE in legacy planner is the whole statement.	2023-08-31 11:55:35 +08:00
airborne12	ab85fb3592	[Fix](PhysicalPlanTranslator) forget setPushDownAggNoGrouping in OlapScanNode (#23675 ) * [Fix](PhysicalPlanTranslator) forget setPushDownAggNoGrouping in OlapScanNode * use relation id instead of table id	2023-08-31 11:49:55 +08:00
zxealous	08b4977d44	[fix](deploy) fix deploy manager can't drop node (#23667 )	2023-08-31 10:53:34 +08:00
Mryange	96c4471b4a	[feature](udf) udf array/map support decimal and update doc (#23560 ) * update * decimal * update table name * remove log * add log	2023-08-31 07:44:18 +08:00
Pxl	7f4f39551a	[Bug](materialized-view) fix change base schema when create mv (#23607 ) * fix change base schema when create mv * fix * fix	2023-08-30 21:00:12 +08:00
Jibing-Li	4f26750c91	[Improvement](statistics)Disable file cache while running analysis tasks. (#23663 ) Disable file cache while running analysis tasks. Analyze tasks are background tasks, shouldn't affect user local cache data.	2023-08-30 20:50:12 +08:00
Mingyu Chen	b7404896fa	[improvement](catalog) avoid calling checksum when replaying creating jdbc catalog and fix ranger issue (#22369 ) 1. jdbc Before, in the constructor of Jdbc catalog, we may call checksum action of the jdbc driver. But the download link of the jdbc driver may not be available when replaying, causing replay error. This PR change the logic to avoid calling checksum when replaying creating jdbc catalog. 2. ranger After this PR, when creating catalog, it will try to init access controller to make sure the config is ok. 3. catalog priv check When creating/dropping/altering/ catalog, doris will only use internal access controller to check catalog level priv.	2023-08-30 19:24:11 +08:00
zzzzzzzs	05771e8a14	[Enhancement](Load) stream Load using SQL (#23362 ) Using stream load in SQL mode for example: example.csv 10000,北京 10001,天津 curl -v --location-trusted -u root: -H "sql: insert into test.t1(c1, c2) select c1,c2 from stream(\"format\" = \"CSV\", \"column_separator\" = \",\")" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql curl -v --location-trusted -u root: -H "sql: insert into test.t2(c1, c2, c3) select c1,c2, 'aaa' from stream(\"format\" = \"CSV\", \"column_separator\" = \",\")" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql curl -v --location-trusted -u root: -H "sql: insert into test.t3(c1, c2) select c1, count(1) from stream(\"format\" = \"CSV\", \"column_separator\" = \",\") group by c1" -T example.csv http://127.0.0.1:8030/api/_stream_load_with_sql	2023-08-30 19:02:48 +08:00
starocean999	e1743b70f2	[enhancement](nereids)remove useless cast for floatlike type (#23621 ) convert cast(c1 AS double) > 2.0 to c1 >= 2 (c1 is integer like type)	2023-08-30 19:00:16 +08:00
zy-kkk	cc6cbc04af	[improvement](jdbc catalog) Remove useless mysql jdbc connection parameters (#23647 )	2023-08-30 17:47:04 +08:00
谢健	ade598e043	[feature](Nereids): eliminate distinct for max/min/any_value (#23428 ) eliminate distinct for max/min/any_value function ``` max(distinct value) = max(value) ```	2023-08-30 17:23:10 +08:00
谢健	a136836770	[feature](Nereids) add two functions: char and covert (#23104 ) add [char](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/string-functions/char/?_highlight=char) func ``` mysql> select char(68, 111, 114, 105, 115); +--------------------------------------+ \| char('utf8', 68, 111, 114, 105, 115) \| +--------------------------------------+ \| Doris \| +--------------------------------------+ ``` convert func ``` MySQL root@127.0.0.1:(none)> select convert(1 using gbk); +-------------------+ \| convert(1, 'gbk') \| +-------------------+ \| 1 \| +-------------------+ ```	2023-08-30 17:09:06 +08:00
jakevin	509d865760	[feature](Nereids): convert CaseWhen to If (#23040 ) Add a rule to optimize CASE WHEN expression. Rewrite rule to convert CASE WHEN to IF. For example: CASE WHEN a > 1 THEN 1 ELSE 0 END -> IF(a > 1, 1, 0)	2023-08-30 15:47:29 +08:00
Chuang Li	3a0a79b4a0	[Improvement][SparkLoad] Use system env configs when users don't set env configs. (#21837 )	2023-08-30 15:14:40 +08:00
zhangstar333	aef162ad4c	[test](log) add some log in udf function when thrown exception (#23651 ) [test](log) add some log in udf function when thrown exception (#23651)	2023-08-30 14:16:05 +08:00
jakevin	4fec0826f8	[fix](Nereids): avoid Exception to cause analyze time too long (#23627 ) AnyDataType will cause toCatalogDataType throw Exception, it will cost much time. Avoid to throw Exception in Analyzer.	2023-08-30 12:25:31 +08:00
amory	d326cb0c99	[fix](planner) array constructor do type coercion with decimal in wrong way (#23630 ) array creator with decimal type and integer type parameters should return array<decimal>, but the legacy planner return array<double>	2023-08-30 11:18:31 +08:00
Guangdong Liu	f786689044	[refactor](TableRowCountAction) Fine-tune sql execution code (#23541 )	2023-08-30 10:11:30 +08:00
Calvin Kirs	ca55bd88ad	[Fix](Job)Fix the window time is not updated when no job is registered (#23628 ) Fix resume job grammar definition is inconsistent Show Job task Add execution results JOB allows to define update operations	2023-08-30 09:48:21 +08:00
morrySnow	e02747e976	[feature](Nereids) support struct type (#23597 ) 1. support struct data type 2. add array / map / struct literal syntax 3. fix array union / intersect / except type coercion 4. fix explict cast data type check for array 5. fix bound function type coercion	2023-08-29 20:41:24 +08:00
DeadlineFen	4f7e7040ad	[bugfix] (dynamic partition) dynamic partition job is removed when tbl is sync (#23404 )	2023-08-29 20:35:56 +08:00
Siyang Tang	1ac0ff0ea9	[feature](delete-predicate) support delete sub predicate v2 (#22442 ) New structure for delete sub predicate. Delete sub predicate uses a string type condition_str to stored temporarily now and fields will be extracted from it using std::regex, which may introduces stack overflow when matching a extremely large string(bug of libc). Now we attempt to use a new PB structure to hold the delete sub predicate, to avoid that problem. message DeleteSubPredicatePB { optional int32 column_unique_id = 1; optional string column_name = 2; optional string op = 3; optional string cond_value = 4; } Currently, 2 versions of sub predicate will both be filled. For query, we use the v2, and during compaction we still use v1. The old rowset meta with delete predicates which had sub predicate v1 will be attempted to convert to v2 when read from PB. Moreover, efforts will be made to rewrite these meta with the new delete sub predicate. Make preparation to use column unique id to specify a column globally. Using the column unique id rather than the column name to identify a column is vital for flexible schema change. The rewritten delete predicate will attach column unique id.	2023-08-29 19:37:23 +08:00
Tiewei Fang	103fa4eb55	[feature](Export) support export with nereids (#23319 )	2023-08-29 19:36:19 +08:00
morrySnow	cc1509ba11	[fix](view) The parameter positions of timestamp diff function to sql are reversed (#23601 )	2023-08-29 18:30:16 +08:00

1 2 3 4 5 ...

5743 Commits