doris

Author	SHA1	Message	Date
starocean999	bcc37c9405	[fix](planner)the common type of floating and decimal should be floating type (#20634 ) * [fix](planner)the common type of floating and decimal should be floating type * fix test cases	2023-06-12 11:32:23 +08:00
Jibing-Li	87bc405c41	[Improvement](statistics)Support external table partition statistics (#20415 ) Support collect statistics for HMS external table with specific partitions. Add session variables to limit the partitions to collect for whole table line number and columns statistics.	2023-06-10 12:28:53 +08:00
caoliang-web	079fb0e56d	[improvement](config)update FE config max_running_txn_num_per_db default value (#20478 ) image update FE config max_running_txn_num_per_db default value: old value : 100 new value : 1000	2023-06-09 08:54:37 +08:00
Gabriel	325ddab34e	[conf](pipeline) turn pipeline on by default (#20458 )	2023-06-08 09:20:51 +08:00
zhangdong	c910e9b78b	[doc](disk)fix disk capacity doc error (#20506 )	2023-06-07 15:20:04 +08:00
Yang, Xu	d02737a293	[feature](struct-type) support struct_element function (#19045 ) This commit support a function allows return a field column in named struct column. Since the function can return any type, this commit also supports ANY_STRUCT_TYPE and ANY_ELEMENT_TYPE.	2023-06-06 10:44:08 +08:00
amory	59a0f80233	[Improve](array-function)Improve array function intersect (#20085 ) now we just support array function with 2 arrays , but intersect operator can support more than 2 arrays	2023-06-05 10:38:48 +08:00
Pxl	8e39f0cf6b	[Enchancement](Agg State) storage function name and result is nullable in agg state type (#20298 ) storage function name and result is nullable in agg state type	2023-06-04 22:44:48 +08:00
morrySnow	422fcd6377	[fix](Nereids) forbid unexpected expression on filter and fix two more bugs (#20331 ) fix below bugs: 1. not check filter's expression, aggregate function, grouping scalar function and window expression should not appear in filter 2. show not change nullable of aggregate function when it is window function in window expression 3. bitmap and other metric types should not appear in order by or partition by of window expression	2023-06-02 16:19:50 +08:00
AKIRA	e32eba8fdf	[refactor](stats) Persist status of analyze task to FE meta data (#20264 ) 1. In the past, we use a BE table named `analysis_jobs` to persist the status of analyze jobs/tasks, however there are many flaws such as, if BE crashed analyze job/task would failed however the status of analyze job/task couldn't get updated. 2. Support `DROP ANALYZE JOB [job_id]` to delete analyze job 3. Support `SHOW ANALYZE TASK STATUS [job_id] ` to get the task status of specific job 4. Restrict the execute condition of auto analyze, only when the last execution of auto analyze job finished a while ago could be executed again 5. Support analyze whole DB	2023-06-02 12:33:31 +08:00
xy720	5a3b97bbf2	[enhancement](struct-type)support comment for struct field (#20200 ) support comment for struct field	2023-06-02 10:29:56 +08:00
xueweizhang	ecdc5124be	[feature-wip](duplicate-no-keys) schame change support for duplicate no keys (#19326 )	2023-06-02 09:22:41 +08:00
Mryange	519f01133a	[feature](decimal)support cast rounding half up and div precision increment in decimalv3. (#19811 )	2023-06-01 13:09:58 +08:00
Lijia Liu	f9dfcb923d	[Enhancement] Change Create Resource Group Grammar (#20249 )	2023-05-31 15:23:24 +08:00
wangbo	6f68ec9de0	support query queue (#20048 ) support query queue (#20048)	2023-05-30 19:52:27 +08:00
Mingyu Chen	0c98355fff	[fix](catalog) fix create catalog with resource replay issue and kerberos auth issue (#20137 ) 1. Fix create catalog with resource replay bug. If user create catalog using `create catalog hive with resource xxx`, when replaying edit log, there is a bug that resource may be dropped, causing NPE and FE will fail to start. In this PR, I add a new FE config `disallow_create_catalog_with_resource`, default is true. So that `with resource` will not be allowed, and it will be deprecated later. And also fix the replay bug to avoid NPE. 2. Fix issue when creating 2 hive catalogs to connect with and without kerberos authentication. When user create 2 hive catalogs, one use simple auth, the other use kerberos auth. The query may fail with error like: `Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.` So I add a default property for hive catalog: `"ipc.client.fallback-to-simple-auth-allowed" = "true"`. Which means this property will be added automatically when user creating hive catalog, to avoid such problem. 3. Fix calling `hdfsExists()` issue When calling `hdfsExists()` with non-zero return code, should check if it encounters error or is file not found. 3. Some code refactor Avoid import `org.apache.parquet.Strings`	2023-05-30 16:57:39 +08:00
Long Zhao	d76be1315f	[BUG]storage_min_left_capacity_bytes default value has integer overflow #19943	2023-05-29 19:50:31 +08:00
AKIRA	cc47ee480c	[feat](stats) delete data size stat and Made task timeout configurable (#20090 ) 1. Delete the stats for data size, since it would cost too much time but useless 2. Make task time out configurable since when it's common to analyze a quite huge table that the default 10 min is not suitable	2023-05-29 16:40:59 +08:00
Gabriel	55ccddb62c	[Conf](decimalv3) enable decimalv3 by default	2023-05-29 15:38:31 +08:00
Changming Xiao	5f9c6e076f	[Fix](load)Make insert timeout accurate in `show load` statistics (#20068 )	2023-05-28 21:19:06 +08:00
YueW	ae352997b4	[Enhancement](alter inverted index) Improve alter inverted index performance with light weight add or drop inverted index (#19063 )	2023-05-28 11:23:07 +08:00
HappenLee	9539bbf8ae	Revert "[test](executor)add crud regression test for resource group (#19659 )" (#20121 ) This reverts commit 8b9813663d87afa7b359b31782f3864dc54881df.	2023-05-27 08:25:00 +08:00
Jack Drogon	93933308e6	[Feature-WIP](CCR): Add ccr doris interface (WIP) (#17881 )	2023-05-26 23:40:49 +08:00
qiye	04415d0b35	[opt](balance) add config balance_slot_num_per_path (#19869 ) Make balance_slot_num_per_path configurable.	2023-05-25 13:39:42 +08:00
starocean999	70f2e8ff80	[fix](nereids)enable decimalv3 by default for nereids (#19906 )	2023-05-24 13:36:24 +08:00
amory	b4669eaeba	[Improve](complex-type)add switch for array/struct/map nesting complex type (#19928 ) Now we not support array/map/struct nesting each other for many action in be , If we do prohibit it in fe, we will meet many undefined action in be , so I just add switch to prohibit nesting complex type . When we fully support , can make it able. Issue Number: close #xxx	2023-05-24 11:39:53 +08:00
Pxl	e9223f6a19	[Feature](aggregation) add agg_state define and ddl support (#19824 ) add agg_state define and ddl support	2023-05-22 11:45:53 +08:00
wangbo	8b9813663d	[test](executor)add crud regression test for resource group (#19659 ) dd crud regression test for resource group (#19659)	2023-05-20 13:49:02 +08:00
zhangdong	a81db3e984	[improvement](FQDN) broker support fqdn (#19821 ) 1.broker support fqdn 2.change 'master_only' attr of 'enable_fqdn_mode'	2023-05-20 11:25:58 +08:00
amory	67dc68630b	[Improve](complex-type)improve array/map/struct creating and function with decimalv3 (#19830 )	2023-05-19 17:43:36 +08:00
airborne12	9d54545bac	[Fix](inverted index) add datev2/datetimev2 for inverted index column type (#19845 ) When we try to query array of datetimev2 column by inverted index, it returns an error like this: CREATE TABLE `nested` ( `qid` bigint(20) NULL, `tag` array<text> NULL, `creationDate` datetime NULL, `title` text NULL, `user` text NULL, `answers.user` array<text> NULL, `answers.date` array<datetimev2(0)> NULL, INDEX tag_idx (`tag`) USING INVERTED PROPERTIES("parser" = "english") COMMENT '', INDEX creation_date_idx (`creationDate`) USING INVERTED COMMENT '', INDEX title_idx (`title`) USING INVERTED COMMENT '', INDEX user_idx (`user`) USING INVERTED COMMENT '', INDEX answers_user_idx (`answers.user`) USING INVERTED COMMENT '', INDEX answers_date_idx (`answers.date`) USING INVERTED COMMENT '' ) ENGINE=OLAP DUPLICATE KEY(`qid`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`qid`) BUCKETS 18 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "storage_format" = "V2", "compression" = "ZSTD", "light_schema_change" = "true", "dynamic_schema" = "true", "disable_auto_compaction" = "false" ); mysql> select * from nested.nested where tag match 'java' and `answers.date` element_le '2012-04-08T21:15:33.873Z' limit 10; ERROR 1105 (HY000): errCode = 2, detailMessage = no function found for MATCH_ELEMENT_LE,`answers.date` MA	2023-05-19 14:57:01 +08:00
Mingyu Chen	14620a6766	[minor](log) add details for unqueryable replicas (#19792 ) Add a new FE config: show_details_for_unaccessible_tablet. Default is false, when set to true, if a query is unable to select a healthy replica, the detailed information of all the replicas of the tablet including the specific reason why they are unqueryable, will be printed out.	2023-05-19 08:53:57 +08:00
Kang	294599ee45	[feature](jsonb) rename JSONB type name and function name to JSON (#19774 ) To be more compatible with MySQL, rename JSONB type name and function name to JSON. The old JSONB type name and jsonb_xx function can still be used for backward compatibility. There is a function jsonb_extract remained since json_extract is used by json string function and more work need to change it. It will be changed further.	2023-05-18 16:16:52 +08:00
mch_ucchi	1d05feea1b	[Feature](Nereids) add executable function to support fold constant for functions (#18209 ) 1. Add date-time functions for fold constant for Nereids. This is the list of executable date-time function nereids supports up to now: - now() - now(int) - current_timestamp() - current_timestamp(int) - localtime() - localtimestamp() - curdate() - current_date() - curtime() - current_time() - date_{add/sub}(),{years/months/days/hours/minutes/seconds}_{add/sub}() - datediff() - {date/datev2}() - {year/quarter/month/day/hour/minute/second}() - dayof{year/month/week}() - date_format() - date_trunc() - from_days() - last_day() - to_monday() - from_unixtime() - unix_timestamp() - utc_timestamp() - to_date() - to_days() - str_to_date() - makedate() 2. solved problem: - enable datev2/datetimev2 default. - refactor Nereids foldConstantOnFE and support fold nested expression. - separate the executable into multi-files for easily-reading and adding new functions	2023-05-17 21:26:31 +08:00
Ashin Gau	30c4f25cb3	[fix](multi-catalog) verify the precision of datetime types for each data source (#19544 ) Fix threes bugs of timestampv2 precision: 1. Hive catalog doesn't set the precision of timestampv2, and can't get the precision from hive metastore, so set the largest precision for timestampv2; 2. Jdbc catalog use datetimev1 to parse timestamp, and convert to timestampv2, so the precision is lost. 3. TVF doesn't use the precision from meta data of file format.	2023-05-17 20:50:15 +08:00
Pxl	4eb2604789	[Bug](function) fix function define of Retention inconsist and change some static_cast to assert cast (#19455 ) 1. fix function define of `Retention` inconsist, this function return tinyint on `FE` and return uint8 on `BE` 2. make assert_cast support cast to derived 3. change some static cast to assert cast 4. support sum(bool)/avg(bool)	2023-05-15 11:50:02 +08:00
Zhengguo Yang	6748ae4a57	[Feature] Collect the information statistics of the query hit (#18805 ) 1. Show the query hit statistics for `baseall` ```sql MySQL [test_query_db]> show query stats from baseall; +-------+------------+-------------+ \| Field \| QueryCount \| FilterCount \| +-------+------------+-------------+ \| k0 \| 0 \| 0 \| \| k1 \| 0 \| 0 \| \| k2 \| 0 \| 0 \| \| k3 \| 0 \| 0 \| \| k4 \| 0 \| 0 \| \| k5 \| 0 \| 0 \| \| k6 \| 0 \| 0 \| \| k10 \| 0 \| 0 \| \| k11 \| 0 \| 0 \| \| k7 \| 0 \| 0 \| \| k8 \| 0 \| 0 \| \| k9 \| 0 \| 0 \| \| k12 \| 0 \| 0 \| \| k13 \| 0 \| 0 \| +-------+------------+-------------+ 14 rows in set (0.002 sec) MySQL [test_query_db]> select k0, k1,k2, sum(k3) from baseall where k9 > 1 group by k0,k1,k2; +------+------+--------+-------------+ \| k0 \| k1 \| k2 \| sum(`k3`) \| +------+------+--------+-------------+ \| 0 \| 6 \| 32767 \| 3021 \| \| 1 \| 12 \| 32767 \| -2147483647 \| \| 0 \| 3 \| 1989 \| 1002 \| \| 0 \| 7 \| -32767 \| 1002 \| \| 1 \| 8 \| 255 \| 2147483647 \| \| 1 \| 9 \| 1991 \| -2147483647 \| \| 1 \| 11 \| 1989 \| 25699 \| \| 1 \| 13 \| -32767 \| 2147483647 \| \| 1 \| 14 \| 255 \| 103 \| \| 0 \| 1 \| 1989 \| 1001 \| \| 0 \| 2 \| 1986 \| 1001 \| \| 1 \| 15 \| 1992 \| 3021 \| +------+------+--------+-------------+ 12 rows in set (0.050 sec) MySQL [test_query_db]> show query stats from baseall; +-------+------------+-------------+ \| Field \| QueryCount \| FilterCount \| +-------+------------+-------------+ \| k0 \| 1 \| 0 \| \| k1 \| 1 \| 0 \| \| k2 \| 1 \| 0 \| \| k3 \| 1 \| 0 \| \| k4 \| 0 \| 0 \| \| k5 \| 0 \| 0 \| \| k6 \| 0 \| 0 \| \| k10 \| 0 \| 0 \| \| k11 \| 0 \| 0 \| \| k7 \| 0 \| 0 \| \| k8 \| 0 \| 0 \| \| k9 \| 1 \| 1 \| \| k12 \| 0 \| 0 \| \| k13 \| 0 \| 0 \| +-------+------------+-------------+ 14 rows in set (0.001 sec) ``` 2. Show the query hit statistics summary for all the mv in a table ```sql MySQL [test_query_db]> show query stats from baseall all; +-----------+------------+ \| IndexName \| QueryCount \| +-----------+------------+ \| baseall \| 1 \| +-----------+------------+ 1 row in set (0.005 sec) ``` 3. Show the query hit statistics detail info for all the mv in a table ```sql MySQL [test_query_db]> show query stats from baseall all verbose; +-----------+-------+------------+-------------+ \| IndexName \| Field \| QueryCount \| FilterCount \| +-----------+-------+------------+-------------+ \| baseall \| k0 \| 1 \| 0 \| \| \| k1 \| 1 \| 0 \| \| \| k2 \| 1 \| 0 \| \| \| k3 \| 1 \| 0 \| \| \| k4 \| 0 \| 0 \| \| \| k5 \| 0 \| 0 \| \| \| k6 \| 0 \| 0 \| \| \| k10 \| 0 \| 0 \| \| \| k11 \| 0 \| 0 \| \| \| k7 \| 0 \| 0 \| \| \| k8 \| 0 \| 0 \| \| \| k9 \| 1 \| 1 \| \| \| k12 \| 0 \| 0 \| \| \| k13 \| 0 \| 0 \| +-----------+-------+------------+-------------+ 14 rows in set (0.017 sec) ``` 4. Show the query hit for a database ```sql MySQL [test_query_db]> show query stats for test_query_db; +----------------------------+------------+ \| TableName \| QueryCount \| +----------------------------+------------+ \| compaction_tbl \| 0 \| \| bigtable \| 0 \| \| empty \| 0 \| \| tempbaseall \| 0 \| \| test \| 0 \| \| test_data_type \| 0 \| \| test_string_function_field \| 0 \| \| baseall \| 1 \| \| nullable \| 0 \| +----------------------------+------------+ 9 rows in set (0.005 sec) ``` 5. Show query hit statistics for all the databases ```sql MySQL [(none)]> show query stats; +-----------------+------------+ \| Database \| QueryCount \| +-----------------+------------+ \| test_query_db \| 1 \| +-----------------+------------+ 1 rows in set (0.005 sec) ```	2023-05-15 10:56:34 +08:00
Mingyu Chen	26e930eed1	[Fix](multi-catalog) Make BE selection policy works fine when enable prefer_compute_node_for_external_table (#19346 )	2023-05-12 15:32:50 +08:00
Chuang Li	a041f8eabe	[fix](fe) Fx SimpleDateFormatter thread unsafe issue by replacing to DateTimeFormatter. (#19265 ) DateTimeFormatter replace SimpleDateFormat in fe module because SimpleDateFormat is not thread-safe.	2023-05-11 22:50:24 +08:00
AKIRA	6d2070c59d	[enhancement](stats) Make stats cache item size configurable (#19205 )	2023-05-11 13:59:37 +08:00
Mryange	d20b5f90d8	[feature](executor) Automatically set the instance_num using the info from be. (#19345 ) 1. fixed some error regressions (results error with big nstance_num due to incorrect order by). 2. if set parallel_fragment_exec_instance_num to 0, the concurrency in the Pipeline execution engine will automatically be set to half of the number of CPU cores. 3. add limit to parallel_fragment_exec_instance_num that it cannot be set to more than fe.conf::max_instance_num(Default: 128) ``` mysql [(none)]>set parallel_fragment_exec_instance_num = 514; ERROR 1231 (42000): errCode = 2, detailMessage = Variable 'parallel_fragment_exec_instance_num' can't be set to the value of '514(Should not be set to more than 128)' ```	2023-05-10 17:07:41 +08:00
Jibing-Li	78435823b6	[Fix](multi catalog)Return all partition values while reading hive table. (#19434 ) Return all partition values while reading hive table. Add a config item for the max value of hive table to partition list cache. Default value is 100.	2023-05-10 10:55:33 +08:00
ZashJie	4302ceaee8	[Improvement](data types) enhance show data types stmt (#18831 )	2023-05-09 09:42:44 +08:00
Tiewei Fang	e78149cb65	[Enhencement](Export) add property for outfile/export and add test (#18997 ) This pr does three things: 1. add `delete_existing_files` property for outfile/export. If `delete_existing_files = true`, export/outfile will delete all files under file_path first. 2. add p2 test for export 3. modify docs	2023-05-08 14:02:20 +08:00
Mingyu Chen	abc73ac1eb	[refactor](cluster)(step-1) remove cluster related stmt (#19355 ) * [refactor](cluster)(step-1) remove cluster stmt	2023-05-07 18:44:42 +08:00
ElvinWei	3f6e5118e6	[enchancement](statistics) support periodic collection of statistics (#19247 ) This PR enables periodic collection of statistics and is a precursor to automatic statistics collection. It mainly includes the following contents： support periodic collection of statistics. Change the type of Date in statistics p0 to DateV2(see [Enhancement](data-type) add FE config to prohibit create date and decimalv2 type #19077) for test locally. complement cases(remove Chinese characters, optimize code, etc) , improve stability. Supports setting whether to keep records of statistics synchronization job info, convenient for use in p0 testing. The statistics job table was modified, and some auxiliary judgments were added to avoid the user perceiving the modification. This function was removed when the table schema is stable.	2023-05-06 14:53:06 +08:00
Luwei	3287f350de	[feature](table) implement the round robin selection be when create tablet (#19167 )	2023-05-06 14:46:48 +08:00
Mingyu Chen	70236adc1f	[Refactor](doc)(config)(variable) use script to generate doc for FE config and session variables (#19246 ) The document of configs(FE and BE) and session variables is hard to maintain. Because developer need to modify both code and document. And you can see that some of config's document is missing. So I plan to write the document of config or variables directly in code, and using script to generate document automatically. How To This CL mainly changes: Add field in Config and Session Variables' annaotion description: The description of the config or variable item. It is a String array. And first element is in Chinese, second is in English options: the valid options if the config or variable is enum. Add a scripts docs/generate-config-and-variable-doc.sh Simple run sh docs/generate-config-and-variable-doc.sh and it will generate docs of FE config and variables, And save it under docs/admin-manual/config/fe-config.md and docs/advanced/variables.md, both in Chinese and in English. And there are template markdowns for this script to read and replace with real doc content. TODO Too many description need to be filled. I will finish them in next PR. And now the origin doc remain unchanged. Find a way to check the description field of config and variables, to make sure we won't missing it. Generate doc for BE config.	2023-05-05 14:42:43 +08:00
Calvin Kirs	5459cd9c30	[Improve](fe)Upgrade dependencies and optimize jar package management (#18882 ) bind netty-version to 4.1.89-final bind jettison to 1.5.4 upgrade hadoop version to 3.3.5 upgrade range-plugins-common to 2.4.0 bind bcprov-jdk15on to 2.4.0 upgrade and bind woodstox to 6.5.1 upgrade and bind kerby to 2.0.3 upgrade hudi to 0.13.0 upgrade parquet to 1.13.0 upgrade maven-source-plugin to 3.2.1 upgrade maven-assembly-plugin to 3.3.0 upgrade maven-javadoc-plugin to 3.3.2 upgrade maven-shade-plugin to 3.3.4 upgrade maven-clean-plugin to 3.1.0 Remove meaningless plugins Optimize doris maven path Unify the Java modules for management in fe	2023-05-04 10:07:37 +08:00
Zhengguo Yang	43e70ab252	[chore](recover) add a config to recover remaining data in emergency (#18986 )	2023-04-28 17:42:00 +08:00

1 2 3 4

183 Commits