doris

Author	SHA1	Message	Date
Pxl	4eb2604789	[Bug](function) fix function define of Retention inconsist and change some static_cast to assert cast (#19455 ) 1. fix function define of `Retention` inconsist, this function return tinyint on `FE` and return uint8 on `BE` 2. make assert_cast support cast to derived 3. change some static cast to assert cast 4. support sum(bool)/avg(bool)	2023-05-15 11:50:02 +08:00
Zhengguo Yang	6748ae4a57	[Feature] Collect the information statistics of the query hit (#18805 ) 1. Show the query hit statistics for `baseall` ```sql MySQL [test_query_db]> show query stats from baseall; +-------+------------+-------------+ \| Field \| QueryCount \| FilterCount \| +-------+------------+-------------+ \| k0 \| 0 \| 0 \| \| k1 \| 0 \| 0 \| \| k2 \| 0 \| 0 \| \| k3 \| 0 \| 0 \| \| k4 \| 0 \| 0 \| \| k5 \| 0 \| 0 \| \| k6 \| 0 \| 0 \| \| k10 \| 0 \| 0 \| \| k11 \| 0 \| 0 \| \| k7 \| 0 \| 0 \| \| k8 \| 0 \| 0 \| \| k9 \| 0 \| 0 \| \| k12 \| 0 \| 0 \| \| k13 \| 0 \| 0 \| +-------+------------+-------------+ 14 rows in set (0.002 sec) MySQL [test_query_db]> select k0, k1,k2, sum(k3) from baseall where k9 > 1 group by k0,k1,k2; +------+------+--------+-------------+ \| k0 \| k1 \| k2 \| sum(`k3`) \| +------+------+--------+-------------+ \| 0 \| 6 \| 32767 \| 3021 \| \| 1 \| 12 \| 32767 \| -2147483647 \| \| 0 \| 3 \| 1989 \| 1002 \| \| 0 \| 7 \| -32767 \| 1002 \| \| 1 \| 8 \| 255 \| 2147483647 \| \| 1 \| 9 \| 1991 \| -2147483647 \| \| 1 \| 11 \| 1989 \| 25699 \| \| 1 \| 13 \| -32767 \| 2147483647 \| \| 1 \| 14 \| 255 \| 103 \| \| 0 \| 1 \| 1989 \| 1001 \| \| 0 \| 2 \| 1986 \| 1001 \| \| 1 \| 15 \| 1992 \| 3021 \| +------+------+--------+-------------+ 12 rows in set (0.050 sec) MySQL [test_query_db]> show query stats from baseall; +-------+------------+-------------+ \| Field \| QueryCount \| FilterCount \| +-------+------------+-------------+ \| k0 \| 1 \| 0 \| \| k1 \| 1 \| 0 \| \| k2 \| 1 \| 0 \| \| k3 \| 1 \| 0 \| \| k4 \| 0 \| 0 \| \| k5 \| 0 \| 0 \| \| k6 \| 0 \| 0 \| \| k10 \| 0 \| 0 \| \| k11 \| 0 \| 0 \| \| k7 \| 0 \| 0 \| \| k8 \| 0 \| 0 \| \| k9 \| 1 \| 1 \| \| k12 \| 0 \| 0 \| \| k13 \| 0 \| 0 \| +-------+------------+-------------+ 14 rows in set (0.001 sec) ``` 2. Show the query hit statistics summary for all the mv in a table ```sql MySQL [test_query_db]> show query stats from baseall all; +-----------+------------+ \| IndexName \| QueryCount \| +-----------+------------+ \| baseall \| 1 \| +-----------+------------+ 1 row in set (0.005 sec) ``` 3. Show the query hit statistics detail info for all the mv in a table ```sql MySQL [test_query_db]> show query stats from baseall all verbose; +-----------+-------+------------+-------------+ \| IndexName \| Field \| QueryCount \| FilterCount \| +-----------+-------+------------+-------------+ \| baseall \| k0 \| 1 \| 0 \| \| \| k1 \| 1 \| 0 \| \| \| k2 \| 1 \| 0 \| \| \| k3 \| 1 \| 0 \| \| \| k4 \| 0 \| 0 \| \| \| k5 \| 0 \| 0 \| \| \| k6 \| 0 \| 0 \| \| \| k10 \| 0 \| 0 \| \| \| k11 \| 0 \| 0 \| \| \| k7 \| 0 \| 0 \| \| \| k8 \| 0 \| 0 \| \| \| k9 \| 1 \| 1 \| \| \| k12 \| 0 \| 0 \| \| \| k13 \| 0 \| 0 \| +-----------+-------+------------+-------------+ 14 rows in set (0.017 sec) ``` 4. Show the query hit for a database ```sql MySQL [test_query_db]> show query stats for test_query_db; +----------------------------+------------+ \| TableName \| QueryCount \| +----------------------------+------------+ \| compaction_tbl \| 0 \| \| bigtable \| 0 \| \| empty \| 0 \| \| tempbaseall \| 0 \| \| test \| 0 \| \| test_data_type \| 0 \| \| test_string_function_field \| 0 \| \| baseall \| 1 \| \| nullable \| 0 \| +----------------------------+------------+ 9 rows in set (0.005 sec) ``` 5. Show query hit statistics for all the databases ```sql MySQL [(none)]> show query stats; +-----------------+------------+ \| Database \| QueryCount \| +-----------------+------------+ \| test_query_db \| 1 \| +-----------------+------------+ 1 rows in set (0.005 sec) ```	2023-05-15 10:56:34 +08:00
Mingyu Chen	26e930eed1	[Fix](multi-catalog) Make BE selection policy works fine when enable prefer_compute_node_for_external_table (#19346 )	2023-05-12 15:32:50 +08:00
Chuang Li	a041f8eabe	[fix](fe) Fx SimpleDateFormatter thread unsafe issue by replacing to DateTimeFormatter. (#19265 ) DateTimeFormatter replace SimpleDateFormat in fe module because SimpleDateFormat is not thread-safe.	2023-05-11 22:50:24 +08:00
AKIRA	6d2070c59d	[enhancement](stats) Make stats cache item size configurable (#19205 )	2023-05-11 13:59:37 +08:00
Mryange	d20b5f90d8	[feature](executor) Automatically set the instance_num using the info from be. (#19345 ) 1. fixed some error regressions (results error with big nstance_num due to incorrect order by). 2. if set parallel_fragment_exec_instance_num to 0, the concurrency in the Pipeline execution engine will automatically be set to half of the number of CPU cores. 3. add limit to parallel_fragment_exec_instance_num that it cannot be set to more than fe.conf::max_instance_num(Default: 128) ``` mysql [(none)]>set parallel_fragment_exec_instance_num = 514; ERROR 1231 (42000): errCode = 2, detailMessage = Variable 'parallel_fragment_exec_instance_num' can't be set to the value of '514(Should not be set to more than 128)' ```	2023-05-10 17:07:41 +08:00
Jibing-Li	78435823b6	[Fix](multi catalog)Return all partition values while reading hive table. (#19434 ) Return all partition values while reading hive table. Add a config item for the max value of hive table to partition list cache. Default value is 100.	2023-05-10 10:55:33 +08:00
ZashJie	4302ceaee8	[Improvement](data types) enhance show data types stmt (#18831 )	2023-05-09 09:42:44 +08:00
Tiewei Fang	e78149cb65	[Enhencement](Export) add property for outfile/export and add test (#18997 ) This pr does three things: 1. add `delete_existing_files` property for outfile/export. If `delete_existing_files = true`, export/outfile will delete all files under file_path first. 2. add p2 test for export 3. modify docs	2023-05-08 14:02:20 +08:00
Mingyu Chen	abc73ac1eb	[refactor](cluster)(step-1) remove cluster related stmt (#19355 ) * [refactor](cluster)(step-1) remove cluster stmt	2023-05-07 18:44:42 +08:00
ElvinWei	3f6e5118e6	[enchancement](statistics) support periodic collection of statistics (#19247 ) This PR enables periodic collection of statistics and is a precursor to automatic statistics collection. It mainly includes the following contents： support periodic collection of statistics. Change the type of Date in statistics p0 to DateV2(see [Enhancement](data-type) add FE config to prohibit create date and decimalv2 type #19077) for test locally. complement cases(remove Chinese characters, optimize code, etc) , improve stability. Supports setting whether to keep records of statistics synchronization job info, convenient for use in p0 testing. The statistics job table was modified, and some auxiliary judgments were added to avoid the user perceiving the modification. This function was removed when the table schema is stable.	2023-05-06 14:53:06 +08:00
Luwei	3287f350de	[feature](table) implement the round robin selection be when create tablet (#19167 )	2023-05-06 14:46:48 +08:00
Mingyu Chen	70236adc1f	[Refactor](doc)(config)(variable) use script to generate doc for FE config and session variables (#19246 ) The document of configs(FE and BE) and session variables is hard to maintain. Because developer need to modify both code and document. And you can see that some of config's document is missing. So I plan to write the document of config or variables directly in code, and using script to generate document automatically. How To This CL mainly changes: Add field in Config and Session Variables' annaotion description: The description of the config or variable item. It is a String array. And first element is in Chinese, second is in English options: the valid options if the config or variable is enum. Add a scripts docs/generate-config-and-variable-doc.sh Simple run sh docs/generate-config-and-variable-doc.sh and it will generate docs of FE config and variables, And save it under docs/admin-manual/config/fe-config.md and docs/advanced/variables.md, both in Chinese and in English. And there are template markdowns for this script to read and replace with real doc content. TODO Too many description need to be filled. I will finish them in next PR. And now the origin doc remain unchanged. Find a way to check the description field of config and variables, to make sure we won't missing it. Generate doc for BE config.	2023-05-05 14:42:43 +08:00
Zhengguo Yang	43e70ab252	[chore](recover) add a config to recover remaining data in emergency (#18986 )	2023-04-28 17:42:00 +08:00
WenYao	5e9c0c3500	[Enhancement](data-type) add FE config to prohibit create date and decimalv2 type (#19077 ) * prohibits date and decimal type * add config in test	2023-04-28 11:31:51 +08:00
xueweizhang	f9f5bbde6f	[feature-wip](duplicate_no_keys) add create duplicate table without keys (#18758 )	2023-04-27 09:59:56 +08:00
AKIRA	270be55c4c	[feat](stats) Add option to config file to enable or disable analyze function (#19062 ) Add this option in conf: /** * If set false, user couldn't submit analyze SQL and FE won't allocate any related resources. */ @ConfField public static boolean enable_stats = true; It will be checked during analyze of analyze related stmt and init analyze manager	2023-04-26 13:37:08 +08:00
Qi Chen	61b7a52444	[Enhancement](multi-catalogs) Use decimal V3 type in multi-catalogs module. (#18926 ) 1. Use decimal V3 type in JDBC and Iceberg tables. 2. Fix hdfs TVF decimal V3 type and regression test.	2023-04-25 14:49:40 +08:00
WenYao	fd4576e420	[Fix](auth) fix some problem of skip_localhost_auth_check in FE config #18996	2023-04-25 09:10:01 +08:00
Xiangyu Wang	2d7903e2bd	[Feature](multi-catalog) support query hive views. (#18815 ) A very simple implementation to query hive views, it is an EXPERIMENTAL feature. We can try to parse the ddl of hive views and try to execute the query relies on the fact that HiveQL is very similar to Doris SQL. But if the ddl of hive views use some complicated or incompatible grammar, the query might fail.	2023-04-24 08:49:26 +08:00
WenYao	166bed11d4	[Enchancement](auth) Forbid to login doris from 127.0.0.1 without password (#18816 ) * forbid to login from 127.0.0.1 without password * add localhost limit * rename	2023-04-23 13:56:31 +08:00
Xiaocc	3007cd49f2	[enhancement](mysql) enable two-way ssl authentication (#18530 ) According to the mysql-ssl, enable two-way SSL authentication.	2023-04-21 14:39:14 +08:00
Tiewei Fang	8e2146f48c	[Enhencement](Export) support export with outfile syntax (#18325 ) `Export` syntax provides asynchronous export function, but `Export` does not achieve vectorization. `Outfile` syntax provides synchronous export function`. So we can reimplement the export syntax with oufile syntax.	2023-04-20 17:27:04 +08:00
zclllyybb	fb377a9da9	[Improvement](functions)Optimized some datetime function's return value (#18369 )	2023-04-19 15:51:11 +08:00
luozenglin	5c076b738b	[improvement](resource-group) add test for resource group (#18575 ) Co-authored-by: wangbo <youseebiggirl_t_t@qq.com>	2023-04-18 20:20:50 +08:00
Gabriel	5300b21db7	[Bug](DECIMALV3) report failure if a decimal value is overflow (#18336 )	2023-04-17 13:18:14 +08:00
Mingyu Chen	1cbbc60822	[feature](config) support "experimental" prefix for FE config (#18699 ) For each release of Doris, there are some experimental features. These feature may not stable or qualified enough, and user need to use it by setting config or session variables, eg, set enable_mtmv = true, otherwise, these feature is disable by default. We should explicitly tell user which features are experimental, so that user will notice that and decide whether to use it. Changes In this PR, I support the experimental_ prefix for FE config and session variables. Session Variable Given enable_nereids_planner as an example. The Nereids planner is an experimental feature in Doris, so there is an EXPERIMENTAL annotation for it: @VariableMgr.VarAttr(..., expType = ExperimentalType.EXPERIMENTAL) private boolean enableNereidsPlanner = false; And for compatibility, user can set it by: set enable_nereids_planner = true; set experimental_enable_nereids_planner = true; And for show variables, it will only show experimental_enable_nereids_planner entry. And you can also see all experimental session variables by: show variables like "%experimental%" Config Same as session variable, give enable_mtmv as an example. @ConfField(..., expType = ExperimentalType.EXPERIMENTAL) public static boolean enable_mtmv = false; User can set it in fe.conf or ADMIN SET FRONTEND CONFIG stmt with both names: enable_mtmv experimental_enable_mtmv And user can see all experimental FE configs by: ADMIN SHOW FRONTEND CONFIG LIKE "%experimental%"; TODO Support this feature for BE config Only add experimental for: enable_pipeline_engine enable_nereids_planner enable_single_replica_insert and FE config: enable_mtmv enabel_ssl enable_fqdn_mode Should modify other config and session vars	2023-04-16 18:32:10 +08:00
huangzhaowei	5d1abe4507	[Bugfix](Mtmv)Fix mtmv meta load failed (#18605 ) MTMV meta load fail since meta was public to the CI System	2023-04-14 16:29:18 +08:00
Mingyu Chen	9634d21a28	[fix](info_db) avoid infodb query timeout when external catalog info is too large or is not reachable (#18662 ) When query tables in information_schema databases, it may timeout due to: There are external catalog with too many tables. The external catalog is unreachable So I add a new FE config infodb_support_ext_catalog. The default is false, which means that when select from tables in information_schema database, the result will not contain the information of the table in external catalog. Describe your changes.	2023-04-14 14:40:31 +08:00
amory	db5ec6f6b0	[FIX](thrift)Fix with 1.2 version for thrift #18658	2023-04-14 14:07:42 +08:00
AlexYue	33eec9096f	[Enhancement](FE) use customized grpc threadpool to get better metric for grpc from FE to BE (#13983 ) Previously in Doris FE, there is no specific thread pool for grpc-client-channel, by default the underlying netty logic would use one dynamic unbounded cache threadpool. The workload for this grpc threadpool is unseen. Use ThreadpoolMgr to create one customized threadpool to get Prometheus-compatible metric data.	2023-04-13 20:09:26 +08:00
xy720	cb644d5bc3	[feature](function) support any type in SQL function (#18392 ) Add AnyType to Doris. Support Inference function in fe SQL function.	2023-04-11 19:45:02 +08:00
Jibing-Li	c13f806e53	[Refactor](multi catalog)Split ExternalFileScanNode into FileQueryScanNode and FileLoadScanNode (#18342 ) Split ExternalFileScanNode into FileQueryScanNode and FileLoadScanNode. Remove some useless code in FileLoadScanNode. Remove unused config item: enable_vectorized_load and enable_new_load_scan_node	2023-04-11 10:30:38 +08:00
morrySnow	512718f629	[enhancement](Nereids)(planner) fix some problem in Nereids and legacy planner (#18280 ) 1. remove TypeCoercion and CharacterLiteralTypeCoercion 2. Nereids Cast do not relay on legacy planner's analyze() 3. fix below problem in legacy planner, after this PR a. BOOLEAN can cast to DECIMALV2 explicitly b. compare between BOOLEAN and DATE will cast both side to DOUBLE c. HLL cannot be implicitly cast to any other type	2023-04-10 18:25:33 +08:00
luozenglin	9700721982	[feature-wip](resource-group) Support create and show resource groups (#18184 )	2023-04-10 15:18:48 +08:00
huangzhaowei	09d98c1663	[BugFix](MTMV)Set enable_mtmv_scheduler_framework master only to avoid regression fail (#18473 ) Set enable_mtmv_scheduler_framework master only to avoid regression fail	2023-04-09 08:47:18 +08:00
Jibing-Li	ea60d65384	[Improvement](multi catalog)Move split size config to session variable (#18355 ) Move split size config to session variable. Before, it was in Config class, user need to restart FE after change it.	2023-04-05 01:02:47 +08:00
huangzhaowei	7c36bef6bc	[Feature-Wip](MySQL Load)Show load warning for my sql load (#18224 ) 1. Support the show load warnings for mysql load to get the detail error message. 2. Fix fillByteBufferAsync not mark the load as finished in same data load 3. Fix drain data only in client mode.	2023-04-04 22:44:48 +08:00
Ashin Gau	66bfd18601	[opt](file_reader) add prefetch buffer to read csv&json file (#18301 ) Co-authored-by: ByteYue <[yj976240184@gmail.com](mailto:yj976240184@gmail.com)> This PR is an optimization for https://github.com/apache/doris/pull/17478: 1. Change the buffer size of `LineReader` to 4MB to align with the size of prefetch buffer. 2. Lazily prefetch data in the first read to prevent wasted reading. 3. S3 block size is 32MB only, which is too small for a file split. Set 128MB as default file split size. 4. Add `_end_offset` for prefetch buffer to prevent wasted reading. The query performance of reading data on object storage is improved by more than 3x+.	2023-04-04 19:05:22 +08:00
yongjinhou	aff260c06f	[Enhancement](HttpServer) Support https interface (#16834 ) 1. Organize http documents 2. Add http interface authentication for FE 3. Support https interface for FE 4. Provide authentication interface 5. Add http interface authentication for BE 6. Support https interface for BE	2023-04-03 14:18:17 +08:00
Mingyu Chen	ecd3fd07f6	[feature](colocate) support cross database colocate join (#18152 )	2023-04-03 14:03:42 +08:00
Pxl	e77833bfa1	[Bug](materialized-view) fix where clause persistence replay incorrect (#18228 ) fix where clause persistence replay incorrect	2023-04-03 12:49:01 +08:00
abmdocrt	365867a867	[feature](SSL) default enable SSL MySQL connection to FE (#18285 )	2023-03-31 21:31:23 +08:00
amory	ea41d94582	[Improve](complex-type) Support Count(complexType) (#17868 ) Support count function for ARRAY/MAP/STRUCT type	2023-03-30 15:43:32 +08:00
Xiangyu Wang	6bd2609294	[Enhancement](multi-catalog) add config for external meta cache loade… (#18117 ) Add config for external cache-loader's max thread-pool size.	2023-03-28 15:10:19 +08:00
xy720	daeaa91dd6	[feature](function) support variadic template type in SQL function (#17985 ) Inspired by c++ function `std::vector::emplace_back()`, we can use variadic template for this issue. e.g. ``` [['struct'], 'STRUCT<TYPES>', ['TYPES'], 'ALWAYS_NOT_NULLABLE', ['TYPES...']] ``` `...TYPES` in template_types defines a variadic template `TYPE`. Then the variadic template will be expanded to multiple normal templates based on actual input arguments at runtime in FE. But make sure `TYPES...` is placed on the last position in all template type arguments. BTW, the origin template function logic is not affected.	2023-03-28 11:08:24 +08:00
huanghaibin	304064653c	[feature](log)check and log holding lock time when it exceeds threshold (#17965 ) Sometimes the competition of lock is fierce in DatabaseTransactionMgr, which may lead to publish time out, i think we should have a log to hint these lock competition.	2023-03-26 20:11:40 +08:00
Gabriel	2408ca5da8	[Bug](DECIMALV3) Fix wrong precision for plus/minus (#18052 ) Result type for DECIMAL(x, y) plus/minus DECIMAL(m, n) should be DECIMAL(max(x - y, m - n) + max(y + n) + 1, max(y + n))	2023-03-25 09:42:39 +08:00
starocean999	7bdd854fdc	[fix](nereids) bucket shuffle and colocate join is not correctly recognized (#17807 ) 1. close (https://github.com/apache/doris/issues/16458) for nereids 2. varchar and string type should be treated as same type in bucket shuffle join scenario. ``` create table shuffle_join_t1 ( a varchar(10) not null ) create table shuffle_join_t2 ( a varchar(5) not null, b string not null, c char(3) not null ) ``` the bellow 2 sqls can use bucket shuffle join ``` select * from shuffle_join_t1 t1 left join shuffle_join_t2 t2 on t1.a = t2.a; select * from shuffle_join_t1 t1 left join shuffle_join_t2 t2 on t1.a = t2.b; ``` 3. PushdownExpressionsInHashCondition should consider both hash and other conjuncts 4. visitPhysicalProject should handle MarkJoinSlotReference	2023-03-24 19:21:41 +08:00
Mingyu Chen	6c8ed9135d	[fix](truncate) fix unable to truncate table due to wrong storage medium (#17917 ) When setting FE config default_storage_medium to SSD, and set all BE storage path as SSD. And table will be stored with storage medium SSD. But there is a FE config storage_cooldown_second and its default value is 30 days. So after 30 days, the storage medium of table will be changed to HDD, which is unexpected. This PR removes the storage_cooldown_second, and use a max value to set the cooldown time of SSD storage medium when the default_storage_medium is SSD.	2023-03-21 10:04:47 +08:00

1 2 3

136 Commits