doris

Author	SHA1	Message	Date
zclllyybb	7cfb3cc0aa	[fix](functions) fix function substitute for datetimeV1/V2 (#23344 ) * fix * function fe	2023-08-25 09:59:38 +08:00
zhangdong	6a4976921d	[fix](auth)Disable column auth temporarily (#23295 ) - add config `enable_col_auth` to temporarily disable column permissions(because old/new planner has bug when select from view) - Restore the old optimizer to the previous authentication method - Support for new optimizer authentication（Legacy issue: When querying the view, the permissions of the base table will be authenticated. The view's own permissions should be authenticated and processed after the new optimizer is improved） - fix: show grants for non-existent users - fix: role:`admin` can not grant/revoke to/from user	2023-08-24 23:37:06 +08:00
AKIRA	35d0c9e71e	[refactor](nereids) Refactor stats collection framework (#22963 ) * remove auto analyze grammer * refactor ResultRow	2023-08-23 10:05:57 +08:00
Lightman	a4e041ea55	[improve](alter-job) Add a config for forbiding doing alter job (#23294 )	2023-08-22 16:28:36 +08:00
morrySnow	b670dd0db7	[feature](Nereids) support array type (#22851 ) FEATURE: 1. enable array type in Nereids 2. support generice on function signature 3. support array and map type in type coercion and type check 4. add element_at and element_slice syntax in Nereids parser REFACTOR: 1. remove AbstractDataType BUG FIX: 1. remove FROM from nonReserved keyword list TODO: 1. support lambda expression 2. use Nereids' way do function type coercion 3. use castIfnotSame when do implict cast on BoundFunction 4. let AnyDataType type coercion do same thing as function type coercion 5. add below array function - array_apply - array_concat - array_filter - array_sortby - array_exists - array_first_index - array_last_index - array_count - array_shuffle shuffle - array_pushfront - array_pushback - array_repeat - array_zip - reverse - concat_ws - split_by_string - explode - bitmap_from_array - bitmap_to_array - multi_search_all_positions - multi_match_any - tokenize	2023-08-22 09:47:55 +08:00
amory	ae9f04f969	[fix](array) fix typeExtactMatch for array() type (#23264 ) if we write sql with : `select cast(array() as array<varchar(10)>)` castexpr in fe will call analyze() with `Type.matchExactType(childType, type, true);` here array type only check contains_null , but should check inner type to make array matchExactType right	2023-08-21 19:41:09 +08:00
Jerry Hu	0967d7ec04	[improvement](agg) Do not serialize bitmap to string (#23172 )	2023-08-21 10:10:15 +08:00
Tiewei Fang	10abbd2b62	[Feauture](Export) support parallel export job using Job Schedule (#22854 )	2023-08-18 22:24:42 +08:00
yujun	1f19d0db3e	[improvement](tablet clone) improve tablet balance, scaling speed etc (#22317 )	2023-08-17 22:30:49 +08:00
Calvin Kirs	3efa06e63e	[Fix](View)varchar type conversion error (#22987 )	2023-08-16 11:49:04 +08:00
yujun	d7a5c37672	[improvement](tablet clone) update the capacity coeficient for calculating backend load score (#22857 ) update the capacity coeficient for calcutating the backend load score: 1. Add fe config entry `backend_load_capacity_coeficient` to allow setting the capacity coeficient manually; 2. Adjust calculating capacity coeficient as below. We emphasize disk usage for calculating load score. If a be has a high used capacity percent, we should increase its load score. So we increase capacity coefficient with a be's used capacity percent. But this is not enough. For example, if the tablets have a big difference in data size. Then for below two BEs, their load score maybe the same: BE A: disk usage = 60%, replica number = 2000 (it contains the big tablets) BE B: disk usage = 30%, replica number = 4000 (it contains the small tablets) But what we want is: firstly move some big tablets from A to B, after their disk usages are close, then move some small tablets from B to A, finally both of their disk usages and replica number are close. To achieve this, when the max difference between all BE's disk usages >= 30%, we set the capacity cofficient to 1.0 and avoid the affect of replica num. After the disk usage difference decrease, then decrease the capacity cofficient to make replica num effective.	2023-08-15 17:27:31 +08:00
Yulei-Yang	94a7b44540	[Improvement](log) add config to controll compression of fe log & fe audit log (#22865 ) fe log is large for a busy doris cluster, if you want to preserve some historical logs, it cost too much disk space. enable compression is a good way to save space. and a gzip compressed text file can be viewed without decompression.	2023-08-11 14:08:08 +08:00
yujun	b9b9071c9b	[improvement](create partition) create partition require quorum replicas succ (#22554 )	2023-08-11 11:59:05 +08:00
zy-kkk	8e5b4005dc	[enhancement](data type) add use_mysql_bigint_for_largeint config Tell Doris to use bigint when returning largeint type to mysql jdbc (#22835 )	2023-08-10 18:53:31 +08:00
Qi Chen	f2658dc7bd	[Feature](multi-catalog) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema. (#22318 ) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema by session var `truncate_char_or_varchar_columns`.	2023-08-10 14:37:20 +08:00
Xinyi Zou	77d3d4e324	[fix](cache) add sql cache conf `cache_result_max_data_size` (#22645 ) Only the maximum number of rows in sql cache cache_result_max_row_count is not enough. If a row of data is too large, FE may OOM.	2023-08-09 14:46:23 +08:00
Gabriel	7bfcee6e71	[improvement](variable) add annotations for variables (#22292 )	2023-08-08 22:16:42 +08:00
AKIRA	97adbaadb9	fix full auto analyze (#22650 )	2023-08-07 11:41:38 +08:00
Tiewei Fang	95aa4d8631	[Feature](Export) Supports concurrently export of table data (#21911 )	2023-08-04 18:50:17 +08:00
Mingyu Chen	672acb8784	[fix](show-table-status) fix hive view NPE and external meta cache refresh issue (#22377 )	2023-08-04 16:55:10 +08:00
herry2038	4f9969ce1e	[feature](show-frontends-disk) Add Show frontend disks (#22040 ) Co-authored-by: yuxianbing <yuxianbing@yy.com> Co-authored-by: yuxianbing <iloveqaz123>	2023-08-03 14:04:48 +08:00
Mryange	e670d84b72	[feature](executor) using max_instance_num to limit automatically instance (#22521 )	2023-08-03 13:12:32 +08:00
Calvin Kirs	e5028314bc	[Feature](Job)Support scheduler job (#21916 )	2023-08-02 21:34:43 +08:00
AKIRA	afb6a57aa8	[enhancement](nereids) Improve stats preload performance (#21970 )	2023-07-31 17:32:01 +08:00
zclllyybb	ad080c691f	[chore](log)Move non-user-friendly error message to be.WARNING (#22315 ) Move non-user-friendly error message to be.WARNING	2023-07-28 13:15:25 +08:00
catpineapple	e87174dd6b	[feature](planner) modify multi partition prefix value (#22098 ) modify multi partition prefix value: 'p_'	2023-07-28 10:21:32 +08:00
AKIRA	b51fcbd9c7	[opt](stats) Scale replica of stats table to 3 when it's possible (#22227 ) So that we could improve the availability of stats.	2023-07-27 17:36:54 +08:00
Yongqiang YANG	31c856351a	[enhancement](default_config) change default value of rpc related (#22149 ) configs Bdbje elect timeout is 30 seconds, so we enlarge thrift_rpc_timeout_ms and txn_commit_rpc_timeout_ms to 60s. BTW: enlarge bdbje_lock_timeout_second from 1 to 5.	2023-07-27 11:12:26 +08:00
AKIRA	582acad8a1	[feature](stats) Enable period time with cron expr (#22095 ) Support such grammar ANALYZE TABLE test WITH CRON "* * * * * ?" Such job would be scheduled as the cron expr specifie, but natively support minute-level schedule only	2023-07-26 17:25:57 +08:00
AKIRA	964ac4e601	[opt](nereids) Retry when async analyze task failed (#21889 ) Retry at most 5 times when async analyze task execution failed	2023-07-26 17:16:56 +08:00
lihangyu	3b6702a1e3	[Bug](point query) cancel future when meet timeout in PointQueryExec (#21573 ) 1. cancel future when meet timeout and add config to modify rpc timeout 2. add config to modify numof BackendServiceProxy since under high concurrent work load GRPC channel will be blocked	2023-07-25 18:18:09 +08:00
Mryange	0f439bb1ca	[vectorized](udf) java udf support map type (#22059 )	2023-07-25 11:56:20 +08:00
Siyang Tang	0205f540ac	[enhancement](config) Enlarge broker scanner bytes conf to 500G, 5G is still not enough (#22126 )	2023-07-24 19:49:39 +08:00
Siyang Tang	22aa54e335	[enhancement](config) enlarge max_bytes_per_broker_scanner to 5G #22099	2023-07-23 12:00:32 +08:00
amory	3d0f952934	[FIX](complex-type)delete enable_map/struct_type switch #21957	2023-07-22 15:29:32 +08:00
Mingyu Chen	85cc044aaa	[feature](create-table) support setting replication num for creating table opertaion globally (#21848 ) Add a new FE config `force_olap_table_replication_num`. If this config is larger than 0, when doing creating table operation, the replication num of table will forcibly be this value. Default is 0, which make no effect. This config will only effect the creating olap table operation, other operation such as `add partition`, `modify table properties` will not be effect. The motivation of this config is that the most regression test cases are creating table will single replica, this will be the regression test running well in p0, p1 pipeline. But we also need to run these cases in multi backend Doris cluster, so we need test cases will multi replicas. But it is hard to modify each test cases. So I add this config, so that we can simply set it to create all tables with specified replication number.	2023-07-21 19:36:04 +08:00
bobhan1	367ad9164a	[feature-wip](auto-inc)(step-2) support auto-increment column for duplicate table (#19917 )	2023-07-20 18:03:39 +08:00
Kaijie Chen	0f116ce148	Revert "[Enhancement](Nereids)enable nereids DML by default. (#21539 )" (#22013 ) This reverts commit f668b3965effbd5df4902f20b496cb6b6642414c.	2023-07-20 11:32:54 +08:00
mch_ucchi	f668b3965e	[Enhancement](Nereids)enable nereids DML by default. (#21539 ) TODO: fix cast agg_state type when do insert	2023-07-19 13:52:15 +08:00
AKIRA	d349c955f0	[fix](nereids) Disable auto analyze temporarily #21919	2023-07-19 09:27:24 +08:00
yujun	beec0e9169	[Improvement](tablet clone) impr tablet sched speed and fix tablet sched failed too many times (#21856 )	2023-07-18 23:25:22 +08:00
AKIRA	05cf095506	[feature](stats) Support full auto analyze (#21192 ) 1. Auto analyze all tables except for internal tables 2. make resource used by analyze configurable	2023-07-17 20:42:57 +08:00
Jibing-Li	352a0c2e17	[Improvement](multi catalog)Cache file system to improve list remote files performance (#21700 ) Use file system type and Conf as key to cache remote file system. This could avoid get a new file system for each external table partition's location. The time cost for fetching 100000 partitions with 1 file for each partition is reduced to 22s from about 15 minutes.	2023-07-14 09:59:46 +08:00
zhengyu	8ffa21a157	[fix](config) set FE header size limit to 1MB from 10k (#21719 ) Enlarge jetty_server_max_http_header_size to avoid Request Header Fields Too Large error when streamloading to FE. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-07-11 19:52:14 +08:00
zy-kkk	5a15967b65	[fix](sparkdpp) Change spark dpp default version to 1.2-SNAPSHOT (#21698 )	2023-07-11 10:49:53 +08:00
Mryange	8973610543	[feature](datetime) "timediff" supports calculating microseconds (#21371 )	2023-07-10 19:21:32 +08:00
HappenLee	f2fb23e98f	[pipeline](exec) disable pipeline load in now version (#21632 )	2023-07-09 01:00:06 +08:00
Gavin Chou	53c10a2389	(chore) Disable ssl connection to FE by default for compatibility reason (#20230 ) Older MySQL client (< 5.7.28) will try to connect to server with tls1.1, which is insecure and is not supported by Doris FE. The connection will fail. We disable ssl connection support on Doris FE to keep the users' application unaffected. To enable ssl support explicitly, just put the following to fe.conf ``` enable_ssl = true ```	2023-07-07 12:24:55 +08:00
Jibing-Li	9bcf79178e	[Improvement](statistics, multi catalog)Support iceberg table stats collection (#21481 ) Fetch iceberg table stats automatically while querying a table. Collect accurate statistics for Iceberg table by running analyze sql in Doris (remove collect by meta option).	2023-07-07 09:18:37 +08:00
Xiangyu Wang	bb3b6770b5	[Enhancement](multi-catalog) Make meta cache batch loading concurrently. (#21471 ) I will enhance performance about querying meta cache of hms tables by 2 steps: Step1 : use concurrent batch loading for meta cache Step2 : execute some other tasks concurrently as soon as possible This pr mainly for step1 and it mainly do the following things: - Create a `CacheBulkLoader` for batch loading - Remove the executor of the previous async cache loader and change the loader's type to `CacheBulkLoader` (We do not set any refresh strategies for LoadingCache, so the previous executor is not useful) - Use a `FixedCacheThreadPool` to replace the `CacheThreadPool` (The previous `CacheThreadPool` just log warn infos and will not throw any exceptions when the pool is full). - Remove parallel streams and use the `CacheBulkLoader` to do batch loadings - Change the value of `max_external_cache_loader_thread_pool_size` to 64, and set the pool size of hms client pool to `max_external_cache_loader_thread_pool_size` - Fix the spelling mistake for `max_hive_table_catch_num`	2023-07-06 15:18:30 +08:00

1 2 3 4 5

230 Commits