doris

Author	SHA1	Message	Date
amory	6f37e483f8	[improve](config)del useless creation config for inverted index (#39005 ) ## Proposed changes delete useless config : enable_create_inverted_index_for_array backport: https://github.com/apache/doris/pull/39006 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-07 17:13:05 +08:00
yujun	3b9394a8c7	[improvement](tablet scheduler) Adjust tablet sched priority to help load data succ #38528 (#38884 ) cherry pick from #38528	2024-08-06 02:13:47 +08:00
zhangdong	2425730609	[enhance](auth)support cache ranger datamask and row filter (#37723 ) (#38575 ) pick: https://github.com/apache/doris/pull/37723	2024-08-02 14:59:32 +08:00
slothever	b0943064e0	[fix](kerberos)fix and refactor ugi login for kerberos and simple authentication (#38607 ) pick from (#37301)	2024-08-01 14:01:32 +08:00
amory	6bd93b119f	[pick](cast)Feature cast complexttype2 json (#38632 ) ## Proposed changes backport: https://github.com/apache/doris/pull/36548 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-01 09:18:15 +08:00
Pxl	b4e82d2322	[Improvement](rpc) set grpc channel's keepAliveTime and remove proxy … (#38381 ) …on InterruptedExcep… (#37304) ## Proposed changes 1. set grpc channel's keepAliveTime 2. remove proxy on InterruptedException/TimeoutException to avoid channel unavailable pick from #37304 ## Proposed changes Issue Number: close #xxx <!--Describe your changes.-->	2024-07-25 22:11:23 +08:00
AlexYue	e396f853a0	Pick "[enhance](Cooldown) Use config to control whether use cooldown replica for scanning first" (#38322 ) ## Proposed changes <!--Describe your changes.--> Same as master #37492	2024-07-25 12:17:38 +08:00
wangbo	81a7542cae	[pick]Add audit log event queue size limit (#37914 ) ## Proposed changes pick #37786	2024-07-16 19:00:22 +08:00
abmdocrt	63c2d22513	[cherry-pick](branch-2.1) Pick "[Fix](delete command) Mark delete sign when do delete command in MoW table (#35917 )" (#37594 ) Pick #35917 and #37151	2024-07-15 18:54:01 +08:00
slothever	16de141743	[regression](kerberos)add hive kerberos docker regression env (#37657 ) ## Proposed changes pick: [regression](kerberos)fix regression pipeline env when write hosts (#37057) [regression](kerberos)add hive kerberos docker regression env (#36430)	2024-07-15 09:35:39 +08:00
Jibing-Li	259d28407e	[improvement](statistics)Enable estimate hive table row count using file size. (#37218 ) (#37694 ) backport: https://github.com/apache/doris/pull/37218	2024-07-12 13:47:27 +08:00
feiniaofeiafei	6214d6421f	[Fix](planner) fix bug of char(255) toSql (#37340 ) (#37671 ) cherry-pick #37340 from master	2024-07-12 10:33:24 +08:00
hui lai	dd18652861	[branch-2.1](routine-load) make get Kafka meta timeout configurable (#37399 ) pick #36619	2024-07-08 10:39:17 +08:00
hui lai	d08a418dd8	[branch-2.1](routine-load) optimize routine load job auto resume policy (#37373 ) pick #35266	2024-07-07 18:16:56 +08:00
wuwenchi	b3eaf0e4d2	[bugfix](hive)Prevent multiple fs from being generated for 2.1 (#37142 ) pick #36954	2024-07-02 22:54:40 +08:00
Mingyu Chen	e25717458e	[opt](catalog) add some profile for parquet reader and change meta cache config (#37040 ) (#37146 ) bp #37040	2024-07-02 20:58:43 +08:00
zy-kkk	3f382b797a	[branch-2.1][improvement](sqlserver catalog) Configurable whether to use encrypt when connecting to SQL Server using the catalog (#36971 ) pick (#36659) pick #37015 In previous versions, we used druid as the default JDBC connection pool, which can use custom decryption to parse the certificate when SQL Server encryption is turned on. However, in the new version, after changing HikariCP as the default connection pool, the SQLServer certificate cannot be parsed, so encryption needs to be turned off for normal use. Therefore, a parameter is added to decide whether to disable SQLServer encryption. It is not disabled by default.	2024-07-02 10:14:43 +08:00
yujun	22cb7b8fcb	[improvement](compaction) be do not compact invisible version to avoid query error -230 #28082 (#36222 ) cherry pick from #28082	2024-06-27 13:45:21 +08:00
walter	58cc1dca7f	[improve](fe) Support to config max msg/frame size of the thrift server (#36594 ) Cherry-pick #35845	2024-06-21 00:15:15 +08:00
xy720	74162a1b7e	[enhancement](prepared statement) Handle unsigned numeric type in prepare statement (#36388 ) ## Proposed changes Issue Number: bp #36133 <!--Describe your changes.-->	2024-06-18 19:33:12 +08:00
yujun	7c0ec4ea2e	[fix](autobucket) fix autobucket config masterOnly=true #36116 (#36286 ) cherry pick from #36116	2024-06-14 14:26:23 +08:00
lihangyu	9708ca8fcb	[Feature](Prepared Statment) Implement in nereids planner (#35318 ) (#36172 )	2024-06-12 19:54:17 +08:00
amory	b5a35b9cef	[FIX] Pick array inverted index bugfix (#35837 ) here with some array with inverted index bugfix: see also: https://github.com/apache/doris/pull/34766 https://github.com/apache/doris/pull/35086 https://github.com/apache/doris/pull/34683 https://github.com/apache/doris/pull/34076	2024-06-06 09:54:14 +08:00
Mingyu Chen	5c8f87e01e	[opt](log) refine the FE logger (#35679 ) Previously, FE logs were written to files. The main FE logs include fe.log, fe.warn.log, fe.audit.log, fe.out, and fe.gc.log. In a K8s deployment environment, logs usually need to be output to standard output, and then other components process the log stream. This PR made the following changes: 1. Modified the log4j configuration template - When started with `--daemon`, logs are still written to various files, and the format remains unchanged. - When started with `--console`, all logs are output to standard output and marked with different prefixes: - `StdoutLogger`: logs for standard output - `StderrLogger`: logs for standard error output - `RuntimeLogger`: logs for fe.log or fe.warn.log - `AuditLogger:` logs for fe.audit.log - No prefix: logs for fe.gc.log Examples are as follows: ``` RuntimeLogger 2024-06-03 14:54:51,229 INFO (binlog-gcer\|62) [BinlogManager.gc():359] begin gc binlog ``` 2. Added a new FE config: `enable_file_logger` Defaults to true. Indicates that logs will be recorded to files regardless of the startup method. For example, if it is started with `--console`, the log will be output to both the file and the standard output. If it is `false`, the log will not be recorded in the file regardless of the startup method. 3. Optimized the log format of standard output The byte streams of stdout and stderr are captured. The logs previously outputted using `System.out` will be captured in fe.log for unified management.	2024-06-04 18:20:30 +08:00
deardeng	f94222a04e	[fix](log) Support fe log rollover size strategy (#34446 )	2024-06-04 18:18:16 +08:00
deardeng	db3bbc2437	[feature](merge-cloud) Change fe log rolling max size (#32777 )	2024-06-04 18:17:33 +08:00
Kang	bc6b316e87	[chore](index) add config enable_create_bitmap_index_as_inverted_index default true #33434 (#35521 )	2024-06-04 12:07:03 +08:00
Ashin Gau	4f0365e0bf	[fix](s3) move s3 providers to fe-common to be accessible for jni reader (#35779 ) backport: #35690 `PropertyConverter.setS3FsAccess` has add customized s3 providers: ``` public static final List<String> AWS_CREDENTIALS_PROVIDERS = Arrays.asList( DataLakeAWSCredentialsProvider.class.getName(), TemporaryAWSCredentialsProvider.class.getName(), SimpleAWSCredentialsProvider.class.getName(), EnvironmentVariableCredentialsProvider.class.getName(), IAMInstanceCredentialsProvider.class.getName()); ``` And these providers are set as configuration value of `fs.s3a.aws.credentials.provider`, which will be used as configuration to build s3 reader in JNI readers. However, `DataLakeAWSCredentialsProvider` is in `fe-core`, that is not dependent by JNI readers, so we have to move s3 providers to `fe-common'.	2024-06-03 14:04:39 +08:00
HHoflittlefish777	d83c714824	[branch-2.1](routine-load) adjusting the default configuration of routing load (#35753 ) #34898	2024-06-01 11:22:21 +08:00
slothever	fd23386ec5	[fix](auth)fix simple auth check and default username (#35620 ) fix simple auth check and default username we should set simple auth to valid by default, and check whether to set the default username in loginWithUGI	2024-05-30 19:59:37 +08:00
zy-kkk	b0e2461181	[branch-2.1][improvement](JdbcScan) Change the mysql function that does not support pushdown in JdbcScan to Config (#35631 ) pk #35196	2024-05-30 15:40:08 +08:00
Tiewei Fang	bddaeb9261	[Fix](JobSchedual) Modify the default value of `async_task_consumer_thread_num` (#35456 ) When `Export` statements are executed concurrently, the background uses `Job schedule` to manage export tasks. Previously, the default value of `async_task_consumer_thread_num` was 5, meaning that regardless of the concurrency setting, a maximum of only 5 threads could execute concurrently. On the other hand, not only `Export` uses `Job schedule`, but other scheduled tasks might also use `Job schedule`, leading to a shortage of thread resources Now, we have found that in many scenarios, `Export` needs to be set to a high concurrency value and run concurrently according to that high value. Clearly, `async_task_consumer_thread_num = 5` is no longer sufficient, so we have changed the default value of `async_task_consumer_thread_num` to 64	2024-05-28 18:54:06 +08:00
Xujian Duan	af7b16f213	[optimize](desc) display the correct data type of aggStateType (#34968 ) If a table column is AGG_STATE type, we can't get the clear defined data type if we use `desc tbl` statement. create table a_table( k1 int null, k2 agg_state<max_by(int not null,int)> generic, k3 agg_state<group_concat(string)> generic ) aggregate key (k1) distributed BY hash(k1) buckets 3 properties("replication_num" = "1"); before optimize: mysql> desc a_table; +-------+------------------------------------------------+------+-------+---------+---------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +-------+------------------------------------------------+------+-------+---------+---------+ \| k1 \| INT \| Yes \| true \| NULL \| \| \| k2 \| org.apache.doris.catalog.AggStateType@239f771c \| No \| false \| NULL \| GENERIC \| \| k3 \| org.apache.doris.catalog.AggStateType@2e535f50 \| No \| false \| NULL \| GENERIC \| +-------+------------------------------------------------+------+-------+---------+---------+ 3 rows in set (0.00 sec) after optimize: mysql> desc a_table; +-------+------------------------------------+------+-------+---------+---------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +-------+------------------------------------+------+-------+---------+---------+ \| k1 \| INT \| Yes \| true \| NULL \| \| \| k2 \| AGG_STATE<max_by(INT, INT NULL)> \| No \| false \| NULL \| GENERIC \| \| k3 \| AGG_STATE<group_concat(TEXT NULL)> \| No \| false \| NULL \| GENERIC \| +-------+------------------------------------+------+-------+---------+---------+ Co-authored-by: duanxujian <duanxujian@jd.com>	2024-05-22 10:03:31 +08:00
924060929	5012ddd87a	[fix](Nereids) fix sql cache return old value when truncate partition (#34698 ) 1. fix sql cache return old value when truncate partition 2. use expire_sql_cache_in_fe_second to control the expire time of the sql cache which in the NereidsSqlCacheManager	2024-05-18 18:05:31 +08:00
HHoflittlefish777	1a24895257	[opt](routine-load) optimize routine load task thread pool and related param(#32282 ) (#34896 )	2024-05-15 12:42:02 +08:00
zhangdong	f9c42f34dd	[fix](auth)Compatible with previously enabled ldap configuration (#34891 )	2024-05-15 12:36:47 +08:00
Mingyu Chen	cadbbdd2c0	[fix](config) for compatibility issue of log dir config (#34734 ) * [fix](config) for compatibility issue of log dir config * 1	2024-05-12 09:44:50 +08:00
xy720	ec34bc0386	[bug](config) Fix modifying label_num_threshold does not take effect (#34575 )	2024-05-10 22:12:17 +08:00
morrySnow	9a94681b29	[refactor](type) AggStateType should not extends ScalarType (#34463 ) 1. let AggStateType extends Type 2. remove useless interface isFixedLengthType and supportsTablePartitioning 3. let MapType implement interface isSupported 4. let VariantType extends ScalarType	2024-05-10 22:10:42 +08:00
lihangyu	853dbdcb00	[Feature](PreparedStatement) implement general server side prepared (#33807 )	2024-05-10 22:10:11 +08:00
feiniaofeiafei	6c11dd2231	[Fix](planner) fix ScalarType.getAssignmentCompatibleType() when deal boolean and decimal (#34435 ) The legacy planner encounters issues when handling filters such as: c1(boolean type)=0.0(decimalv3). The literal 0.0 is interpreted as decimalv3(1,1), and the boolean type c1 is coerced to decimalv3(1,1). decimalv3(1,1) can only retain values in the range [0,1), while the boolean true is represented as 1, exceeding the upper bound, thus causing an overflow problem. This pull request addresses this issue by considering the boolean type as decimalv3(1,0), making both c1 and 0.0 being cast to decimal(2,1). Co-authored-by: feiniaofeiafei <moailing@selectdb.com>	2024-05-10 22:07:16 +08:00
Kaijie Chen	07207b7b51	[feature](shuffle) enable strict consistency dml by default (#32958 ) (#34641 )	2024-05-10 14:31:50 +08:00
Mingyu Chen	3ae3f9d6e1	[opt](catalog) support using loading cache for db/table list in external catalog (#33610 ) (#34596 ) bp #33610	2024-05-09 17:50:39 +08:00
yiguolei	8fa1b78d7b	Revert "[feature](shuffle) enable strict consistency dml by default (#32958 )" This reverts commit 400105a92182755bdd95a58a7d378d67c6b27f51.	2024-05-08 23:00:46 +08:00
Kaijie Chen	400105a921	[feature](shuffle) enable strict consistency dml by default (#32958 )	2024-05-08 11:00:14 +08:00
wudongliang	182177def0	[Improve](config)The stream_load label length is changed to a configurable (#34459 ) pick from #33745	2024-05-07 20:43:16 +08:00
yiguolei	8fdfbcb3c4	Revert "[Opt](func) opt the percentile func performance (#34373 ) (#34416 )" This reverts commit 509ae425e416b4779ae94eab9c2b21f9850e03c3.	2024-05-07 07:23:48 +08:00
slothever	2d4da7d177	[fix](kerberos)enable hadoop auto renew tgt (#34439 )	2024-05-07 00:36:20 +08:00
HappenLee	509ae425e4	[Opt](func) opt the percentile func performance (#34373 ) (#34416 )	2024-05-06 20:10:35 +08:00
Jibing-Li	91887a285e	Implement HLL with 128 buckets to support statistics cache. (#34124 )	2024-04-26 15:05:36 +08:00

1 2 3 4 5 ...

476 Commits