doris

Author	SHA1	Message	Date
jiafeng.zhang	c3348b8023	[docs] fix config enable_force_drop_redundant_replica name not correct in docs (#8154 )	2022-02-21 09:40:21 +08:00
jiafeng.zhang	289aacb78c	[improvement] enable check_java_version (#8034 ) Enable to check the Java version when Doris starts, to prevent the user experience caused by the inconsistency between the compiled version and the running version. If the Java version is compiled and the Java version is run, it will not start, and a prompt message will be given.	2022-02-17 11:16:45 +08:00
Mingyu Chen	26289c28b0	[fix](load)(compaction) Fix NodeChannel coredump bug and modify some compaction logic (#8072 ) 1. Fix the problem of BE crash caused by destruct sequence. (close #8058) 2. Add a new BE config `compaction_task_num_per_fast_disk` This config specify the max concurrent compaction task num on fast disk(typically .SSD). So that for high speed disk, we can execute more compaction task at same time, to compact the data as soon as possible 3. Avoid frequent selection of unqualified tablet to perform compaction. 4. Modify some log level to reduce the log size of BE. 5. Modify some clone logic to handle error correctly.	2022-02-17 10:52:08 +08:00
jiafeng.zhang	79fd81f035	[doc] Added be -238 error code description (#8048 ) Added be -238 error code description	2022-02-17 10:47:52 +08:00
caiconghui	e6fedff68f	[Refactor][heartbeat] Make get fe heart response by thrift (#8035 ) * [Refactor] Make get fe heart response by thrift Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2022-02-17 10:25:51 +08:00
weizuo93	a6bf8c13eb	[Feature](Transaction) Support two phase commit (2PC) for stream load (#7473 ) The two phase batch commit means： During Stream load, after data is written, the message will be returned to the client, the data is invisible at this point and the transaction status is PRECOMMITTED. The data will be visible only after COMMIT is triggered by client. 1. User can invoke the following interface to trigger commit operations for transaction： curl -X PUT --location-trusted -u user:passwd -H "txn_id:txnId" -H "txn_operation:commit" \ http://fe_host:http_port/api/{db}/_stream_load_2pc or curl -X PUT --location-trusted -u user:passwd -H "txn_id:txnId" -H "txn_operation:commit" \ http://be_host:webserver_port/api/{db}/_stream_load_2pc 2.User can invoke the following interface to trigger abort operations for transaction： curl -X PUT --location-trusted -u user:passwd -H "txn_id:txnId" -H "txn_operation:abort" \ http://fe_host:http_port/api/{db}/_stream_load_2pc or curl -X PUT --location-trusted -u user:passwd -H "txn_id:txnId" -H "txn_operation:abort" \ http://be_host:webserver_port/api/{db}/_stream_load_2pc	2022-02-16 11:55:04 +08:00
Mingyu Chen	884fddbf33	[fix](compatibility) Fix compatibility issue of PRowBatch and some tablet sink bugs (#8000 ) 1. set both `tuple_offsets` and `new_tuple_offsets` in PRowBatch for compatibility 2. set FE config `repair_slow_replica` default to false Avoid impacting the load process after upgrading. Eg, if there are only 2 replicas, one is with high version count. After upgrade, that replica will be set to bad, so that the load process will be stopped because only 1 replica is alive. 3. Fix a bug that NodeChannel may be blocked at `close_wait()` Forget to set `add_batch_finish` flag after the last rpc finished. 4. Fix a NPE of RoutineLoadScheduler	2022-02-15 11:23:19 +08:00
yiguolei	aea3e4e59b	[refactor] Remove version hash from BE and related test in BE (#8027 )	2022-02-14 09:29:27 +08:00
sodamnsure	8d7a0d9747	[docs](routine-load)Update routine-load-manual.md (#8006 )	2022-02-14 09:28:08 +08:00
dataroaring	0b2b328c7b	[doc] remove useless word 'To' in materialized view (#7985 )	2022-02-10 15:08:58 +08:00
smallhibiscus	2e27827c73	[doc] Added http interface return example to obtain the specified table structure information (#7955 ) 1. Added http interface return example in table-schema-action.md. 2. Correct typos in the document in error.md. 3. Modify the content of the code comments in the text_converter.hpp file.	2022-02-10 15:07:28 +08:00
924060929	c1fef37399	[improvement](runtime-filter) Support adaptive runtime filter(#7546 ) (#7645 ) Change 1: Support an adaptive runtime filter: IN_OR_BLOOM_FILTER The processing logic is If the number of rows in the right table < runtime_filter_max_in_num, then IN predicate will work If the number of rows in the right table >= runtime_filter_max_in_num, then Bloom filter can take effect Change 2: The default runtime filter is changed to filter: IN_OR_BLOOM_FILTER	2022-01-30 16:46:52 +08:00
steadyBoy	0d3fe8f07b	[typo](docs) fix document backup-restore.md typo (#7868 ) Refactor the document format and improve the content of the English version	2022-01-27 10:34:34 +08:00
caiconghui	d69b7bff2e	[feature](meta) Support show compactionTooSlowTablets and oversizeTablets (#7821 ) Add more columns in `show proc "/statistic"`	2022-01-27 10:26:41 +08:00
Zhengguo Yang	4bdeef3b64	[chore][fix][doc](fe-plugin)(mysqldump) fix build auditlog plugin error (#7804 ) 1. fix problems when build fe_plugins 2. format 3. add docs about dump data using mysql dump	2022-01-26 09:11:23 +08:00
wudi	60c6bb4f92	[Feature][flink-connector] support flink delete option (#7457 ) * Flink Connector supports delete option on Unique models Co-authored-by: wudi <wud3@shuhaisc.com>	2022-01-23 20:24:41 +08:00
Mingyu Chen	3494c8973b	[improvement](colocation) Add a new config to delay the relocation of colocation group (#7656 ) 1. Add a new FE config `colocate_group_relocate_delay_second` The relocation of a colocation group may involve a large number of tablets moving within the cluster. Therefore, we should use a more conservative strategy to avoid relocation of colocation groups as much as possible. Relocation usually occurs after a BE node goes offline or goes down. This config is used to delay the determination of BE node unavailability. The default is 30 minutes, i.e., if a BE node recovers within 30 minutes, relocation of the colocation group will not be triggered. 2. Change the priority of colocate tablet repair and balance task from HIGH to NORMAL 3. Add a new FE config allow_replica_on_same_host If set to true, when creating table, Doris will allow to locate replicas of a tablet on same host. And also the tablet repair and balance will be disabled. This is only for local test, so that we can deploy multi BE on same host and create table with multi replicas.	2022-01-18 10:26:36 +08:00
xy720	e80c34b6fe	[docs][typo] fix some typos in documents (#7769 )	2022-01-16 10:43:42 +08:00
Mingyu Chen	5f8d91257b	[improvement](routine-load) Reduce the probability that the routine load task rpc timeout (#7754 ) If an load task has a relatively short timeout, then we need to ensure that each RPC of this task does not get blocked for a long time. And an RPC is usually blocked for two reasons. 1. handling "memory exceeds limit" in the RPC If the system finds that the memory occupied by the load exceeds the threshold, it will select the load channel that occupies the most memory and flush the memtable in it. this operation is done in the RPC, which may be more time consuming. 2. close the load channel When the load channel receives the last batch, it will end the task. It will wait for all memtables flushes to finish synchronously. This process is also time consuming. Therefore, this PR solves this problem by. 1. Use timeout to determine whether it is a high-priority load task If the timeout of an load task is relatively short, then we mark it as a high-priority task. 2. not processing "memory exceeds limit" for high priority tasks 3. use a separate flush thread to flush memtable for high priority tasks.	2022-01-16 10:41:31 +08:00
Adonis Ling	2cf574dc01	[docs] Improve instructions for the configuration of BE. (#7620 )	2022-01-11 15:06:05 +08:00
caiconghui	83f6eef506	[improvement](routine-load) Make routine load work with old kafka version (#7630 ) Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2022-01-10 17:30:24 +08:00
924060929	563545475e	[Optimize](Runtime Filter) Support merge in runtime filter(#7546 ) (#7547 ) Support merge IN predicate when exist remote target(e.g. shuffle hash join). Remote the code that IN predicate implicit conversion to Bloom filter then exist remote target. Close related #7546	2022-01-06 19:08:35 +08:00
Zhengguo Yang	bf4a867e85	[improvement](tablet-repair) add a config repair_slow_replica (#7423 ) Add a new FE config `repair_slow_replica` when this config is true, Doris will try to delete the replica with the largest number of versions, and then rebalance the replica. Usually, when the number of versions of a certain replica is much higher then that of other replicas, there are some problems with the current be's compilation. Migrating to other machines can typically solve this problem.	2022-01-04 10:28:14 +08:00
Henry2SS	6657524c51	[feature](sql-block-rule) add partition_num, tablet_num, cardinality in SqlBlockRule to block big/slow sql (#7403 ) Add partitionNum, tabletNum, cardinality in SqlBlockRule to block large/slow sql. 1. set partitionNum, tabletNum, cardinality as limitations to block sqls 2. compatible with lower version 3. add unit tests 4. add docs	2022-01-04 09:59:41 +08:00
zhengshiJ	723ee84a66	[feature] (planner) InferPredicate (#7096 ) This pr is for #7096 , which is add a rewrite rule for infer predicate. For example: origin stmt: select * from t1, t2, t3 where t1.id=t2.id and t2.i=t3.id and t2.id = 1 rewrite stmt: select * from t1, t2, t3 where t1.id=t2.id and t2.i=t3.id and t2.id = 1 and t1.id=1 and t3.id=1 + Add a switch enable_infer_predicate to control whether to perform predicate expansion. + Register a new rule InferFiltersrule and add it to GlobalState. + Traverse Conjunct to construct on/where equivalence connection, numerical connection and isNullPredicate. + Infer all equivalence connections + Construct additional numerical connections and isNullPredicate	2021-12-30 13:24:30 +08:00
pengxiangyu	dc9cd34047	[docs] Add user manual for hdfs load and transaction. (#7497 )	2021-12-30 10:22:48 +08:00
Zhengguo Yang	07e2acb2f3	[feature] Suport national secret (national commercial password) algorithm SM3/SM4 (#7464 ) SM3 is password hash algorithm SM4 is a block cipher used to replace DES / AES and other international algorithms.	2021-12-28 10:39:54 +08:00
Lijia Liu	ca97535491	[docs](executor) correct some be error code (#7460 ) correct some be error code in doc.	2021-12-26 11:06:54 +08:00
Heng Zhao	43ed54faa1	[docs] The name of hidden column is incorrect in batch-delete-manual.md(#7465 ) (#7466 )	2021-12-24 21:30:57 +08:00
jiafeng.zhang	695eca8cbc	[docs] add bloomfilter index doc (#7318 ) * add bloomfilter index doc	2021-12-21 11:05:20 +08:00
Henry2SS	998489ac50	[fix](sql-block-rule) move sql block rule check from ConnectProcessor to StmtExecutor (#7407 ) SqlBlockRule should block only query stmt. And exclude explain stmt.	2021-12-21 10:25:09 +08:00
caiconghui	06c38ce46e	[enhancement] Make concurrent_number for routine load task can be larger than be num (#7386 ) * [enhancement] Make concurrent_number for routine load task can be larger than be num Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2021-12-17 11:04:29 +08:00
Mingyu Chen	2b90967c4c	[fix][refactor](broker load) refactor the scheduling logic of broker load (#7371 ) 1. Refactor the scheduling logic of broker load. Details see #7367 2. Fix bug that loadedBytes in SHOW LOAD result is wrong. 3. Cancel the thread of LoadTimeoutChecker Now for PENDING load jobs, there will be no timeout. And the timeout of a load job start when pending load task is scheduled. 4. Fix a bug that the loading task is never submitted to the pool. The logic of BlockedPolicy is wrong. We should make sure the task is submitted to the pool, or the RejectedExecutionException should be thrown. 5. Now the transaction of a load job will begin in pending task, instead of when submitting the job.	2021-12-16 10:39:22 +08:00
jiafeng.zhang	2e334d06da	[docs](sql-block-rule) modify document of sql block rule (#7370 )	2021-12-16 10:38:54 +08:00
Zhengguo Yang	926540c561	[feature] Support return bitmp/hll data in select statement (#7276 ) Support return bitmp/hll data in select statement, this can be used when set show_object_data=true;	2021-12-15 09:48:27 +08:00
Mingyu Chen	db57c42c83	[improvement](compaction)(tablet repair) Add missing rowsets in compaction status url and support force dropping redundant replica (#7283 ) 1. Add missing rowsets in compaction status url 2. Add a new config `force_drop_redundant_replica` to force drop redundant replicas. 3. Fix FE ut	2021-12-09 22:34:57 +08:00
Zhengguo Yang	62d12067aa	[feature](udf) make orthogonal bitmap udaf as build in functions (#7211 ) move orthogonal bitmap udaf as build in functions add three buildin bitmap functions: - orthogonal_bitmap_intersect - orthogonal_bitmap_intersect_count - orthogonal_bitmap_union_count	2021-12-07 09:57:26 +08:00
renzhimin7	27f494dad3	[docs][typo] Update fe_config.md (#7252 ) Int type should be 4 bytes and decimal should be 16 bytes	2021-12-06 10:25:28 +08:00
EmmyMiao87	845f931098	[fix](select outfile) Remove optional properties check of hdfs storage (#7272 )	2021-12-03 13:42:56 +08:00
Xinyi Zou	fc9e502b51	[improvement](brpc)(config) Support transfer RowBatch in Controller Attachment (#7164 ) Transfer RowBatch in Protobuf Request to Controller Attachment, when the maximum length of the RowBatch in the Protobuf Request is exceeded. This can avoid reaching the upper limit of the Protobuf Request length (2G), and it is expected that performance can be improved.	2021-12-02 11:41:38 +08:00
caiconghui	fbab8afe24	[feature] Support disable query and load for backend to make Doris more robust and set default value to 1 for max_query_retry_time (#7155 ) ALTER SYSTEM MODIFY BACKEND "host1:9050" SET ("disable_query" = "true"); ALTER SYSTEM MODIFY BACKEND "host1:9050" SET ("disable_load" = "true");	2021-11-30 22:08:32 +08:00
xu20160924	3b988204fc	[doc] Modify the wrong comment of the ScanTime (#7109 ) Modify the wrong comment of the ScanTime.	2021-11-24 10:40:00 +08:00
tianhui5	d3c020b3cb	[feat-opt](fe-config) Add tablets number limit to void wrong usage (#7025 ) 1. Add new FE config `default_db_replica_quota_size` 2. Check replica quota after create table/partition	2021-11-24 10:37:54 +08:00
renzhimin7	ce7fa5d6d9	[typo] Update multi-tenant.md (#7162 ) A double quote is missing	2021-11-22 14:47:00 +08:00
tianhui5	143d3769b1	[feat](config) add FE config to limit the replica num per tablet (#7087 )	2021-11-20 21:40:23 +08:00
lihuigang	e9282205f1	[feat-opt](spark-load) support bitmap binary data from hive in spark load (#6883 ) Support to load the binary data of bitmap value from Hive into Doris. fix #6461	2021-11-20 21:38:38 +08:00
EmmyMiao87	11cca0b15d	[JoinReorder] Add session variable to close join order (#7076 ) The new session variable 'close_join_reorder' is used to turn off all automatic join reorder algorithms. If close_join_reorder is true, the Doris will execute query by the order in the original query.	2021-11-13 17:10:44 +08:00
曹建华	93ccef4ec7	[Feature] Add degradate strategy for local_replica_selection. (#7064 ) When local_replica_selection is turned on, support select a non-local BE to service the query when the local be is unavailable	2021-11-13 17:09:25 +08:00
Mingyu Chen	58804d3570	[Colocate] Fix bug that colocate group can not be redistributed after dropping a backend (#7020 ) Mainly changes: 1. Fix [Bug] Colocate group can not redistributed after dropping a backend #7019 2. Add detail msg about why a colocate group is unstable. 3. Add more suggestion when upgrading Doris cluster.	2021-11-11 15:41:49 +08:00
EmmyMiao87	5d946ccd5e	[Docs] Add hdfs outfile example (#7052 )	2021-11-09 10:02:28 +08:00

1 2 3 4 5 ...

267 Commits