doris

Author	SHA1	Message	Date
thinker	8cf7ff78df	[Bug] big_int * big_int product overflow (#6788 ) while query with multi where conditions, such as `where dt in (20210926,20210919) and hour<=13`, will cause int * int product overflow result. and then in the function extend_scan_key will call `range.convert_to_fixed_value()` mistakenly. And for a big `range[_low_value, _high_value)`, mass value will be inserted into _fixed_values, result in oom finally.	2021-10-03 12:17:03 +08:00
Zhengguo Yang	7297b275f1	[Optimize] Optimize cpu consumption when importing parquet files (#6782 ) Remove part of dynamic_cast, reduce the overhead caused by type conversion, and probably reduce the cpu consumption of parquet file import by about 10%	2021-10-03 12:14:35 +08:00
Mingyu Chen	ad3c9390a2	[Bug] Fix bdbje getDatabaseNames() bug and scan node close bug (#6769 ) 1. This bug is introduced from #6582 2. Optimize the error log of Address used used error msg. 3. Add some document about compilation. 1. Add a custom thirdparty download url. 2. Add a custom com.alibaba maven jar package for DataX. 4. Fix bug that BE crash when closing scan node, introduced from #6622.	2021-09-29 11:11:28 +08:00
Mingyu Chen	982b76c3c0	[Bug] Fix resource tag bug, add documents and some other bug fix (#6708 ) 1. Fix bug of UNKNOWN Operation Type 91 2. Support using resource_tag property of user to limit the usage of BE 3. Add new FE config `disable_tablet_scheduler` to disable tablet scheduler. 4. Add documents for resource tag. 5. Modify the default value of FE config `default_db_data_quota_bytes` to 1PB. 6. Add a new BE config `disable_compaction_trace_log` to disable the trace log of compaction time cost. 7. Modify the default value of BE config `remote_storage_read_buffer_mb` to 16MB 8. Fix `show backends` results error 9. Add new BE config `external_table_connect_timeout_sec` to set the timeout when connecting to odbc and mysql table. 10. Modify issue template to enable blank issue, for release note or other specific usage. 11. Fix a bug in alpha_row_set split_range() function.	2021-09-28 10:37:42 +08:00
Mingyu Chen	42c7d39faa	[Revert] "[Enhancement] Modify the method of calculating compaction score (#6252 )" (#6748 ) This reverts commit dedb57f87e31305db3e2a13e374ba4fd58043fca. Reverts #6252 This commit may cause tablet which segments are all empty never to compaction, and results in -235 error. I will revert this commit, and the problem will be solved in #6671	2021-09-27 10:35:19 +08:00
thinker	850cf10991	[Refactor] refactor olap_scan_node: discard boost, remove dynamic_cast (#6622 ) 1. refactor olap_scan_node: discard boost, remove dynamic_cast 2. use move instead of copy version for push_back	2021-09-27 10:32:57 +08:00
Mingyu Chen	36d6788bc3	[Optimize] Use compact mode to send query plan thrift data structure. (#6702 ) In some cases, the query plan thrift structure of a query may be very large (for example, when there are many columns in SQL), resulting in a large number of "send fragment timeout" errors. This PR adds an FE config to control whether to transmit the query plan in a compressed format. Using compressed format transmission can reduce the size by ~50%. But it may reduce the concurrency by ~10%. Therefore, in the high concurrency small query scenario, you can choose to turn off compaction.	2021-09-25 12:13:29 +08:00
EmmyMiao87	bdc8c98008	[Outfile] Support hdfs in select outfile clause (#6644 ) Support hdfs in select outfile clause without broker. This PR implement a HDFS writer in BE which is used to write HDFS file directly without using broker. Also the hdfs outfile clause syntax check has been added in FE. The syntax: ``` select * from xx into outfile "hdfs://user/outfile_" format as csv properties ("hdfs.fs.dafultFS" = "xxx", "hdfs.hdfs_user" = "xxx"); ``` Note that all hdfs configurations need to carry a prefix `hdfs.`.	2021-09-24 10:07:11 +08:00
Zhengguo Yang	5c45e26644	Fixed zone map init error for string type (#6667 ) Fixed the problem that the StringValue memory generated by Expr may be released before use Fixed from_string for String type may overflow	2021-09-23 09:44:22 +08:00
Mingyu Chen	521fb15a9b	[Bug] Fix some memory bugs (#6699 ) 1. Fix a memory leak in `collect_iterator.cpp` (Fix #6700) 2. Add a new BE config `max_segment_num_per_rowset` to limit the num of segment in new rowset.(Fix #6701) 3. Make the error msg of stream load more friendly.	2021-09-22 12:30:14 +08:00
Mingyu Chen	fee8e6afc5	[Bug] Fix some bugs (#6665 ) 1.Fix a potential BE coredump of sending batch when loading data. (Fix [Bug] BE crash when loading data #6656) 2.Fix a potential BE coredump when doing schema change. (Fix [Bug] BE crash when doing alter task #6657) 3.Optimize the metric of base_compaction_request_failed. 4.Add Order column in show tablet result. (Fix [Feature] Add order column in SHOW TABLET stmt result #6658) 5.Fix bug that tablet repair slot not being released. (Fix [Bug] Tablet scheduler stop working #6659) 6.Fix bug that REPLICA_MISSING error can not be handled. (Fix [Bug] REPLICA_MISSING error can not be handled. #6660) 7.Modify column name of SHOW PROC "/cluster_balance/cluster_load_stat" 8.Optimize the result of SHOW PROC "/statistic" to show COLOCATE_MISMATCH tablets (Fix [Feature] the health status of colocate table's tablet is not shown in show proc statistic #6663) 9.Fix bug that show load where state='pending' can not be executed. (Fix [Bug] show load where state='pending' can not be executed. #6664)	2021-09-17 10:11:37 +08:00
Zhengguo Yang	332ba4cded	[config] use thrift_rpc_timeout_ms config replace hard code value (#6637 ) use thrift_rpc_timeout_ms config to replace hard code value	2021-09-16 10:22:57 +08:00
Zhengguo Yang	61c9d11fdb	support change column type from decimal to string (#6643 )	2021-09-14 15:56:44 +08:00
Cui Kaifeng	020282e885	[Bug] Fix aes_decrypt to handle null input correctly. (#6636 )	2021-09-14 11:19:55 +08:00
qiye	225bdb1fda	[Bug] fix `replace` function bug (#6605 ) * fix replace function bug * fix replace docs	2021-09-14 09:59:13 +08:00
Zhengguo Yang	794d4e7ace	fix insert null as string type may coredump (#6615 )	2021-09-13 12:30:34 +08:00
Yunfeng,Wu	b3ae607fe9	[Sprak-Doris-Connector] support boolean data type (#6601 ) 1. Support boolean data type for spark-doris-connector because Doris has previously supported the boolean data type 2. Bug-Fix for the Doris BE core when spark request data from be	2021-09-12 10:07:23 +08:00
stdpain	a4fbad3736	[BUG][Profile] Fixed the problem that BE's profile could not add chil… (#6268 ) * [BUG][Profile] Fixed the problem that BE's profile could not add child profile in the specified correct location bug: runtime_profile()->add_child(build_phase_profile, false, nullptr); child profile will add to second location * Update runtime_profile.cpp	2021-09-10 09:53:51 +08:00
Mingyu Chen	b2f1e21a3b	[Bugs] Fix some bugs (#6586 ) * fix regex lazy * fix result file core * fix dynamic partition replica and table name length bug * fix replicanum 0 * fix delete bug * renew proxy Co-authored-by: morningman <chenmingyu@baidu.com>	2021-09-10 09:53:30 +08:00
stdpain	39bb669dcb	[BUG] fix extra memory copy in bitmap value (#6599 )	2021-09-10 09:52:41 +08:00
Zhengguo Yang	4f744333c2	fix some core in local test: (#6594 ) 1. insert very large string value may coredump 2. some analitic functiuon and agg function result may be incorrect 3. string compare may be coredump when string type is too large 4. string type in delete condition can not process correctly 5. add text/blob as alias of string to compitable with mysql 6. fix string type min/max agg may process incorrectly	2021-09-10 09:52:03 +08:00
Mingyu Chen	de66312a1a	[Job Clean] Sort out the cleanup logic of historical jobs (#6553 ) There are many historical job records in Doris, such as load jobs, alter jobs, export jobs and so on. These historical jobs are generally cleaned up periodically by the cleanup thread, to avoid taking too much memory. This PR reorganized the cleanup logic of historical jobs and optimized the cleanup logic of some historical jobs to reduce the memory usage of historical jobs. The following FE configuration items are related to historical job cleaning: 1. label_keep_max_second Used to determine whether LoadJob, LoadJobV2, RoutineLoadJob or TransactionState are expired. 2. streaming_label_keep_max_second Used to determine whether InsertJob, DeleteJob or TransactionState are expired. Different from label_keep_max_second, this config is used to clean up these frequently submitted jobs or load transactions. 3. history_job_keep_max_second Used to determine whether AlterJob, ExportJob are expired	2021-09-07 16:57:45 +08:00
Mingyu Chen	74ddea8d83	[Optimize] Remove some unused code to reduce lock contention (#6566 ) 1. Remove global runtime profile counter 2. Remove unused thread token register	2021-09-07 11:56:12 +08:00
Pxl	577ff01094	[Bug][Function] Fix pad function wrong result when len.val==str_char_size (#6564 ) like #6563 and #6562	2021-09-07 11:55:49 +08:00
EmmyMiao87	9469b2ce1a	[Outfile] Support concurrent export of query results (#6539 ) This pr mainly supports 1. Export query result sets concurrently 2. Query result set export supports s3 protocol Among them, there are several preconditions for concurrently exporting query result sets 1. Enable concurrent export variables 2. The query itself can be exported concurrently (some queries containing sort nodes at the top level cannot be exported concurrently) 3. Export the s3 protocol used instead of the broker After exporting the result set concurrently, the file prefix is changed to outfile_{query_instance_id}_filenumber.{file_format}	2021-09-07 11:53:32 +08:00
stdpain	1206f9a17f	[RuntimeFilter] support TYPE_STRING in runtime filter (#6573 )	2021-09-06 11:55:11 +08:00
weizuo93	57199955d6	[Compaction][ThreadPool]Support adjust compaction threads num at runtime (#5781 ) * adjust thread number of compaction thread pool dynamically Co-authored-by: weizuo <weizuo@xiaomi.com>	2021-09-02 10:01:44 +08:00
wangbo	d8cde8c044	(#6454 ) Remove useless code for Segment V2 (#6455 )	2021-09-02 09:59:21 +08:00
zhangstar333	7a15e583a7	[Feature]Support functions of json_array, json_object, json_quote (#6504 )	2021-09-02 09:59:02 +08:00
Zhengguo Yang	9f7d4cf741	[BUG] fix bugs with string type (#6538 ) * fix bugs with string type 1. not support string with agg type min/max 2. agg_update with large string may coredump 3. stringval with large string may coredump 4. not support string as partition key	2021-09-01 15:59:55 +08:00
caiconghui	0393c9b3b9	[Optimize] Support send batch parallelism for olap table sink (#6397 ) * Support send batch parallelism for olap table sink Co-authored-by: caiconghui <caiconghui@xiaomi.com>	2021-08-30 11:03:09 +08:00
weizuo93	dedb57f87e	[Enhancement] Modify the method of calculating compaction score (#6252 ) * optimize calculation method of compaction score to lower the priority of rowset with 0 segments Co-authored-by: weizuo <weizuo@xiaomi.com>	2021-08-27 11:10:41 +08:00
Mingyu Chen	3f2fdd236f	Add scan thread token (#6443 )	2021-08-27 10:56:17 +08:00
stdpain	bfb2252175	[RuntimeFilter] provide no simd block bloom filter implement to support arm (#6511 )	2021-08-27 10:22:36 +08:00
Zhengguo Yang	ca3eb6490e	push down conditions on unique table value columns to base rowset (#6457 )	2021-08-26 09:14:49 +08:00
Zhengguo Yang	acc5fd2f21	[BUG] Fix string type cast bug and runtime filter may core when not support avx2 (#6495 ) * fix string type cast bug and runtime filter instructions may not support * add arm support	2021-08-26 09:14:31 +08:00
Mingyu Chen	fa290383dc	[Doc] Modify README to add some statistical indicators (#6486 ) 1. Add license/total line/release badegs. 2. Add monthly active contributor and contributor growth graph 3. fix a pom.xml bug 4. Modify some routine load log on BE side	2021-08-25 09:36:26 +08:00
caiconghui	7e30b28f3a	[Optimize] Speed up converting the data of other types to string in mysql_result_writer (#6384 ) Co-authored-by: caiconghui <caiconghui@xiaomi.com>	2021-08-24 22:30:58 +08:00
Zhengguo Yang	146060dfc0	[Bug]Fix result_writer may coredump (#6482 ) fix result_writer may coredump, let BufferControlBlock owns the memory	2021-08-22 22:04:00 +08:00
Mingyu Chen	fa382f8602	[Bug][MemLimit] Modify the memory limit of storage page cache (#6451 ) This CL mainly changes: 1. the `storage_page_cache_limit` is based on config `mem_limit` the default is 20% of `mem_limit`. 2. the `buffer_pool_limit` is based on config `mem_limit` the default is 20% of `mem_limit`. 3. the `buffer_pool_clean_pages_limit` is based on config `buffer_pool_limit` the default is 50% of `buffer_pool_limit` 4. Fix some show bugs of lru cache hit ratio and usage ratio 5. Fix a create view bug that `notEvalNondeterministicFunction` should be reset after analyze.	2021-08-19 14:16:53 +08:00
Xiang Wei	c65ec3136b	[Improvement] spark load without agg and de/serialization (#6270 ) fix #6269 The outline of our changes is to improve our memory in case of OOM in BE and to speed up the calculation. 1. We do not need to do Aggregation in load, which has already been done in the ETL spark job. 2. Based on 1, we do not need to serialize/deserialize bitmap/HLL objects.	2021-08-19 14:15:01 +08:00
Hao Tan	66a7a4b294	[Feature] Support exact percentile aggregate function (#6410 ) Support to calculate the exact percentile value array of numeric column `col` at the given percentage(s).	2021-08-18 15:56:06 +08:00
Zhengguo Yang	0c5c3f7d87	Fixed the problem that there may be redundant retries when the query result export fails (#6436 )	2021-08-18 09:06:02 +08:00
Zhengguo Yang	8738ce380b	Add long text type STRING, with a maximum length of 2GB. Usage is similar to varchar, and there is no guarantee for the performance of storing extremely long data (#6391 )	2021-08-18 09:05:40 +08:00
caiconghui	285d44cd48	[BUG] Fix potential overflow exception when do money format for double (#6408 ) * [BUG] Fix potential overflow bug when do money format for double Co-authored-by: caiconghui <caiconghui@xiaomi.com>	2021-08-15 18:40:26 +08:00
Mingyu Chen	2030c44dba	[Log] Modify some log level on BE side (#6381 )	2021-08-14 10:25:45 +08:00
stdpain	34af66bf1d	[BUG][Memory] fix memory tracker DCHECK fail in debug mode and Fix Process Memory limit fail (#6438 )	2021-08-14 10:24:33 +08:00
Pxl	8a267f1ac5	[Feature] Support for cleaning the trash actively (#6323 )	2021-08-12 10:07:51 +08:00
HappenLee	9216735cfa	[New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329 ) 1. FE vectorized plan code 2. Function register vec function 3. Diff function nullable type 4. New thirdparty code and new thrift struct	2021-08-11 14:54:06 +08:00
luozenglin	0930e89452	[http][manager] Add manager related http interface. (#6396 ) Encapsulate some http interfaces for better management and maintenance of doris clusters. The http interface includes getting cluster connection information, node information, node configuration information, batch modifying node configuration, and getting query profile. For details, please refer to the document: `docs/zh-CN/administrator-guide/http-actions/fe/manager/`	2021-08-10 10:58:31 +08:00

1 2 3 4 5 ...

1446 Commits