doris

Author	SHA1	Message	Date
HappenLee	53fed2d35e	[BUG] Fix the bug of query in expr (#6767 ) (#6768 )	2021-10-05 12:26:10 +08:00
xy720	5f3559a94c	[Bug][Binlog] Fix Bug that multiple sync jobs can connect to the same canal instance (#6756 ) When creating sync jobs, we should ban that different jobs can connect to the same canal instance, or else these jobs will compete with each other for the data produced by the same canal instance, which may cause data inconsistency.	2021-10-03 12:21:06 +08:00
thinker	8cf7ff78df	[Bug] big_int * big_int product overflow (#6788 ) while query with multi where conditions, such as `where dt in (20210926,20210919) and hour<=13`, will cause int * int product overflow result. and then in the function extend_scan_key will call `range.convert_to_fixed_value()` mistakenly. And for a big `range[_low_value, _high_value)`, mass value will be inserted into _fixed_values, result in oom finally.	2021-10-03 12:17:03 +08:00
Zhengguo Yang	7297b275f1	[Optimize] Optimize cpu consumption when importing parquet files (#6782 ) Remove part of dynamic_cast, reduce the overhead caused by type conversion, and probably reduce the cpu consumption of parquet file import by about 10%	2021-10-03 12:14:35 +08:00
EmmyMiao87	fb7fc27a0a	[Bug] Fix duplicate result in colocated agg node (#6727 ) Fixed #6726 If the plan fragment contains colocated agg plan node, it will be a colocated fragment. The scan range and backend id of colocated fragment instance should be different from ordinary scheduler logic. Tablets in the same bucket must fall on the same be. For example, for the same bucket in different partitions, even though the tablet id is different, they must be scheduled to the same be for scan node.	2021-10-03 11:59:38 +08:00
pengxiangyu	83003cc372	[Thirdparty] Change libhdfs3 download url to a stable one(#6744 )	2021-10-03 11:56:36 +08:00
Mingyu Chen	7a20d6d4c2	[Doc] Modify document of resource tag (#6778 ) Fix typo	2021-10-03 11:37:45 +08:00
shee	e7707c8180	[FOLLOWUP] create table like clause support copy rollup (#6580 ) * Remove `ALL` key word to make grammar more clear. Co-authored-by: qzsee <shizhiqiang03@meituan.com>	2021-09-30 18:26:21 +08:00
Mingyu Chen	ad3c9390a2	[Bug] Fix bdbje getDatabaseNames() bug and scan node close bug (#6769 ) 1. This bug is introduced from #6582 2. Optimize the error log of Address used used error msg. 3. Add some document about compilation. 1. Add a custom thirdparty download url. 2. Add a custom com.alibaba maven jar package for DataX. 4. Fix bug that BE crash when closing scan node, introduced from #6622.	2021-09-29 11:11:28 +08:00
chovy	8d471007a6	[Feature] support spark connector sink stream data to doris (#6761 ) * [Feature] support spark connector sink stream data to doris * [Doc] Add spark-connector batch/stream writing instructions * add license and remove meaningless blanks code Co-authored-by: wei.zhao <wei.zhao@aispeech.com>	2021-09-28 17:46:19 +08:00
wudi	df5ba6b5a2	[Fix] Flink connector support json import and use httpclient to streamlaod (#6740 ) * [Bug]:fix when data null , throw NullPointerException * [Bug]:Distinguish between null and empty string * [Feature]:flink-connector supports streamload parameters * [Fix]:code style * [Fix]: support json format import and use httpclient to streamload * [Fix]:remove System out * [Fix]:upgrade httpclient version * [Doc]: add json format import doc Co-authored-by: wudi <wud3@shuhaisc.com>	2021-09-28 17:37:03 +08:00
Henry2SS	cdf9f9e980	[Dynamic Partition] reserve specific history periods by dynamic partition. (#6554 ) Add RESERVED_HISTORY_STARTS and RESERVED_HISTORY_ENDS. Fixes #6514	2021-09-28 11:39:35 +08:00
Jennifer Huang	adf6510050	[docs] Update README.md (#6711 )	2021-09-28 10:38:23 +08:00
Mingyu Chen	982b76c3c0	[Bug] Fix resource tag bug, add documents and some other bug fix (#6708 ) 1. Fix bug of UNKNOWN Operation Type 91 2. Support using resource_tag property of user to limit the usage of BE 3. Add new FE config `disable_tablet_scheduler` to disable tablet scheduler. 4. Add documents for resource tag. 5. Modify the default value of FE config `default_db_data_quota_bytes` to 1PB. 6. Add a new BE config `disable_compaction_trace_log` to disable the trace log of compaction time cost. 7. Modify the default value of BE config `remote_storage_read_buffer_mb` to 16MB 8. Fix `show backends` results error 9. Add new BE config `external_table_connect_timeout_sec` to set the timeout when connecting to odbc and mysql table. 10. Modify issue template to enable blank issue, for release note or other specific usage. 11. Fix a bug in alpha_row_set split_range() function.	2021-09-28 10:37:42 +08:00
Mingyu Chen	42c7d39faa	[Revert] "[Enhancement] Modify the method of calculating compaction score (#6252 )" (#6748 ) This reverts commit dedb57f87e31305db3e2a13e374ba4fd58043fca. Reverts #6252 This commit may cause tablet which segments are all empty never to compaction, and results in -235 error. I will revert this commit, and the problem will be solved in #6671	2021-09-27 10:35:19 +08:00
shee	e4d999274f	[BUG] Fix a bug when modify table's colocate group with same name (#6695 ) If new group name is the same as old group name when mod table colocate group name, the group has been in an unstable state	2021-09-27 10:34:41 +08:00
thinker	850cf10991	[Refactor] refactor olap_scan_node: discard boost, remove dynamic_cast (#6622 ) 1. refactor olap_scan_node: discard boost, remove dynamic_cast 2. use move instead of copy version for push_back	2021-09-27 10:32:57 +08:00
xy720	3db8160400	[Bug] Fix Tuple is null predicate may cause be cores (#6466 )	2021-09-27 10:31:48 +08:00
Zeno Yang	11ec38dd6f	[Bug] When using view, make toSql method generates the final sql (#6736 ) 1. Fix the problem that the WITH statement cannot be printed when `UNION` is included in SQL 2. In the `toSql` method, convert the normal VIEW into the final statement 3. Replace `selectStmt.originSql` with `selectStmt.toSql`	2021-09-26 11:44:23 +08:00
dh-cloud	ce7f9bef91	[Bug][bdbje] handle bdb rollbackexception (#6582 ) when use 3 FE follower, when restart the fe, and regardless of order, we probability can't start fe success, and bdb throw RollbackException， In this scenario, the bdb suggests to catch the exception, simply closing all your ReplicatedEnvironment handles, and then reopening. so we catch the RollbackException, and reopen the ReplicatedEnvironment	2021-09-26 11:43:58 +08:00
Ming King	a121124fb2	[Doc] Update doris-on-es.md (#6734 ) Typo	2021-09-25 12:28:03 +08:00
jiafeng.zhang	f3d4c475b1	[DOC] Add connection reset exception solution (#6733 ) Add solution for connection reset exception when doing stream load.	2021-09-25 12:27:35 +08:00
Gabriel	ec777aa122	[DOCS] improve docs (#6718 )	2021-09-25 12:26:41 +08:00
weizuo93	e5a4172b27	[Bug][Docs]Fix outfile docs for parquet (#6709 ) Update outfile documents for parquet.	2021-09-25 12:24:52 +08:00
xy720	537a542dba	[Bugs] Fix the bugs list of sync job (#6705 ) 1、Fix bug that the sync jobs are not cancelled after deleting the database. 2、The MySQL and Doris tables should have a one-to-one correspondence. If they are not, they should fail when creating the task. 3、When the cluster has multiple FE, the non-master will core when replay create the sync job. 4、Inconsistent data when updating key column 5、Failed to synchronize data when there are multiple tables in single sync job. 6、After restarting the master, resuming the paused syncjob will fail.	2021-09-25 12:24:29 +08:00
Mingyu Chen	36d6788bc3	[Optimize] Use compact mode to send query plan thrift data structure. (#6702 ) In some cases, the query plan thrift structure of a query may be very large (for example, when there are many columns in SQL), resulting in a large number of "send fragment timeout" errors. This PR adds an FE config to control whether to transmit the query plan in a compressed format. Using compressed format transmission can reduce the size by ~50%. But it may reduce the concurrency by ~10%. Therefore, in the high concurrency small query scenario, you can choose to turn off compaction.	2021-09-25 12:13:29 +08:00
zhoubintao	56031cbbe1	[Doc] Change CN/EN sql-functions single quote in markdown (#6698 )	2021-09-24 21:42:52 +08:00
zhuixun	f73af475ce	[HTTP API] Add aggregation type information in table schema api (#6686 ) ``` { "msg": "success", "code": 0, "data": { "properties": [{ "type": "INT", "name": "k1", "comment": "", "aggregation_type":"" }, { "type": "INT", "name": "k2", "comment": "", "aggregation_type":"MAX" }], "status": 200 }, "count": 0 } ```	2021-09-24 21:42:24 +08:00
jiafeng.zhang	e03b74ebc1	[Doc] Add the error code document of returned by the OLAP function on the BE side (#6666 )	2021-09-24 21:40:20 +08:00
caiconghui	af771bee5a	[Improvement] Try to finish transaction if all backends of unfinished tasks have been dead (#6662 )	2021-09-24 21:39:20 +08:00
xhmz	68529d20f3	[Flink] Fix bug of flink doris connector (#6655 ) Flink-Doris-Connector do not support flink 1.13, refactor doris sink forma to not use GenericRowData. But to use RowData::FieldGetter.	2021-09-24 21:38:35 +08:00
ChPi	39fd839cd1	[Bug] fix backup bug when comparisons case sensitive (#6648 ) #6633	2021-09-24 21:35:50 +08:00
smallhibiscus	f49362b0d7	[Demo] Add Spark-Doris-Sink demo (#6570 ) This demo includes reading hdfs files and writing doris through streaming load、 reading kafka message queues and writing doris through streaming load and reading doris tables through spark doris connector to build DataFrame dataset.	2021-09-24 21:35:08 +08:00
Mingyu Chen	a52104fe40	[Bug] Fix bug that DROP SCHEMA will forcibly database (#6729 )	2021-09-24 10:35:40 +08:00
EmmyMiao87	bdc8c98008	[Outfile] Support hdfs in select outfile clause (#6644 ) Support hdfs in select outfile clause without broker. This PR implement a HDFS writer in BE which is used to write HDFS file directly without using broker. Also the hdfs outfile clause syntax check has been added in FE. The syntax: ``` select * from xx into outfile "hdfs://user/outfile_" format as csv properties ("hdfs.fs.dafultFS" = "xxx", "hdfs.hdfs_user" = "xxx"); ``` Note that all hdfs configurations need to carry a prefix `hdfs.`.	2021-09-24 10:07:11 +08:00
pierre xiong	840a7ef3a8	Fix a typo (#6688 ) Fix a typo	2021-09-23 09:44:46 +08:00
Zhengguo Yang	5c45e26644	Fixed zone map init error for string type (#6667 ) Fixed the problem that the StringValue memory generated by Expr may be released before use Fixed from_string for String type may overflow	2021-09-23 09:44:22 +08:00
Mingyu Chen	521fb15a9b	[Bug] Fix some memory bugs (#6699 ) 1. Fix a memory leak in `collect_iterator.cpp` (Fix #6700) 2. Add a new BE config `max_segment_num_per_rowset` to limit the num of segment in new rowset.(Fix #6701) 3. Make the error msg of stream load more friendly.	2021-09-22 12:30:14 +08:00
Mingyu Chen	fee8e6afc5	[Bug] Fix some bugs (#6665 ) 1.Fix a potential BE coredump of sending batch when loading data. (Fix [Bug] BE crash when loading data #6656) 2.Fix a potential BE coredump when doing schema change. (Fix [Bug] BE crash when doing alter task #6657) 3.Optimize the metric of base_compaction_request_failed. 4.Add Order column in show tablet result. (Fix [Feature] Add order column in SHOW TABLET stmt result #6658) 5.Fix bug that tablet repair slot not being released. (Fix [Bug] Tablet scheduler stop working #6659) 6.Fix bug that REPLICA_MISSING error can not be handled. (Fix [Bug] REPLICA_MISSING error can not be handled. #6660) 7.Modify column name of SHOW PROC "/cluster_balance/cluster_load_stat" 8.Optimize the result of SHOW PROC "/statistic" to show COLOCATE_MISMATCH tablets (Fix [Feature] the health status of colocate table's tablet is not shown in show proc statistic #6663) 9.Fix bug that show load where state='pending' can not be executed. (Fix [Bug] show load where state='pending' can not be executed. #6664)	2021-09-17 10:11:37 +08:00
xy720	95cdb7cc0c	[Enhance] [Binlog] Reduce thread number of SyncJob to save resources (#6418 ) This commit is going to reduce thread number of SyncJob . 1、Submit send task to thread pool to send data. 2、Submit eof task to thread pool to block and wake up client to commit transactions. 3、Use SerialExecutorService to ensure correct order of sent data in every channel. Besides，some bugs have been fixed in this commit 1、Failed to resume syncJob. 2、Failed to do sync data when set multiple tables in a syncJob. 3、In a cluster with multiple Fe, master may hang up after creating syncJob.	2021-09-17 10:01:27 +08:00
EmmyMiao87	085942b30f	[Doc] Download hashes and signatures use "downloads.apache.org" (#6677 ) The latest release should use https://www.apache.org/dyn/closer.lua <https://www.apache.org/dyn/closer.lua> The latest hashes and signatures should use https://downloads.apache.org/ The old release should use http://archive.apache.org/dist	2021-09-16 18:09:08 +08:00
ccoffline	dd8a1da159	[Performance] Improve performance for showing proc statistic(#6567 ) Co-authored-by: 迟成 <chicheng@meituan.com>	2021-09-16 10:43:57 +08:00
Zhengguo Yang	e45b487504	[PLUGIN] optimize output msg when auditlog plugin error (#6653 ) * Optimize output msg when auditlog plugin error	2021-09-16 10:29:47 +08:00
xy720	67472f3518	[Meta][Refactor] Use lambda expressions to save Image (#6646 ) The original code in MetaWriter is tedious，So I try use lambda expressions to make the code more clean.	2021-09-16 10:28:40 +08:00
Zeno Yang	7f5631717e	[Bug] Fix the bug of generating sql_key in SqlCache (#6668 ) The current SqlCache sql_key is generated by taking the md5 value of selectStmt.toSql(), but selectStmt.toSql() is spliced through the operator tree, and sometimes some specific parameters cannot be displayed, resulting in sql hits with different parameters The same cache is used, and the query results are inconsistent with expectations. For example, our user has a sql with more than 300 rows, which contains a lot of parameters, including partitions. But the result of selectStmt.toSql() is: SELECT `tb`.`type` AS `type`, `tb`.`name` AS `name`, `tb`.`name1` AS `name1`, `tb`.`name2` AS `name2`, `tb`.`name3` AS `name3` FROM ( SELECT 3 AS `type`, `cc`.`name` AS `name`, `cc`.`name1` AS `name1` , coalesce(`bb`.`name`, '请联系您的品牌业务经理进行咨询。') AS `name2`, `bb`.`name1` AS `name3` FROM `cc` LEFT JOIN `bb` ON `cc`.`id` = `bb`.`id1` UNION ALL SELECT `dd`.`type` AS `type`, `dd`.`name` AS `name`, `dd`.`name1` AS `name1`, `dd`.`name2` AS `name2`, `dd`.`name3` AS `name3` FROM `dd` UNION ALL SELECT `ee`.`type` AS `type`, `ee`.`name` AS `name`, `ee`.`name1` AS `name1`, `ee`.`name2` AS `name2`, `ee`.`name3` AS `name3` FROM `ee` ) tb LIMIT 10 In this way, the user specified different partitions for query, and the same cache was queried, which was inconsistent with the expected result. Therefore, it is recommended to use originStmt instead of selectStmt.toSql() to generate sql_key.	2021-09-16 10:26:15 +08:00
GeoffreyStark	7ee39743de	[Doc] Fix tabletScore expression in be_config.md (#6638 ) Co-authored-by: Geoffrey <gaofeng01@rd.netease.com>	2021-09-16 10:24:46 +08:00
Zhengguo Yang	332ba4cded	[config] use thrift_rpc_timeout_ms config replace hard code value (#6637 ) use thrift_rpc_timeout_ms config to replace hard code value	2021-09-16 10:22:57 +08:00
Zhengguo Yang	61c9d11fdb	support change column type from decimal to string (#6643 )	2021-09-14 15:56:44 +08:00
Cui Kaifeng	020282e885	[Bug] Fix aes_decrypt to handle null input correctly. (#6636 )	2021-09-14 11:19:55 +08:00
qiye	225bdb1fda	[Bug] fix `replace` function bug (#6605 ) * fix replace function bug * fix replace docs	2021-09-14 09:59:13 +08:00

1 2 3 4 5 ...

3356 Commits