doris

Author	SHA1	Message	Date
HappenLee	c00a5cb543	[Bug] Fix the core problem of function `split_part` and add the UT of core case (#4721 ) issue:#4720	2020-10-13 10:09:39 +08:00
Mingyu Chen	f431d8d94c	[Enhance][Log] Make RPC error log more clear (#4702 ) At present, when some rpc errors occur, the client cannot obtain the error information well. And this CL change the RPC error returned to client like this: ``` ERROR 1064 (HY000): errCode = 2, detailMessage = there is no scanNode Backend. [10002: in black list(A error occurred: errorCode=2001 errorMessage:Channel inactive error!)] ERROR 1064 (HY000): failed to send brpc batch, error=The server is overcrowded, error_text=[E1011]The server is overcrowded @xx.xx.xx.xx:8060 [R1][E1011]The server is overcrowded @xx.xx.xx.xx:8060 [R2][E1011]The server is overcrowded @xx.xx.xx.xx:8060 [R3][E1011]The server is overcrowded @xx.xx.xx.xx:8060, client: yy.yy.yy.yy ```	2020-10-13 10:08:43 +08:00
Mingyu Chen	83f6f46c34	[Config] Limit the version number of tablet (#4687 ) Add a BE config `max_tablet_version_num` to limit the version number of a single tablet. To avoid too many versions	2020-10-13 10:08:16 +08:00
Zhengguo Yang	75e0ba32a1	Fixes some be typo (#4714 )	2020-10-13 09:37:15 +08:00
HappenLee	d73d205de7	[ODBC/MySQL] Support Limit Clause Push Down For ODBC Table And MySQL Table(#4706 ) (#4707 ) 1. Support limit clause push down both odbc table and mysql table. 2. Code refactor of ODBC Scan Node, change `build_connect_string` and `query_string` from BE to FE to make it easily to modify	2020-10-11 21:11:04 +08:00
Zhengguo Yang	98e71a8b9f	[Bug][Function] Fix rand() function return same value (#4709 ) fix rand function return same value when no parameter	2020-10-11 15:40:38 +08:00
HappenLee	04f26e4b7f	[SQL] Support Bucket Shuffle Join (#4677 ) Support Bucket Shuffle Join issue:#4394	2020-10-11 15:37:32 +08:00
weizuo93	eba595583e	[Optimize] Optimize the execution model of compaction to limit memory consumption (#4670 ) Currently, there are M threads to do base compaction and N threads to do cumulative compaction for each disk. Too many compaction tasks may run out of memory, so the max concurrency of running compaction tasks is limited by semaphore. If the running threads cost too much memory, we can't defense it. In addition, reducing concurrency to avoid OOM will lead to some compaction tasks can't be executed in time and we may encounter more heavy compaction. Therefore, concurrency limitation is not enough. The strategy proposed in #3624 may be effective to solve the OOM. A CompactionPermitLimiter is used for compaction limitation, and use single-producer/multi-consumer model. Producer will try to generate compaction tasks and acquire `permits` for each task. The compaction task which can hold `permits` will be executed in thread pool and each finished task will release its `permits`. `permits` should be applied for before a compaction task can execute. When the sum of `permits` held by executing compaction tasks reaches a threshold, subsequent compaction task will be no longer allowed, until some `permits` are released. Tablet compaction score is used as `permits` of compaction task here. To some extent, memory consumption can be limited by setting appropriate `permits` threshold.	2020-10-11 11:39:25 +08:00
ccoffline	f3cdf167d1	[Feature] Add time_round builtin functions (#4640 ) #4619 Add time_round functions that provides `time_floor` & `time_ceil` at each time unit. Fix two related bugs. - #4618 - Fix `struct TimeInterval` to use `int64_t` instead of `int32_t`, in case when the second diff overflow	2020-10-09 16:05:51 +08:00
Mingyu Chen	ed09ea9cf7	[Bug] Fix bug that tablet report always out of date (#4695 ) This is because the report version is not set correctly in the tablet report request. Also modify the download url of libevent Fix #4689 Fix #4691	2020-10-09 15:54:31 +08:00
Zhengguo Yang	0475aa9b93	[Bug]Fix delete on clause may not work in routineLoad (#4683 ) fix delete on may not work in some cases, this is describe in #4682	2020-09-30 09:56:19 +08:00
Yingchun Lai	b1853caeed	[UDF] Improve performance of function money_format (#4672 ) Use static local variable instead of create it every calls. Time cost of the new added unit benchmark test could reduce from about 60 seconds to 10 seconds.	2020-09-28 13:39:41 +08:00
HaiBo Li	5199a17a4b	[cache][be]Fix the bug of cross-border access cache (#4639 ) * When the different partition of the table is updated frequently, the partition key list of the cache is discontinuous, and the partition key in the request cannot hit the key list in the cache, resulting in the access overrun，the BE will crash. * Add some unit test case，add test cases that fail to hit the boundary value of cache	2020-09-28 13:35:52 +08:00
HangyuanLiu	1dacadb015	[BUG] Fix DATA_TYPE in information_schema.columns is not compatible to mysql meta (#4648 ) Describe the bug DATA_TYPE in information_schema.columns is not compatible to mysql meta To Reproduce Steps to reproduce the behavior: select * from information_schema.columns Expected behavior the result of data_type is (int, decimal, char, varchar, ...)，but doris data_type is (int(11), varchar(20), ...) Excess number will affect some BI systems or upper system can't get right type	2020-09-25 13:38:09 +08:00
HappenLee	a61d0de173	[ODBC SCAN NODE] 4/4 Add ODBC_SCAN_NODE and Odbc_Scanner in BE and add ODBC_SCAN_NODE docs (#4438 )	2020-09-25 10:19:50 +08:00
Yingchun Lai	2a637f848d	[Refactor] Remove meaningless return value of RowBlock::init (#4627 ) Simplify some code, mainly remove meaningless return value of RowBlock::init.	2020-09-20 20:57:00 +08:00
weizuo93	985ec98a9b	[Bug] Fix the calculation of the variable "left_bytes" in data_dir.cpp (#4609 ) Suppose that the current available byte in a disk is `_available_bytes`. After the data with size of `incoming_data_size` is written in the disk, the left space in the disk should be calculated as follow : `int64_t left_bytes = _available_bytes - incoming_data_size;` rather than: `int64_t left_bytes = _disk_capacity_bytes - _available_bytes - incoming_data_size;`	2020-09-20 20:54:13 +08:00
Mingyu Chen	00f25c2b77	[Bug] Tablet and Disk report thread not work (#4597 ) The tablet and disk information reporting threads need to report to the FE periodically. At the same time these two reporting threads will also be triggered by certain events. The modification in PR #4440 caused these two threads to be triggered only by events, and could not report regularly.	2020-09-20 20:51:52 +08:00
HaiBo Li	5f43fb3bde	[Cache][BE] LRU cache for sql/partition cache #2581 (#4005 ) 1. Find the cache node by SQL Key, then find the corresponding partition data by Partition Key, and then decide whether to hit Cache by LastVersion and LastVersionTime 2. Refers to the classic cache algorithm LRU, which is the least recently used algorithm, using a three-layer data structure to achieve 3. The Cache elimination algorithm is implemented by ensuring the range of the partition as much as possible, to avoid the situation of partition discontinuity, which will reduce the hit rate of the Cache partition, 4. Use the two thresholds of maximum memory and elastic memory to control to avoid frequent elimination of data	2020-09-20 20:50:51 +08:00
qiye	065b979f35	[Bug] behavior of function str_to_date() and date_format() on BE and FE is inconsistent (#4612 ) 1. add date range check in `DateLiteral` for `FEFunctions` 2. `select str_to_date(202009,'%Y%m')` and `select str_to_date(str,'%Y%m') from tb where tb.str = '202009'` will return same output `2020-09-00`. 3. add support of zero-date to function `str_to_date()`,`date_format()` 4. fix FE can calculate negative value bug, eg: `select str_to_date('-2020', '%Y')` will return `NULL` instead of date value. current behavior is same as MySQL without sql_mode `NO_ZERO_IN_DATE` and `NO_ZERO_DATE`. current behavior ``` mysql> select siteid,str_to_date(siteid,'%Y%m%d') from table2 order by siteid; +------------+---------------------------------+ \| siteid \| str_to_date(`siteid`, '%Y%m%d') \| +------------+---------------------------------+ \| 1 \| 2001-00-00 \| \| 2 \| 2002-00-00 \| \| 2 \| 2002-00-00 \| \| 3 \| 2003-00-00 \| \| 4 \| 2004-00-00 \| \| 5 \| 2005-00-00 \| \| 20 \| 2020-00-00 \| \| 202 \| 0202-00-00 \| \| 2020 \| 2020-00-00 \| \| 20209 \| 2020-09-00 \| \| 202008 \| 2020-08-00 \| \| 202009 \| 2020-09-00 \| \| 2020009 \| 2020-00-09 \| \| 20200009 \| 2020-00-09 \| \| 20201309 \| NULL \| \| 2020090909 \| 2020-09-09 \| +------------+---------------------------------+ mysql> select str_to_date('2','%Y%m%d'),str_to_date('20','%Y%m%d'),str_to_date('202','%Y%m%d'),str_to_date('2020','%Y%m%d'),str_to_date('20209','%Y%m%d'),str_to_date('202009','%Y%m%d'),str_to_date('2020099','%Y%m%d'),str_to_date('20200909','%Y%m%d'),str_to_date('2020090909','%Y%m%d'),str_to_date('2020009','%Y%m%d'),str_to_date('20200009','%Y%m%d'),str_to_date('20201309','%Y%m%d'); +----------------------------+-----------------------------+------------------------------+-------------------------------+--------------------------------+---------------------------------+----------------------------------+-----------------------------------+-------------------------------------+----------------------------------+-----------------------------------+-----------------------------------+ \| str_to_date('2', '%Y%m%d') \| str_to_date('20', '%Y%m%d') \| str_to_date('202', '%Y%m%d') \| str_to_date('2020', '%Y%m%d') \| str_to_date('20209', '%Y%m%d') \| str_to_date('202009', '%Y%m%d') \| str_to_date('2020099', '%Y%m%d') \| str_to_date('20200909', '%Y%m%d') \| str_to_date('2020090909', '%Y%m%d') \| str_to_date('2020009', '%Y%m%d') \| str_to_date('20200009', '%Y%m%d') \| str_to_date('20201309', '%Y%m%d') \| +----------------------------+-----------------------------+------------------------------+-------------------------------+--------------------------------+---------------------------------+----------------------------------+-----------------------------------+-------------------------------------+----------------------------------+-----------------------------------+-----------------------------------+ \| 2002-00-00 \| 2020-00-00 \| 0202-00-00 \| 2020-00-00 \| 2020-09-00 \| 2020-09-00 \| 2020-09-09 \| 2020-09-09 \| 2020-09-09 \| 2020-00-09 \| 2020-00-09 \| NULL \| +----------------------------+-----------------------------+------------------------------+-------------------------------+--------------------------------+---------------------------------+----------------------------------+-----------------------------------+-------------------------------------+----------------------------------+-----------------------------------+-----------------------------------+ ```	2020-09-17 10:10:19 +08:00
ZhangYu0123	4f7cfee908	[compaction][config] Change default config policy to size_based (#4599 ) (1) change default compaction config policy to size_based (2) change missed version check policy when delete stale rowsets	2020-09-16 15:04:06 +08:00
Mingyu Chen	9419c73472	[Bug] Fix bug that BE will crash when querying information_schema.columns (#4595 )	2020-09-14 15:47:08 +08:00
EmmyMiao87	e8e5f350fe	[BUG] ReAgg when adding agg mv on dup base table (#4587 ) When the keystype of mv and base table is difference, Doris should execute sorting schema change instead of linked schema change. If doesn't, the data size of mv actually is same as base table. This will cause mv to have no pre-aggregation effect at all. The query will not choose mv. This commit fixed this problem. Fixed #4586	2020-09-13 19:17:35 +08:00
ZhangYu0123	4571b09dd6	[storage][compatibility] Add meta format detection to prevent data loss. (#4539 ) After 0.12 version, doris remove the format convert functiion which can convert from hdr_ format to tabletmeta_ format when loading metas, the commit link: 3bca253 When we update doris version and there are old format meta in storage, BE will not read the old format tablet. It can lead to data loss. So we add meta format detection function to prevent data loss. When there are old format meta in olap_meta, BE can find and print log or exit.	2020-09-13 11:58:22 +08:00
wangbo	2c24fe80fa	[SparkDpp] Support complete types (#4524 ) For[Spark Load] 1 support decimal andl largeint 2 add validate logic for char/varchar/decimal 3 check data load from hive with strict mode 4 support decimal/date/datetime aggregator	2020-09-13 11:57:33 +08:00
HuangWei	4caa6f9b33	[Bug] fix get_parsed_paths() subscript out of range (#4585 )	2020-09-12 16:04:21 +08:00
HuangWei	e26d5d0da0	[MemTracker] show all MemTrackers on BE's website (#4580 ) We can show all MemTrackers on BE's website by calling MemTracker::ListTrackers().	2020-09-12 11:18:50 +08:00
HuangWei	704bcec9d3	[Bug] add_batch check state fix (#4575 )	2020-09-12 11:18:10 +08:00
ZhangYu0123	d29bf30f74	[BUG] Fix stale path delete checking logic when current main path is missing. (#4549 ) Fix stale path delete checking logic. When current main path is version missing, then delete checking logic is always core dumped. So we fix the checking logic to tolerate current main version missing.	2020-09-08 18:52:53 +08:00
Mingyu Chen	e55327bbc7	[Bug] Fix bug that task_worker_pool not work (#4543 ) The number of thread initialized in task worker pool is not right. This bug is introduced from #4440	2020-09-08 09:25:36 +08:00
Yingchun Lai	64ebea2e43	[Feature] Support gzip compression for http response (#4533 ) After tablet level metrics is supported, the http metrics API may response a very large body when a BE holds a large number of tablets, and cause heavy network traffic. This patch introduce http content compression to reduce network traffic.	2020-09-06 20:30:12 +08:00
ZhangYu0123	69bd91b617	[BUG] Tablet is not readable and delete handler report -1903 error, when condition value contains \n (#4531 )	2020-09-06 20:29:44 +08:00
Yingchun Lai	b780df697a	[refactor] Optimize threads usage mode in BE (#4440 ) BE can not graceful exit because some threads are running in endless loop. This patch do the following optimization: - Use the well encapsulated Thread and ThreadPool instead of std::thread and std::vector<std::thread> - Use CountDownLatch in thread's loop condition to avoid endless loop - Introduce a new class Daemon for daemon works, like tcmalloc_gc, memory_maintenance and calculate_metrics - Decouple statistics type TaskWorkerPool and StorageEngine notification by submit tasks to TaskWorkerPool's queue - Reorder objects' stop and deconstruct in main(), i.e. stop network services at first, then internal services - Use libevent in pthreads mode, by calling evthread_use_pthreads(), then EvHttpServer can exit gracefully in multi-threads - Call brpc::Server's Stop() and ClearServices() explicitly	2020-09-06 20:19:14 +08:00
Youngwb	068707484d	Support sequence column for UNIQUE_KEYS Table (#4256 ) * add sequence col Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>	2020-09-04 10:10:17 +08:00
ZhangYu0123	8d60352737	[BUG] Fix segment group add zone map bug when schema change. (#4526 ) Fix segment group add zone map bug when schema change. (1) WrapperField null point check (2) in DUP_KEYS, let _zone_maps index consistent with _schema column index	2020-09-04 09:30:52 +08:00
Mingyu Chen	15f3e5a775	[Bug] Fix bug of core local value (#4523 ) When creating core local value from CoreDataAllocator, A lock is needed to protect the modification of _blocks.	2020-09-04 09:30:30 +08:00
Mingyu Chen	5166a6c6bc	[Bug] function str_to_date()'s behavior on BE and FE is inconsistent (#4495 ) Main CL: 1. Copy the code from BE to implement the `str_to_date()` function in FE. 2. `str_to_date("2020-08-08", "%Y-%m-%d %H:%i:%s")` will return `2020-08-08 00:00:00` instead of `2020-08-08`.	2020-09-03 17:16:19 +08:00
xinghuayu007	1a30bcbf36	[SQL Function][Bug] Fix parse_url() bug (#4429 ) The parameter 'part' of parse_url function does not support lower case, and parse protocol not right. And This function does not support parse 'port'. This PR tries to make parse_url function case insensitive and support parse 'port'. The issue: #4451	2020-09-03 17:06:09 +08:00
ZhangYu0123	c29d41f675	[BUG] Fix recover persistent stale rowsets bug from multi-single version rowsets in stale rowsets (#4513 ) (1) fix recover persistent stale rowsets bug from multi-single version rowset in stale rowsets (2) delete_expired_inc_rowsets check consistent version convert to [0, max_version]	2020-09-03 16:59:18 +08:00
Mingyu Chen	d7ac44ac79	[Bug] Fix bug that BE will crash when querying information_schema.columns (#4511 ) This bug is introduced from #4364	2020-09-03 16:57:56 +08:00
Yingchun Lai	498b06fbe2	[Metrics] Support tablet level metrics (#4428 ) Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet, but we have no insight about tablets in the cluster. This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request, and not return tablet level metrics by default.	2020-09-02 10:39:41 +08:00
Mingyu Chen	a864db03fe	[Bug] Fix bug of load error hub and schema change (#4486 ) 1. When WITH_MYSQL is off, load error hub does not suport MySQL load error hub, we should check its return value. 2. misjudge the return value of `change_row_block` in schema_change.cpp	2020-08-31 23:21:50 +08:00
ZhangYu0123	1d93ba027a	[Compaction] Compaction show policy type and disk format (#4466 ) Add more information in compaction show api 1、add cumulative policy type 2、format rowset total disk size	2020-08-30 21:09:47 +08:00
Yingchun Lai	65cacbff7c	[Bug] Fix bug that memory copy may overflow in MemIndex::load_segment (#4458 ) Segment index file content is not set as 0 when it is constructed in write procedure, so when load index from this file, and meet a null VARCHAR cell, the null field of this cell is 0, but the length field which is not initialized maybe a large random number, then memory copy may cause overflow. This patch fix this bug, and also skip useless memory copy to improve a bit of performance.	2020-08-30 21:08:55 +08:00
ZhangYu0123	123237afb7	[Compaction] Persistence stale rowsets meta (#4454 ) Persistence stale rowsets meta. When BE reboots, stale rowsets meta can resume and the stale version can also be readable before stale gc time. ISSUE: #4453	2020-08-30 21:05:48 +08:00
ZhangYu0123	004b955ca4	[Bug] Fix a null pointer bug in PlanFragmentExecutor. (#4473 ) Fix a null pointer bug in PlanFragmentExecutor. Add null check operation before it is used. Detail: #4472	2020-08-28 09:28:23 +08:00
HappenLee	84c63f1350	[Bug] replace libltdl.so when compile the unixodbc library (#4461 )	2020-08-27 20:53:28 +08:00
HangyuanLiu	ad738fa198	Add OLAP_ERR_DATE_QUALITY_ERR error status to display schema change failure (#4388 ) In the process of historical data transformation of materialized views, it may occur that the transformation fails due to data quality. Add an error status code ：OLAP_ERR_DATE_QUALITY_ERR to determine if a data problem is causing the failure #3344	2020-08-27 17:52:53 +08:00
yiguolei	b85bb0e2e9	[Bug-Fix] Some deleted tablets are not recycled on BE (#4401 )	2020-08-27 12:09:19 +08:00
Mingyu Chen	f218327dd9	[Mysql Compatibility] Support convert() and signed/unsigned interger cast (#4364 ) 1. Support convert(expr, target_type) function, which is same as CastExpr 2. Support cast (expr as signed/unsigned int) This is just for compatibility, the signed/unsigned specification is meaningless.	2020-08-27 12:07:58 +08:00

1 2 3 4 5 ...

1128 Commits