doris

Author	SHA1	Message	Date
Zhengguo Yang	9aa3d61dc0	[refactor] change http server log level (#4853 ) * change some log level * change some log level	2020-11-08 20:53:36 +08:00
Yingchun Lai	f40868a480	[Optimize] Improve LRU cache's performance (#4781 ) When LRUCache insert and evict a large number of entries, there are frequently calls of HandleTable::remove(e->key, e->hash), it will lookup the entry in the hash table. Now that we know the entry to remove 'e', we can remove it directly from hash table's collision list if it's a double linked list. This patch refactor the collision list to double linked list, the simple benchmark CacheTest.SimpleBenchmark shows that time cost reduced about 18% in my test environment.	2020-11-06 10:56:27 +08:00
EmmyMiao87	1d89e0670a	[License] Add other license declare in NOTICE (#4831 )	2020-11-05 20:30:49 +08:00
Mingyu Chen	f239f44b37	[Compaction][Bug-Fix] Fix bug that meta lock need to be held when calculating compaction score (#4829 ) * [Compaction][Buf] Fix bug that meta lock need to be held when calucating compaction score * fix Co-authored-by: morningman <chenmingyu@baidu.com>	2020-11-05 20:29:01 +08:00
Mingyu Chen	c53dd949c9	[Feature] Add CPU and Heap profile in BE webserver (#4632 ) Add CPU Profile and Heap Profile in BE webserver. This way we can more easily diagnose system performance bottlenecks through perf tools.	2020-11-05 20:25:07 +08:00
Yingchun Lai	d1c2b3ed0d	[Optimize] Add an unordered_map for TabletSchema to speed up column name lookup (#4779 ) Reduce column name lookup for TabletSchema and Tablet from O(N) to O(1).	2020-11-03 19:53:44 +08:00
HappenLee	b1c1ffda4a	[Refactor] Refactor olap scan node code (#4823 ) 1. Remove meaningless code in Doris 2. Replace string copy by string reference 3. Simplified the implementation of some functions	2020-11-01 09:12:23 +08:00
pengxiangyu	44498a1ae2	[Compatibility] Add table "views" in information_schema database (#4778 ) To support some tools like DBeaver	2020-10-30 11:44:44 +08:00
EmmyMiao87	d6497fedc4	[Config] Change config name 'streaming_load_max_batch_size_mb' to 'streaming_load_json_max_mb' (#4791 ) The name and another config name are close to each other and are indistinguishable. So this pr modify the name. The document description has also been changed	2020-10-28 23:27:33 +08:00
weizuo93	6790254b97	[Bug] Fix bug and optimize implementation logic of tablets web page (#4775 ) (1) The implementation logic of `tablets web page` is that: Firstly, get all the tablets of the BE; Secondly, return specific number tablets in front to web page according to the value of request parameter `limit`. This patch optimize the implementation logic of `tablets web page` that getting specific number tablets in BE according to the value of request parameter `limit` and then return them to web page. (2) It will return default `1000` tablets through http interface `http://be_host:webserver_port/tablets_page`. If the number of tablet in the BE is less than 1000, there will be `coredump` in the following code: be/src/http/action/tablets_info_action.cpp	2020-10-28 23:25:43 +08:00
Mingyu Chen	bfdb15c730	[Bug] Fix some date functions to make their result same as MySQL (#4786 ) dayofweek, dayofmonth, dayofyear, weekofyear, timediff Also fix ut compilation problem	2020-10-27 12:52:44 +08:00
Zhengguo Yang	0f13eddd97	fix typo in log (#4790 )	2020-10-27 10:03:56 +08:00
Mingyu Chen	0213e93ea1	[Feature][Config] Support persistence of configuration items modified at runtime (#4704 ) Support persistence of configuration items modified at runtime via HTTP API. ``` FE: GET /api/_set_config?key=value&persist=true BE POST /api/update_config?key=value&persist=true ``` The modified config will be saved in `fe_custom.conf` or `be_custom.conf`. And when process starts, it will load `fe.conf/be.conf` first, then `fe_custom.conf/be_custom.conf`.	2020-10-22 21:39:02 +08:00
Yingchun Lai	6cbefd5621	[LRUCache] Expose LRU Cache status to metrics (#4688 ) Expose LRU Cache status to metrics would be helpful to diagnose problems like high usage, low hit rate.	2020-10-22 21:37:02 +08:00
Yunfeng,Wu	7b2762b1b1	[Doris On ES][Bug-Fix] Can not pushdown limit when some predicate can not processed by ES (#4768 ) Can not pushdown limit when some predicate not processed by ES, fixed: #4761	2020-10-21 12:10:55 +08:00
Mingyu Chen	77835dd9c4	[Bug][Compaction] Fix bug that compaction may be blocked (#4750 ) the logic of compaction producer thread may failed to produce compaction task due to invalid order of modifying task map.	2020-10-21 10:12:37 +08:00
HappenLee	349cc9ef17	[Bug] Do not push down limit operation when ODBC table do not push all conjunct as filter. (#4764 )	2020-10-21 10:12:12 +08:00
Mingyu Chen	588e5bee47	[Bug] Fix bug of cumulative compaction and deletion of stale version (#4593 ) When selecting candidate rowsets to do the cumulative compaction, some rowsets may not be selected because the protection time has not expired. Therefore, we need to find the current longest continuous version path in the candidate rowsets.	2020-10-21 10:03:55 +08:00
Zhengguo Yang	09f97f8a05	[Refactor] Fixes some be typo part 2 (#4747 )	2020-10-20 09:28:57 +08:00
Yingchun Lai	45fa67aa71	[Refactor] Remove objects which are only used for unit test (#4751 ) We create some objects which are only used for unit tests, it's not necessary, and it may cause create duplicate instances for some classes. This patch remove unnecessary instance of class BlockManager and StoragePageCache.	2020-10-18 21:37:12 +08:00
HuangWei	e31b4a4561	[Bug] fix illegal defer in Tablet::rowset_with_max_version() (#4737 )	2020-10-17 13:44:15 +08:00
Yingchun Lai	3438a746ac	[Typo] Fix typo in metrics macros (#4739 ) Just fix typo. Rename DEFINE_GAUGE_METRIC_PROTOTYPE_5ARG(name, unit) to DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(name, unit) Rename DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(name, unit) witch define core metrics to DEFINE_GAUGE_CORE_METRIC_PROTOTYPE_2ARG(name, unit)	2020-10-15 19:56:43 +08:00
HappenLee	c00a5cb543	[Bug] Fix the core problem of function `split_part` and add the UT of core case (#4721 ) issue:#4720	2020-10-13 10:09:39 +08:00
Mingyu Chen	f431d8d94c	[Enhance][Log] Make RPC error log more clear (#4702 ) At present, when some rpc errors occur, the client cannot obtain the error information well. And this CL change the RPC error returned to client like this: ``` ERROR 1064 (HY000): errCode = 2, detailMessage = there is no scanNode Backend. [10002: in black list(A error occurred: errorCode=2001 errorMessage:Channel inactive error!)] ERROR 1064 (HY000): failed to send brpc batch, error=The server is overcrowded, error_text=[E1011]The server is overcrowded @xx.xx.xx.xx:8060 [R1][E1011]The server is overcrowded @xx.xx.xx.xx:8060 [R2][E1011]The server is overcrowded @xx.xx.xx.xx:8060 [R3][E1011]The server is overcrowded @xx.xx.xx.xx:8060, client: yy.yy.yy.yy ```	2020-10-13 10:08:43 +08:00
Mingyu Chen	83f6f46c34	[Config] Limit the version number of tablet (#4687 ) Add a BE config `max_tablet_version_num` to limit the version number of a single tablet. To avoid too many versions	2020-10-13 10:08:16 +08:00
Zhengguo Yang	75e0ba32a1	Fixes some be typo (#4714 )	2020-10-13 09:37:15 +08:00
HappenLee	d73d205de7	[ODBC/MySQL] Support Limit Clause Push Down For ODBC Table And MySQL Table(#4706 ) (#4707 ) 1. Support limit clause push down both odbc table and mysql table. 2. Code refactor of ODBC Scan Node, change `build_connect_string` and `query_string` from BE to FE to make it easily to modify	2020-10-11 21:11:04 +08:00
Zhengguo Yang	98e71a8b9f	[Bug][Function] Fix rand() function return same value (#4709 ) fix rand function return same value when no parameter	2020-10-11 15:40:38 +08:00
HappenLee	04f26e4b7f	[SQL] Support Bucket Shuffle Join (#4677 ) Support Bucket Shuffle Join issue:#4394	2020-10-11 15:37:32 +08:00
weizuo93	eba595583e	[Optimize] Optimize the execution model of compaction to limit memory consumption (#4670 ) Currently, there are M threads to do base compaction and N threads to do cumulative compaction for each disk. Too many compaction tasks may run out of memory, so the max concurrency of running compaction tasks is limited by semaphore. If the running threads cost too much memory, we can't defense it. In addition, reducing concurrency to avoid OOM will lead to some compaction tasks can't be executed in time and we may encounter more heavy compaction. Therefore, concurrency limitation is not enough. The strategy proposed in #3624 may be effective to solve the OOM. A CompactionPermitLimiter is used for compaction limitation, and use single-producer/multi-consumer model. Producer will try to generate compaction tasks and acquire `permits` for each task. The compaction task which can hold `permits` will be executed in thread pool and each finished task will release its `permits`. `permits` should be applied for before a compaction task can execute. When the sum of `permits` held by executing compaction tasks reaches a threshold, subsequent compaction task will be no longer allowed, until some `permits` are released. Tablet compaction score is used as `permits` of compaction task here. To some extent, memory consumption can be limited by setting appropriate `permits` threshold.	2020-10-11 11:39:25 +08:00
ccoffline	f3cdf167d1	[Feature] Add time_round builtin functions (#4640 ) #4619 Add time_round functions that provides `time_floor` & `time_ceil` at each time unit. Fix two related bugs. - #4618 - Fix `struct TimeInterval` to use `int64_t` instead of `int32_t`, in case when the second diff overflow	2020-10-09 16:05:51 +08:00
Mingyu Chen	ed09ea9cf7	[Bug] Fix bug that tablet report always out of date (#4695 ) This is because the report version is not set correctly in the tablet report request. Also modify the download url of libevent Fix #4689 Fix #4691	2020-10-09 15:54:31 +08:00
Zhengguo Yang	0475aa9b93	[Bug]Fix delete on clause may not work in routineLoad (#4683 ) fix delete on may not work in some cases, this is describe in #4682	2020-09-30 09:56:19 +08:00
Yingchun Lai	b1853caeed	[UDF] Improve performance of function money_format (#4672 ) Use static local variable instead of create it every calls. Time cost of the new added unit benchmark test could reduce from about 60 seconds to 10 seconds.	2020-09-28 13:39:41 +08:00
HaiBo Li	5199a17a4b	[cache][be]Fix the bug of cross-border access cache (#4639 ) * When the different partition of the table is updated frequently, the partition key list of the cache is discontinuous, and the partition key in the request cannot hit the key list in the cache, resulting in the access overrun，the BE will crash. * Add some unit test case，add test cases that fail to hit the boundary value of cache	2020-09-28 13:35:52 +08:00
HangyuanLiu	1dacadb015	[BUG] Fix DATA_TYPE in information_schema.columns is not compatible to mysql meta (#4648 ) Describe the bug DATA_TYPE in information_schema.columns is not compatible to mysql meta To Reproduce Steps to reproduce the behavior: select * from information_schema.columns Expected behavior the result of data_type is (int, decimal, char, varchar, ...)，but doris data_type is (int(11), varchar(20), ...) Excess number will affect some BI systems or upper system can't get right type	2020-09-25 13:38:09 +08:00
HappenLee	a61d0de173	[ODBC SCAN NODE] 4/4 Add ODBC_SCAN_NODE and Odbc_Scanner in BE and add ODBC_SCAN_NODE docs (#4438 )	2020-09-25 10:19:50 +08:00
Yingchun Lai	2a637f848d	[Refactor] Remove meaningless return value of RowBlock::init (#4627 ) Simplify some code, mainly remove meaningless return value of RowBlock::init.	2020-09-20 20:57:00 +08:00
weizuo93	985ec98a9b	[Bug] Fix the calculation of the variable "left_bytes" in data_dir.cpp (#4609 ) Suppose that the current available byte in a disk is `_available_bytes`. After the data with size of `incoming_data_size` is written in the disk, the left space in the disk should be calculated as follow : `int64_t left_bytes = _available_bytes - incoming_data_size;` rather than: `int64_t left_bytes = _disk_capacity_bytes - _available_bytes - incoming_data_size;`	2020-09-20 20:54:13 +08:00
Mingyu Chen	00f25c2b77	[Bug] Tablet and Disk report thread not work (#4597 ) The tablet and disk information reporting threads need to report to the FE periodically. At the same time these two reporting threads will also be triggered by certain events. The modification in PR #4440 caused these two threads to be triggered only by events, and could not report regularly.	2020-09-20 20:51:52 +08:00
HaiBo Li	5f43fb3bde	[Cache][BE] LRU cache for sql/partition cache #2581 (#4005 ) 1. Find the cache node by SQL Key, then find the corresponding partition data by Partition Key, and then decide whether to hit Cache by LastVersion and LastVersionTime 2. Refers to the classic cache algorithm LRU, which is the least recently used algorithm, using a three-layer data structure to achieve 3. The Cache elimination algorithm is implemented by ensuring the range of the partition as much as possible, to avoid the situation of partition discontinuity, which will reduce the hit rate of the Cache partition, 4. Use the two thresholds of maximum memory and elastic memory to control to avoid frequent elimination of data	2020-09-20 20:50:51 +08:00
qiye	065b979f35	[Bug] behavior of function str_to_date() and date_format() on BE and FE is inconsistent (#4612 ) 1. add date range check in `DateLiteral` for `FEFunctions` 2. `select str_to_date(202009,'%Y%m')` and `select str_to_date(str,'%Y%m') from tb where tb.str = '202009'` will return same output `2020-09-00`. 3. add support of zero-date to function `str_to_date()`,`date_format()` 4. fix FE can calculate negative value bug, eg: `select str_to_date('-2020', '%Y')` will return `NULL` instead of date value. current behavior is same as MySQL without sql_mode `NO_ZERO_IN_DATE` and `NO_ZERO_DATE`. current behavior ``` mysql> select siteid,str_to_date(siteid,'%Y%m%d') from table2 order by siteid; +------------+---------------------------------+ \| siteid \| str_to_date(`siteid`, '%Y%m%d') \| +------------+---------------------------------+ \| 1 \| 2001-00-00 \| \| 2 \| 2002-00-00 \| \| 2 \| 2002-00-00 \| \| 3 \| 2003-00-00 \| \| 4 \| 2004-00-00 \| \| 5 \| 2005-00-00 \| \| 20 \| 2020-00-00 \| \| 202 \| 0202-00-00 \| \| 2020 \| 2020-00-00 \| \| 20209 \| 2020-09-00 \| \| 202008 \| 2020-08-00 \| \| 202009 \| 2020-09-00 \| \| 2020009 \| 2020-00-09 \| \| 20200009 \| 2020-00-09 \| \| 20201309 \| NULL \| \| 2020090909 \| 2020-09-09 \| +------------+---------------------------------+ mysql> select str_to_date('2','%Y%m%d'),str_to_date('20','%Y%m%d'),str_to_date('202','%Y%m%d'),str_to_date('2020','%Y%m%d'),str_to_date('20209','%Y%m%d'),str_to_date('202009','%Y%m%d'),str_to_date('2020099','%Y%m%d'),str_to_date('20200909','%Y%m%d'),str_to_date('2020090909','%Y%m%d'),str_to_date('2020009','%Y%m%d'),str_to_date('20200009','%Y%m%d'),str_to_date('20201309','%Y%m%d'); +----------------------------+-----------------------------+------------------------------+-------------------------------+--------------------------------+---------------------------------+----------------------------------+-----------------------------------+-------------------------------------+----------------------------------+-----------------------------------+-----------------------------------+ \| str_to_date('2', '%Y%m%d') \| str_to_date('20', '%Y%m%d') \| str_to_date('202', '%Y%m%d') \| str_to_date('2020', '%Y%m%d') \| str_to_date('20209', '%Y%m%d') \| str_to_date('202009', '%Y%m%d') \| str_to_date('2020099', '%Y%m%d') \| str_to_date('20200909', '%Y%m%d') \| str_to_date('2020090909', '%Y%m%d') \| str_to_date('2020009', '%Y%m%d') \| str_to_date('20200009', '%Y%m%d') \| str_to_date('20201309', '%Y%m%d') \| +----------------------------+-----------------------------+------------------------------+-------------------------------+--------------------------------+---------------------------------+----------------------------------+-----------------------------------+-------------------------------------+----------------------------------+-----------------------------------+-----------------------------------+ \| 2002-00-00 \| 2020-00-00 \| 0202-00-00 \| 2020-00-00 \| 2020-09-00 \| 2020-09-00 \| 2020-09-09 \| 2020-09-09 \| 2020-09-09 \| 2020-00-09 \| 2020-00-09 \| NULL \| +----------------------------+-----------------------------+------------------------------+-------------------------------+--------------------------------+---------------------------------+----------------------------------+-----------------------------------+-------------------------------------+----------------------------------+-----------------------------------+-----------------------------------+ ```	2020-09-17 10:10:19 +08:00
ZhangYu0123	4f7cfee908	[compaction][config] Change default config policy to size_based (#4599 ) (1) change default compaction config policy to size_based (2) change missed version check policy when delete stale rowsets	2020-09-16 15:04:06 +08:00
Mingyu Chen	9419c73472	[Bug] Fix bug that BE will crash when querying information_schema.columns (#4595 )	2020-09-14 15:47:08 +08:00
EmmyMiao87	e8e5f350fe	[BUG] ReAgg when adding agg mv on dup base table (#4587 ) When the keystype of mv and base table is difference, Doris should execute sorting schema change instead of linked schema change. If doesn't, the data size of mv actually is same as base table. This will cause mv to have no pre-aggregation effect at all. The query will not choose mv. This commit fixed this problem. Fixed #4586	2020-09-13 19:17:35 +08:00
ZhangYu0123	4571b09dd6	[storage][compatibility] Add meta format detection to prevent data loss. (#4539 ) After 0.12 version, doris remove the format convert functiion which can convert from hdr_ format to tabletmeta_ format when loading metas, the commit link: 3bca253 When we update doris version and there are old format meta in storage, BE will not read the old format tablet. It can lead to data loss. So we add meta format detection function to prevent data loss. When there are old format meta in olap_meta, BE can find and print log or exit.	2020-09-13 11:58:22 +08:00
wangbo	2c24fe80fa	[SparkDpp] Support complete types (#4524 ) For[Spark Load] 1 support decimal andl largeint 2 add validate logic for char/varchar/decimal 3 check data load from hive with strict mode 4 support decimal/date/datetime aggregator	2020-09-13 11:57:33 +08:00
HuangWei	4caa6f9b33	[Bug] fix get_parsed_paths() subscript out of range (#4585 )	2020-09-12 16:04:21 +08:00
HuangWei	e26d5d0da0	[MemTracker] show all MemTrackers on BE's website (#4580 ) We can show all MemTrackers on BE's website by calling MemTracker::ListTrackers().	2020-09-12 11:18:50 +08:00
HuangWei	704bcec9d3	[Bug] add_batch check state fix (#4575 )	2020-09-12 11:18:10 +08:00

1 2 3 4 5 ...

1150 Commits