Commit Graph

1227 Commits

Author SHA1 Message Date
fe1ca824cc [Config] change some static config to dynamic config and delete some unused config (#5158)
* change some BE static config to dynamic config

Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-01-06 09:55:09 +08:00
03e36056eb [Bug] Fix bug that the min/max function has an error in handling string null values. (#5189)
null should be ignored in min/max function.
And if there is no data, null should be return.

Co-authored-by: morningman <chenmingyu@baidu.com>
2021-01-05 09:48:38 +08:00
e536823f92 [Thirdparty] Fix build thirdparty may be failed (#5187)
1. fix build thirdparty may be failed  in some os, because of default lib path is `lib` or`lib64` or `arrow` bulld failed by `brotil` and `zstd`
2. fix canot extract `.tar.bz2` file
2021-01-04 15:21:18 +08:00
6c098e45fc [Optimize][Cache]Implementation of Separated Page Cache (#5008)
#4995
**Implementation of Separated Page Cache**
- Add config "index_page_cache_ratio" to set the ratio of capacity of index page cache
- Change the member of StoragePageCache to maintain two type of cache
- Change the interface of StoragePageCache for selecting type of cache
- Change the usage of page cache in read_and_decompress_page in page_io.cpp
  - add page type as argument
  - check if current page type is available in StoragePageCache (cover the situation of ratio == 0 or 1)
- Add type as argument in superior call of read_and_decompress_page
- Change Unit Test
2021-01-04 12:19:24 +08:00
0d3564c2e1 [Feature] Implementation of histogram metric (#5148)
#5146
Add histogram metrics into util/metrics.h. The data structure of histogram is implemented in util/histogram.h,
which could also be used in other situations that in need of histogram. Unit tests added as well.
2021-01-04 09:32:46 +08:00
5807413ad0 [UT] Add ut for column predicate of comlumnblock (#5123)
Add ut for column predicate of ColumnBlock
2021-01-04 09:29:30 +08:00
17d939b789 [Bug] Fix scanner threads heap-use-after-free (#5111)
Scanner threads may be running and using the member vars of OlapScanNode,
when the OlapScanNode has already destroyed.

We can use `_running_thread` to be the last accessed member variable.
And `transfer_thread` need to wait for `_running_thread==0`.
After `transfer_thread` joined, `OlapScanNode::close()` can continue.
2021-01-04 09:28:51 +08:00
05ac7fcd4a [Function] Add BE udf bitmap_xor (#5098)
this function will return the xor result of inputs two bitmap .
2021-01-04 09:27:46 +08:00
f2cf8d2c5e [Bug-Fix] Fix the bug of PERCENTILE_APPROX return error result nan and add PERCENTILE_APPROX UT (#5172) 2021-01-03 15:45:22 +08:00
9e19b6b133 [Performance Improve] Push Down _conjunct of 'A is NULL' and 'B is not NULL' to Storage Engine. (#5092)
This patch mainly do the following:
- Support #5086
- Refactor ColumnRangeValue to support contain null
2021-01-03 15:45:07 +08:00
5e1a80bb22 [UT][Bug] fix LOOP_LESS_OR_MORE (#5157)
This bug introduced by #5131. When AllowSlowTests() is true, we should loop more.
2020-12-29 09:48:19 +08:00
11c0aafa5c [UT] Speed up BE unit test (#5131)
There are some long loops and sleeps in unit tests, it will cost a
very long time to run all unit tests, especially run in TSAN mode.
This patch speed up unit tests by shortening long loops and sleeps,
on my environment all unit tests finished in 1 minite. It's useful
to do basic functional unit tests.
You can switch to run in this mode by adding a new environment variable
'DORIS_ALLOW_SLOW_TESTS'. For example, you can set:
export DORIS_ALLOW_SLOW_TESTS=1
and also you can disable it by setting:
export DORIS_ALLOW_SLOW_TESTS=0
2020-12-27 22:19:56 +08:00
85076b5678 [UT] fix test_env & add a sample (#5085)
Easily create tests.
2020-12-27 22:14:30 +08:00
d9f1ffe9a0 (#5151) An already merged rowset should skip window check (#5152) 2020-12-26 11:40:44 +08:00
279ae1cb75 Add fuzzy_parse option to speed up json import (#5114)
add a flag of fuzzy_parse, if the json file all object keys are the same and has same order, we only need to parse the first row, and then use index instead key to parse value
2020-12-25 09:19:42 +08:00
86e40dd3e5 Fix old tablet inserting bug (#5113)
#4996
When BE is restarting and the older tablet have been added to the garbage collection queue but not deleted yet.
In this case, since the data_dirs are parallel loaded, a later loaded tablet may be older than previously loaded one, which should not be acknowledged as a failure.

It should be noted that the _add_tablet_unlocked() method will also be called when creating a new tablet. In that case, the changes in this pull request will not be accessed so there is no affect on the tablet creating process.
2020-12-24 15:20:54 +08:00
c57145b4c2 [Bug] Fix bug that routine load may lost some data (#5093)
In the previous implementation, whether a subtask is in commit or abort state,
we will try to update the job progress, such as the consumed offset of kafka.
Under normal circumstances, the aborted transaction does not consume any data,
and all progress is 0, so even we update the progress, the progress will remain
unchanged.
However, in the case of high cluster load, the subtask may fail half of the execution on the BE side.
At this time, although the task is aborted, part of the progress is updated.
Cause the next subtask to skip these data for consumption, resulting in data loss.
2020-12-23 09:33:52 +08:00
176dcf8bd9 [Trace] Add trace for create tablet tasks (#5091)
Add trace for create tablet tasks, it's a useful tool for admin to find
out the bottleneck when create tablets timeouted.
For example, admin could enlarge 'tablet_map_shard_size' when found
'got tablets shard lock' procedure cost too much time.
2020-12-19 11:18:12 +08:00
9ddf434f6b [Bug-Fix] Fix partition cache match bug (#5060)
When partition cache is not cached continuely, range query may fail.
For example, partition key 20201011 and 20201013 is cached,
but rang query is between 20201011 and 20201013, the query will not hit the cache.
issue:#5059
2020-12-19 11:17:44 +08:00
984807910f [Bug] Fix bug when delete condition is null but zonemap is not null (#5109)
If a column does not have any null value, and execute a delete operation
with "where k1 is null" on it, BE may crash.

This bug is introducaed from #5030
2020-12-18 21:39:52 +08:00
3d4b2cb1ae [Bug] Fix tablet shared ptr circular reference causing the tablet not to be cleared (#5100)
Regardless of whether the tablet is submitted for compaction or not,
we need to call 'reset_compaction' to clean up the base_compaction or cumulative_compaction objects
in the tablet, because these two objects store the tablet's own shared_ptr.
If it is not cleaned up, the reference count of the tablet will always be greater than 1,
thus cannot be collected by the garbage collector. (TabletManager::start_trash_sweep)

This bug is introduced from #4891
2020-12-18 21:17:18 +08:00
f6881d2f7b [Bug] Fix coredump bug when create new tablets (#5089)
There is a bug may cause BE coredump when create tablet,
the accessing of tablet_set of a data dir should be protected by lock.
2020-12-17 00:34:31 +08:00
650536d53e [Feature] Add Topn udaf (#4803)
For #4674 
This is a udaf for approximate topn using Space-Saving algorithm.  At present, we can only calculate
the frequent items and their frequencies in a certain column, based on which we can implement similar
topN functions supported by Kylin in the future. 

I have also added a test to calculate the accuracy of this algorithm. The following is a rough running result.
The total amount of data is 1 million lines and follows the Zipfian distribution, where Element Cardinality
represents the data cardinality, 20X, 50X.. The value representing space_expand_rate is 20,50, which is
used to set the counter number in the space-saving algorithm

```
zf exponent = 0.5
Element cardinality	        20X        50X          100X
               1000		100%	   100%         100%
               10000		100%	   100%		100%
	       100000		100%	   100%		100%
	       500000		 94%	    98%		 99%

zf exponent = 0.6,1
Element cardinality	        20X        50X          100X
		1000		100%	   100%         100%
		10000		100%	   100%		100%
		100000		100%	   100%		100%
		500000		100%	   100%		100%

```
2020-12-16 21:58:34 +08:00
6afa14cda7 [Bug] Fix Memory Leak in Json Load (#5073)
fix json load memory leak #5069
2020-12-15 22:55:47 +08:00
81c7c0360e [Bug] Fix a core dump of counter in BE (#5078)
Introduced by PR #5051.
As @liutang123 said, when PlanFragmentExecutor is destructed, it will call
`close -> ExecNode::close -> OlapScanNode::close`. OlapScanNode will wait for `_transfer_thread`.
`_transfer_thread` will wait for all OlapScanner processing to complete.
OlapScanner is processed by the scanner thread. When the last scanner processing is completed,
`_transfer_thread` will break out of the loop, and PlanFragmentExecutor will continue to destruct.
And if it is completed, its RuntimeProfile::Counter will also be destructed.
At this time, the ScopedTimer in the Scan thread may still use this Counter when it is destructed.

So we must make sure that the timer is deconstructed before deconstructing the runtime profile.
2020-12-15 09:33:38 +08:00
49f26f4413 [UT] cleanup storage engine creation in tablet_mgr_test etc (#5077)
Mistakenly use the string '_engine_data_path' as the path, actually the storage engine is not open,
so option/path is needless. Cleanup it to avoid any doubt about the file path management.
2020-12-15 09:30:32 +08:00
0a0e46fd53 [Bug] Fix the bug of where condition a in ('A', 'B', 'V') and a in ('A') return error result (#5072)
And Refactor ColumnRangeValue and OlapScanNode

This patch mainly do the following:
- Fix issue #5071
- Change type_min in ColumnRangeValue as static
- Add Class of type_limit make code clear
- Refactor the function of normalize_in_and_eq_predicate
2020-12-15 09:29:10 +08:00
90e7f7005e [Bug] Fix bug that query multi mysql external table with union will get incomplete result (#5067)
The `eos` flag should be reset to false after opening next child of union node.
2020-12-15 09:28:39 +08:00
193db4207e [enhancement]improve performance of json load (#5055)
* imporve performance of json load
2020-12-15 09:27:51 +08:00
ff4bd1223f [Profile] Add cpu time cost in query audit (#5051) 2020-12-13 22:22:15 +08:00
115d4332aa [ODBC] Support ODBC Sink for insert into data to ODBC external table (#5033)
issue:#5031

1. Support ODBC Sink for insert into data to ODBC external table.
2. Support Transaction for ODBC sink to make sure insert into data is atomicital.
3. The document about ODBC sink has been modified
2020-12-13 21:53:27 +08:00
e278e0b3db [Load] Support full StreamLoad feature in multiload (#4717) 2020-12-10 09:37:18 +08:00
ca9e5c4785 [Bug] Add a flag to prevent repeated close operation of OlapTabletSink (#5034)
The close method of OlapTabletSink may be called twice.
In the open_internal() method of plan_fragment_executor, close is called once.
If an error occurs in this call, it will be called again in fragment_mgr.
So here we use a flag to prevent repeated close operations.

Co-authored-by: morningman <chenmingyu@baidu.com>
2020-12-09 09:30:09 +08:00
f2d69a51d4 [Docs]Remove some unused variables and update BE config documents (#4987)
Remove some unused variables and update BE config documents about compaction.
2020-12-09 09:28:56 +08:00
49f7eb69bf [Refactor] Refactor DeleteHandler and Cond module (2nd) (#5030)
* [Refactor] Refactor DeleteHandler and Cond module (#4925)

This patch mainly do the following refactors:
- Use int64_t instead of int32_t for 'version' in DeleteHandler
- Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments
- Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid
- Use range loop to simplify code
- Reduce some compare operations in Cond::del_eval
- Improve some branch predictions in Reader
- Fix and improve some unit tests
2020-12-08 10:01:18 +08:00
eb0cb04a70 Fix a core dump introduced by pr #5022 (#5032)
* fix a core dump caused by pr #5022
2020-12-08 10:00:07 +08:00
b9dabc3b5b [Enhance] Push down predicate on value column of unique table to base rowset (#5022) 2020-12-06 08:50:37 +08:00
6021d6fc7f [Performance Optimization] Remove push down conjuncts in olap scan node (#4999)
Push conjunct to Storage Engine as more as possible

olap scan node do not need filter data use push down conjuncts again.

fix #4986
2020-12-06 08:50:08 +08:00
b954dfd82d [Bug] Fix the bug of Largetint and Decimal json load failed. (#4983)
Use param of json load "num_as_string" to use flag kParseNumbersAsStringsFlag to parse json data.
2020-12-06 08:49:30 +08:00
b1b99ae884 [Function] Support Decimal to calculate variance and standard deviation (#4959) 2020-12-06 08:49:01 +08:00
c440aa07d1 Revert "[Refactor] Refactor DeleteHandler and Cond module (#4925)" (#5028)
This reverts commit 9c9992e0aa28ee85364eebf86a6675f1073e08fb.

Co-authored-by: morningman <chenmingyu@baidu.com>
2020-12-05 21:39:49 +08:00
9c9992e0aa [Refactor] Refactor DeleteHandler and Cond module (#4925)
This patch mainly do the following refactors:
- Use int64_t instead of int32_t for 'version' in DeleteHandler
- Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments
- Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid
- Use range loop to simplify code
- Reduce some compare operations in Cond::del_eval
- Improve some branch predictions in Reader
- Fix and improve some unit tests
2020-12-04 12:13:30 +08:00
1f236a5339 [BUG] Fix core when schema change (#5018) 2020-12-04 09:53:19 +08:00
8823f2d928 [Buf] Fix incorrect name of TaskWorkerPool (#5015)
'_task_worker_type' is not well initialized when use it to init '_name',
then '_name' is always 'TaskWorkerPool.CREATE_TABLE', this patch fix
this bug.
2020-12-04 09:30:23 +08:00
1ae6de7117 [Enhance] Add "statistics" meta table and fix some mysql compatibility problem (#4991)
1. Add metadata table 'statistics' to store index information;
2. In the header information returned by mysql, the data type length is returned according to the actual type.
2020-12-03 09:38:18 +08:00
5215727b45 [Function] Let "str_to_date" return correct type (#5004)
The return type of str_to_date depends on whether the time part is included in the format.
If included, it is DATETIME, otherwise it is DATE.
If the format parameter is not constant, the return type will be DATETIME.
The above judgment has been completed in the FE query planning stage,
so here we directly set the value type to the return type set in the query plan.

For example:
A table with one column k1 varchar, and has 2 lines:
    "%Y-%m-%d"
    "%Y-%m-%d %H:%i:%s"
Query:
    SELECT str_to_date("2020-09-01", k1) from tbl;
Result will be:
    2020-09-01 00:00:00
    2020-09-01 00:00:00

Query:
     SELECT str_to_date("2020-09-01", "%Y-%m-%d");
Return type is DATE

Query:
     SELECT str_to_date("2020-09-01", "%Y-%m-%d %H:%i:%s");
Return type is DATETIME
2020-12-03 09:33:26 +08:00
92db00bd86 [Bug] Fix concurrent access of _tablets_under_clone in TabletManager (#5000)
_tablets_under_clone in TabletManager is not sharded but the lock
used to prevent concurrent access is sharded, so when shards size
is not 1, it will cause coredump.
This patch fix this bug, and also do some refactor to make shard
locks more convenient to use.
2020-12-03 09:32:44 +08:00
af06adb57f [Doris On ES][Bug-fix] fix boolean predicate pushdown manner (#4990)
Correct handling `boolean` field predicate through set the predicate value to `true`、`false` or `empty set` for DOE
2020-12-02 10:13:13 +08:00
df1f06e60b Optimized the read performance of the table when have multi versions (#4958)
* Optimized the read performance of the table when have multi versions,
changed the merge method of the unique table,
merged the cumulative version data first, and then merged with the base version.
For the data with only one base version, read directly without merging
2020-12-01 12:25:11 +08:00
99404df8b2 [Bug][Compaction] Fix bug that output rowset is not deleted after compaction failure (#4964)
This CL fix 2 bugs:

1. 
When the compaction fails, we must explicitly delete the output rowset,
otherwise the GC logic cannot process these rows.

2. 
Base compaction failed if compaction process include some delete version in SegmentV2,
Because the number of filtered rows is wrong.
2020-11-30 22:02:03 +08:00