doris

Author	SHA1	Message	Date
HangyuanLiu	60f93b2142	Fix bitmap type (#3749 )	2020-06-03 10:07:58 +08:00
HuangWei	fdf66b8102	[MemTracker] add log depth & auto unregister (#3701 )	2020-06-01 23:16:25 +08:00
Yingchun Lai	43d25afa2c	[compaction] Update cumulative point calculate algorithm (#3690 ) Current cumulative point calculate algorithm may skip singleton rowset when the rowset has only one segment and with NONOVERLAPPING flag. When a tablet is new created and cumulate many singleton rowsets, cumulative point will be calculated as the max version + 1, and then cumulative compaction couldn't pick any rowsets and compaction failed, and will lead the next base compaction on this tablet with all rowsets, which can also cause memory consume problem, suppose there are thousands of rowsets. All singleton rowsets must be newly wrote by delta writer and hasn't do any compaction, we should place cumulative point before any of these rowsets.	2020-05-30 10:34:53 +08:00
Binglin Chang	7524c5ef63	[Memory Engine] Add MemSubTablet, MemTablet, WriteTx, PartialRowBatch (#3637 )	2020-05-30 10:33:10 +08:00
Binglin Chang	c967eaf496	[Memory Engine] Add TabletType to PartitionInfo and TabletMeta (#3668 )	2020-05-29 20:20:44 +08:00
lichaoyong	93aae6bdff	[Bug] fix mixed used of counter (#3720 ) MysqlResultWriter _sent_rows_counter and _result_send_timer are mixed used. It will results core dump when checking counter->type().	2020-05-29 15:36:21 +08:00
Dayue Gao	9c85d05e41	[Bug] RuntimeState should be destructed after DataSink (#3709 ) Fixes #3706 DataSink uses instance and query MemTracker from RuntimeState, therefore it should be destructed before RuntimeState. Otherwise memory corruption and segfault could happen.	2020-05-28 17:31:01 +08:00
worker24h	e76f712bb3	[Bug] Load data is error in json load	2020-05-28 17:28:33 +08:00
lichaoyong	8f71c7a331	Duplicate Key table core when predicate on metric column (#3699 ) ``` CREATE TABLE `query_detail` ( `query_id` varchar(100) NULL COMMENT "", `start_time` datetime NULL COMMENT "", `end_time` datetime NULL COMMENT "", `latency` int(11) NULL COMMENT "unit is milliseconds", `state` varchar(20) NULL COMMENT "RUNNING/FINISHED/FAILED", `sql` varchar(1024) NULL COMMENT "" ) DUPLICATE KEY(`query_id`) SELECT COUNT(*) FROM query_detail WHERE start_time >= '2020-05-27 14:52:16' AND start_time < '2020-05-27 14:52:31'; ``` The above query will core because of ZoneMap only in query_id. Use start_time to match ZoneMap cause this core.	2020-05-28 14:35:40 +08:00
Mingyu Chen	f89d970cfd	[Bug][Metrics] Fix bug that some of metrics can not be got (#3708 ) The metrics in a metric collector need have same type, but no need to have same unit.	2020-05-28 09:09:14 +08:00
lichaoyong	1cc78fe69b	[Enhancement] Convert metric to Json format (#3635 ) Add a JSON format for existing metrics like this. ``` { "tags": { "metric":"thread_pool", "name":"thrift-server-pool", "type":"active_thread_num" }, "unit":"number", "value":3 } ``` I add a new JsonMetricVisitor to handle the transformation. It's not to modify existing PrometheusMetricVisitor and SimpleCoreMetricVisitor. Also I add 1. A unit item to indicate the metric better 2. Cloning tablet statistics divided by database. 3. Use white space to replace newline in audit.log	2020-05-27 08:49:30 +08:00
worker24h	fb66bac5fe	[Bug] Fix null pointer access in json-load (#3692 ) Add check for null pointer to avoid core dump	2020-05-26 22:41:30 +08:00
Mingyu Chen	dcd5e5df12	[AuditPlugin] Modify load label of audit plugin to avoid load confliction (#3681 ) Change the load label of audit plugin as: `audit_yyyyMMdd_HHmmss_feIdentity`. The `feIdentity` is got from the FE which run this plugin, currently just use FE's IP_editlog_port.	2020-05-26 18:23:07 +08:00
HappenLee	f4c03fe8e2	1. Delete the code of Sort Node we do not use now. (#3666 ) Optimize the quick sort by find_the_median and try to reduce recursion level of quick sort.	2020-05-26 10:20:57 +08:00
Mingyu Chen	3ffc447b38	[OUTFILE] Support `INTO OUTFILE` to export query result (#3584 ) This CL mainly changes: 1. Support `SELECT INTO OUTFILE` command. 2. Support export query result to a file via Broker. 3. Support CSV export format with specified column separator and line delimiter.	2020-05-25 21:24:56 +08:00
yangzhg	ba7d2dbf7b	[Function] Support utf-8 encoding in instr, locate, locate_pos, lpad, rpad (#3638 ) Support utf-8 encoding for string function `instr`, `locate`, `locate_pos`, `lpad`, `rpad` and add unit test for them	2020-05-22 14:34:26 +08:00
HuangWei	fb02bb5cd9	[Load] Fix mem limit in NodeChannel (#3643 )	2020-05-22 09:11:59 +08:00
worker24h	4f79036a7e	Add error code into error message (#3645 )	2020-05-21 19:14:35 +08:00
Mingyu Chen	f6b5c8839b	[Bug] Ignore loading DELETE status tablet error when restarting BE (#3641 ) Fix: #3640 Also add a `batch delete meta` feature for `meta tool` Fix #3639	2020-05-21 19:08:28 +08:00
worker24h	ef8fd1fcbe	[Load] Support load json-data into Doris by RoutineLoad or StreamLoad (#3553 ) Doris support load json-data by RoutineLoad or StreamLoad	2020-05-21 13:00:49 +08:00
EmmyMiao87	0d66e6bd15	Support bitmap_intersect (#3571 ) * Support bitmap_intersect Support aggregate function Bitmap Intersect, it is mainly used to take intersection of grouped data. The function 'bitmap_intersect(expr)' calculates the intersection of bitmap columns and returns a bitmap object. The defination is following: FunctionName: bitmap_intersect, InputType: bitmap, OutputType: bitmap The scenario is as follows: Query which users satisfy the three tags a, b, and c at the same time. ``` select bitmap_to_string(bitmap_intersect(user_id)) from ( select bitmap_union(user_id) user_id from bitmap_intersect_test where tag in ('a', 'b', 'c') group by tag ) a ``` Closed #3552. * Add docs of bitmap_union and bitmap_intersect * Support null of bitmap_intersect	2020-05-20 21:12:02 +08:00
Binglin Chang	c54cb4b14e	[Memory Engine] Add column reader/writer (#3580 )	2020-05-20 11:09:30 +08:00
yangzhg	6be7a6232f	[Config] Add ignore config to determine whether to continue to start be when load tablet from header failed. (#3632 ) Add config ignore_load_tablet_failure to determine whether to continue to start be when load tablet from header failed.	2020-05-20 09:40:50 +08:00
yangzhg	58a6628af2	[Bug] Fix first start error after upgrade doris to support delete dulplicate table value columns (#3628 )	2020-05-20 09:39:24 +08:00
Dayue Gao	9425f17d28	[Bug] instance mem tracker should has no limit (#3592 )	2020-05-19 19:49:39 +08:00
令狐少侠	8018b1c348	[Doris on ES]Fix bug of like not translate correctly (#3602 ) Why this case happened In current implement, translation into dsl only if it is not the first charactor. Thus, when sql is write like '%abc', translation would not run. How fixed Now, translation will trigger with charactor '?' or '*' if it is the first charactor, translate directly else, check the preceding char is escaped or not to determin translation or not	2020-05-19 17:06:46 +08:00
Mingyu Chen	7fb74db0a1	[Trace] Introduce trace util to BE Ref https://github.com/apache/incubator-doris/issues/3566 Introduce trace utility from Kudu to BE. This utility has been widely used in Kudu, Impala also import this trace utility. This trace util is used for tracing each phases in a thread, and can be dumped to string to see each phases' time cost and diagnose which phase cost more time. This util store a Trace object as a threadlocal variable, we can add trace entries which record the current file name, line number, user specified symbols and timestamp to this object, and it's able to add some counters to this Trace object. And then, it can be dumped to human readable string. There are some helpful macros defined in trace.h, here is a simple example for usage: ``` scoped_refptr<Trace> t1(new Trace); // New 2 traces scoped_refptr<Trace> t2(new Trace); t1->AddChildTrace("child_trace", t2.get()); // t1 add t2 as a child named "child_trace" TRACE_TO(t1, "step $0", 1); // Explicitly trace to t1 usleep(10); // ... do some work ADOPT_TRACE(t1.get()); // Explicitly adopt to trace to t1 TRACE("step $0", 2); // Implicitly trace to t1 { // The time spent in this scope is added to counter t1.scope_time_cost TRACE_COUNTER_SCOPE_LATENCY_US("scope_time_cost"); ADOPT_TRACE(t2.get()); // Adopt to trace to t2 for the duration of the current scope TRACE("sub start"); // Implicitly trace to t2 usleep(10); // ... do some work TRACE("sub before loop"); for (int i = 0; i < 10; ++i) { TRACE_COUNTER_INCREMENT("iterate_count", 1); // Increase counter t2.iterate_count MicrosecondsInt64 start_time = GetMonoTimeMicros(); usleep(10); // ... do some work MicrosecondsInt64 end_time = GetMonoTimeMicros(); int64_t dur = end_time - start_time; // t2's simple histogram metric with name prefixed with "lbm_writes" const char* counter = BUCKETED_COUNTER_NAME("lbm_writes", dur); TRACE_COUNTER_INCREMENT(counter, 1); } TRACE("sub after loop"); } TRACE("goodbye $0", "cruel world"); // Automatically restore to trace to t1 std::cout << t1->DumpToString(Trace::INCLUDE_ALL) << std::endl; ``` output looks like: ``` 0514 02:16:07.988054 (+ 0us) trace_test.cpp:76] step 1 0514 02:16:07.988112 (+ 58us) trace_test.cpp:80] step 2 0514 02:16:07.988863 (+ 751us) trace_test.cpp:103] goodbye cruel world Related trace 'child_trace': 0514 02:16:07.988120 (+ 0us) trace_test.cpp:85] sub start 0514 02:16:07.988188 (+ 68us) trace_test.cpp:88] sub before loop 0514 02:16:07.988850 (+ 662us) trace_test.cpp:101] sub after loop Metrics: {"scope_time_cost":744,"child_traces":[["child_trace",{"iterate_count":10,"lbm_writes_lt_1ms":10}]]} ``` Exclude the original source code, this patch do the following work to adapt to Doris: - Rename "kudu" namespace to "doris" - Update some names to the existing function names in Doris, i.g. strings::internal::SubstituteArg::kNoArg -> strings::internal::SubstituteArg::NoArg - Use doris::SpinLock instead of kudu::simple_spinlock which hasn't been imported - Use manual malloc() and free() instead of kudu::Arena which hasn't been imported - Use manual rapidjson::Writer instead of kudu::JsonWriter which hasn't been imported - Remove all TRACE_EVENT related unit tests since TRACE_EVENT is not imported this time - Update CMakeLists.txt NOTICE(#3622): This is a "revert of revert pull request". This pr is mainly used to synthesize the PRs whose commits were scattered and submitted due to the wrong merge method into a complete single commit.	2020-05-18 14:55:11 +08:00
Mingyu Chen	69a63f6f53	Revert "[trace] Introduce trace util to BE" (#3614 ) This revert is used to correct the mess of the commit timeline caused by the wrong merge method.	2020-05-18 13:16:39 +08:00
Mingyu Chen	bb7ae97845	[trace] Introduce trace util to BE Ref https://github.com/apache/incubator-doris/issues/3566 Introduce trace utility from Kudu to BE. This utility has been widely used in Kudu, Impala also import this trace utility. This trace util is used for tracing each phases in a thread, and can be dumped to string to see each phases' time cost and diagnose which phase cost more time. This util store a Trace object as a threadlocal variable, we can add trace entries which record the current file name, line number, user specified symbols and timestamp to this object, and it's able to add some counters to this Trace object. And then, it can be dumped to human readable string. There are some helpful macros defined in trace.h, here is a simple example for usage: ``` scoped_refptr<Trace> t1(new Trace); // New 2 traces scoped_refptr<Trace> t2(new Trace); t1->AddChildTrace("child_trace", t2.get()); // t1 add t2 as a child named "child_trace" TRACE_TO(t1, "step $0", 1); // Explicitly trace to t1 usleep(10); // ... do some work ADOPT_TRACE(t1.get()); // Explicitly adopt to trace to t1 TRACE("step $0", 2); // Implicitly trace to t1 { // The time spent in this scope is added to counter t1.scope_time_cost TRACE_COUNTER_SCOPE_LATENCY_US("scope_time_cost"); ADOPT_TRACE(t2.get()); // Adopt to trace to t2 for the duration of the current scope TRACE("sub start"); // Implicitly trace to t2 usleep(10); // ... do some work TRACE("sub before loop"); for (int i = 0; i < 10; ++i) { TRACE_COUNTER_INCREMENT("iterate_count", 1); // Increase counter t2.iterate_count MicrosecondsInt64 start_time = GetMonoTimeMicros(); usleep(10); // ... do some work MicrosecondsInt64 end_time = GetMonoTimeMicros(); int64_t dur = end_time - start_time; // t2's simple histogram metric with name prefixed with "lbm_writes" const char* counter = BUCKETED_COUNTER_NAME("lbm_writes", dur); TRACE_COUNTER_INCREMENT(counter, 1); } TRACE("sub after loop"); } TRACE("goodbye $0", "cruel world"); // Automatically restore to trace to t1 std::cout << t1->DumpToString(Trace::INCLUDE_ALL) << std::endl; ``` output looks like: ``` 0514 02:16:07.988054 (+ 0us) trace_test.cpp:76] step 1 0514 02:16:07.988112 (+ 58us) trace_test.cpp:80] step 2 0514 02:16:07.988863 (+ 751us) trace_test.cpp:103] goodbye cruel world Related trace 'child_trace': 0514 02:16:07.988120 (+ 0us) trace_test.cpp:85] sub start 0514 02:16:07.988188 (+ 68us) trace_test.cpp:88] sub before loop 0514 02:16:07.988850 (+ 662us) trace_test.cpp:101] sub after loop Metrics: {"scope_time_cost":744,"child_traces":[["child_trace",{"iterate_count":10,"lbm_writes_lt_1ms":10}]]} ``` Exclude the original source code, this patch do the following work to adapt to Doris: - Rename "kudu" namespace to "doris" - Update some names to the existing function names in Doris, i.g. strings::internal::SubstituteArg::kNoArg -> strings::internal::SubstituteArg::NoArg - Use doris::SpinLock instead of kudu::simple_spinlock which hasn't been imported - Use manual malloc() and free() instead of kudu::Arena which hasn't been imported - Use manual rapidjson::Writer instead of kudu::JsonWriter which hasn't been imported - Remove all TRACE_EVENT related unit tests since TRACE_EVENT is not imported this time - Update CMakeLists.txt	2020-05-18 11:10:25 +08:00
HappenLee	7bf926eba8	[Profile] Improve the running profile 1. Delete Invalid Counter In Data_Stream_Sender. (#3598) 2. Add Counter For PartitionHashTable of PartitionAggregationNode: * Hash Probe Method * Row processed by Aggregation * HashFilledBuckets: Counter How Many FilledBuckets in Aggragation * HTResize: Counter How Many Resize of HashTable * HashProbe: Counter Probe of HashTable * HashFailedProbe: Counter Failed Probe of HashTable * HashTravelLength: Total TravelLength for Probe * HashCollisions: Counter of HashCollision 3. Del some unecessary code in PartitionHashTable by template	2020-05-16 21:35:30 +08:00
Dayue Gao	273aad6cf4	[Bug] Restore tablet action not working because tablet status is shutdown (#3551 )	2020-05-15 10:11:17 +08:00
yangzhg	123e1394b1	[Delete] Allow delete duplicated non-key column using delete from (#3424 )	2020-05-15 09:26:36 +08:00
Yingchun Lai	9fc2554e6c	indentation	2020-05-14 14:45:22 +00:00
Yingchun Lai	8406723912	adapt to Doris	2020-05-13 12:13:47 +00:00
Yingchun Lai	e066791e47	import original files	2020-05-13 19:03:20 +08:00
Binglin Chang	a7cfafe076	[Memory Engine] add core column related classes (#3508 ) add core column related classes	2020-05-13 16:30:32 +08:00
令狐少侠	5a57ecca15	[Doris On ES]fix bug of query failed in doc_value_mode when fields have none value (#3513 ) #3479 Here I try to explain the cause of the problem and how to fix it. The Cause of The problem Take the case in issue(#3479 ) as an example: The general results are as follows: ``` GET table/_doc/_search {"query":{"match_all":{}},"stored_fields":"_none_","docvalue_fields":["k1"],"sort":["_doc"],"size":100} { "took": 6, "timed_out": false, "_shards": { …… }, "hits": { "total": 3, "max_score": null, "hits": [ { "_index": "table", "_score": null, "sort": [ 0 ] }, { "_index": "table", "_score": null, "fields": { "k1": [ "kkk1" ] }, "sort": [ 0 ] }, { "_index": "table", "_score": null, "sort": [ 0 ] } ] } } ``` But in Doris on ES，Be fetched data parallelly on all shards, and use `filter_path` to reduce the network cost. The process will be as follows: ``` GET table/_doc/_search?preference=_shards:1&filter_path=_scroll_id,hits.hits._source,hits.total,_id,hits.hits._source.fields,hits.hits.fields {"query":{"match_all":{}},"stored_fields":"_none_","docvalue_fields":["k1"],"sort":["_doc"],"size":100} { "hits": { "total": 0 } } GET table/_doc/_search?preference=_shards:2&filter_path=_scroll_id,hits.hits._source,hits.total,_id,hits.hits._source.fields,hits.hits.fields {"query":{"match_all":{}},"stored_fields":"_none_","docvalue_fields":["k1"],"sort":["_doc"],"size":100} { "hits": { "total": 1 } } GET table/_doc/_search?preference=_shards:3&filter_path=_scroll_id,hits.hits._source,hits.total,_id,hits.hits._source.fields,hits.hits.fields {"query":{"match_all":{}},"stored_fields":"_none_","docvalue_fields":["k1"],"sort":["_doc"],"size":100} { "hits": { "total": 1, "hits": [ { "fields": { "k1": [ "kkk1" ] } } ] } } ``` Scan-Worker On BE which processed result of shard2 will failed. The reasons are as follows: 1. "filter_path" causes the hits.hits object not exist. 2. In the current implementation, if there are some data rows（total > 0）, the hits.hits. object must be an array How To Fix it Two Method: 1. modify "filter_path" to contain the hits. Pros: Fixed Code is very simple Cons: More network cost 2. Deal with the case where fields are missing in a batch. Pros: No loss of performance Cons: Code is more complex Performance first, I use Method2. Design 1. Add a variable "_doc_value_mode" into Class "EsScrollParser" to =indicate whether the data processed by this parser is doc_value_mode or not. 2. "_doc_value_mode" is passed from ESScollReader <- ESScanner <- ScrollQueryBuilder::build() that determines whether DSL is enable doc_value_mode 3. When hits.hits of response from ES is empty and total > 0. We know there are data lines, but the corresponding fields do not exist. EsScrollParser will use "_doc_value_mode" and _total to construct _total lines which fields are assigned with 'NULL'	2020-05-11 15:34:12 +08:00
HuangWei	57cbfb772d	Add -Werror when gcc<=7.3.0 & udf fix (#3533 )	2020-05-11 10:31:38 +08:00
Yingchun Lai	b576e54fe6	[ASAN] Fix some address problems detected by ASAN (#3495 ) LSAN detected errors have been fixed by a prior pathch (#3326), but there are still some ASAN detected errors. This patch try to fix these errors to make Doris BE more robustness. And then we can add CI run in LSAN/ASAN mode to detect memory errors as early as possible.	2020-05-11 10:30:45 +08:00
Dayue Gao	56db6e7a35	[Config] allow user to config BRPC socket_max_unwritten_bytes (#3488 ) Add new BE config `brpc_socket_max_unwritten_bytes`	2020-05-10 17:56:14 +08:00
Dayue Gao	b62b310864	[Bug] Fix BE crash when input to hll_merge is null (#3521 )	2020-05-09 11:01:48 +08:00
Yingchun Lai	e2c3c84e8d	[ut] disable backgrounp scan context gc to speed up unit test (#3524 ) Each test case in ExternalScanContextMgrTest may cost 1 minitue which is too long, we'd better disable backgrounp scan context gc to speed up unit test.	2020-05-09 09:01:05 +08:00
Youngwb	a656a7ddd4	Support append_trailing_char_if_absent function (#3439 )	2020-05-09 08:59:34 +08:00
Dayue Gao	2f7d2c7e1a	[BUG] Fix a bug that ignore_broken_disk may not work (#3486 ) When BE sets `ignore_broken_disk` to true, it's expected that non-exist path in storage_root_path won't prevent BE from launching, but in 0.12 BE fails to launch in such scenario. ``` W0506 14:46:11.039953 17040 options.cpp:64] path can not be canonicalized. may be not exist. path=/data11/olap W0506 14:46:11.040014 17040 options.cpp:141] failed to parse store path /data11/olap, res=-203 ``` The reason is that #2861 adds a path existence check in `parse_root_path` which precedes the usage of `ignore_broken_disk` in the main method.	2020-05-08 12:53:44 +08:00
yangzhg	94b3a2bd50	[Bug] Fix string functions not support multibyte string (#3345 ) Let string functions support utf8 encoding	2020-05-08 12:52:46 +08:00
yangzhg	f90da72078	[Planner]Enhance AssertNumRowsNode (#3485 ) Enhance AssertNumRowsNode to support equal, less than, greater than,... assert conditions	2020-05-08 12:49:48 +08:00
yangzhg	c85d847b1e	[CompileBug] fix a compile error (#3502 ) NodeChannel::mark_close() missing `return`	2020-05-07 23:01:46 +08:00
HuangWei	94539e7120	Non blocking OlapTableSink (#3143 ) ImplementaItion Notes NodeChannel _cur_batch -> _pending_batches: when _cur_batch is filled up, move it to _pending_batches. add_row() just produce batches. try_send_and_fetch_status() tries to consume one pending batch. If has in flight packet, skip send in this round. So we can add one sender thread to be in charge of all node channels try_send. IndexChannel init(), open() stay the same. Use for_each_node_channel() to expose the detailed changes of NodeChannel.(It's more easy to read & modify) Sender thread See func OlapTableSink::_send_batch_process() Why use polling？ If we use wait/notify, it will notify when generate a new batch. We can't skip sending this batch, coz it won't notify the same batch again. So wait/notify can't avoid blocking simply. So I choose polling. It's wasting to continuously try_send(), but it's difficult to set the suitable polling interval. Thus, I add std::this_thread::yield() to give up the time slice, give priority to other process/threads (if there are other process/threads waiting in the queue).	2020-05-07 10:43:41 +08:00
sduzh	9e8a060e5b	Replace std::tr1::unordered_map with std::unordered_map (#3478 )	2020-05-07 10:38:27 +08:00
sduzh	36f2863574	fix mismatched tags (#3489 ) RandomAccessFileOptions, WritableFileOptions, RandomRWFileOptions defined as a struct but previously declared as a class; this is valid, but will result in compile warning or error under clang compiler	2020-05-07 09:37:26 +08:00

1 2 3 4 5 ...

893 Commits