Commit Graph

962 Commits

Author SHA1 Message Date
0224d49842 [Fix][Bug] Fix compile bug (#3888)
Co-authored-by: chenmingyu <chenmingyu@baidu.com>
2020-06-16 18:42:04 +08:00
6c4d7c60dd [Feature] Add QueryDetail to store query statistics. (#3744)
1. Store the query statistics in memory.
2. Supporting RESTFUL interface to get the statistics.
2020-06-15 18:16:54 +08:00
2211cb0ee0 [Metrics] Add metrics document and 2 new metrics of TCP (#3835) 2020-06-15 09:48:09 +08:00
3c09e1e1d8 [trace] Adapt trace util to compaction module (#3814)
Trace util is helpful for diagnosing compaction performance problems,
we can get trace log for base compaction like:
```
W0610 11:26:33.804431 56452 storage_engine.cpp:552] Trace:
0610 11:23:03.727535 (+     0us) storage_engine.cpp:554] start to perform base compaction
0610 11:23:03.728961 (+  1426us) storage_engine.cpp:560] found best tablet 546859
0610 11:23:03.728963 (+     2us) base_compaction.cpp:40] got base compaction lock
0610 11:23:03.729029 (+    66us) base_compaction.cpp:44] rowsets picked
0610 11:24:51.784439 (+108055410us) compaction.cpp:46] got concurrency lock and start to do compaction
0610 11:24:51.784818 (+   379us) compaction.cpp:74] prepare finished
0610 11:26:33.359265 (+101574447us) compaction.cpp:87] merge rowsets finished
0610 11:26:33.484481 (+125216us) compaction.cpp:102] output rowset built
0610 11:26:33.484482 (+     1us) compaction.cpp:106] check correctness finished
0610 11:26:33.513197 (+ 28715us) compaction.cpp:110] modify rowsets finished
0610 11:26:33.513300 (+   103us) base_compaction.cpp:49] compaction finished
0610 11:26:33.513441 (+   141us) base_compaction.cpp:56] unused rowsets have been moved to GC queue
Metrics: {"filtered_rows":0,"input_row_num":3346807,"input_rowsets_count":42,"input_rowsets_data_size":1256413170,"input_segments_num":44,"merge_rowsets_latency_us":101574444,"merged_rows":0,"output_row_num":3346807,"output_rowset_data_size":1228439659,"output_segments_num":6}
```
for cumulative compaction like:
```
W0610 11:14:18.714366 56468 storage_engine.cpp:518] Trace:
0610 11:14:08.068484 (+     0us) storage_engine.cpp:520] start to perform cumulative compaction
0610 11:14:08.069844 (+  1360us) storage_engine.cpp:526] found best tablet 547083
0610 11:14:08.069846 (+     2us) cumulative_compaction.cpp:42] got cumulative compaction lock
0610 11:14:08.069947 (+   101us) cumulative_compaction.cpp:46] calculated cumulative point
0610 11:14:08.070141 (+   194us) cumulative_compaction.cpp:50] rowsets picked
0610 11:14:08.070143 (+     2us) compaction.cpp:46] got concurrency lock and start to do compaction
0610 11:14:08.070518 (+   375us) compaction.cpp:74] prepare finished
0610 11:14:15.389893 (+7319375us) compaction.cpp:87] merge rowsets finished
0610 11:14:15.390916 (+  1023us) compaction.cpp:102] output rowset built
0610 11:14:15.390917 (+     1us) compaction.cpp:106] check correctness finished
0610 11:14:15.409460 (+ 18543us) compaction.cpp:110] modify rowsets finished
0610 11:14:15.409496 (+    36us) cumulative_compaction.cpp:55] compaction finished
0610 11:14:15.410138 (+   642us) cumulative_compaction.cpp:65] unused rowsets have been moved to GC queue
Metrics: {"filtered_rows":0,"input_row_num":136707,"input_rowsets_count":302,"input_rowsets_data_size":76617836,"input_segments_num":302,"merge_rowsets_latency_us":7319372,"merged_rows":0,"output_row_num":136707,"output_rowset_data_size":53893280,"output_segments_num":1}
```
2020-06-13 19:31:51 +08:00
b8ee84a120 [Doc] Add docs to OLAP_SCAN_NODE query profile (#3808) 2020-06-13 16:25:40 +08:00
61be7132a9 fix for be server crash which throwing syntax error when parse json … (#3846)
Fix for be server crash which throwing syntax error when parse json from kafka message
2020-06-13 12:45:16 +08:00
38b6d291f1 [Bug] fix uninitialized member vars (#3848)
This fix is based on UBSAN unit test. So if we create & use class obj in a different way, may have runtime error: load of value XX, which is not a valid value for type 'YYY' warnings again.

Unit test should build in DEBUG or XXSAN mode(at lease DEBUG). RELEASE mode will add -DNDEBUG, turn off dchecks/asserts/debug.
2020-06-13 12:44:49 +08:00
83d39ff9c9 Avoid pass NULL to memcmp() (#3844)
If we exec "StringVal(len=0, ptr="") == StringVal(len=0,ptr=NULL)", it will pass NULL ptr to memcmp(). It should be avoided.
2020-06-13 12:43:41 +08:00
dac156b6b1 [Spill To Disk] Analytic_Eval_Node Support Spill Disk and Del Some Unless Code (#3820)
* 1. Add enable spilling in query option, support spill disk in Analytic_Eval_Node, FE can open enable spilling by

         set enable_spilling = true;

Now, Sort Node and Analytic_Eval_Node can spill to disk.
2. Delete merge merge_sorter code we do not use now.
3. Replace buffered_tuple_stream by buffered_tuple_stream2 in Analytic_Eval_Node and support spill to disk. Delete the useless code of buffered_block_mgr and buffered_tuple_stream.
4. Add DataStreamRecvr Profile. Move the counter belong to DataStreamRecvr from fragment to DataStreamRecvr Profile to make clear of Running Profile.

* change some hint in code

* replace disable_spill with enable_spill which is better compatible to FE
2020-06-13 10:19:02 +08:00
7591527977 [Bug] Fix a bug that insert null bitmap crashes BE (#3830)
INSERT INTO VALUES to_bitmap('xx') may insert null into bitmap column, which may cause dirty data to be written.
2020-06-12 18:03:02 +08:00
8caedadb67 use scoped_refptr to new HashIndex (#3818) 2020-06-10 23:47:10 +08:00
ef94c25773 [Bug]fix the crash of checksum task #3735 (#3738)
1. the table include key column of double/float type
2. when run checksum task, will use all of key columns to compare
3. schema.column(idx) of double/float type is NULL

#3735
2020-06-10 22:59:15 +08:00
8c608bbad5 [Doris On ES] Skip function_call expr when process predicate (#3813)
[Doris On ES] Skip function_call expr when process predicate

Fixed #3801
Do not push-down function_call such as split_xxx when process predicate, Doris BE is responsible for processing these predicate

All rows in table:

```
+------+------+------+------------+------------+
| k1   | k2   | k3   | UpdateTime | ArriveTime |
+------+------+------+------------+------------+
| NULL | NULL | kkk1 |  123456789 |       NULL |
| kkk1 | NULL | NULL |  123456789 |       NULL |
| NULL | kkk2 | NULL |  123456789 |       NULL |
+------+------+------+------------+------------+
```

The following predicate could not push down to ES.

```
SQL 1:
mysql> select * from (select split_part(k1, "1", 1) as kk from case_replay_for_milimin) t where t.kk is not null;
+------+
| kk   |
+------+
| kkk  |
+------+
1 row in set (0.02 sec)

SQL 2:
mysql> select * from (select split_part(k1, "1", 1) as kk from case_replay_for_milimin) t where t.kk > 'a';      
+------+
| kk   |
+------+
| kkk  |
+------+

SQL 3:
mysql> select * from (select split_part(k1, "1", 1) as kk from case_replay_for_milimin) t where t.kk > '2';
+------+
| kk   |
+------+
| kkk  |
+------+
1 row in set (0.03 sec)
```
2020-06-10 11:22:53 +08:00
559714f3d4 Fix largeint max min bug (#3793) 2020-06-08 21:01:30 +08:00
e4dc2ec440 [StorageEngine] Make StorageEngine::open return more detailed info (#3761)
StorageEngine::open just return a very vague status info when failed,
we have to check logs to find out the root reason, and it's not
convenient to check logs if we run unit tests in CI dockers.
It would be better to return more detailed failure info to point out
the root reason, for example, it may return error status with message
"file descriptors limit is too small".
2020-06-07 10:21:33 +08:00
3b6a781862 [Bug] Fix a bug that tablet's _preferred_rowset_type may be modified to BETA_ROWSET after cloned (#3750)
TabletMeta's _preferred_rowset_type is not initialized after object constructing and
may be a random value, and this field is not updated when create ALPHA_ROWSET tablet,
and it will not be serialized into pb in this case. So if cloning an ALPHA_ROWSET
tablet from another BE, this new created local tablet's _preferred_rowset_type field
may be random as BETA_ROWSET and can not be overwrote after cloned, then new input
rows will be wrote as BETA_ROWSET format which is not we expect.
This patch fix this bug by giving _preferred_rowset_type a default value and updating
this field when create any type of tablet, and add an unit test and related overwrite
equal operator functions.
2020-06-06 11:36:28 +08:00
0f6e74f3f9 [BUG] Fix location url in agg_fn_evaluator (#3780) 2020-06-06 11:34:12 +08:00
ed9022a908 Ignore broken disk when BE starts up (#3741) 2020-06-05 10:26:07 +08:00
484e7de3c5 [Doirs On ES] fix bug for sparse docvalue context and remove the mistake usage of total (#3751)
The other PR : https://github.com/apache/incubator-doris/pull/3513 (https://github.com/apache/incubator-doris/issues/3479) try to resolved the `inner hits node is not an array` because when a  query( batch-size) run against new segment without this field, as-well the filter_path just only take `hits.hits.fields` 、`hits.hits._source` into account, this would appear an null inner hits node:
```
{
   "_scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAHaUWY1ExUVd0ZWlRY2",
   "hits": {
      "total": 1
   }
}
```

Unfortunately this PR introduce another serious inconsistent result with different batch_size because of misusing the `total`.

To avoid this two problem,  we just add `hits.hits._score` to filter_path when `docvalue_mode` is true,   `_score`  would always `null` ,  and populate the inner hits node:

```
{
   "_scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAHaUWY1ExUVd0ZWlRY2",
   "hits": {
      "total": 1,
      "hits": [
         {
            "_score": null
         }
      ]
   }
}
```

related issue: https://github.com/apache/incubator-doris/issues/3752
2020-06-04 16:31:18 +08:00
01c1de1870 [Load] Add more metric to trace the time cost in stream load and make brpc_num_threads configurable (#3703) 2020-06-04 13:37:28 +08:00
27046c5b61 [Enhancement] Improve the performance of query with IN predicate (#3694)
This CL mainly changes:
1. Add a new BE config `max_pushdown_conditions_per_column` to limit the number of conditions of a single column that can be pushed down to storage engine.

2. Add 2 new session variables `max_scan_key_num` and `doris_max_scan_key_num` which can set in session level and overwrite the config value in BE.
2020-06-04 11:39:00 +08:00
791f8fee49 [Bug][Outfile] Fix bug that column separater is missing in output file. (#3765)
When output result of a query using `OUTFILE` statement, is some of output
column is null, then then following column separator is missing.
2020-06-04 10:35:32 +08:00
2ad1b20b24 [Config] Add new BE config for tcmalloc (#3732)
Add a new BE config tc_max_total_thread_cache_bytes
2020-06-03 21:58:13 +08:00
73c3de4313 [refactor] Simple refactor on class Reader (#3691)
This is a simple refactor patch on class Reader without any functional changes.
Main refactor points:
- Remove some useless return value
- Use range loop
- Use empty() instead of size() for some STL containers size judgement
- Use in-class initialization instead of initialize in constructor function
- Some other small refactor
2020-06-03 19:55:53 +08:00
ed886a485d [HttpServer] capture convert exception (#3736)
If parameter str is an empty string, it will throw exception too. Maybe we can add an ut for parsing parameters in http server.
2020-06-03 19:54:41 +08:00
e16873a6c1 Fix large string val allocation failure (#3724)
* Fix large string val allocation failure

Large bitmap will need use StringVal to allocate large memory, which is large than MAX_INT.
The overflow will cause serialization failure of bitmap.

Fixed #3600
2020-06-03 17:07:54 +08:00
70aa9d6ca8 [Memory Engine] Add MemTabletScan (#3734) 2020-06-03 15:42:38 +08:00
60f93b2142 Fix bitmap type (#3749) 2020-06-03 10:07:58 +08:00
fdf66b8102 [MemTracker] add log depth & auto unregister (#3701) 2020-06-01 23:16:25 +08:00
43d25afa2c [compaction] Update cumulative point calculate algorithm (#3690)
Current cumulative point calculate algorithm may skip singleton rowset when the rowset has only one segment and with NONOVERLAPPING flag. When a tablet is new created and cumulate many singleton rowsets, cumulative point will be calculated as the max version + 1, and then cumulative compaction couldn't pick any rowsets and compaction failed, and
will lead the next base compaction on this tablet with all rowsets, which can also cause memory consume problem, suppose there are thousands of rowsets.
    All singleton rowsets must be newly wrote by delta writer and hasn't
do any compaction, we should place cumulative point before any of these rowsets.
2020-05-30 10:34:53 +08:00
7524c5ef63 [Memory Engine] Add MemSubTablet, MemTablet, WriteTx, PartialRowBatch (#3637) 2020-05-30 10:33:10 +08:00
c967eaf496 [Memory Engine] Add TabletType to PartitionInfo and TabletMeta (#3668) 2020-05-29 20:20:44 +08:00
93aae6bdff [Bug] fix mixed used of counter (#3720)
MysqlResultWriter _sent_rows_counter and _result_send_timer are mixed used.
It will results core dump when checking counter->type().
2020-05-29 15:36:21 +08:00
9c85d05e41 [Bug] RuntimeState should be destructed after DataSink (#3709)
Fixes #3706 

DataSink uses instance and query MemTracker from RuntimeState, therefore it should be destructed before RuntimeState. Otherwise memory corruption and segfault could happen.
2020-05-28 17:31:01 +08:00
e76f712bb3 [Bug] Load data is error in json load 2020-05-28 17:28:33 +08:00
8f71c7a331 Duplicate Key table core when predicate on metric column (#3699)
```
CREATE TABLE `query_detail` (
  `query_id` varchar(100) NULL COMMENT "",
  `start_time` datetime NULL COMMENT "",
  `end_time` datetime NULL COMMENT "",
  `latency` int(11) NULL COMMENT "unit is milliseconds",
  `state` varchar(20) NULL COMMENT "RUNNING/FINISHED/FAILED",
  `sql` varchar(1024) NULL COMMENT ""
)
DUPLICATE KEY(`query_id`)

SELECT COUNT(*) FROM query_detail WHERE start_time >= '2020-05-27 14:52:16' AND start_time < '2020-05-27 14:52:31';
```
The above query will core because of ZoneMap only in query_id.
Use start_time to match ZoneMap cause this core.
2020-05-28 14:35:40 +08:00
f89d970cfd [Bug][Metrics] Fix bug that some of metrics can not be got (#3708)
The metrics in a metric collector need have same type, but no need
to have same unit.
2020-05-28 09:09:14 +08:00
1cc78fe69b [Enhancement] Convert metric to Json format (#3635)
Add a JSON format for existing metrics like this.
```
{
    "tags":
    {
        "metric":"thread_pool",
        "name":"thrift-server-pool",
        "type":"active_thread_num"
    },
    "unit":"number",
    "value":3
}
```
I add a new JsonMetricVisitor to handle the transformation.
It's not to modify existing PrometheusMetricVisitor and SimpleCoreMetricVisitor.
Also I add
1.  A unit item to indicate the metric better 
2. Cloning tablet statistics divided by database.
3. Use white space to replace newline in audit.log
2020-05-27 08:49:30 +08:00
fb66bac5fe [Bug] Fix null pointer access in json-load (#3692)
Add check for null pointer to avoid core dump
2020-05-26 22:41:30 +08:00
dcd5e5df12 [AuditPlugin] Modify load label of audit plugin to avoid load confliction (#3681)
Change the load label of audit plugin as:

`audit_yyyyMMdd_HHmmss_feIdentity`.

The `feIdentity` is got from the FE which run this plugin, currently just use FE's IP_editlog_port.
2020-05-26 18:23:07 +08:00
f4c03fe8e2 1. Delete the code of Sort Node we do not use now. (#3666)
Optimize the quick sort by find_the_median and try to reduce recursion level of quick sort.
2020-05-26 10:20:57 +08:00
3ffc447b38 [OUTFILE] Support INTO OUTFILE to export query result (#3584)
This CL mainly changes:

1. Support `SELECT INTO OUTFILE` command.
2. Support export query result to a file via Broker.
3. Support CSV export format with specified column separator and line delimiter.
2020-05-25 21:24:56 +08:00
6788cacb94 Fix unit test failed (#3642)
Fix some unittest failed due to glog, this may be we change the ut build dir,and the log path is not exist in new build dir, so we change the log from file to stdout
2020-05-25 18:55:19 +08:00
12ebd5d82b Remove some outdate test (#3672) 2020-05-25 09:23:56 +08:00
ba7d2dbf7b [Function] Support utf-8 encoding in instr, locate, locate_pos, lpad, rpad (#3638)
Support utf-8 encoding for string function `instr`, `locate`, `locate_pos`, `lpad`, `rpad`
and add unit test for them
2020-05-22 14:34:26 +08:00
16deac96a9 [UT][Bug] Fix the ut error of bitmap_intersect (#3664)
Change-Id: Id32fd9381119f30786acae9b4ac61b0d5ef9df48
2020-05-22 10:29:12 +08:00
fb02bb5cd9 [Load] Fix mem limit in NodeChannel (#3643) 2020-05-22 09:11:59 +08:00
4f79036a7e Add error code into error message (#3645) 2020-05-21 19:14:35 +08:00
f6b5c8839b [Bug] Ignore loading DELETE status tablet error when restarting BE (#3641)
Fix: #3640 

Also add a `batch delete meta` feature for `meta tool`
Fix #3639
2020-05-21 19:08:28 +08:00
ef8fd1fcbe [Load] Support load json-data into Doris by RoutineLoad or StreamLoad (#3553)
Doris support load json-data by RoutineLoad or StreamLoad
2020-05-21 13:00:49 +08:00