Commit Graph

145 Commits

Author SHA1 Message Date
d79cdc829f [Bug] Filter out unavaliable backends when getting tablet location (#6204)
* [Bug] Filter out unavaiable backends when getting scan range location

In the previous implementation, we will eliminate non-surviving BEs in the Coordinator phase.
But for Spark or Flink Connector, there is no such logic, so when a BE node is down,
it will cause the problem of querying errors through the Connector.

* fix ut

* fix compiule
2021-07-13 11:17:49 +08:00
d57c2344e1 [MemTracker] Refactored the hierarchical structure of memtracker (#5956)
To avoid showing too many memtracker on BE web pages.
The MemTracker level now has 3 levels: OVERVIEW, TASK and VERBOSE.

OVERVIEW Mainly used for main memory consumption module such as Query/Load/Metadata.
TASK is mainly used to record the memory overhead of a single task such as a single query, load, and compaction task.
VERBOSE is used for other more detailed memtrackers.
2021-06-16 09:44:24 +08:00
1ec615c562 [BUG] Fixed some uninitialized variables (#5850)
Fixed some potential bugs caused by uninitialized variables
2021-05-25 10:34:35 +08:00
add8c4bb74 [Load] Support reading multi-line json objects for JsonScanner (#5774)
Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-05-18 15:44:45 +08:00
98e80aa65e [refactor] Replace boost::function with std::function (#5700)
Replace boost::function with std::function
2021-05-09 22:00:48 +08:00
8850cfe2ad [Compaction] Modify compaction logic (#5737)
1. Add /api/compaction/run_status to show the running compaction tasks.
2. Support do base and cumulative compaction for one tablet at same time.
3. Modify some log level.
4. Add a feedback document.
2021-05-07 11:18:47 +08:00
eb1fdd019d [Bug-fix] Fix the replay error in stream load manager (#5722)
The previous replay logic does not record the size of the map, which eventually resulted in EOF when reading the log.
This pr replaces the replay logic directly with json.

At the same time, the replay logic of image is supplemented.
The pr ensure that the attributes 'lastStreamLoadTime' of backend can be correctly recorded in the image.
2021-05-05 10:18:51 +08:00
8ff87409cb [Bugfix] Fix the incorrect compaction type of compaction API has no corresponding response (#5710) 2021-04-30 10:12:04 +08:00
3de42999e0 [Config]Add uri param to demonstrate tablet meta content of byte type for debug (#5693)
* add switch to demonstrate tablet meta of byte type for debug

Co-authored-by: wangxixu <wangxixu@xiaomi.com>
2021-04-30 10:11:35 +08:00
e519a24c9a dynamic adjust compaction policy (#5651)
Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-04-26 12:39:13 +08:00
f5cf008bcc [Bug] Fix stream load UT failed (#5692)
Also move the stream load rocksdb dir to the first of storage root paths
2021-04-23 09:33:42 +08:00
a803ceea86 [refactor] Remove boost mutex, use std::mutex instead (#5684)
* Remove boost mutex, use std::mutex instead

* replace shared_mutex
2021-04-22 11:29:36 +08:00
a4f8194111 [Audit][Stream Load] Support audit function for stream load (#5452)
Record finished stream load job (both successful job and failed job) into audit log
so that we can see when the stream load job was executed and check the details of stream load jobs.
2021-04-21 16:36:12 +08:00
c4cc681d14 remove boost_foreach, using c++ foreach instead (#5611) 2021-04-15 10:52:29 +08:00
75db273b93 [Doris On ES][WIP] Support external ES table with SSL secured and configurable node sniffing (#5325)
Support external ES  table with `SSL` secured and configurable node sniffing
2021-04-12 11:23:49 +08:00
b423274f17 [Enhance] Make MemTracker more accurate (#5515) (#5516)
* [Enhance] Make MemTracker more accurate (#5515)
 This PR main about:
 1. Improve the readability of MemTrackers' name
 2. Add the MemTracker of:
    * Load
    * Compaction
    * SchemaChange
    * StoragePageCache
    * TabletManager
 3. Change SchemaChange to a Singleon

* revise some code for Code Review

* change the name of mem_tracker

* keep reader_context have the same lifetime of rowset_reader in schema change.

* change vlog notice to log(warning) in schema change
2021-04-08 09:14:55 +08:00
d641a26490 [Refactor] Remove boost filesystem (#5579)
* use std::filesystem instead of boost
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2021-04-08 09:11:59 +08:00
ad67dd34a0 update gcc to gcc 10 and support c++17 (#5394)
* update gcc to gcc 10 and support c++17
    update brpc to 0.9.7
    update boost to 1.73
    remove third-party boost 1.54 for mysql

* update cmake version

* ignore jdk version

* remove unused patch

* avoid use SYS_getrandom call
2021-03-25 09:30:38 +08:00
a1bce25677 [BUG] Fix Memory Leak in SchemaChange And Fix some DCHECK error (#5491) 2021-03-17 09:27:05 +08:00
0131c33966 [Enhance] Improve the readability of memtrackers' name (#5455)
Improve the readability of memtrackers' name, then you will be happy to read website be_ip:port/mem_tracker
2021-03-11 22:33:31 +08:00
7a8fbe5db8 [internal] [doris-1084] support compressed csv file in stream load (#5463) 2021-03-11 10:53:05 +08:00
e023ef5404 [Load] Support multi bytes LineDelimiter and ColumnSeparator (#5462)
* [Internal][Support Multibytes Separator] doris-1079
support multi bytes LineDelimiter and ColumnSeparator
2021-03-09 09:35:39 +08:00
a6e2c3e3f1 [Bug][Clone] Fix the bug that incremental clone is not triggered (#5230)
In version 0.13, we support a more efficient compaction logic. 
This logic will maintain multiple version paths of the tablet.
This can avoid -230 errors and can also support incremental clone.

But the previous incremental clone uses the incremental rowset meta recorded in `incr_rs_meta`.
At present, the incremental rowset meta recorded in `incr_rs_meta` and the records
in `stale_rs_meta` are duplicated, and the current clone logic does not adapt to the
new multi-version path, resulting in many cases not triggering incremental clone.

This CL mainly modified:

1. Removed `incr_rs_meta` metadata
2. Modified the clone logic. When the clone is incremented, it will try to read the rowset in `stale_rs_meta`.
3. Delete a lot of code that was previously used for version compatibility.
2021-02-06 22:04:48 +08:00
93a4c7efc1 [LOG] Standardize the use of VLOG in code (#5264)
At present, the application of vlog in the code is quite confusing.
It is inherited from impala VLOG_XX format, and there is also VLOG(number) format.
VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG
2021-01-21 12:09:09 +08:00
99b22c92f8 [Feature] Add a http interface for single tablet migration between different disks (#5101)
Based on PR #4475, this patch add a new feature for single tablet migration between different disks by http.
Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-01-16 21:35:20 +08:00
58e58c94d8 [TSAN] Fix tsan bugs (part 1) (#5162)
ThreadSanitizer, aka TSAN, is a useful tool to detect multi-thread
problems, such as data race, mutex problems, etc.
We should detect TSAN problems for Doris BE, both unit tests and
server should pass through TSAN mode, to make Doris more robustness.
This is the very beginning patch to fix TSAN problems, and some
difficult problems are suppressed in file 'tsan_suppressions', you
can suppress these problems by setting:
export TSAN_OPTIONS="suppressions=tsan_suppressions"

before running:
`BUILD_TYPE=tsan ./run-be-ut.sh --run`
2021-01-15 09:45:11 +08:00
07eaf50084 [Feature] Add a http interface to acquire the tablets distribution between different disks (#5096)
For the task of rebalancing tablet among different disks on the same BE,
It might be an effective strategy to ensure all tablets under the same partition
evenly distribute on the different disks. Thus, it is necessary to obtain the
distribution of tablets under the same partition between different disks on a BE.

This patch add a new http interface for BE to acquire the distribution of tablets
under a partition between different disks on the same BE.
2021-01-15 09:32:27 +08:00
279ae1cb75 Add fuzzy_parse option to speed up json import (#5114)
add a flag of fuzzy_parse, if the json file all object keys are the same and has same order, we only need to parse the first row, and then use index instead key to parse value
2020-12-25 09:19:42 +08:00
e278e0b3db [Load] Support full StreamLoad feature in multiload (#4717) 2020-12-10 09:37:18 +08:00
b954dfd82d [Bug] Fix the bug of Largetint and Decimal json load failed. (#4983)
Use param of json load "num_as_string" to use flag kParseNumbersAsStringsFlag to parse json data.
2020-12-06 08:49:30 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
10e1e29711 Remove header file common/names.h (#4945) 2020-11-26 17:00:48 +08:00
74bc25ffe5 [Metrics] Add metric to monitor timeout canceled fragment count (#4862)
It would be helpful to monitor the count of timeout canceled fragments
when there is any issuse cause fragments execute failed or queued too
long time.
2020-11-11 21:21:48 +08:00
9aa3d61dc0 [refactor] change http server log level (#4853)
* change some log level

* change some log level
2020-11-08 20:53:36 +08:00
c53dd949c9 [Feature] Add CPU and Heap profile in BE webserver (#4632)
Add CPU Profile and Heap Profile in BE webserver.
This way we can more easily diagnose system performance bottlenecks through perf tools.
2020-11-05 20:25:07 +08:00
d6497fedc4 [Config] Change config name 'streaming_load_max_batch_size_mb' to 'streaming_load_json_max_mb' (#4791)
The name and another config name are close to each other and are indistinguishable.
So this pr modify the name.
The document description has also been changed
2020-10-28 23:27:33 +08:00
6790254b97 [Bug] Fix bug and optimize implementation logic of tablets web page (#4775)
(1) The implementation logic of `tablets web page` is that:
Firstly, get all the tablets of the BE;
Secondly, return specific number tablets in front to web page according to the value of request parameter `limit`. 
This patch optimize the implementation logic of `tablets web page` that getting specific number tablets in BE 
according to the value of request parameter `limit` and then return them to web page.

(2) It will return default `1000` tablets through http interface `http://be_host:webserver_port/tablets_page`.
If the number of tablet in the BE is less than 1000, there will be `coredump` in the following code:

be/src/http/action/tablets_info_action.cpp
2020-10-28 23:25:43 +08:00
0213e93ea1 [Feature][Config] Support persistence of configuration items modified at runtime (#4704)
Support persistence of configuration items modified at runtime via HTTP API.

```
FE:
GET /api/_set_config?key=value&persist=true

BE
POST /api/update_config?key=value&persist=true
```

The modified config will be saved in `fe_custom.conf` or `be_custom.conf`.
And when process starts, it will load `fe.conf/be.conf` first, then `fe_custom.conf/be_custom.conf`.
2020-10-22 21:39:02 +08:00
09f97f8a05 [Refactor] Fixes some be typo part 2 (#4747) 2020-10-20 09:28:57 +08:00
3438a746ac [Typo] Fix typo in metrics macros (#4739)
Just fix typo.
Rename DEFINE_GAUGE_METRIC_PROTOTYPE_5ARG(name, unit) to DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(name, unit)
Rename DEFINE_GAUGE_METRIC_PROTOTYPE_2ARG(name, unit) witch define core metrics to DEFINE_GAUGE_CORE_METRIC_PROTOTYPE_2ARG(name, unit)
2020-10-15 19:56:43 +08:00
75e0ba32a1 Fixes some be typo (#4714) 2020-10-13 09:37:15 +08:00
0475aa9b93 [Bug]Fix delete on clause may not work in routineLoad (#4683)
fix delete on may not work in some cases, this is describe in #4682
2020-09-30 09:56:19 +08:00
e26d5d0da0 [MemTracker] show all MemTrackers on BE's website (#4580)
We can show all MemTrackers on BE's website by calling MemTracker::ListTrackers().
2020-09-12 11:18:50 +08:00
64ebea2e43 [Feature] Support gzip compression for http response (#4533)
After tablet level metrics is supported, the http metrics API may response
a very large body when a BE holds a large number of tablets, and cause heavy
network traffic.
This patch introduce http content compression to reduce network traffic.
2020-09-06 20:30:12 +08:00
b780df697a [refactor] Optimize threads usage mode in BE (#4440)
BE can not graceful exit because some threads are running in endless
loop. This patch do the following optimization:
- Use the well encapsulated Thread and ThreadPool instead of std::thread
  and std::vector<std::thread>
- Use CountDownLatch in thread's loop condition to avoid endless loop
- Introduce a new class Daemon for daemon works, like tcmalloc_gc,
  memory_maintenance and calculate_metrics
- Decouple statistics type TaskWorkerPool and StorageEngine notification
  by submit tasks to TaskWorkerPool's queue
- Reorder objects' stop and deconstruct in main(), i.e. stop network
  services at first, then internal services
- Use libevent in pthreads mode, by calling evthread_use_pthreads(),
  then EvHttpServer can exit gracefully in multi-threads
- Call brpc::Server's Stop() and ClearServices() explicitly
2020-09-06 20:19:14 +08:00
068707484d Support sequence column for UNIQUE_KEYS Table (#4256)
* add sequence  col

Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>
2020-09-04 10:10:17 +08:00
498b06fbe2 [Metrics] Support tablet level metrics (#4428)
Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet,
but we have no insight about tablets in the cluster.
This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. 
However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request,
and not return tablet level metrics by default.
2020-09-02 10:39:41 +08:00
67b842ce04 [License] Organize and modify the license of the code (#4371)
1. Disable the MySQL client and LZO library by default when building the Doris.

    MySQL client library is used for MySQL external table feature.
    This feature will be replaced by the new ODBC external table soon.

    LZO library is used to compress/decompress data of some old data format of Doris,
    which is no longer used anymore.

2. Add missing license to some files.

3. For all non-Apache-License code, all are explained in NOTICE file and the corresponding license is declared.

4. Remove the js source code from webroot, it will be downloaded as thirdparty
2020-08-24 21:51:55 +08:00
d61c10b761 [Delete] Support batch delete [part 1] (#4310)
* Implements the grammar of the batch delete #4051 
* Process create, alter table when table has delete sign column
* Support the syntax for enabling the delete column
* Automatically filtered deleted data in the select statement.
* Automatically add delete sign when create  rollup table
TODO:
 * Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction
2020-08-21 22:57:16 +08:00
4c571cb6f5 Revert "[Metrics] Support tablet level metrics (#4327)" (#4397)
This reverts commit 56260a65c87830ffe34109195ee4d6f1d543e630.

Co-authored-by: morningman <chenmingyu@baidu.com>
2020-08-19 22:37:52 +08:00