Commit Graph

103 Commits

Author SHA1 Message Date
e26d5d0da0 [MemTracker] show all MemTrackers on BE's website (#4580)
We can show all MemTrackers on BE's website by calling MemTracker::ListTrackers().
2020-09-12 11:18:50 +08:00
64ebea2e43 [Feature] Support gzip compression for http response (#4533)
After tablet level metrics is supported, the http metrics API may response
a very large body when a BE holds a large number of tablets, and cause heavy
network traffic.
This patch introduce http content compression to reduce network traffic.
2020-09-06 20:30:12 +08:00
b780df697a [refactor] Optimize threads usage mode in BE (#4440)
BE can not graceful exit because some threads are running in endless
loop. This patch do the following optimization:
- Use the well encapsulated Thread and ThreadPool instead of std::thread
  and std::vector<std::thread>
- Use CountDownLatch in thread's loop condition to avoid endless loop
- Introduce a new class Daemon for daemon works, like tcmalloc_gc,
  memory_maintenance and calculate_metrics
- Decouple statistics type TaskWorkerPool and StorageEngine notification
  by submit tasks to TaskWorkerPool's queue
- Reorder objects' stop and deconstruct in main(), i.e. stop network
  services at first, then internal services
- Use libevent in pthreads mode, by calling evthread_use_pthreads(),
  then EvHttpServer can exit gracefully in multi-threads
- Call brpc::Server's Stop() and ClearServices() explicitly
2020-09-06 20:19:14 +08:00
068707484d Support sequence column for UNIQUE_KEYS Table (#4256)
* add sequence  col

Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>
2020-09-04 10:10:17 +08:00
498b06fbe2 [Metrics] Support tablet level metrics (#4428)
Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet,
but we have no insight about tablets in the cluster.
This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. 
However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request,
and not return tablet level metrics by default.
2020-09-02 10:39:41 +08:00
67b842ce04 [License] Organize and modify the license of the code (#4371)
1. Disable the MySQL client and LZO library by default when building the Doris.

    MySQL client library is used for MySQL external table feature.
    This feature will be replaced by the new ODBC external table soon.

    LZO library is used to compress/decompress data of some old data format of Doris,
    which is no longer used anymore.

2. Add missing license to some files.

3. For all non-Apache-License code, all are explained in NOTICE file and the corresponding license is declared.

4. Remove the js source code from webroot, it will be downloaded as thirdparty
2020-08-24 21:51:55 +08:00
d61c10b761 [Delete] Support batch delete [part 1] (#4310)
* Implements the grammar of the batch delete #4051 
* Process create, alter table when table has delete sign column
* Support the syntax for enabling the delete column
* Automatically filtered deleted data in the select statement.
* Automatically add delete sign when create  rollup table
TODO:
 * Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction
2020-08-21 22:57:16 +08:00
4c571cb6f5 Revert "[Metrics] Support tablet level metrics (#4327)" (#4397)
This reverts commit 56260a65c87830ffe34109195ee4d6f1d543e630.

Co-authored-by: morningman <chenmingyu@baidu.com>
2020-08-19 22:37:52 +08:00
56260a65c8 [Metrics] Support tablet level metrics (#4327)
Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet,
but we have no insight about tablets in the cluster.
This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `.
However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request,
and not return tablet level metrics by default.
2020-08-18 16:56:12 +08:00
d6028863f3 [Compaction] Manually trigger compaction RESTapi interface (#4312)
Add restapi to be which do compaction task by manual trigger. The detail design in #4311 .
2020-08-13 23:41:46 +08:00
d655b271b8 [Feature][Web] Add new feature to list all tablets on a particular BE (#4268)
A new feature has been added to acquire tablet id and schema hash of all the tablets on a particular BE node
via Web page,so that more detailed information of each tablet can be obtained according to these
tablet id and schema hash. In accordance with different web request, there are two ways 
(table and json)to show these acquired tablet id and schema hash on Web page.
2020-08-12 20:55:19 +08:00
e71152132c [metrics] Redesign metrics to 3 layers (#4115)
Redesign metrics to 3 layers:
    MetricRegistry - MetricEntity - Metrics
    MetricRegistry : the register center
    MetricEntity : the entity registered on MetricRegistry. Generally a MetricRegistry can be registered on several 
        MetricEntities, each of MetricEntity is an independent entity, such as server, disk_devices, data_directories, thrift 
        clients and servers, and so on. 
    Metric : metrics of an entity. Such as fragment_requests_total on server entity, disk_bytes_read on a disk_device entity, 
        thrift_opened_clients on a thrift_client entity.
    MetricPrototype: the type of a metric. MetricPrototype is a global variable, can be shared by the same metrics across 
        different MetricEntities.
2020-08-08 11:23:01 +08:00
3f31866169 [Bug][Load][Json] #4124 Load json format with stream load failed (#4217)
Stream load should read all the data completely before parsing the json.
And also add a new BE config streaming_load_max_batch_read_mb
to limit the data size when loading json data.

Fix the bug of loading empty json array []

Add doc to explain some certain case of loading json format data.

Fix: #4124
2020-08-04 12:55:53 +08:00
10f822eb43 [MemTracker] make all MemTrackers shared (#4135)
We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web.
As follows:
1. nearly all MemTracker raw ptr -> shared_ptr
2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent)
3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec 
     before MemTracker's destructor. So we don't change these code.
4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared 
    too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile.
Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.
2020-07-31 21:57:21 +08:00
fdcc223ad2 [Bug][Json] Refactor the json load logic to fix some bug
1. Add `json_root` for nest json data.
2. Remove `_jmap` to make the logic reasonable.
2020-07-30 10:36:34 +08:00
443b8f100b [Feature][ThreadPool]Add Web Page to display thread's stats (#4110)
This CL mainly includes:
- add some methods to get thread's stats from Linux's system file in
env.
- support get thread's stats by http method.
- register page handle in BE to show thread's stats to help developer
position some thread relate problem.
2020-07-23 21:08:36 +08:00
d07a23ece3 [webserver] Introduce mustache to simplify BE's website render (#4062)
cpp-mustache is a C++ implementation of a Mustache template engine
with support for RapidJSON, and in order to simplify RapidJSON object
building, we introduce class EasyJson from Apache Kudu.
2020-07-16 22:39:51 +08:00
2e460f581c [Bug] Support get all rowset meta info in memory from tablet meta url (#4061)
This PR is to fix bug that we cannot get the newest tablet meta info from tablet meta url.
2020-07-13 20:53:51 +08:00
42cb11901b [webserver] Make BE webserver more pretty (#4050)
Add some CSS and js files, and use boost-table framework to make BE's website more pretty
2020-07-09 21:50:52 +08:00
ab8851f7aa [webserver] Make BE webserver handle static files (#4021)
Make BE webserver handle static files, e.g. css, js, ico, then we can make
BE website more pretty.
2020-07-07 23:08:29 +08:00
48d947edf4 Support rpc_timeout property in stream load request to cancel request in fe in time when stream load request is timeout (#3948)
This PR is to enable cancel stream load request in FE in time
when stream load request is timeout to make stream load more robust.
2020-06-29 19:16:16 +08:00
93a0b47d22 Revert "[Memory Engine] MemTablet creation and compatibility handling in BE (#3762)" (#3931)
This reverts commit ca96ea30560c9e9837c28cfd2cdd8ed24196f787.
2020-06-24 10:13:45 +08:00
ca96ea3056 [Memory Engine] MemTablet creation and compatibility handling in BE (#3762) 2020-06-18 09:56:07 +08:00
e9f7576b9d [Enhancement] make metrics api more clear (#3891) 2020-06-17 12:17:54 +08:00
6c4d7c60dd [Feature] Add QueryDetail to store query statistics. (#3744)
1. Store the query statistics in memory.
2. Supporting RESTFUL interface to get the statistics.
2020-06-15 18:16:54 +08:00
01c1de1870 [Load] Add more metric to trace the time cost in stream load and make brpc_num_threads configurable (#3703) 2020-06-04 13:37:28 +08:00
ed886a485d [HttpServer] capture convert exception (#3736)
If parameter str is an empty string, it will throw exception too. Maybe we can add an ut for parsing parameters in http server.
2020-06-03 19:54:41 +08:00
1cc78fe69b [Enhancement] Convert metric to Json format (#3635)
Add a JSON format for existing metrics like this.
```
{
    "tags":
    {
        "metric":"thread_pool",
        "name":"thrift-server-pool",
        "type":"active_thread_num"
    },
    "unit":"number",
    "value":3
}
```
I add a new JsonMetricVisitor to handle the transformation.
It's not to modify existing PrometheusMetricVisitor and SimpleCoreMetricVisitor.
Also I add
1.  A unit item to indicate the metric better 
2. Cloning tablet statistics divided by database.
3. Use white space to replace newline in audit.log
2020-05-27 08:49:30 +08:00
ef8fd1fcbe [Load] Support load json-data into Doris by RoutineLoad or StreamLoad (#3553)
Doris support load json-data by RoutineLoad or StreamLoad
2020-05-21 13:00:49 +08:00
273aad6cf4 [Bug] Restore tablet action not working because tablet status is shutdown (#3551) 2020-05-15 10:11:17 +08:00
b576e54fe6 [ASAN] Fix some address problems detected by ASAN (#3495)
LSAN detected errors have been fixed by a prior pathch (#3326), but
there are still some ASAN detected errors.
This patch try to fix these errors to make Doris BE more robustness.
And then we can add CI run in LSAN/ASAN mode to detect memory errors
as early as possible.
2020-05-11 10:30:45 +08:00
b58b1b3953 [metrics] Make DorisMetrics to be a real singleton (#3417) 2020-05-04 09:20:53 +08:00
37fccd53c4 [Tablet] A small refactor on class Tablet (#3339)
There is no functional changes in this patch.
Key refactor points are:
- Remove meaningless return value of functions in class Tablet, and
  also some related functions in other classes
- Allow RowsetGraph::capture_consistent_versions to pass a nullptr
  to the output parameter
- Use CHECK instead of LOG(FATAL) to simplify code
2020-04-24 22:22:26 +08:00
4a7a88ede1 [LSAN] Fix some memory leak detected by LSAN (#3326) 2020-04-22 22:59:44 +08:00
f39c8b156d [refactor] A small refactor on class DataDir (#3276)
main refactor points are:
- Use a single get_absolute_tablet_path function instead of 3
  independent functions
- Remove meaningless return value of register_tablet and deregister_tablet
- Some typo and format
2020-04-10 00:32:22 +08:00
8fc284d593 [config] Support to modify configs when BE is running without restarting (#3264)
In the past, when we want to modify some BE configs, we have to modify be.conf and then restart BE.
This patch provides a way to modify configs in the type of 'threshold', 'interval', 'enable flag'
when BE is running without restarting it.
You can update a single config once by BE's http API: `be_host:be_http_port/api/update_config?config_name=new_value`
2020-04-08 11:17:47 +08:00
e4682398bd [web] Dump configs on BE's website '/varz' (#3220)
Dump configs on BE's website '/varz'
Change NAVIGATION_BAR_PREFIX from 'Impala' to 'Doris'
Format the related files by clang-format
2020-03-28 16:26:38 +08:00
0f14408f13 [Temp Partition] Support loading data into temp partitions (#3120)
Related issue: #2663, #2828.

This CL support loading data into specified temporary partitions.

```
INSERT INTO tbl TEMPORARY PARTITIONS(tp1, tp2, ..) ....;

curl .... -H "temporary_partition: tp1, tp, .. "  ....

LOAD LABEL db1.label1 (
DATA INFILE("xxxx") 
INTO TABLE `tbl2`
TEMPORARY PARTITION(tp1, tp2, ...)
...
```

NOTICE: this CL change the FE meta version to 77.

There 3 major changes in this CL

## Syntax reorganization

Reorganized the syntax related to the `specify-partitions`. Removed some redundant syntax
 definitions, and unified the syntax related to the `specify-partitions` under one syntax entry.

## Meta refactor

In order to be able to support specifying temporary partitions, 
I made some changes to the way the partition information in the table is stored.

Partition information is now organized as follows:

The following two maps are reserved in OlapTable for storing formal partitions:

    ```
    idToPartition
    nameToPartition
    ```

Use the `TempPartitions` class for storing temporary partitions.

All the partition attributes of the formal partition and the temporary partition,
such as the range, the number of replicas, and the storage medium, are all stored
in the `partitionInfo` of the OlapTable.

In `partitionInfo`, we use two maps to store the range of formal partition
and temporary partition:

    ```
    idToRange
    idToTempRange
    ```

Use separate map is because the partition ranges of the formal partition and
the temporary partition may overlap. Separate map can more easily check the partition range.

All partition attributes except the partition range are stored using the same map,
and the partition id is used as the map key.

## Method to get partition

A table may contain both formal and temporary partitions.
There are several methods to get the partition of a table.
Typically divided into two categories:

1. Get partition by id
2. Get partition by name

According to different requirements, the caller may want to obtain
a formal partition or a temporary partition. These methods are
described below in order to obtain the partition by using the correct method.

1. Get by name

This type of request usually comes from a user with partition names. Such as
`select * from tbl partition(p1);`.
This type of request has clear information to indicate whether to obtain a
formal or temporary partition.
Therefore, we need to get the partition through this method:

`getPartition(String partitionName, boolean isTemp)`

To avoid modifying too much code, we leave the `getPartition(String
partitionName)`, which is same as:

`getPartition(partitionName, false)`

2. Get by id

This type of request usually means that the previous step has obtained
certain partition ids in some way,
so we only need to get the corresponding partition through this method:

`getPartition(long partitionId)`.

This method will try to get both formal partitions and temporary partitions.

3. Get all partition instances

Depending on the requirements, the caller may want to obtain all formal
partitions,
all temporary partitions, or all partitions. Therefore we provide 3 methods,
the caller chooses according to needs.

`getPartitions()`
`getTempPartitions()`
`getAllPartitions()`
2020-03-19 15:07:01 +08:00
3e6dfa31c4 [UnitTest] Fix BE unit test randomly failed (#2970)
* fix http server related unit test failed due to http port has been used
* fix unit test failed in DEBUG build type
2020-02-21 22:21:02 +08:00
ed299d5d8b Create pprof_profile_dir before heap profiling (#2944) 2020-02-20 10:41:04 +08:00
3c539aac54 [Refactor] Some tiny refactor on streaming-load related code (#2891)
Mainly contains the following modifications:
1. Use `std::unique_ptr` to replace some naked pointers
2. Modify some methods from member-method to local-static-function
3. Modify some methods do not need to be public to private
4. Some formatting changes: such as wrapping lines that are too long
5. Remove some useless variables
6. Add or modify some comments for easier understanding

No functional changes in this patch.
2020-02-13 10:42:52 +08:00
4e151b1551 Remove boost exception when parse store path (#2861) 2020-02-10 17:50:52 +08:00
e1ba0efbc7 Optimize compaction strategy of tablet on BE (#2473)
The current compaction selection strategy and cumulative point update logic
will cause the cumulative compaction to not work, and all compaction tasks
will be completed only by the base compaction. This can cause a large number
of data versions to pile up.

In the current cumulative point update logic, when a cumulative cannot select
enough number of rowsets, it will directly increase the cumulative point.
Therefore, when the data version generates the same speed as the cumulative
compaction polling, it will cause the cumulative point to continuously increase
without triggering the cumulative compaction.

The new strategy mainly modifies the update logic of cumulative point to ensure
that the above problems do not occur. At the same time, the new strategy also
takes into account the problem that compaction cannot be performed if cumulative
points stagnate for a long time. Cumulative points will be forced to increase
through threshold settings to ensure that compaction has a chance to execute.

Also add a new HTTP API to view the compaction status of specified tablet.
See `compaction-action.md` for details.
2019-12-17 10:30:43 +08:00
83b5455be5 [Load] Fix several races in stream load that could cause BE crash (#2414)
This CL fixes the following problems
1. check whether TabletsChannel has been closed/cancelled in `reduce_mem_usage` to avoid using a closed DeltaWriter
2. make `FlushHandle.wait` wait for all submitted tasks to finish so that memtable is deallocated before its delta writer
3. make `~MemTracker()` release its consumption bytes to accommodate situations in aggregate_func.h that bitmap and hll call `MemTracker::consume` without corresponding `MemTracker::release`, which cause the consumption of root tracker never drops to zero
2019-12-10 21:59:05 +08:00
42a4fff562 Replace boost canonicalize (#2209) 2019-11-19 17:57:37 +08:00
59e9027f76 Fix bug that timeout is not taken effect in streamload (#2217) 2019-11-16 22:29:55 +08:00
c3b5046940 Fix bug of invalid stream load task rollback (#1999)
If stream load be committed with result PUBLISH_TIMEOUT, it should not rollback
this transaction, but only return this message to user.
2019-10-17 21:08:29 +08:00
62acf5d098 Limit the memory usage of Loading process (#1954) 2019-10-15 09:26:20 +08:00
f130bd3e7b Use Env function to operate directory (#1980)
Now Env has unify all environment operation, such as file operation.
However some of our old functions don't leverage it. This change unify
FileUtils::scan_dir to use Env's function.
2019-10-15 09:25:12 +08:00
2f0808137a Refactor FrontendHelper (#1888) 2019-09-27 13:21:14 +08:00