Commit Graph

61 Commits

Author SHA1 Message Date
22cb7b8fcb [improvement](compaction) be do not compact invisible version to avoid query error -230 #28082 (#36222)
cherry pick from #28082
2024-06-27 13:45:21 +08:00
0d2ab9d5c3 [fix](clean trash) Fix clean trash lost submit task (#35271) 2024-05-23 16:27:20 +08:00
a050513c91 [Fix](clean trash) Fix clean trash use agent task (#33912) (#33972)
* [Fix](clean trash) Fix clean trash use agent task (#33912)

* add .h
2024-04-22 17:14:21 +08:00
31b3be456c add workload scheduler in be (#29116) 2023-12-28 15:04:22 +08:00
1afdbfe723 [enhance](BE) Refactor TaskWorkerPool (#27555) 2023-12-04 21:46:10 +08:00
1ba8a9bae4 [feature-wip](executor)Fe send topic info to be (#25798) 2023-10-26 15:52:48 +08:00
9d41edd9eb [Feature](binlog) Add binlog gc && Auth master_token (#20854)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-06-16 11:25:11 +08:00
c45da40ed7 [refactor-WIP](TaskWorkerPool) add specific classes for ALTER_TABLE, CLONE, STORAGE_MEDIUM_MIGRATE task (#20140) 2023-05-28 19:27:08 +08:00
0434c6a738 [refactor-WIP](TaskWorkerPool) add specific classes for PUSH, PUBLIC_VERION, CLEAR_TRANSACTION tasks (#19822) 2023-05-27 22:47:45 +08:00
e242d7dfcc [refactor-WIP](TaskWorkerPool) add DropTableTaskPool for DROP_TABLE task (#19793) 2023-05-18 18:25:13 +08:00
6a5b590873 [refactor-WIP](TaskWorkerPool) add CreateTableTaskPool class for CREATE_TABLE task (#19734) 2023-05-18 11:43:09 +08:00
9e960f4c4f [chore](build) Use include-what-you-use to optimize includes (#18681)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-17 11:44:58 +08:00
5014ad03e7 [feature](cooldown) Auto delete unused remote files (#16588) 2023-02-13 23:59:39 +08:00
43eca4f209 [Feature-WIP](inverted index) Implementation for alter inverted index. (#16371)
implementation for add/drop inverted index.
2023-02-10 17:56:17 +08:00
00a598a839 [feature](cooldown) Decouple storage policy and resource (#15873) 2023-01-31 14:13:47 +08:00
d257059e6b [refactor](remove hadoop dpp) remove hadoop dpp code since it is not used (#16009) 2023-01-18 15:01:04 +08:00
58c520dbfd [Feature](remote) Cooldown cold data to object storage only one replica (#15832) 2023-01-14 23:58:00 +08:00
f3aea7f0f0 [Enhancement](status) Unify error code and enable customed err msg for BE internal errors (#14744) 2022-12-11 23:33:18 +08:00
32fea672b0 [chore](gutil) remove some gutil macros and solve some macro conflict with brpc (#13954)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-11-07 13:39:52 +08:00
125def5102 [enhancement](macOS M1) Support building from source on macOS (M1) (#13195)
# Proposed changes

This PR fixed lots of issues when building from source on macOS with Apple M1 chip.

## ATTENTION

The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime:
1. Some errors with memory tracker occur when BE (RELEASE) starts.
2. Some UT cases fail.
...

Temporarily, the following changes are made on macOS to start BE successfully.
1. Disable memory tracker.
2. Use tcmalloc instead of jemalloc.

This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues.

## Use case

```shell
./build.sh -j 8 --be --clean

cd output/be/bin
ulimit -n 60000
./start_be.sh --daemon
```

## Something else

It takes around _**10+**_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the  development experience on macOS greatly when we finish the adaptation job.
2022-10-18 13:10:13 +08:00
db07e51cd3 [refactor](status) Refactor status handling in agent task (#11940)
Refactor TaggableLogger
Refactor status handling in agent task:
Unify log format in TaskWorkerPool
Pass Status to the top caller, and replace some OLAPInternalError with more detailed error message Status
Premature return with the opposite condition to reduce indention
2022-08-29 12:06:01 +08:00
331fa50501 [feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280)
This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet,
and there is no necessary to prohibit loading new data to cooled tablets.

Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without
perceiving the underlying filesystem.

The abstracted `RemoteFileSystem` can try local caching strategies with different granularity,
instead of caching segment files as before.

To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory.
In the future, `FileReader`s and `FileWriter`s should be unified.
2022-07-08 12:18:39 +08:00
c9f86bc7e2 [refactor] Refactoring Status static methods to format message using fmt(#9533) 2022-07-02 18:58:23 +08:00
aab7dc956f [refactor](load) Remove mini load (#10520) 2022-06-30 23:21:41 +08:00
b8d2c96842 [refactor]Remove load_delete job (#10353) 2022-06-24 00:04:38 +08:00
f377c26bf7 [refactor][be] Optimize headers (#9708) 2022-05-30 16:12:10 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
e157c2c254 [feature-wip](remote-storage) step3: Support remote storage, only for be, add migration_task_v2 (#8806)
1. Add TStorageMigrationReqV2 and EngineStorageMigrationTask to support migration action
2. Change TabletManager::create_tablet() for remote storage
3. Change TabletManager::try_delete_unused_tablet_path() for remote storage
2022-04-22 22:38:10 +08:00
e5e0dc421d [refactor] Change ALL OLAPStatus to Status (#8855)
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
2022-04-14 11:43:49 +08:00
aeee738af0 Revert "[Refactor][agent_task] Remove etl mgr and etl job pool from be (#8635)" (#8666)
This reverts commit 6bc982c37436acf288f566cf10e084731b80fa44.
2022-03-25 18:32:50 +08:00
6bc982c374 [Refactor][agent_task] Remove etl mgr and etl job pool from be (#8635) 2022-03-25 15:17:39 +08:00
ed39ff1500 [feature](compaction) Support triggering compaction for a specific partition manually (#7521)
Add statement to trigger cumulative or base compaction for a specified partition.
2022-01-21 09:27:06 +08:00
58d0c8971e [Bugfix] Fix BE metrics http API dead lock bug (#5730) 2021-04-30 10:15:33 +08:00
d641a26490 [Refactor] Remove boost filesystem (#5579)
* use std::filesystem instead of boost
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2021-04-08 09:11:59 +08:00
a6e2c3e3f1 [Bug][Clone] Fix the bug that incremental clone is not triggered (#5230)
In version 0.13, we support a more efficient compaction logic. 
This logic will maintain multiple version paths of the tablet.
This can avoid -230 errors and can also support incremental clone.

But the previous incremental clone uses the incremental rowset meta recorded in `incr_rs_meta`.
At present, the incremental rowset meta recorded in `incr_rs_meta` and the records
in `stale_rs_meta` are duplicated, and the current clone logic does not adapt to the
new multi-version path, resulting in many cases not triggering incremental clone.

This CL mainly modified:

1. Removed `incr_rs_meta` metadata
2. Modified the clone logic. When the clone is incremented, it will try to read the rowset in `stale_rs_meta`.
3. Delete a lot of code that was previously used for version compatibility.
2021-02-06 22:04:48 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
10e1e29711 Remove header file common/names.h (#4945) 2020-11-26 17:00:48 +08:00
75e0ba32a1 Fixes some be typo (#4714) 2020-10-13 09:37:15 +08:00
ea6d7c281d [Bug] Remove RECOVER_TABLET worker pool to make ASAN compile happy (#4392)
Co-authored-by: morningman <chenmingyu@baidu.com>
2020-08-19 17:39:46 +08:00
f189a2e7b8 [Spark load][Be 1/1] Be handle push task (#3742)
1、Add a PushBrokerReader in push_handle.cpp.
2、PushBrokerReader wraps the ParquetScanner to support reading data from parquet format file through broker.
2020-06-22 19:57:58 +08:00
e991b1300f [Code Refactor] Refactor AgentServer to make it less error-prone and more readable (#2831)
In `AgentServer`, each task type needs to be processed separately,
which leads to very long code, hard to read, and not easy to detect
errors (for example, some task type processing may be missed,
corresponding relationship may be error)

Fortunately, the code for each task_type is very similar, so this
is a good case to use `MACRO`, which can greatly reduce the repeated
code and solve above problems.

This patch also fix two small bugs:
1. The `_topic_subscriber` member has not been released in dtor
2. in `submit_tasks()`, the `status_code` is not reset before
   each task is processed, resulting in wrong judgment.

No functional changes in this patch.
2020-02-06 09:56:00 +08:00
9e54751098 [Snapshot] Modify the prefer snapshot version (#2748)
In this CL, prefer snapshot version in snapshot request is defined
in thrift. So that both FE and BE can use this version value.
2020-01-15 15:10:14 +08:00
5ab6739429 Add rowset convert (#2212) 2019-12-02 10:00:19 +08:00
0bcfddab92 Remove clear_alter_task (#2056)
Alter task has been refactored and clear_alter_task is not necessary.
2019-10-24 18:57:14 +08:00
58c882fa2a Remove SchemaChangeV1 (#2014) 2019-10-21 15:07:28 +08:00
9aa2045987 Refactor alter job (#1695) 2019-09-12 16:31:29 +08:00
6f4feca3dc Add rowset id generator to FE and BE (#1678) 2019-09-02 18:51:31 +08:00
c34b35e6c4 Add ALTER_TABLET task in be (#1497)
This a for the new implementation of alter table process.
2019-07-23 15:16:21 +08:00
755b12cd75 Add partition id to tablet meta in be (#1490)
FE uses partition_id to publish version. BE should check whether all tablets related with this partition have the version. But Tablet in BE does not have partition id in its metadata. So that BE could not check it.

This patch will add partition id to tablet meta during report task.
Sync at most 10k tablets during set tablet meta.
2019-07-17 14:07:55 +08:00
0d48a3961c Refactor Storage Engine (#1478)
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.

1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
   Nameing rowset with tablet_id and version will lead to
   many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
   with different format.
2019-07-15 21:18:22 +08:00