This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet,
and there is no necessary to prohibit loading new data to cooled tablets.
Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without
perceiving the underlying filesystem.
The abstracted `RemoteFileSystem` can try local caching strategies with different granularity,
instead of caching segment files as before.
To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory.
In the future, `FileReader`s and `FileWriter`s should be unified.
1. Add TStorageMigrationReqV2 and EngineStorageMigrationTask to support migration action
2. Change TabletManager::create_tablet() for remote storage
3. Change TabletManager::try_delete_unused_tablet_path() for remote storage
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
In version 0.13, we support a more efficient compaction logic.
This logic will maintain multiple version paths of the tablet.
This can avoid -230 errors and can also support incremental clone.
But the previous incremental clone uses the incremental rowset meta recorded in `incr_rs_meta`.
At present, the incremental rowset meta recorded in `incr_rs_meta` and the records
in `stale_rs_meta` are duplicated, and the current clone logic does not adapt to the
new multi-version path, resulting in many cases not triggering incremental clone.
This CL mainly modified:
1. Removed `incr_rs_meta` metadata
2. Modified the clone logic. When the clone is incremented, it will try to read the rowset in `stale_rs_meta`.
3. Delete a lot of code that was previously used for version compatibility.
1、Add a PushBrokerReader in push_handle.cpp.
2、PushBrokerReader wraps the ParquetScanner to support reading data from parquet format file through broker.
In `AgentServer`, each task type needs to be processed separately,
which leads to very long code, hard to read, and not easy to detect
errors (for example, some task type processing may be missed,
corresponding relationship may be error)
Fortunately, the code for each task_type is very similar, so this
is a good case to use `MACRO`, which can greatly reduce the repeated
code and solve above problems.
This patch also fix two small bugs:
1. The `_topic_subscriber` member has not been released in dtor
2. in `submit_tasks()`, the `status_code` is not reset before
each task is processed, resulting in wrong judgment.
No functional changes in this patch.
FE uses partition_id to publish version. BE should check whether all tablets related with this partition have the version. But Tablet in BE does not have partition id in its metadata. So that BE could not check it.
This patch will add partition id to tablet meta during report task.
Sync at most 10k tablets during set tablet meta.
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.
1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
Nameing rowset with tablet_id and version will lead to
many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
with different format.
Added: Support getting column size and precision info of table or view using JDBC.
Updated: Change the promethues type name GAUGE to lowercase, to fit the latest promethues version.
Updated: Backend ip saved in FE will be compared with BE's local ip when doing heartbeat, to avoid false positive heartbeat response.
Updated: Using version_num of tablet instead of calculating nice value to select cumulative compaction candicates.
Fixed: Predicates should not be pushed down to subquery which contains limit clause.
Fixed: Fix the formula of calculating BE load score.
Fixed: Fix a bug that in some edge cases, non-master Fontend may wait for a unnecessary long timeout after forwarding cmd to Master FE.
Fixed: A bug that granting privs on more than one table does not work.
Fixed: Support 'Insert into' table which contains HLL columns.
Fixed: ExportStmt' toSql() method may throw NullPointer Exception if table does not exist.
Fixed: Remove unnecessary 'get capacity' operation to avoid IO impact.
Internal commit id: merge to c16bd603a53dfe2089ff95704c698a738c317792
1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication.
2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS.
3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user.
4. A lot of bugs fixed.
5. Performance improvement.
1. ix bugs in query layer.
2. remove some redundant code in BE
3. support specify multi helper node when starting FE
4. add proc 'cluster_load_statistic' to show load balance situation of Palo