The problem I want to solve is described in #6355.
This CL mainly changes:
1. Support compacting tablets under alter operations
On BE side, the compaction logic will select tablets which state is "TABLET_NOTREADY" to do cumulative compaction.
2. Remove "alter_task" field in tablet's meta on BE side.
"alter_task" field is never used long time ago
3. Support doing delete operation when table is doing alter operation.
Previously, when a table is doing alter operation, execution of delete will return error: Table's state is not NORMAL.
But now, delete can be executed successfully only if the condition column is not under schema change.
And delete condition will be applied to all materialized indexes.
This is part of the array type support and has not been fully completed.
The following functions are implemented
1. fe array type support and implementation of array function, support array syntax analysis and planning
2. Support import array type data through insert into
3. Support select array type data
4. Only the array type is supported on the value lie of the duplicate table
this pr merge some code from #4655#4650#4644#4643#4623#2979
* [Enhance] Make MemTracker more accurate (#5515)
This PR main about:
1. Improve the readability of MemTrackers' name
2. Add the MemTracker of:
* Load
* Compaction
* SchemaChange
* StoragePageCache
* TabletManager
3. Change SchemaChange to a Singleon
* revise some code for Code Review
* change the name of mem_tracker
* keep reader_context have the same lifetime of rowset_reader in schema change.
* change vlog notice to log(warning) in schema change
1. If cumulative compaction compact only one rowset, the old rowset will not be put into `stale_rowset_meta_map`
2. Show rowset id in `/api/compaction/show`
Co-authored-by: xxiao2018 <benghua3_1@sina.com>
In version 0.13, we support a more efficient compaction logic.
This logic will maintain multiple version paths of the tablet.
This can avoid -230 errors and can also support incremental clone.
But the previous incremental clone uses the incremental rowset meta recorded in `incr_rs_meta`.
At present, the incremental rowset meta recorded in `incr_rs_meta` and the records
in `stale_rs_meta` are duplicated, and the current clone logic does not adapt to the
new multi-version path, resulting in many cases not triggering incremental clone.
This CL mainly modified:
1. Removed `incr_rs_meta` metadata
2. Modified the clone logic. When the clone is incremented, it will try to read the rowset in `stale_rs_meta`.
3. Delete a lot of code that was previously used for version compatibility.
This CL refactor the storage medium migration task process in BE.
I did not modify the execution logic. Just extract part of the logic
in the migration task and put it in task_work_pool.
In this way, the migration task is only used to process the migration
from the specified tablet to the specified data dir.
Later, we can use this task to migrate of tablets between different disks. #4476
Persistence stale rowsets meta. When BE reboots, stale rowsets meta
can resume and the stale version can also be readable before stale gc time.
ISSUE: #4453
Since the Segment V2 has been released for a long time, we should make it as default storage format for newly created table.
This CL mainly changes:
1. For all newly created tables, their default storage format is Segment V2.
2. For all already exist tablets, their storage format remain unchanged.
3. Fix bugs described in Fix#4384 and Fix#4385
* Implements the grammar of the batch delete #4051
* Process create, alter table when table has delete sign column
* Support the syntax for enabling the delete column
* Automatically filtered deleted data in the select statement.
* Automatically add delete sign when create rollup table
TODO:
* Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction
This PR is to add inPredicate support to delete statement,
and add max_allowed_in_element_num_of_delete variable to
limit element num of InPredicate in delete statement.
Querys like DELETE FROM tbl WHERE decimal_key <= "123.456";
when the type of decimal_key is DECIMALV2 may failed randomly,
this is because the precision and scale is not initialized.
Related issue #4017, main changes as follows:
1. Add expired_snapshot_rs_version_map,_expired_snapshot_rs_metas,
2. Add VersionedRowsetTracker record compacted path version
3. Record path version when rowsets compact
4. In gc process, add expired snapshot rowsets to unused set to remove.
TabletMeta's _preferred_rowset_type is not initialized after object constructing and
may be a random value, and this field is not updated when create ALPHA_ROWSET tablet,
and it will not be serialized into pb in this case. So if cloning an ALPHA_ROWSET
tablet from another BE, this new created local tablet's _preferred_rowset_type field
may be random as BETA_ROWSET and can not be overwrote after cloned, then new input
rows will be wrote as BETA_ROWSET format which is not we expect.
This patch fix this bug by giving _preferred_rowset_type a default value and updating
this field when create any type of tablet, and add an unit test and related overwrite
equal operator functions.
This CL try to fix a potential bug describe in ISSUE: #3097. But I'm not sure this is the root cause.
Also remove lots of verbose log, and fix a memory leak.
1. Add some comments to make the code easier to understand;
2. Make the metric `create_tablet_requests_failed` to be accurate;
3. Some internal methods use naked pointers directly instead of `shared_ptr`;
4. The `using` in `.h` files are contagious when included by other files,
so we should only use it in `.cpp` files;
5. Some formatting changes: such as wrapping lines that are too long
6. Parameters that need to be modified, use pointers instead of references
No functional changes in this patch.
* Improve comparison and printing of Version
There are two members in `Version`:` first` and `second`.
There are many places where we need to print one `Version` object and
compare two `Version` objects, but in the current code, these two members
are accessed directly, which makes the code very tedious.
This patch mainly do:
1. Adds overloaded methods for `operator<<()` for `Version`, so
we can directly print a Version object;
2. Adds the `cantains()` method to determine whether it is an containment
relationship;
3. Uses `operator==()` to determine if two `Version` objects are equal.
Because there are too many places need to be modified, there are still some
naked codes left, which will be modified later.
This patch also removes some necessary header file references.
No functional changes in this patch.
to solve the issue #2246.
scheme is as following:
add a optional preferred_rowset_type in TabletMeta for V2 format rollup index tablet
add a boolean session variable use_v2_rollup, if set true, the query will v2 storage format rollup index to process the query.
test queries will be sent to online service to verify the correctness of segment-v2 by send the the same queries to fe with use_v2_rollup set or not to check whether the returned results are the same.
Remove the default constructor for UniqueID
Add a gen_uid method in UniqueId. If need to generate a new uid, users should call this api explicitly.
Reuse boost random generator not generate a new one every time.
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.
1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
Nameing rowset with tablet_id and version will lead to
many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
with different format.