Commit Graph

160 Commits

Author SHA1 Message Date
3dde97bff1 (compaction) opt compaction task producer and quick compaction (#13495) (#14535)
1.remove quick_compaction's rowset pick policy, call cu compaction when trigger
quick compaction
2. skip tablet's compaction task when compaction score is too small

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-12-02 10:07:44 +08:00
94a6ffb906 [feature](compaction) support vertical_compaction & ordered_data_compaction (#14524) 2022-12-01 22:15:41 +08:00
1f9fb4dc8b [Bugfix] Fix upgrade from 1.1 coredump (#14163)
When upgrade from 1.1 to master, and then rollback to 1.1, and upgrade to master again, BE will coredump because some rowsets has schema and some rowsets has no schema. In the first time upgrade from 1.1, BE will flush schema in all rowsets and after rollback to 1.1, BE do compaction, and create some new rowset without schema. And the second time upgrade from 1.1, BE coredump because some conditions depend on having all or none of the rowsets.
2022-11-11 10:29:34 +08:00
942611c185 Revert "[enhancement](compaction) opt compaction task producer and quick compaction (#13495)" (#13833)
This reverts commit 4f2ea0776ca3fe5315ab5ef7e00eefabfb5771a0.
2022-11-01 14:22:12 +08:00
4f2ea0776c [enhancement](compaction) opt compaction task producer and quick compaction (#13495)
1.remove quick_compaction's rowset pick policy, call cu compaction when trigger
quick compaction
2. skip tablet's compaction task when compaction score is too small

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-10-31 12:24:05 +08:00
eab8876abc [Feature](remote) Using heavy schema change if the table is not enable light weight schema change (#13487) 2022-10-28 15:48:22 +08:00
87864e40bf [doc](random_sink) Add some doc content about random sink (#13577)
1. Add some doc content about random sink
2. Fix bug of showing missing rowsets info
2022-10-23 22:51:56 +08:00
6d322f85ac [improvement](compaction) delete num based compaction policy (#13409)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-10-18 16:13:28 +08:00
125def5102 [enhancement](macOS M1) Support building from source on macOS (M1) (#13195)
# Proposed changes

This PR fixed lots of issues when building from source on macOS with Apple M1 chip.

## ATTENTION

The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime:
1. Some errors with memory tracker occur when BE (RELEASE) starts.
2. Some UT cases fail.
...

Temporarily, the following changes are made on macOS to start BE successfully.
1. Disable memory tracker.
2. Use tcmalloc instead of jemalloc.

This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues.

## Use case

```shell
./build.sh -j 8 --be --clean

cd output/be/bin
ulimit -n 60000
./start_be.sh --daemon
```

## Something else

It takes around _**10+**_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the  development experience on macOS greatly when we finish the adaptation job.
2022-10-18 13:10:13 +08:00
9e42804298 [feature-wip](unique-key-merge-on-write) unique key with merge on write table support schema change (#12886) 2022-10-09 11:31:53 +08:00
Pxl
8731eea26e [Chore](clang) fix some build fail on clang15 (#12882)
remove unused variables
2022-09-26 23:13:28 +08:00
a5643822de [feature-wip](unique-key-merge-on-write) fix calculate delete bitmap when has sequence column (#12789)
when the rowset has multiple segments with sequence column, we should compare sequence id with previous segment.
2022-09-21 09:21:07 +08:00
41cf94498d [feature-wip](unique-key-merge-on-write) fix that incremental clone may lead to loss of delete bitmap (#12721) 2022-09-20 09:08:06 +08:00
e01986b8b9 [feature](light-schema-change) fix light-schema-change and add more cases (#12160)
Fix _delete_sign_idx and _seq_col_idx when append_column or build_schema when load.
Tablet schema cache support recycle when schema sptr use count equals 1.
Add a http interface for flink-connector to sync ddl.
Improve tablet->tablet_schema() by max_version_schema.
2022-09-17 11:29:36 +08:00
554ba40b13 [feature-wip](unique-key-merge-on-write) update delete bitmap when increamental clone (#12364) 2022-09-09 17:03:27 +08:00
018b4b7e1e [bugfix](report) fix continuous version miss check (#12415)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-09-08 08:39:22 +08:00
2f192019d3 [bugfix](delete hanlder) delete predicate is merged and could not find schema cause core dump (#12161)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-08-30 09:18:21 +08:00
ccff3f5711 [bugfix](light weight schema change) support delete condition in schema change (#11869)
* [bugfix](light weight schema change) support delete condition in schema change


Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-08-26 11:45:55 +08:00
ba11d8dc67 [feature-wip](unique-key-merge-on-write) fix bugs on tablet clone #12067 2022-08-26 10:37:00 +08:00
e5bfbbe761 [feature-wip](unique-key-merge-on-write) support alter table column for MoW (#12052) 2022-08-26 09:40:11 +08:00
60fddd56e7 [feature-wip](unique-key-merge-on-write) opt lock and only save valid delete_bitmap (#11953)
1. use rlock in most logic instead of wrlock
2. filter stale rowset's delete bitmap in save meta
3. add a delete_bitmap lock to handle compaction and publish_txn confict

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-08-23 14:43:40 +08:00
11dc5cad83 [feature-wip](unique-key-merge-on-write) add min/max key in segment (#11830)
some feature:
1. add min max key in segment footer to speed up get_row_ranges_by_keys
2. do not load pk bloom filter in query

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-08-17 18:11:39 +08:00
12c4d1f4dd [feature-wip](unique-key-merge-on-write) unique key table with MOW supports sequence column (#11808) 2022-08-17 10:56:14 +08:00
c3e6a841c1 [feature-wip](unique-key-merge-on-write) fix that sort segments by segment id in descending order (#11811) 2022-08-17 10:54:30 +08:00
3e13b7d2c2 [Bugfix](light-shema-change) fix _finish_clone dead lock (#11823)
In engine_clone_task.cpp, it use tablet->tablet_schema() to create rowset, but in the method, it need a lock that already locked in engine_clone_task.cpp:514. It use cloned_tablet_meta->tablet_schema() originally, but modified in #11131. It need to revert to use cloned_tablet_meta->tablet_schema().
2022-08-17 09:10:08 +08:00
7d836cf0c7 [fix](memtracker) Fix flush memtable to reduce load channel mem not executed (#11771)
The memory value automatically tracked by the tcmalloc hook in the DeltaWriter is smaller than the value recorded manually in the memtable, because the first 4096-byte Chunk requested by each MemPool when the memtable is initialized is not tracked to the DeltaWriter by the hook.

The values ​​of the two are not equal, causing the mem_consumption() == _mem_table->memory_usage branch judgment to fail.
2022-08-16 14:30:45 +08:00
0ab43c51e8 [Feature](unique-key-merge-on-write) some fix on delete bitmap usage (#11623) 2022-08-12 11:54:31 +08:00
0a5fd99d02 [feature-wip](unique-key-merge-on-write) speed up publish_txn (#11557)
In our origin design, we calc delete bitmap in publish txn, and this operation
will cost too much time as it will load segment data and lookup row key in pre
rowset and segments.And publish version task should run in order, so it'll lead
to timeout in publish_txn.

In this pr, we seperate delete_bitmap calculation to tow part, one of it will be
done in flush mem table, so this work can run parallel. And we calc final
delete_bitmap in publish_txn, get a rowset_id set that should be included and
remove rowsets that has been compacted, the rowset difference between memtable_flush
and publish_txn is really small so publish_txn become very fast.In our test,
publish_txn cost about 10ms.

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-08-08 18:57:55 +08:00
321107cb40 [refactor](schema change) Using tablet schema shared ptr instead of raw ptr (#11475)
* Using tabletschema shared ptr instead of raw ptrs


Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-08-05 11:04:38 +08:00
5c1cd058f2 [Feature] Add interface to check tablet segment lost (#10711)
Co-authored-by: weizuo <weizuo@xiaomi.com>
2022-08-02 09:40:04 +08:00
b35daf0a04 [improvement](light-schema-change) Support tablet schema cache (#11131) 2022-08-01 12:18:00 +08:00
018665aba2 [feature-wip](unique-key-merge-on-write) some followup of #11057 (#11290) 2022-07-29 14:44:48 +08:00
a6537a90cd [Enhancement] Garbage collection of unused data on remote storage backend (#10731)
* [Feature](cold_on_s3) support unused remote rowset gc

* return aborted when skip drop tablet

* perform unused remote rowset gc
2022-07-29 14:38:39 +08:00
eab8382b4a [feature-wip](unique-key-merge-on-write) add the implementation of primary key index update, DSIP-018 (#11057) 2022-07-27 14:17:56 +08:00
4960043f5e [enhancement] Refactor to improve the usability of MemTracker (step2) (#10823) 2022-07-21 17:11:28 +08:00
ec5471f048 [feature-wip](unique-key-merge-on-write) Implement tablet lookup interface, using rowset-tree, DSIP-018[3/5] (#10938) 2022-07-20 14:52:14 +08:00
3bc6655069 [refactor] remove BlockManager (#10913)
* remove BlockManager
* remove deprecated field in tablet meta
2022-07-17 14:10:06 +08:00
b1711e94b7 (unique-key-merge-on-write) Add tablet lookup interface, DSIP-18[3/4] (#10820)
Add lookup_row_key interface for tablet and segment.
2022-07-15 20:49:28 +08:00
364c8733fa fix light schema change coredump (#10828) 2022-07-14 15:43:15 +08:00
486cf0ebd4 [Feature] Lightweight schema change of add/drop column (#10136)
* [Schema Change] support fast add/drop column  (#49)

* [feature](schema-change) support fast schema change. coauthor: yixiutt

* [schema change] Using columns desc from fe to read data. coauthor: Lchangliang

* [feature](schema change) schema change optimize for add/drop columns.

1.add uniqueId field for class column.
2.schema change for add/drop columns directly update schema meta

Co-authored-by: yixiutt <yixiu@selectdb.com>
Co-authored-by: SWJTU-ZhangLei <1091517373@qq.com>

[Feature](schema change) fix write and add regression test (#69)

Co-authored-by: yixiutt <yixiu@selectdb.com>

[schema change] be ssupport that delete use newest schema

add delete regression test

fix regression case (#107)

tmp

[feature](schema change) light schema change exclude rollup and agg/uniq/dup key type.

[feature](schema change) fe olapTable maxUniqueId write in disk.

[feature](schema change) add rpc iface for sc add column.

[feature](schema change) add columnsDesc to TPushReq for ligtht sc.

resolve the deadlock when schema change (#124)

fix columns from fe don't has bitmap_index flag (#134)

add update/delete case

construct MATERIALIZED schema from origin schema when insert

fix not vectorized compaction coredump

use segment cache

choose newest schema by schema version when compaction (#182)

[bugfix](schema change) fix ligth schema change problem.

[feature](schema change) light schema change add alter job. (#1)

fix be ut

[bug] (schema change) unique drop key column should not light schema
change

[feature](schema change) add schema change regression-test.

fix regression test

[bugfix](schema change) fix multi alter clauses for light schema change. (#2)

[bugfix](schema change) fix multi clauses calculate column unique id (#3)

modify PushTask process (#217)

[Bugfix](schema change) fix jobId replay cause bdbje exception.

[bug](schema change) fix max col unique id repeatitive. (#232)

[optimize](schema change) modify pendingMaxColUniqueId generate rule.

fix compaction error
* fix be ut

* fix snapshot load core

fix unique_id error (#278)

[refact](fe) remove redundant code for light schema change. (#4)

[refact](fe) remove redundant code for light schema change. (#4)

format fe core

format be core

fix be ut

modify fe meta version

fix rebase error

flush schema into rowset_meta in old table

[refactor](schema change) refact fe light schema change. (#5)

delete the change of schemahash and support get max version schema

* modify for review

* fix be ut

* fix schema change test
2022-07-12 19:41:06 +08:00
331fa50501 [feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280)
This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet,
and there is no necessary to prohibit loading new data to cooled tablets.

Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without
perceiving the underlying filesystem.

The abstracted `RemoteFileSystem` can try local caching strategies with different granularity,
instead of caching segment files as before.

To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory.
In the future, `FileReader`s and `FileWriter`s should be unified.
2022-07-08 12:18:39 +08:00
89e56ea67f [refactor] remove alpha rowset related code and vectorized row batch related code (#10584) 2022-07-05 20:33:34 +08:00
c9f86bc7e2 [refactor] Refactoring Status static methods to format message using fmt(#9533) 2022-07-02 18:58:23 +08:00
f35b235c3b [opt](compaction) optimize compaction in concurrent load (#10153)
add some logic to opt compaction:
1.seperate base&cumu compaction in case base compaction runs too long and
affect cumu compaction
2.fix level size in cu compaction so that file size below 64M have a right level
size, when choose rowsets to do compaction, the policy will ignore big rowset,
this will reduce about 25% cpu in high frequency concurrent load
3.remove skip window restriction so rowset can do compaction right after
generated, cause we'll not delete rowset after compaction. This will highly
reduce compaction score in concurrent log.
4.remove version consistence check in can_do_compaction, we'll choose a
consecutive rowset to do compaction, so this logic is useless

after add logic above, compaction score and cpu cost will have a substantial
optimize in concurrent load.

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-06-17 17:49:45 +08:00
Pxl
5805f8077f [Feature] [Vectorized] Some pre-refactorings or interface additions for schema change part2 (#10003) 2022-06-16 10:50:08 +08:00
4dfebb9852 [Feature] compaction quickly for small data import (#9804)
* compaction quickly for small data import #9791
1.merge small versions of rowset as soon as possible to increase the import frequency of small version data
2.small version means that the number of rows is less than config::small_compaction_rowset_rows  default 1000
2022-06-15 21:48:34 +08:00
f4e2f78a1a [fix] Fix the bug that data balance causes tablet loss (#9971)
1. Provide a FE conf to test the reliability in single replica case when tablet scheduling are frequent.
2. According to #6063, almost apply this fix on current code.
2022-06-15 09:52:56 +08:00
3363b3aa19 [fix](load) fix streamload failure due to false unhealthy replica in concurrent stream load (#10007)
in concurrent stream load, fe will run publish version task concurrently,
which cause publish task disorder in be.
For example:
fe publish task with version 1 2 3 4
be may handle task with sequence 1 2 4 3
In case above, when report tablet info, be found that version 4
published but version 3 not visible, it'll report version miss to fe,
and fe will set replica lastFailedVersion, and finally makes transaction
commits fail while no quorum health replicas。

Add a time condition if a version miss for 60 seconds then report version miss.
2022-06-10 09:15:14 +08:00
3743f19369 [feature] support convert alpha rowset (#9890)
Add alpha rowset to beta rowset convert to convert rowset automatically. We will remove alpha rowset's code after 1.1.
2022-06-04 12:29:03 +08:00
47dfdd8e09 [fix](storage) Disable compaction before schema change is actually executed(#9032) (#9065)
As in issue, the combination and schema change at the same time may lead to version intersection.
Describe the overview of changes.
1. Do not do compaction before schema change is actually executed.
2. Set tablet as bad when it has version intersection.
3. Do not do schema change when it can not find appropriate versions to delete in new tablet.
4. Do not change rowsets after compaction if the rowsets of the tablet has changed.
2022-06-01 23:29:18 +08:00