doris

Author	SHA1	Message	Date
yixiutt	3dde97bff1	(compaction) opt compaction task producer and quick compaction (#13495 ) (#14535 ) 1.remove quick_compaction's rowset pick policy, call cu compaction when trigger quick compaction 2. skip tablet's compaction task when compaction score is too small Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-12-02 10:07:44 +08:00
yixiutt	94a6ffb906	[feature](compaction) support vertical_compaction & ordered_data_compaction (#14524 )	2022-12-01 22:15:41 +08:00
Lightman	1f9fb4dc8b	[Bugfix] Fix upgrade from 1.1 coredump (#14163 ) When upgrade from 1.1 to master, and then rollback to 1.1, and upgrade to master again, BE will coredump because some rowsets has schema and some rowsets has no schema. In the first time upgrade from 1.1, BE will flush schema in all rowsets and after rollback to 1.1, BE do compaction, and create some new rowset without schema. And the second time upgrade from 1.1, BE coredump because some conditions depend on having all or none of the rowsets.	2022-11-11 10:29:34 +08:00
Mingyu Chen	942611c185	Revert "[enhancement](compaction) opt compaction task producer and quick compaction (#13495 )" (#13833 ) This reverts commit 4f2ea0776ca3fe5315ab5ef7e00eefabfb5771a0.	2022-11-01 14:22:12 +08:00
yixiutt	4f2ea0776c	[enhancement](compaction) opt compaction task producer and quick compaction (#13495 ) 1.remove quick_compaction's rowset pick policy, call cu compaction when trigger quick compaction 2. skip tablet's compaction task when compaction score is too small Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-10-31 12:24:05 +08:00
pengxiangyu	eab8876abc	[Feature](remote) Using heavy schema change if the table is not enable light weight schema change (#13487 )	2022-10-28 15:48:22 +08:00
caiconghui	87864e40bf	[doc](random_sink) Add some doc content about random sink (#13577 ) 1. Add some doc content about random sink 2. Fix bug of showing missing rowsets info	2022-10-23 22:51:56 +08:00
yixiutt	6d322f85ac	[improvement](compaction) delete num based compaction policy (#13409 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-10-18 16:13:28 +08:00
Adonis Ling	125def5102	[enhancement](macOS M1) Support building from source on macOS (M1) (#13195 ) # Proposed changes This PR fixed lots of issues when building from source on macOS with Apple M1 chip. ## ATTENTION The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime: 1. Some errors with memory tracker occur when BE (RELEASE) starts. 2. Some UT cases fail. ... Temporarily, the following changes are made on macOS to start BE successfully. 1. Disable memory tracker. 2. Use tcmalloc instead of jemalloc. This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues. ## Use case ```shell ./build.sh -j 8 --be --clean cd output/be/bin ulimit -n 60000 ./start_be.sh --daemon ``` ## Something else It takes around _10+_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the development experience on macOS greatly when we finish the adaptation job.	2022-10-18 13:10:13 +08:00
Xin Liao	9e42804298	[feature-wip](unique-key-merge-on-write) unique key with merge on write table support schema change (#12886 )	2022-10-09 11:31:53 +08:00
Pxl	8731eea26e	[Chore](clang) fix some build fail on clang15 (#12882 ) remove unused variables	2022-09-26 23:13:28 +08:00
Xin Liao	a5643822de	[feature-wip](unique-key-merge-on-write) fix calculate delete bitmap when has sequence column (#12789 ) when the rowset has multiple segments with sequence column, we should compare sequence id with previous segment.	2022-09-21 09:21:07 +08:00
Xin Liao	41cf94498d	[feature-wip](unique-key-merge-on-write) fix that incremental clone may lead to loss of delete bitmap (#12721 )	2022-09-20 09:08:06 +08:00
Lightman	e01986b8b9	[feature](light-schema-change) fix light-schema-change and add more cases (#12160 ) Fix _delete_sign_idx and _seq_col_idx when append_column or build_schema when load. Tablet schema cache support recycle when schema sptr use count equals 1. Add a http interface for flink-connector to sync ddl. Improve tablet->tablet_schema() by max_version_schema.	2022-09-17 11:29:36 +08:00
Xin Liao	554ba40b13	[feature-wip](unique-key-merge-on-write) update delete bitmap when increamental clone (#12364 )	2022-09-09 17:03:27 +08:00
yixiutt	018b4b7e1e	[bugfix](report) fix continuous version miss check (#12415 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-08 08:39:22 +08:00
yiguolei	2f192019d3	[bugfix](delete hanlder) delete predicate is merged and could not find schema cause core dump (#12161 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-30 09:18:21 +08:00
yiguolei	ccff3f5711	[bugfix](light weight schema change) support delete condition in schema change (#11869 ) * [bugfix](light weight schema change) support delete condition in schema change Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-26 11:45:55 +08:00
zhannngchen	ba11d8dc67	[feature-wip](unique-key-merge-on-write) fix bugs on tablet clone #12067	2022-08-26 10:37:00 +08:00
zhannngchen	e5bfbbe761	[feature-wip](unique-key-merge-on-write) support alter table column for MoW (#12052 )	2022-08-26 09:40:11 +08:00
yixiutt	60fddd56e7	[feature-wip](unique-key-merge-on-write) opt lock and only save valid delete_bitmap (#11953 ) 1. use rlock in most logic instead of wrlock 2. filter stale rowset's delete bitmap in save meta 3. add a delete_bitmap lock to handle compaction and publish_txn confict Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-08-23 14:43:40 +08:00
yixiutt	11dc5cad83	[feature-wip](unique-key-merge-on-write) add min/max key in segment (#11830 ) some feature: 1. add min max key in segment footer to speed up get_row_ranges_by_keys 2. do not load pk bloom filter in query Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-08-17 18:11:39 +08:00
Xin Liao	12c4d1f4dd	[feature-wip](unique-key-merge-on-write) unique key table with MOW supports sequence column (#11808 )	2022-08-17 10:56:14 +08:00
Xin Liao	c3e6a841c1	[feature-wip](unique-key-merge-on-write) fix that sort segments by segment id in descending order (#11811 )	2022-08-17 10:54:30 +08:00
Lightman	3e13b7d2c2	[Bugfix](light-shema-change) fix _finish_clone dead lock (#11823 ) In engine_clone_task.cpp, it use tablet->tablet_schema() to create rowset, but in the method, it need a lock that already locked in engine_clone_task.cpp:514. It use cloned_tablet_meta->tablet_schema() originally, but modified in #11131. It need to revert to use cloned_tablet_meta->tablet_schema().	2022-08-17 09:10:08 +08:00
Xinyi Zou	7d836cf0c7	[fix](memtracker) Fix flush memtable to reduce load channel mem not executed (#11771 ) The memory value automatically tracked by the tcmalloc hook in the DeltaWriter is smaller than the value recorded manually in the memtable, because the first 4096-byte Chunk requested by each MemPool when the memtable is initialized is not tracked to the DeltaWriter by the hook. The values of the two are not equal, causing the mem_consumption() == _mem_table->memory_usage branch judgment to fail.	2022-08-16 14:30:45 +08:00
zhannngchen	0ab43c51e8	[Feature](unique-key-merge-on-write) some fix on delete bitmap usage (#11623 )	2022-08-12 11:54:31 +08:00
yixiutt	0a5fd99d02	[feature-wip](unique-key-merge-on-write) speed up publish_txn (#11557 ) In our origin design, we calc delete bitmap in publish txn, and this operation will cost too much time as it will load segment data and lookup row key in pre rowset and segments.And publish version task should run in order, so it'll lead to timeout in publish_txn. In this pr, we seperate delete_bitmap calculation to tow part, one of it will be done in flush mem table, so this work can run parallel. And we calc final delete_bitmap in publish_txn, get a rowset_id set that should be included and remove rowsets that has been compacted, the rowset difference between memtable_flush and publish_txn is really small so publish_txn become very fast.In our test, publish_txn cost about 10ms. Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-08-08 18:57:55 +08:00
yiguolei	321107cb40	[refactor](schema change) Using tablet schema shared ptr instead of raw ptr (#11475 ) * Using tabletschema shared ptr instead of raw ptrs Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-08-05 11:04:38 +08:00
weizuo93	5c1cd058f2	[Feature] Add interface to check tablet segment lost (#10711 ) Co-authored-by: weizuo <weizuo@xiaomi.com>	2022-08-02 09:40:04 +08:00
Lightman	b35daf0a04	[improvement](light-schema-change) Support tablet schema cache (#11131 )	2022-08-01 12:18:00 +08:00
zhannngchen	018665aba2	[feature-wip](unique-key-merge-on-write) some followup of #11057 (#11290 )	2022-07-29 14:44:48 +08:00
plat1ko	a6537a90cd	[Enhancement] Garbage collection of unused data on remote storage backend (#10731 ) * [Feature](cold_on_s3) support unused remote rowset gc * return aborted when skip drop tablet * perform unused remote rowset gc	2022-07-29 14:38:39 +08:00
Xin Liao	eab8382b4a	[feature-wip](unique-key-merge-on-write) add the implementation of primary key index update, DSIP-018 (#11057 )	2022-07-27 14:17:56 +08:00
Xinyi Zou	4960043f5e	[enhancement] Refactor to improve the usability of MemTracker (step2) (#10823 )	2022-07-21 17:11:28 +08:00
zhannngchen	ec5471f048	[feature-wip](unique-key-merge-on-write) Implement tablet lookup interface, using rowset-tree, DSIP-018[3/5] (#10938 )	2022-07-20 14:52:14 +08:00
plat1ko	3bc6655069	[refactor] remove BlockManager (#10913 ) * remove BlockManager * remove deprecated field in tablet meta	2022-07-17 14:10:06 +08:00
zhannngchen	b1711e94b7	(unique-key-merge-on-write) Add tablet lookup interface, DSIP-18[3/4] (#10820 ) Add lookup_row_key interface for tablet and segment.	2022-07-15 20:49:28 +08:00
Lightman	364c8733fa	fix light schema change coredump (#10828 )	2022-07-14 15:43:15 +08:00
Lightman	486cf0ebd4	[Feature] Lightweight schema change of add/drop column (#10136 ) * [Schema Change] support fast add/drop column (#49) * [feature](schema-change) support fast schema change. coauthor: yixiutt * [schema change] Using columns desc from fe to read data. coauthor: Lchangliang * [feature](schema change) schema change optimize for add/drop columns. 1.add uniqueId field for class column. 2.schema change for add/drop columns directly update schema meta Co-authored-by: yixiutt <yixiu@selectdb.com> Co-authored-by: SWJTU-ZhangLei <1091517373@qq.com> [Feature](schema change) fix write and add regression test (#69) Co-authored-by: yixiutt <yixiu@selectdb.com> [schema change] be ssupport that delete use newest schema add delete regression test fix regression case (#107) tmp [feature](schema change) light schema change exclude rollup and agg/uniq/dup key type. [feature](schema change) fe olapTable maxUniqueId write in disk. [feature](schema change) add rpc iface for sc add column. [feature](schema change) add columnsDesc to TPushReq for ligtht sc. resolve the deadlock when schema change (#124) fix columns from fe don't has bitmap_index flag (#134) add update/delete case construct MATERIALIZED schema from origin schema when insert fix not vectorized compaction coredump use segment cache choose newest schema by schema version when compaction (#182) [bugfix](schema change) fix ligth schema change problem. [feature](schema change) light schema change add alter job. (#1) fix be ut [bug] (schema change) unique drop key column should not light schema change [feature](schema change) add schema change regression-test. fix regression test [bugfix](schema change) fix multi alter clauses for light schema change. (#2) [bugfix](schema change) fix multi clauses calculate column unique id (#3) modify PushTask process (#217) [Bugfix](schema change) fix jobId replay cause bdbje exception. [bug](schema change) fix max col unique id repeatitive. (#232) [optimize](schema change) modify pendingMaxColUniqueId generate rule. fix compaction error * fix be ut * fix snapshot load core fix unique_id error (#278) [refact](fe) remove redundant code for light schema change. (#4) [refact](fe) remove redundant code for light schema change. (#4) format fe core format be core fix be ut modify fe meta version fix rebase error flush schema into rowset_meta in old table [refactor](schema change) refact fe light schema change. (#5) delete the change of schemahash and support get max version schema * modify for review * fix be ut * fix schema change test	2022-07-12 19:41:06 +08:00
plat1ko	331fa50501	[feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280 ) This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet, and there is no necessary to prohibit loading new data to cooled tablets. Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without perceiving the underlying filesystem. The abstracted `RemoteFileSystem` can try local caching strategies with different granularity, instead of caching segment files as before. To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory. In the future, `FileReader`s and `FileWriter`s should be unified.	2022-07-08 12:18:39 +08:00
yiguolei	89e56ea67f	[refactor] remove alpha rowset related code and vectorized row batch related code (#10584 )	2022-07-05 20:33:34 +08:00
Tiewei Fang	c9f86bc7e2	[refactor] Refactoring Status static methods to format message using fmt(#9533 )	2022-07-02 18:58:23 +08:00
yixiutt	f35b235c3b	[opt](compaction) optimize compaction in concurrent load (#10153 ) add some logic to opt compaction: 1.seperate base&cumu compaction in case base compaction runs too long and affect cumu compaction 2.fix level size in cu compaction so that file size below 64M have a right level size, when choose rowsets to do compaction, the policy will ignore big rowset, this will reduce about 25% cpu in high frequency concurrent load 3.remove skip window restriction so rowset can do compaction right after generated, cause we'll not delete rowset after compaction. This will highly reduce compaction score in concurrent log. 4.remove version consistence check in can_do_compaction, we'll choose a consecutive rowset to do compaction, so this logic is useless after add logic above, compaction score and cpu cost will have a substantial optimize in concurrent load. Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-06-17 17:49:45 +08:00
Pxl	5805f8077f	[Feature] [Vectorized] Some pre-refactorings or interface additions for schema change part2 (#10003 )	2022-06-16 10:50:08 +08:00
chenlinzhong	4dfebb9852	[Feature] compaction quickly for small data import (#9804 ) * compaction quickly for small data import #9791 1.merge small versions of rowset as soon as possible to increase the import frequency of small version data 2.small version means that the number of rows is less than config::small_compaction_rowset_rows default 1000	2022-06-15 21:48:34 +08:00
plat1ko	f4e2f78a1a	[fix] Fix the bug that data balance causes tablet loss (#9971 ) 1. Provide a FE conf to test the reliability in single replica case when tablet scheduling are frequent. 2. According to #6063, almost apply this fix on current code.	2022-06-15 09:52:56 +08:00
yixiutt	3363b3aa19	[fix](load) fix streamload failure due to false unhealthy replica in concurrent stream load (#10007 ) in concurrent stream load, fe will run publish version task concurrently, which cause publish task disorder in be. For example: fe publish task with version 1 2 3 4 be may handle task with sequence 1 2 4 3 In case above, when report tablet info, be found that version 4 published but version 3 not visible, it'll report version miss to fe, and fe will set replica lastFailedVersion, and finally makes transaction commits fail while no quorum health replicas。 Add a time condition if a version miss for 60 seconds then report version miss.	2022-06-10 09:15:14 +08:00
jacktengg	3743f19369	[feature] support convert alpha rowset (#9890 ) Add alpha rowset to beta rowset convert to convert rowset automatically. We will remove alpha rowset's code after 1.1.	2022-06-04 12:29:03 +08:00
Lijia Liu	47dfdd8e09	[fix](storage) Disable compaction before schema change is actually executed(#9032 ) (#9065 ) As in issue, the combination and schema change at the same time may lead to version intersection. Describe the overview of changes. 1. Do not do compaction before schema change is actually executed. 2. Set tablet as bad when it has version intersection. 3. Do not do schema change when it can not find appropriate versions to delete in new tablet. 4. Do not change rowsets after compaction if the rowsets of the tablet has changed.	2022-06-01 23:29:18 +08:00

1 2 3 4

160 Commits