Commit Graph

15 Commits

Author SHA1 Message Date
aea3e4e59b [refactor] Remove version hash from BE and related test in BE (#8027) 2022-02-14 09:29:27 +08:00
dd36ccc3bf [feature](storage-format) Z-Order Implement (#7149)
Support sort data by Z-Order:

```
CREATE TABLE table2 (
siteid int(11) NULL DEFAULT "10" COMMENT "",
citycode int(11) NULL COMMENT "",
username varchar(32) NULL DEFAULT "" COMMENT "",
pv bigint(20) NULL DEFAULT "0" COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(siteid, citycode)
COMMENT "OLAP"
DISTRIBUTED BY HASH(siteid) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"data_sort.sort_type" = "ZORDER",
"data_sort.col_num" = "2",
"in_memory" = "false",
"storage_format" = "V2"
);
```
2021-12-02 11:39:51 +08:00
63dbcbc4e1 [UT] Fix ut bugs (#6862)
Co-authored-by: morningman <chenmingyu@baidu.com>
2021-10-18 10:12:55 +08:00
a6e2c3e3f1 [Bug][Clone] Fix the bug that incremental clone is not triggered (#5230)
In version 0.13, we support a more efficient compaction logic. 
This logic will maintain multiple version paths of the tablet.
This can avoid -230 errors and can also support incremental clone.

But the previous incremental clone uses the incremental rowset meta recorded in `incr_rs_meta`.
At present, the incremental rowset meta recorded in `incr_rs_meta` and the records
in `stale_rs_meta` are duplicated, and the current clone logic does not adapt to the
new multi-version path, resulting in many cases not triggering incremental clone.

This CL mainly modified:

1. Removed `incr_rs_meta` metadata
2. Modified the clone logic. When the clone is incremented, it will try to read the rowset in `stale_rs_meta`.
3. Delete a lot of code that was previously used for version compatibility.
2021-02-06 22:04:48 +08:00
068707484d Support sequence column for UNIQUE_KEYS Table (#4256)
* add sequence  col

Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>
2020-09-04 10:10:17 +08:00
d61c10b761 [Delete] Support batch delete [part 1] (#4310)
* Implements the grammar of the batch delete #4051 
* Process create, alter table when table has delete sign column
* Support the syntax for enabling the delete column
* Automatically filtered deleted data in the select statement.
* Automatically add delete sign when create  rollup table
TODO:
 * Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction
2020-08-21 22:57:16 +08:00
f189a2e7b8 [Spark load][Be 1/1] Be handle push task (#3742)
1、Add a PushBrokerReader in push_handle.cpp.
2、PushBrokerReader wraps the ParquetScanner to support reading data from parquet format file through broker.
2020-06-22 19:57:58 +08:00
c967eaf496 [Memory Engine] Add TabletType to PartitionInfo and TabletMeta (#3668) 2020-05-29 20:20:44 +08:00
625411bd28 Doris support in memory olap table (#2847) 2020-02-18 10:45:54 +08:00
13e5fdd512 [AlphaRowset] set num_segments field in rowset meta if missing (#2658)
the num segments should be read from rowset meta pb.
But the previous code error caused this value not to be set in some cases.
So when init the rowset meta and find that the num_segments is 0(not set),
we will try to calculate the num segments from AlphaRowsetExtraMetaPB,
and then set the num_segments field.
This should only happen in some rowsets converted from old version.
and for all newly created rowsets, the num_segments field must be set.
2020-01-07 21:46:02 +08:00
6f4feca3dc Add rowset id generator to FE and BE (#1678) 2019-09-02 18:51:31 +08:00
e8561d71a6 Add dict page (#1409)
Add dict encoding page for binary/string type data. 
Construct a dict for original data, and save encoded id instead of 
origin data to save space. If the dict is too big, then will automatically fall
back to plain encoding.
2019-07-26 09:47:11 +08:00
0d48a3961c Refactor Storage Engine (#1478)
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.

1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
   Nameing rowset with tablet_id and version will lead to
   many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
   with different format.
2019-07-15 21:18:22 +08:00
5d3fc80067 Added:
* Add streaming load feature. You can execute 'help stream load;' to see more information.

Changed:
* Loading phase of a certain table can be parallelized, to reduce the load job execution time when multi load jobs to a single table.
* Using RocksDB to save the header info of tablets in Backends, to reduce the IO operations and increate speeding of restarting.

Fixed:
* A lot of bugs fixed.
2018-10-31 14:46:22 +08:00
e2311f656e baidu palo 2017-08-11 17:51:21 +08:00