Commit Graph

19 Commits

Author SHA1 Message Date
08adf914f9 [improvement](vec) avoid creating a new column while filtering mutable columns (#16850)
Currently, when filtering a column, a new column will be created to store the filtering result, which will cause some performance loss。 ssb-flat without pushdown expr from 19s to 15s.
2023-02-21 09:47:21 +08:00
37d1519316 [WIP](dynamic-table) support dynamic schema table (#16335)
Issue Number: close #16351

Dynamic schema table is a special type of table, it's schema change with loading procedure.Now we implemented this feature mainly for semi-structure data such as JSON, since JSON is schema self-described we could extract schema info from the original documents and inference the final type infomation.This speical table could reduce manual schema change operation and easily import semi-structure data and extends it's schema automatically.
2023-02-11 13:37:50 +08:00
1f7829e099 [Fix](array-type) bugfix for array column with delete condition (#13361)
Fix for SQL with array column:
delete from tbl where c_array is null;

more info please refer to #13360

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-10-21 09:29:02 +08:00
cd3450bd9d [Improvement](join) optimize join probing phase (#13357) 2022-10-18 12:37:17 +08:00
4e4f8afa28 [fix](array-type) fix get_data_at for zero element array #13225
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-10-11 15:41:34 +08:00
35076431ab [fix](column)fix get_shrinked_column misspell (#12961)
Fix misspell
2022-09-26 17:32:03 +08:00
e413a2b8e9 [Opt](vectorized) Use new way to do hash shffle to speed up query (#12586) 2022-09-15 11:08:04 +08:00
Pxl
0ead048b93 [Enhancement](column) remove ColumnString terminating zero and add a data_version for pblock (#12456)
1. remove ColumnString terminating zero
    2. add a data_version for pblock
    3. change EncryptionMode to enum class
2022-09-14 21:25:22 +08:00
56b2fc43d4 [enhancement](array-type) shrink column suffix zero for type ARRAY<CHAR> (#12443)
In compute level, CHAR type will shrink suffix zeros.
To keep the logic the same as CHAR type, we also shrink for ARRAY or ARRAY<ARRAY> types.

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-09-13 23:24:48 +08:00
26cf2d3742 [enhancement](array-type) avoid abuse of Offset and Offset64 #12378
We already separate Array Offset64 and String Offset(32bit) in PR: #12341

Now we limit: Offset inside IColumn, Offset64 only inside ColumnArray, to avoid abuse of them.
If we use the wrong one, it will compile failed.
2022-09-08 14:53:07 +08:00
54d1630c42 [Opt](vectorized) speed up hash function compute in hash partition (#12334)
After do the opt of hash function, the compute of siphash in HASH_PARTITION in vdata_stream_sender

Before: 1s800ms
After: 800ms
2022-09-07 10:11:40 +08:00
cf5d194fe1 [enhancement](array-type) Split Array Offsets and String Offsets (#12341)
In old Doris version string offsets are 32bit, but it is not enough for Array type.
If we change string offsets from 32bit to 64bit, there will be problem if we upgrade BE one by one. Because at the same time 32bit Offsets and 64 bit Offsets String will exist at the same time.
As a result, we separate the Codes for Array Offsets.
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-09-06 11:18:27 +08:00
fba2658a1d [fix](array-type) fix the be core dump when use collect_list result to insert (#12045)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-08-26 18:00:43 +08:00
667689e9ba [Fix](array) fix array permute (#11389) 2022-08-01 22:46:03 +08:00
Pxl
a9d23ce337 [refactor] remove collator (#10518) 2022-07-01 10:35:32 +08:00
a2edc6fd8b [feature-wip](array-type) replicate impl for ColumnArray to support join with array column (#9070)
SQL with JOIN and columns ARRAY, will call function ColumnArray::replicate. At this pr,
we implement replicate for ARRAY type, to support SQL like this:
`SELECT count(lo_array),count(d_array),SUM(lo_extendedprice*lo_discount) AS REVENUE FROM  lineorder, date WHERE  lo_orderdate = d_datekey AND d_year = 1993 AND lo_discount BETWEEN 1 AND 3 AND lo_quantity < 25;`
2022-04-20 14:50:34 +08:00
52d18aa83c permute impl for column array; and codes format (#8949)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-04-13 09:47:54 +08:00
2580da4f72 [feature-wip](array-type) Support insertion for vectorized engine. (#8494) (#8590)
Please refer to #8493
2022-03-22 15:48:13 +08:00
a498463ab5 [feature-wip](array-type)support select ARRAY data type on vectorized engine (#8217) (#8584)
Usage Example:
1. create table for test;
```
`CREATE TABLE `array_test` (
  `k1` tinyint(4) NOT NULL COMMENT "",
  `k2` smallint(6) NULL COMMENT "",
  `k3` ARRAY<int(11)> NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`k1`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`k1`) BUCKETS 5
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2"
);`
```

2. insert some data
```
`insert into array_test values(1, 2, [1, 2]);`
`insert into array_test values(2, 3, null);`
`insert into array_test values(3, null, null);`
`insert into array_test values(4, null, []);`
```

3. open vectorized
`set enable_vectorized_engine=true;`

4. query array data
`select * from array_test;`
+------+------+--------+
| k1   | k2   | k3     |
+------+------+--------+
|    4 | NULL | []     |
|    2 |    3 | NULL   |
|    1 |    2 | [1, 2] |
|    3 | NULL | NULL   |
+------+------+--------+
4 rows in set (0.061 sec)

Code Changes include:
1. add column_array, data_type_array codes;
2. codes about data_type creation by Field, TabletColumn, TypeDescriptor, PColumnMeta move to DataTypeFactory;
3. support create data_type for ARRAY date type;
4. RowBlockV2::convert_to_vec_block support ARRAY date type;
5. VMysqlResultWriter::append_block support ARRAY date type;
6. vectorized::Block serialize and deserialize support ARRAY date type;
2022-03-22 15:21:44 +08:00