Read predicate columns firstly, and use VExprContext(push-down predicates)
to generate the select vector, which is then applied to read the non-predicate columns.
The data in non-predicate columns may be skipped by select vector, so the value-decode-time can be reduced.
If a whole page can be skipped, the decompress-time can also be reduced.
When execute analyze table, doris fails on decimal columns.
The root cause is the scale in decimalV2 is 9, but 2 in schema.
There is no need to check scale for decimalV2, since it is not a float point type.
1. add RemainedDownPredicates
2. fix core dump when _scan_ranges is empty
3. fix invalid memory access on vLiteral's debug_string()
4. enlarge mv test wait time
refractor DataTypeArray from_string, make it more clear;
support ',' and ']' inside string element, for example: ['hello,,,', 'world][]']
support empty elements, such as [,] ==> [0,0]
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
1. binding slot in order by that not show in project, such as:
SELECT c1 FROM t WHERE c2 > 0 ORDER BY c3
2. not check unbound when bind slot reference. Instead, do it in analysis check.
* [enhancement](load) shrink reserved buffer for page builder (#14012)
For table with hundreds of text type columns, flushing its memtable may cost huge memory.
These memory are consumed when initializing page builder, as it reserves 1MB for each column.
So memory consumption grows in proportion with column number. Shrinking the reservation may
reduce memory substantially in load process.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
* response to the review
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
* Update binary_plain_page.h
* Update binary_dict_page.cpp
* Update binary_plain_page.h
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
Introduce a SQL syntax for creating inverted index and related metadata changes.
```
-- create table with INVERTED index
CREATE TABLE httplogs (
ts datetime,
clientip varchar(20),
request string,
status smallint,
size int,
INDEX idx_size (size) USING INVERTED,
INDEX idx_status (status) USING INVERTED,
INDEX idx_clientip (clientip) USING INVERTED PROPERTIES("parser"="none")
)
DUPLICATE KEY(ts)
DISTRIBUTED BY RANDOM BUCKETS 10
-- add an INVERTED index to a table
CREATE INDEX idx_request ON httplogs(request) USING INVERTED PROPERTIES("parser"="english");
```
This PR implements the function of predicate inference
For example:
``` sql
select * from student left join score on student.id = score.sid where score.sid > 1
```
transformed logical plan tree:
left join
/ \
filter(sid >1) filter(id > 1) <---- inferred predicate
| |
scan scan
See `InferPredicatesTest` for more cases
The logic is as follows:
1. poll up bottom predicate then infer additional predicates
for example:
select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id
1. poll up bottom predicate
select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1
2. infer
select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1 and t2.id = 1
finally transformed sql:
select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t2.id = 1
2. put these predicates into `otherJoinConjuncts` , these predicates are processed in the next
round of predicate push-down
Now only support infer `ComparisonPredicate`.
TODO: We should determine whether `expression` satisfies the condition for replacement
eg: Satisfy `expression` is non-deterministic