Due to the current architecture, predicate derivation at rewrite cannot satisfy all cases,
because rewrite is performed on first and then where, and when there are subqueries, all cases cannot be derived.
So keep the predicate pushdown method here.
eg.
select * from t1 left join t2 on t1 = t2 where t1 = 1;
InferFiltersRule can't infer t2 = 1, because this is out of specification.
The expression(t2 = 1) can actually be deduced to push it down to the scan node.
Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.
so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:
- gsasl version: 1.8.0
- krb5 version: 1.19
* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code
It only changed vectorized load, not original row based load.
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't.
This would cause different query result with hive.
Add a new session variable "trim_tailing_spaces_for_external_table_query".
If set to true, when reading csv from broker scan node, it will trim the tailing space of the column
unnecessary cast will be added on children in CaseExpr because use symbolized equal to compare to `Expr`'s type.
it will lead to expression compare mistake and then lead to expression substitute failed when use `ExprSubstitutionMap`
Issue Number: close#9555
Make the last value of the dictionary null, when ColumnDict inserts a null value,
add the encoding corresponding to the last value of the dictionary·
introduce in stream-load-vec #9280, it will cause multi-thread
operate to same segment_write cause BetaRowset enable multi-thread
of memtable flush, memtable flush call rowset_writer.add_block, it
use member variable _segment_writer to write, so it will cause
multi-thread in segment write.
Co-authored-by: yixiutt <yixiu@selectdb.com>
Push down predicate past aggregate cannot push down predicate past 2 phase aggregate.
origin plan is like this:
```
second phase agg (conjuncts on olap scan node tuples)
|
first phase agg
|
olap scan node
```
should be optimized to
```
second phase agg
|
first phase agg
|
olap scan node (conjuncts on olap scan node tuples)
```