Inspired by c++ function `std::vector::emplace_back()`, we can use variadic template for this issue.
e.g.
```
[['struct'], 'STRUCT<TYPES>', ['TYPES'], 'ALWAYS_NOT_NULLABLE', ['TYPES...']]
```
`...TYPES` in template_types defines a variadic template `TYPE`. Then the variadic template will be expanded to multiple normal templates based on actual input arguments at runtime in FE.
But make sure `TYPES...` is placed on the last position in all template type arguments.
BTW, the origin template function logic is not affected.
add some limit for create mv on aggregate table.
```sql
CREATE TABLE t1 (
p1 INT,
p2 INT,
p3 INT,
v1 INT SUM,
v2 INT MAX,
v3 INT MIN ) AGGREGATE KEY (p1, p2, p3) DISTRIBUTED BY HASH (p1) BUCKETS 1 PROPERTIES ('replication_num' = '1');
CREATE MATERIALIZED VIEW mv_1 AS SELECT p1, SUM(v3) FROM t1 GROUP BY p1; // invalid aggregate type
CREATE MATERIALIZED VIEW mv_2 AS SELECT p1, MIN(v3+v3) FROM t1 GROUP BY p1; // invalid expression calculate on aggregate column
CREATE MATERIALIZED VIEW mv_3 AS SELECT p1, SUM(v1) FROM t1 GROUP BY p1; // cast v1 as bigint, ok
CREATE MATERIALIZED VIEW mv_4 AS SELECT p1, SUM(abs(v1)) FROM t1 GROUP BY p1; // invalid expression calculate on aggregate column
```
In the case of rowets non-overlap and desc sorting, the logic of VCollectIterator::Level0Iterator::init_for_union will be followed. In this function, the row ref pos of the first level0 iterator is set to 0, and the row pos of other level0 iterators are all Set to -1.
But in the level1iterator, when rowets are non-overlapping and is ordering by desc, the list of rowset iterators will be reversed, causing the row ref pos of the first level0 iterator in the list to be -1, causing the block reader to think that the entire tablet has no data.
he udaf case is unstable reason:
when fuzzy enable_pipeline_engine=true, the case of agg function only 1 instance,
so not merge the default value, but if instance>1, will merge the default value
refer paper: WinMagic - Subquery Elimination Using Window Aggregation
SQL like TPC-H Q2 and Q17, which contains a correlated sub-query with only one aggregation function output, we can eliminate the sub-query and transform it to window function. For example, TPC-H Q17 is
```sql
select
sum(l_extendedprice) / 7.0 as avg_yearly
from
lineitem,
part
where
p_partkey = l_partkey
and p_brand = 'Brand#23'
and p_container = 'MED BOX'
and l_quantity < (
select
0.2 * avg(l_quantity)
from
lineitem
where
l_partkey = p_partkey
);
```
we rewrite it to
```sql
select
sum(l_extendedprice) / 7.0 as avg_yearly
from (
select
l_extendedprice, l_quantity, avg(l_quantity) over(partition by l_partkey) avg_l_quantity
from
lineitem,
part
where
p_partkey = l_partkey
and p_brand = 'Brand#23'
and p_container = 'MED BOX' )
where l_quantity < 0.2 * avg_l_quantity
```
now the rule can only handle: where conjuncts in outer scope contain one sub-query and the conjunct contain sub-query is a comparison-predicate, we will support compound-predicate and more than one conjuncts containing sub-query later.
Due to concurrent load, there may be duplication in the delete bitmap of historical data and incremental calculations, resulting in duplicate calculations of missed rows.
Currently, if user modify the file on hdfs directly, no through hive. The changes of file will not be noticed by Doris and user
will get wrong data. Support the TTL(Time-to-Live) config of File Cache, so that the stale file info will be invalidated automatically after expiring.
1.Add a parameter configuration to set file cache ttl. "file.meta.cache.ttl-second".
2.Set the value corresponding to guava expireAfterAccess to the configuration value.
Co-authored-by: lexluo <lexluo@tencent.com>
sql like:
select k5, k6, SUM(k3) AS k3
from (
select
k5,
date_format(k6, '%Y-%m-%d') as k6,
count(distinct k3) as k3
from t
group by k5, k6
) AS temp where 1=1
group by k5, k6;
will throw exception since conjuncts planned on exchange node, because exchange node cannot handle conjuncts, now we skip exchange node when planning conjuncts, which fixes the bug.
notice: the bug occurs iff the conjunct is always true like 1=1 above.
sql like:
select * from (select *, null as top from v1)t where top = 5;
select * from (select *, null as top from v1)t where top is not null;
will cause NPE because targetTypeDef is null when value is null. Now we use cast target type to the targetTypeDef.
The optimization for skip reading column data if it match inverted index and only used in WHERE clause may get wrong result for complex SQL.
This PR temporary disable the optimization and later PRs will resolve the problem fundamentality.
After the memory exceeds the limit, the previous query waited for memory free in the mem hook, and changed it to wait in the Allocator.
more controllable and safe
add <optional> head to solve the compilation issue
use 3.12.9 as the protoc.artifact's version, because there is no 3.12.21
See: https://repo.maven.apache.org/maven2/com/google/protobuf/protoc/
Remove --show-progress arguments of wget because it is not supported in low version wget
Sometimes the competition of lock is fierce in DatabaseTransactionMgr, which may lead to publish time out, i think we should have a log to hint these lock competition.
When using JDBC Catalog to query the Doris data, because Doris does not provide the cursor reading method (that is, fetchBatchSize is invalid), Doris will send the data to the client at one time, resulting in client OOM.
The MySQL protocol provides a stream reading method. Doris can use this method to avoid OOM. The requirements of using the stream method are setting fetchbatchsize = Integer.MIN_VALUE and setting ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY