Introduce a SQL syntax for creating inverted index and related metadata changes.
```
-- create table with INVERTED index
CREATE TABLE httplogs (
ts datetime,
clientip varchar(20),
request string,
status smallint,
size int,
INDEX idx_size (size) USING INVERTED,
INDEX idx_status (status) USING INVERTED,
INDEX idx_clientip (clientip) USING INVERTED PROPERTIES("parser"="none")
)
DUPLICATE KEY(ts)
DISTRIBUTED BY RANDOM BUCKETS 10
-- add an INVERTED index to a table
CREATE INDEX idx_request ON httplogs(request) USING INVERTED PROPERTIES("parser"="english");
```
This PR implements the function of predicate inference
For example:
``` sql
select * from student left join score on student.id = score.sid where score.sid > 1
```
transformed logical plan tree:
left join
/ \
filter(sid >1) filter(id > 1) <---- inferred predicate
| |
scan scan
See `InferPredicatesTest` for more cases
The logic is as follows:
1. poll up bottom predicate then infer additional predicates
for example:
select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id
1. poll up bottom predicate
select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1
2. infer
select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1 and t2.id = 1
finally transformed sql:
select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t2.id = 1
2. put these predicates into `otherJoinConjuncts` , these predicates are processed in the next
round of predicate push-down
Now only support infer `ComparisonPredicate`.
TODO: We should determine whether `expression` satisfies the condition for replacement
eg: Satisfy `expression` is non-deterministic
We set heap limit for tcmalloc to avoid oom introduced by tcmalloc which allocates memory for cache even free memory of a machine is little. However, doris allocates large memory unused in some cases, so tcmalloc would throw an oom exception even ther are a lot free memory in a machine.
We can set the limit after we fix the problem again.
1. add back TPC-H regression test cases
2. fix decimal problem on aggregate function sum and agg introduced by #13764
3. fix memo merge group NPE introduced by #13900
Support Aliyun DLF
Support data on s3-compatible object storage, such as aliyun oss.
Refactor some interface of catalog, to make it more tidy.
Fix bug that the default text format field delimiter of hive should be \x01
Add a new class PooledHiveMetaStoreClient to wrap the IMetaStoreClient.
In GraphSimplifier, we can use simple cost to calculate the benefit.
And only when the best neighbor of the apply step is the processing edge, we need to update recursively.
mem tracker can be logically divided into 4 layers: 1)process 2)type 3)query/load/compation task etc. 4)exec node etc.
type includes
enum Type {
GLOBAL = 0, // Life cycle is the same as the process, e.g. Cache and default Orphan
QUERY = 1, // Count the memory consumption of all Query tasks.
LOAD = 2, // Count the memory consumption of all Load tasks.
COMPACTION = 3, // Count the memory consumption of all Base and Cumulative tasks.
SCHEMA_CHANGE = 4, // Count the memory consumption of all SchemaChange tasks.
CLONE = 5, // Count the memory consumption of all EngineCloneTask. Note: Memory that does not contain make/release snapshots.
BATCHLOAD = 6, // Count the memory consumption of all EngineBatchLoadTask.
CONSISTENCY = 7 // Count the memory consumption of all EngineChecksumTask.
}
Object pointers are no longer saved between each layer, and the values of process and each type are periodically aggregated.
other fix:
In [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528, I tried to separate the memory that was manually abandoned in the query from the orphan mem tracker. But in the actual test, the accuracy of this part of the memory cannot be guaranteed, so put it back to the orphan mem tracker again.
to_bitmap function only support string param only,add to_bitmap() function with int type, this can avoid convert int type to string and then convert string to int
When the load mem hard limit reached, all load channel should wait on the lock of LoadChannelMgr, util current reduce mem work finished. In current implementation, there's a bug might cause some threads be woke up before reduce mem work finished:
thread A found that soft limit reached, picked a load channel and waiting for reduce memory work finish.
The memory keep increasing
thread B found that hard limit reached (either the load mem hard limit, or process soft limit), it picked a load channel to reduce memory and set the variable _should_wait_flush to true
thread C found that _should_wait_flush is true, waiting on _wait_flush_cond
thread A finished it's reduce memory work, found that _should_wait_flush is true, set it to false, and notify all threads.
thread C is woke up and pick a load channel to do the reduce memory work, and now thread B's work is not finished.
We can see 2 threads doing reduce memory work when hard limit reached, it's quite confusing.
1. add a feature that support statement having aggregate function in order by list. such as:
SELECT COUNT(*) FROM t GROUP BY c1 ORDER BY COUNT(*) DESC;
2. add clickbench analyze unit tests
This pr does three things:
1. Modified the framework of table-valued-function(tvf).
2. be support `fetch_table_schema` rpc.
3. Implemented `S3(path, AK, SK, format)` table-valued-function.
[What is DLF](https://www.alibabacloud.com/product/datalake-formation)
This PR is a preparation for support DLF, with some changes of multi catalog
1. Add RuntimeException for most of hive meta store or es client visit operation.
2. Add DLF related dependencies.
3. Move the checks of es catalog properties to the analysis phase of creating es catalog
TODO(in next PR):
1. Refactor the `getSplit` method to support not only hdfs, but s3-compatible object storage.
2. Finish the implementation of supporting DLF