Commit Graph

9538 Commits

Author SHA1 Message Date
c8e4684578 [enhancement](nereids)support topN opt in nereids (#17741)
1. support topN opt in nereids
2. pushdown limit->proj->sort
2023-03-27 18:57:56 +08:00
2785202816 [Bug](regression-test) be coredump in pipeline when grace exit in regression test (#18131) 2023-03-27 18:36:27 +08:00
894f38a517 [fix](planner) fix conjunct planned on exchange node (#18042)
sql like: 
select k5, k6, SUM(k3) AS k3 
from ( 
    select
        k5,
        date_format(k6, '%Y-%m-%d') as k6,
        count(distinct k3) as k3 
    from t 
    group by k5, k6
) AS temp where 1=1
group by k5, k6;

will throw exception since conjuncts planned on exchange node, because exchange node cannot handle conjuncts, now we skip exchange node when planning conjuncts, which fixes the bug. 
notice: the bug occurs iff the conjunct is always true like 1=1 above.
2023-03-27 17:50:52 +08:00
902629adb6 [fix](planner) fix targetTypeDef NPE when value is null (#18072)
sql like:
select * from (select *, null as top from v1)t where top = 5;
select * from (select *, null as top from v1)t where top is not null;
will cause NPE because targetTypeDef is null when value is null. Now we use cast target type to the targetTypeDef.
2023-03-27 17:29:14 +08:00
cd85b5b262 [conf](nereids) disable new cost model since it hurts performance (#18127) 2023-03-27 16:12:15 +08:00
da8c53a831 [feat](Nereids): pushdown semijoin through agg. (#18105) 2023-03-27 15:27:44 +08:00
642c378fc7 [feature](table-valued-function) add Backends table-valued-function (#17667)
This pr implement a new Metadata TVF called backends. And the implement process tutorial is in #17974.
2023-03-27 15:18:31 +08:00
1576130094 [ehancement](stats) Tune stats framework (#18118) 2023-03-27 14:38:10 +08:00
8b07021f5f [enhancement](regression-test) add hint to disable nereids planner for some cases (#18066) 2023-03-27 14:06:50 +08:00
f03598f214 [enhance](cooldown) no snapshot or migration action for cooldown tablet (#17658) 2023-03-27 13:35:32 +08:00
d1f34a3be4 [bugfix](inverted index)temporary disable skip read column data if it match inverted index (#18065)
The optimization for skip reading column data if it match inverted index and only used in WHERE clause may get wrong result for complex SQL.

This PR temporary disable the optimization and later PRs will resolve the problem fundamentality.
2023-03-27 11:29:42 +08:00
dc7b2015f5 eh (#18122) 2023-03-27 11:09:35 +08:00
2929a96224 [Refactor](inverted index cache) Use asc set instead of priority queue at the lru cache (#18033)
use asc set instead of priority queue at the LRU cache, to keep the lifecycle of the LRUHandle consistent in the sorted set and the LRU free list
2023-03-27 10:27:37 +08:00
bcf95cd920 [feature](function)Add ST_Angle_Sphere function (#17919) 2023-03-27 10:14:46 +08:00
fd5dd9a391 [Opt](Pipeline) opt pipeline code in mult tablet (#17999) 2023-03-27 10:02:48 +08:00
990479e177 [refactor](memory) Query waits for memory free in Allocator, after memory exceed limit. (#18075)
After the memory exceeds the limit, the previous query waited for memory free in the mem hook, and changed it to wait in the Allocator.

more controllable and safe
2023-03-27 09:06:03 +08:00
78abb40fdc [improvement](string) throw exception instead of log fatal if string column exceed total size limit (#17989)
Throw exception instead of log fatal if string column exceed total size limit, so that we can catch it and let query fail, instead of causing be exit.
2023-03-27 08:55:26 +08:00
c2dd005efb [fix](chore) fix BE compile and FE protoc artifact issue (#18120)
add <optional> head to solve the compilation issue
use 3.12.9 as the protoc.artifact's version, because there is no 3.12.21
See: https://repo.maven.apache.org/maven2/com/google/protobuf/protoc/
Remove --show-progress arguments of wget because it is not supported in low version wget
2023-03-27 08:53:42 +08:00
5463ba6267 [doc](fqdn)fqdn and k8s doc (#17318) 2023-03-26 22:04:21 +08:00
1027dd52ba [feature](Nereids): Pushdown SemiJoin in RBO. (#18099) 2023-03-26 20:58:43 +08:00
304064653c [feature](log)check and log holding lock time when it exceeds threshold (#17965)
Sometimes the competition of lock is fierce in DatabaseTransactionMgr, which may lead to publish time out, i think we should have a log to hint these lock competition.
2023-03-26 20:11:40 +08:00
e06c613f9a [fix](meta)Fix FE try to repair a tablet witch can not be repaired. #17959 2023-03-26 20:11:14 +08:00
3e8b3d68fc [BugFix](jdbc catalog) fix OOM when jdbc catalog querys large data from doris #18067
When using JDBC Catalog to query the Doris data, because Doris does not provide the cursor reading method (that is, fetchBatchSize is invalid), Doris will send the data to the client at one time, resulting in client OOM.

The MySQL protocol provides a stream reading method. Doris can use this method to avoid OOM. The requirements of using the stream method are setting fetchbatchsize =  Integer.MIN_VALUE and setting ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY
2023-03-26 20:02:03 +08:00
a0b100d38e [enhancement](regression-test) prove setting default value to session var will be detected #18113 2023-03-26 12:56:15 +08:00
2a0890d803 [feature](datatype) add show data types stmt (#18111) 2023-03-26 12:37:06 +08:00
Pxl
45ad297a1d [Enchancement](function) change aggregate function creator to return AggregateFunctionPtr (#18025)
change creator_type to return AggregateFunctionPtr.
remove some function and use creator directly.
2023-03-26 11:41:34 +08:00
5df011cd43 [typo](doc)Add cancel create materialized view grammar #18084 2023-03-26 11:39:25 +08:00
0347ae4dbd [Enhancement](proc) sort result by backend id when show backends (#18112) 2023-03-26 11:30:47 +08:00
c63807ccfe [chore](be) reduce log when trying to do async write cooldown meta (#18107) 2023-03-26 11:10:21 +08:00
5846b3fc54 [fix](memory) Remove PODArray peak allocated memory tracking #18010
#11740 , solved the problem that the query memory statistics are higher than the actual physical memory, because PODArray does not have memset 0 when allocating memory, and the query mem tracker is virtual memory.

But in extreme cases, such as csv load, PODArray frequent insert will cause performance problems. So revert part of #11740 and part of #12820.

The accuracy of the query mem tracker, there is currently no feedback, no further attention.
2023-03-26 09:45:10 +08:00
c5dcb633e9 [fix](hive)throw exception if complex type in text format table (#18013)
For Hive text input format: the column types ARRAY/MAP/STRUCT are not supported yet.
It will be supported over successive versions.

Co-authored-by: jinzhe <jinzhe@selectdb.com>
2023-03-25 23:26:52 +08:00
7c0bcbdca1 [enhance](parquet-reader) cache file meta of parquet to speed up query (#18074)
Problem:
1. FE will split the parquet file into split. So a file can have several splits.
2. BE will scan each split, read the footer of the parquet file.
3. If 2 splits belongs to a same parquet file, the footer of this file will be read twice.

This PR mainly changes:
1. Use kv cache to cache the footer of parquet file.
2. The kv cache is belong to a scan node, so all parquet reader belong to this scan node will share same kv cache.
3. In cache, the key is "meta_file_path", the value is parsed thrift footer.

The KV Cache is sharded into mutlti sub cache.
So that different file can use different sub cache, avoid blocking each other

In my test, a query with 26 splits can reduce the footer parse time from 4s -> 1s
2023-03-25 23:22:57 +08:00
96f274b8f3 [fix](global-variable) fix bug that set default value for global variable will cause NullPointerException (#18004) 2023-03-25 22:45:26 +08:00
df0eca4003 [improvement] (schema change) Lightweight schema change of modify column with varchar length (#17207)
Signed-off-by: Yisong Han <yisong8686@gmail.com>
2023-03-25 22:38:19 +08:00
74fdb6c116 [refactor](regression-test) refactor ssl test from p0 to p2 (#17847) 2023-03-25 22:37:26 +08:00
cb6fca95b2 [fix](lambda-func) fix lambda functions exception message errors (#18068) 2023-03-25 22:36:55 +08:00
360d3050bc [Feature](array-function) Support array_reverse_sort function (#17754)
Co-authored-by: zhangyu209 <zhangyu209@meituan.com>
2023-03-25 21:58:11 +08:00
50eeb2d9a4 [fix](json) change int to bigint for json function (#17769) 2023-03-25 21:57:29 +08:00
855852d582 [enhancement](timeout) fix set timeout failure and simplify timeout logic (#17837) 2023-03-25 21:56:06 +08:00
193ae352e4 [fix](coalesce) fix problem that coalesce function may cause problem of block mem reuse (#17940) 2023-03-25 21:50:37 +08:00
Pxl
a8753faeb1 [Bug](function) fix column complex not resize after filter (#18043) 2023-03-25 21:48:13 +08:00
77c9550420 [fix](bitmapfilter) fix bitmap filter timeout unit error (#18110) 2023-03-25 21:46:32 +08:00
f9013f2668 [feature](Nereids): pullup all semijoin through join. (#18106) 2023-03-25 20:25:28 +08:00
f36465e76e [enhancement](memory) optimize jemalloc heap profile doc (#18094) 2023-03-25 13:04:45 +08:00
7ae51c856e [refactor](unify exception) unify exception definition and error code (#18006)
* [refactor](unify exception) unify exception definition and error code


---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-25 12:41:07 +08:00
f84481886b [feature](string_functions) The 'split_part' function supports non-constant parameters (#18029) 2023-03-25 12:03:11 +08:00
e0518fd19d [fix](nereids)remove redundant visit call in Validator (#18103) 2023-03-25 11:41:34 +08:00
1164611393 [enhancement](planner) fix unclear exception msg when create mv (#17537)
a materialized view's from clause can only be a single table and not sub-query, but the exception msg is npe. The pr change it to a clear msg.
2023-03-25 11:36:40 +08:00
2408ca5da8 [Bug](DECIMALV3) Fix wrong precision for plus/minus (#18052)
Result type for DECIMAL(x, y) plus/minus DECIMAL(m, n) should be DECIMAL(max(x - y, m - n) + max(y + n) + 1, max(y + n))
2023-03-25 09:42:39 +08:00
b2c70b51cc [refactor](vectorized) delete row-based AnyVal and DateTimeVal (#18093) 2023-03-25 09:40:04 +08:00