doris

Author	SHA1	Message	Date
seawinde	e6a6b82201	[nereids](mtmv) Support rewrite by mv nested materialized view (#33362 ) Support query rewritting by nested materialized view. Such as `inner_mv` def is as following > select > l_linenumber, > o_custkey, > o_orderkey, > o_orderstatus, > l_partkey, > l_suppkey, > l_orderkey > from lineitem > inner join orders on lineitem.l_orderkey = orders.o_orderkey; the mv1_0 def is as following: > select > l_linenumber, > o_custkey, > o_orderkey, > o_orderstatus, > l_partkey, > l_suppkey, > l_orderkey, > ps_availqty > from inner_mv > inner join partsupp on l_partkey = ps_partkey AND l_suppkey = ps_suppkey; for the following query, both inner_mv and mv1_0 can be successful when query rewritting by materialized view，and cbo will chose `mv1_0` finally. > select lineitem.l_linenumber > from lineitem > inner join orders on l_orderkey = o_orderkey > inner join partsupp on l_partkey = ps_partkey AND l_suppkey = ps_suppkey > where o_orderstatus = 'o' AND l_linenumber in (1, 2, 3, 4, 5)	2024-04-21 09:55:34 +08:00
Chester	687951202f	[refactor](opt) move BE code of hll scalar functions together, optimize head files (#33757 ) In this PR, we moved the BE code of hll scalar functions together to manage better, like bitmap functions file does. Also, we optimized the head files by: removing useless file "vec/aggregate_functions/aggregate_function.h" and "boost/iterator/iterator_facade.hpp", using cstddef and cstdint instead of stddef.h and stdint.h.	2024-04-21 09:55:19 +08:00
yujun	89441b0cb0	[fix](tablet invert index) fix tablet invert index leaky caused by auto partition (#33714 )	2024-04-21 09:54:50 +08:00
minghong	60253c827c	[fix](nereids) do not push RF into nested cte (#33769 )	2024-04-20 20:08:00 +08:00
Mingyu Chen	27662c3d62	[fix](row-count-cache) use cached row count for show (#33911 )	2024-04-20 20:06:58 +08:00
Tiewei Fang	36a70ba1e7	[Fix](Csv-Reader)Fix the issue of BE core dump caused by improper configuration of column_seperator and line_delimiter. (#33693 )	2024-04-20 20:06:48 +08:00
wangbo	03c3419265	[Refactor](executor)Add workload schedule policy table (#33729 )	2024-04-20 20:06:34 +08:00
Mingyu Chen	0e3ad5cd9d	[fix](parquet) fix time zone error(isAdjustedToUTC=true) in parquet reader (#33675 ) (#33924 ) bp (#33675) Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>	2024-04-20 19:06:54 +08:00
wuwenchi	f38b00b64a	[bugfix](hive)Modify the method used to obtain the txnId #33883	2024-04-20 11:43:58 +08:00
HHoflittlefish777	1ca96a1611	[fix](stream-load) fix stream load and http stream metric error #33899	2024-04-20 11:43:49 +08:00
谢健	80307354b2	[fix](Nereids): add whole tree rewriter when root is not CTEAnchor (#33591 ) (#33906 )	2024-04-20 01:05:57 +08:00
Mingyu Chen	cb5a94d4fc	[opt](coord) use single instance only with small limit (#33888 )	2024-04-20 00:45:20 +08:00
Mingyu Chen	00adad8e2d	[fix](variable) modify @@auto_commit column type to BIGINT in Nereids. #33887	2024-04-20 00:45:11 +08:00
yiguolei	365fcec473	Revert "[Improvementation](join) empty_block shall be set true when build block only one row (#33721 )" This reverts commit f17ac173b4e8052cb130119bdec649169f66ac4e.	2024-04-19 23:52:24 +08:00
Pxl	09b973db49	[Chore](runtime-filter) adjust need_local_merge setting conditions (#33886 )	2024-04-19 23:50:04 +08:00
zclllyybb	5bddbcd933	[chore](errmsg) Fix confusing error message and clang tidy hints (#33893 )	2024-04-19 23:41:50 +08:00
zclllyybb	42e91149e4	[enhancement](auto-partition) Forbid use Auto and Dynamic partition at the same time (#33736 )	2024-04-19 23:41:46 +08:00
Xinyi Zou	bec7c36c46	[fix](stacktrace) Fix dwarf_location_info_mode is passed as parameter to stack trace (#33863 ) dwarf_location_info_mode is passed as parameter to stack trace	2024-04-19 23:41:46 +08:00
Xinyi Zou	ee687a43fd	[fix](plsql) Fix regression test for routine select (#33860 ) fix #33608, more comprehensive test	2024-04-19 23:41:46 +08:00
Uniqueyou	f2a0ac8ff2	[feature] (partition) Dynamic partition behavior changes (#33712 )	2024-04-19 23:41:46 +08:00
zclllyybb	25358564ca	[Fix](compile) Fix gcc compile on master (#33864 ) This is imported by #33511. wrongly used ColumnStr<T> (); which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)	2024-04-19 23:41:37 +08:00
Mryange	74590e4836	[refine](node) Remove the cse DCHECK from the constructor (#33856 ) It's possible that a failure in the fe caused the check to fail, and at that moment, it may not be possible to retrieve the corresponding query ID from be.out.	2024-04-19 23:41:37 +08:00
Sun Chenyang	7e91e69eb9	[fix](compaction) fix single compaction (#33907 ) * [fix](compaction)Fix single compaction to get all local versions #33849 add test and comment * remove single replica compaction prepare input rowsets reviesd	2024-04-19 23:30:25 +08:00
Luwei	439027119e	[fix](schema change) fix schema change check does not calculate reader merged rows (#33825 ) (#33908 )	2024-04-19 22:57:25 +08:00
HappenLee	0ac7849a9d	[exec](table_fun) opt bitmap/split vexplode table func performance (#33876 )	2024-04-19 15:22:14 +08:00
924060929	15f8014e4e	[enhancement](Nereids) Enable parse sql from sql cache and fix some bugs (#33867 ) * [enhancement](Nereids) Enable parse sql from sql cache (#33262) Before this pr, the query must pass through parser, analyzer, rewriter, optimizer and translator, then we can check whether this query can use sql cache, if the query is too long, or the number of join tables too big, the plan time usually >= 500ms. This pr reduce this time by skip the fashion plan path, because we can reuse the previous physical plan and query result if no any changed. In some cases we should not parse sql from sql cache, e.g. table structure changed, data changed, user policies changed, privileges changed, contains non-deterministic functions, and user variables changed. In my test case: query a view which has lots of join and union, and the tables has empty partition, the query latency is about 3ms. if not parse sql from sql cache, the plan time is about 550ms ## Features 1. use Config.sql_cache_manage_num to control how many sql cache be reused in on fe 2. if explain plan appear some plans contains `LogicalSqlCache` or `PhysicalSqlCache`, it means the query can use sql cache, like this: ```sql mysql> set enable_sql_cache=true; Query OK, 0 rows affected (0.00 sec) mysql> explain physical plan select * from test.t; +----------------------------------------------------------------------------------+ \| Explain String(Nereids Planner) \| +----------------------------------------------------------------------------------+ \| cost = 3.135 \| \| PhysicalResultSink[53] ( outputExprs=[c1#0, c2#1] ) \| \| +--PhysicalDistribute[50]@0 ( stats=3, distributionSpec=DistributionSpecGather ) \| \| +--PhysicalOlapScan[t]@0 ( stats=3 ) \| +----------------------------------------------------------------------------------+ 4 rows in set (0.02 sec) mysql> select * from test.t; +------+------+ \| c1 \| c2 \| +------+------+ \| 1 \| 2 \| \| -2 \| -2 \| \| NULL \| 30 \| +------+------+ 3 rows in set (0.05 sec) mysql> explain physical plan select * from test.t; +-------------------------------------------------------------------------------------------+ \| Explain String(Nereids Planner) \| +-------------------------------------------------------------------------------------------+ \| cost = 0.0 \| \| PhysicalSqlCache[2] ( queryId=78511f515cda466b-95385d892d6c68d0, backend=127.0.0.1:9050 ) \| \| +--PhysicalResultSink[52] ( outputExprs=[c1#0, c2#1] ) \| \| +--PhysicalDistribute[49]@0 ( stats=3, distributionSpec=DistributionSpecGather ) \| \| +--PhysicalOlapScan[t]@0 ( stats=3 ) \| +-------------------------------------------------------------------------------------------+ 5 rows in set (0.01 sec) ``` (cherry picked from commit 03bd2a337d4a56ea9c91673b3bd4ae518ed10f20) * fix * [fix](Nereids) fix some sql cache consistence bug between multiple frontends (#33722) fix some sql cache consistence bug between multiple frontends which introduced by [enhancement](Nereids) Enable parse sql from sql cache #33262, fix by use row policy as the part of sql cache key. support dynamic update the num of fe manage sql cache key (cherry picked from commit 90abd76f71e73702e49794d375ace4f27f834a30) * [fix](Nereids) fix bug of dry run query with sql cache (#33799) 1. dry run query should not use sql cache 2. fix test sql cache in cloud mode 3. enable cache OneRowRelation and EmptyRelation in frontend to skip parse sql (cherry picked from commit dc80ecf7f33da7b8c04832dee88abd09f7db9ffe) * remove cloud mode * remove @NotNull	2024-04-19 15:22:14 +08:00
Xinyi Zou	c747714c18	[fix](memory) Fix ExecEnv destroy memory tracking (#33781 ) disable memory tracking when ExecEnv destroy. fix memory tracker label convert to query id	2024-04-19 15:03:10 +08:00
On-Work-Song	f4704b3821	[improvement](storage) support glibc <2.21 for system call eventfd (#33218 ) support glibc <2.21 for system call eventfd	2024-04-19 15:03:10 +08:00
Pxl	175e85d616	[Bug](runtime-filter) fix coredump on no null string type rf (#33869 ) fix coredump on no null string type rf	2024-04-19 15:03:06 +08:00
abmdocrt	8b061c7055	[Enhancement](group commit) Add fault injection case for group commit	2024-04-19 15:03:06 +08:00
Kang	ad75b9b142	[opt](auto bucket) add fe config autobucket_max_buckets (#33842 )	2024-04-19 15:03:06 +08:00
HHoflittlefish777	e38d844d40	[fix](multi-table-load) fix single stream multi table load cannot finish (#33816 )	2024-04-19 15:03:06 +08:00
airborne12	659900040f	[Fix](inverted index) fix wrong need read data opt when encounters columnA > columnB predicate (#33855 )	2024-04-19 15:03:06 +08:00
wuwenchi	1a6f8c443e	[bugfix](paimon) Create paimon catalog with hadoop user (#33833 ) When creating a catalog, paimon will create a warehouse on HDFS, so we need to use the corresponding user with permissions to create it.	2024-04-19 15:02:56 +08:00
feiniaofeiafei	6776a3ad1b	[Fix](planner) fix create view star except and modify cast to sql (#33726 )	2024-04-19 15:02:49 +08:00
feiniaofeiafei	a8ba933947	[Fix](nereids) fix bind order by expression logic (#33843 )	2024-04-19 15:02:49 +08:00
Kaijie Chen	ffd9da44a2	[fix](move-memtable) fix commit may fail due to duplicated reports (#32403 )	2024-04-19 15:02:49 +08:00
Mingyu Chen	2675e94a93	[feature](variable) add read_only and super_read_only (#33795 )	2024-04-19 15:02:21 +08:00
zhannngchen	56eb5ea00c	[enhancement](partial-update) print more log while missed some rowsets (#33711 )	2024-04-19 15:01:57 +08:00
meiyi	5abc84af71	[fix](txn insert) Fix txn insert commit failed when schema change (#33706 )	2024-04-19 15:01:57 +08:00
Tiewei Fang	315f6e44c2	[Branch-2.1](Outfile) Fixed the problem that the concurrent Outfile wrote multiple Success files (#33870 ) backport: #33016	2024-04-19 12:09:53 +08:00
slothever	561afde0c4	[feature](insert)support default value when create hive table (#33666 ) Issue Number: #31442 hive3 support create table with column's default value if use hive3, we can write default value to table	2024-04-19 11:31:33 +08:00
wuwenchi	734520a77b	[bugfix](hive)delete write path after hive insert (#33798 ) Issue #31442 1. delete file according query id 2. delete write path after insert	2024-04-19 11:31:25 +08:00
Pxl	ba05ef4405	[Chore](runtime-filter) add tmp debug info to investigate unknown filter error #33857	2024-04-18 21:03:09 +08:00
HappenLee	1300317723	[Exec](join) Support column string64 to avoid join failed in string size overflow the uint32 (#33511 ) (#33850 )	2024-04-18 19:43:08 +08:00
lihangyu	8f6f4cf0eb	[Pick](Variant) pick #33734 #33766 #33707 to branch-2.1 (#33848 ) * [Fix](Variant Type) forbit distribution info contains variant columns (#33707) * [Fix](Variant) VariantRootColumnIterator::read_by_rowids with wrong null map size (#33734) insert_range_from should start from `size` with `count` elements for null map * [Fix](Variant) check column index validation for extracted columns (#33766)	2024-04-18 19:42:44 +08:00
walter	c8a92b82cc	[fix](restore) Reset index id for MaterializedIndexMeta (#33831 )	2024-04-18 19:05:24 +08:00
jakevin	46fa64f34b	[minor](Nereids): remove useless getFilterConjuncts() filter() in Translator (#33801 )	2024-04-18 19:05:24 +08:00
wuwenchi	3eca9da0dd	[refactor](filesystem)refactor `filesystem` interface (#33361 ) 1. Remame`list` to `globList` . The path of this `list` needs to have a wildcard character, and the corresponding hdfs interface is `globStatus`, so the modified name is `globList`. 2. If you only need to view files based on paths, you can use the `listFiles` operation. 3. Merge `listLocatedFiles` function into `listFiles` function.	2024-04-18 19:05:24 +08:00
谢健	34a97d5e8b	[fix](Nereids)fix unstable plan shape in limit_push_down case	2024-04-18 19:05:24 +08:00

... 2 3 4 5 6 ...

18429 Commits