json reader DCHECK fail because of missing TYPE_STRING
fix bug that if no file is found, the tvf will throw NPE.
The predicate conjuncts can not be pushed down to parquet reader if this is a load task.
Because the predicate should be applied on column of dest table, not on column of source file.
Add a temp property "use_new_load_scan_node" of broker load to make regression test happy.
So that we can use new load scan node for a certain job and avoid setting global FE config.
Fix bug that the ArithmeticExpr's write method is not implement, causing FE crash when creating function like:
CREATE ALIAS FUNCTION IF NOT EXISTS mesh_udf_test1(INT,INT) WITH PARAMETER(n,d) AS ROUND(1+floor(n/d));
Add if exists and if not exists for drop and create function
Fix a minor bug that if file does not exist, hdfs() table valued function will throw NPE
remove group by constants, like:
before apply rule:
select 1, k1, min(k2), max(k3) from t1 group by 1, 2;
after apply rule:
select 1, k1, min(k2), max(k3) from t1 group by k1;
MySQL [db]> SELECT SUM(a.r[1]) as active_user_num, SUM(a.r[2]) as active_user_num_1day, SUM(a.r[3]) as active_user_num_3day, SUM(a.r[4]) as active_user_num_7day FROM ( SELECT user_id, retention( day = '2022-11-01', day = '2022-11-02', day = '2022-11-04', day = '2022-11-07') as r FROM login_event WHERE (day >= '2022-11-01') AND (day <= '2022-11-21') GROUP BY user_id ) a;
ERROR 1105 (HY000): errCode = 2, detailMessage = sum requires a numeric parameter: sum(%element_extract%(a.r, 1))
Remove those projects that used for column pruning only and don't do any expression calculation, So that we could avoid some redundant data copy in do_projection of BE side.
1. when we choose broadcast join, we only consider transferring less data. This may lead OOM, if hash table is big enough.
2. fix a bug in `Stats.computeSize()`. ColumnStats.dataSize is the total size of this column, but we need the byte of one cell.
BE will crash when querying partitioned hive table with text format
and put partition column at first of select items.
1. FE should use file slots to set the column mapping index of csv file.
2. BE should use `get_by_name` of block to get right column in a block in csv reader.
This pr did two things:
1. 【new logical plan】add **LogicalCheckPolicy** before UnboundRelation in LogicalPlanBuilder.
2. 【new rule】turn **LogicalCheckPolicy** to LogicalFilter if row policy exist, otherwise remove it.
Using cache to store external table columns, doesn't persist uniq id for external columns anymore.
So use -1 as column id for ES external table.
Avoid non-master FE trying to get uniq id problem. The problem will cause non-master FE fail to write bdbje.
1.Support in bitmap syntax, like 'where k1 in (select bitmap_column from tbl)';
2.Support bitmap runtime filter. Generate a bitmap filter using the right table bitmap and push it down to the left table storage layer for filtering.
This pr contribute:
- support explain CTE;
- refine CTE, fix the bug: reuse the same analyzed plan which LogicalOlapScan has the same relationId;
- change EliminateAliasNode to LogicalSubQueryAliasToLogicalProject and move to the top of rewrite stage, so we can simply observe the analyzed plan by the LogicalSubQueryAlias with alias;
- job traverse left child first, so the ExprId growth from left child to right child.
in #14482, we implement the feature to keep specific number of meta with same name in catalog recycle bin.
But it will cause meta replay bug.
Because every time we drop db/table/partition, it will try to erase a certain number of meta with same name.
And when replay "drop" edit log, it will do same thing. But the number of meta to erase it based on current config value,
not persist in edit log, so it will cause inconsistency with "drop" and "replay drop".
In this PR, I move the "erase meta with same name" logic to the daemon thread of catalog recycle bin.
This PR is to make sharing hash table for broadcast more robust:
Add a session variable to enable/disable this function.
Do not block the hash join node's close function.
Use shared pointer to share hash table and runtime filter in broadcast join nodes.
The Hash join node that doesn't need to build the hash table will close the right child without reading any data(the child will close the corresponding sender).
Issue Number: close#13615
The main work:
implement grouping sets/ cube/ rollup.
fix if function Infinite loop problem.
Support for isNull transitions to legacy optimizers.
1. in dateV2, we adjust the dir structure to avoid creating a tpch-1G database
2. use `drop table XXX` to replace `delete * from XXX where key>0`
3. remove explain cases, because
- the explain string itself is variable, and the case is hard to maintain
- it is original planner explain, not nereids
Previously, we use "Date" type for cooldownTime in StoragePolicy.
But the serialization method of Date type in Gson is different in java8 and java11, which may cause inconsistent meta error.
This PR use Long to save cooldownTime.
And notice that in FE, the cooldownTime is saved in milliseconds, and in BE, it is saved in seconds.
Origin code using Set to store hms external table partition columns,
which couldn't guarantee the order of the columns.
This could cause the column name and column type doesn't match.
Using List instead of Set to fix the problem.
Nereids assign fragment ID in its own way. The fragment Id in explain is different from the fragment id in profile.
This difference makes trouble to understand profile.
This pr aims to print fragment id in explain the same as that in profile.·