doris

Author	SHA1	Message	Date
camby	c5b6056b7a	[fix](lateral_view) fix lateral view explode_split with temp table (#12643 ) Problem describe: follow SQL return wrong result: WITH example1 AS ( select 6 AS k1 ,'a,b,c' AS k2) select k1, e1 from example1 lateral view explode_split(k2, ',') tmp as e1; Wrong result: +------+------+ \| k1 \| e1 \| +------+------+ \| 0 \| a \| \| 0 \| b \| \| 0 \| c \| +------+------+ Correct result should be: +------+------+ \| k1 \| e1 \| +------+------+ \| 6 \| a \| \| 6 \| b \| \| 6 \| c \| +------+------+ Why? TableFunctionNode::outputSlotIds do not include column k1. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-09-21 09:19:18 +08:00
Gabriel	d5486726de	[Bug](date) Fix wrong result produced by date function (#12720 )	2022-09-20 21:09:26 +08:00
Gabriel	cc072d35b7	[Bug](date) Fix wrong type in TimestampArithmeticExpr (#12727 )	2022-09-20 21:08:48 +08:00
caiconghui	bb7206d461	[refactor](SimpleScheduler) refactor code for getting available backend in SimpleScheduler (#12710 )	2022-09-20 18:08:29 +08:00
mch_ucchi	47797ad7e8	[feature](Nereids) Push down not slot references expression of on clause (#11805 ) pushdown not slotreferences expr of on clause. select * from t1 join t2 on t1.a + 1 = t2.b + 2 and t1.a + 1 > 2 project() +---join(t1.a + 1 = t2.b + 2 && t1.a + 1 > 2) \|---scan(t1) +---scan(t2) transform to project() +---join(c = d && c > 2) \|---project(t1.a -> t1.a + 1) \| +---scan(t1) +---project(t2.b -> t2.b + 2) +---scan(t2)	2022-09-20 13:41:54 +08:00
minghong	d83eb13ac5	[enhancement](nereids) use Literal promotion to avoid unnecessary cast (#12663 ) Instead of add a cast function on literal, we directly change the literal type. This change could save cast execution time and memory. For example: In SQL: "CASE WHEN l_orderkey > 0 THEN ...", 0 is a TinyIntLiteral. Before this PR: "CASE WHEN l_orderkey > CAST (TinyIntLiteral(0) AS INT)` With this PR: "CASE WHEN l_orderkey > IntegerLiteral(0)"	2022-09-20 11:15:47 +08:00
morrySnow	954c44db39	[enhancement](Nereids) compare LogicalProperties with output set instead of output list (#12743 ) We used output list to compare two LogicalProperties before. Since join reorder will change the children order of a join plan and caused output list changed. the two join plan will not equals anymore in memo although they should be. So we must add a project on the new join to keep the LogicalProperties the same. This PR changes the equals and hashCode funtions of LogicalProperties. use a set of output to compare two LogicalProperties. Then we do not need add the top peoject anymore. This help us keep memo simple and efficient.	2022-09-20 10:55:29 +08:00
starocean999	4f27692898	[fix](inlineview)the inlineview's slots' nullability property is not set correctly (#12681 ) The output slots of inline view may come from an outer join nullable side table. So it's should be nullable.	2022-09-20 09:29:15 +08:00
ElvinWei	e1d2f82d8e	[feature](statistics) template for building internal query SQL statements (#12714 ) Template for building internal query SQL statements，it mainly used for statistics module. After the template is defined, the executable statement will be built after the given parameters. For example, template and parameters: - template: `SELECT ${col} FROM ${table} WHERE id = ${id};`, - parameters: `{col=colName, table=tableName, id=1}` - result sql: `SELECT colName FROM tableName WHERE id = 1;` usage: ``` String template = "SELECT * FROM ${table} WHERE id = ${id};"; Map<String, String> params = new HashMap<>(); params.put("table", "table0"); params.put("id", "123"); // result: SELECT * FROM table0 WHERE id = 123; String result = InternalSqlTemplate.processTemplate(template, params); ```	2022-09-19 22:10:28 +08:00
mch_ucchi	94d73abf2a	[test](Nereids) runtime filter unit cases not rely on NereidPlanner to generate PhysicalPlan anymore (#12740 ) This PR: 1. add rewrite and implement method to PlanChecker 2. improve unit tests of runtime filter	2022-09-19 19:53:55 +08:00
ElvinWei	1339eef33c	[fix](statistics) remove statistical task multiple times in one loop cycle (#12741 ) There is a problem with StatisticsTaskScheduler. The peek() method obtains a reference to the same task object, but the for-loop executes multiple removes.	2022-09-19 19:28:51 +08:00
jakevin	4b5cc62348	[refactor](Nereids) rename transform to applyExploration UT helper class PlanChecker (#12725 )	2022-09-19 16:49:56 +08:00
ElvinWei	08a71236a9	[feature](statistics) Internal-query, execute SQL query statement internally in FE (#9983 ) Execute SQL query statements internally(in FE). Internal-query mainly used for statistics module, FE obtains statistics by SQL from BE, such as column maximum value, minimum value, etc. This is a tool module as statistics, it will not affect the original code, also will not affect the use of users. The simple usage process is as follows(the following code does no exception handling): ``` String dbName = "test"; String sql = "SELECT * FROM table0"; InternalQuery query = new InternalQuery(dbName, sql); InternalQueryResult result = query.query(); List<ResultRow> resultRows = result.getResultRows(); for (ResultRow resultRow : resultRows) { List<String> columns = resultRow.getColumns(); for (int i = 0; i < resultRow.getColumns().size(); i++) { resultRow.getColumnIndex(columns.get(i)); resultRow.getColumnName(i); resultRow.getColumnType(columns.get(i)); resultRow.getColumnType(i); resultRow.getColumnValue(columns.get(i)); resultRow.getColumnValue(i); } } ```	2022-09-19 16:26:54 +08:00
jakevin	399af4572a	[improve](Nereids) improve join cost model (#12657 )	2022-09-19 16:25:30 +08:00
Jibing-Li	5978fd9647	[refactor](file scanner)Refactor file scanner. (#12602 ) Refactor the scanners for hms external catalog, work in progress. Use VFileScanner, will remove NewFileParquetScanner, NewFileOrcScanner and NewFileTextScanner after fully tested. Query for parquet file has been tested, still need to add readers for orc file, text file and load logic as well.	2022-09-19 15:23:51 +08:00
jakevin	75d7de89a5	[improve](Nereids) Add all slots used by onClause to project when reorder and fix reorder mark (#12701 ) 1. Add all slots used by onClause in project ``` (A & B) & C like join(hash conjuncts: C.t2 = A.t2) \|---project(A.t2) \| +---join(hash conjuncts: A.t1 = B.t1) \| +---A \| +---B +---C transform to (A & C) & B join(hash conjuncts: A.t1 = B.t1) \|---project(A.t2) \| +---join(hash conjuncts: C.t2 = A.t2) \| +---A \| +---C +---B ``` But projection just include `A.t2`, can't find `A.t1`, we should add slots used by onClause when projection exist. 2. fix join reorder mark Add mark `LAsscom` when apply `LAsscom` 3. remove slotReference use `Slot` instead of `SlotReference` to avoid cast.	2022-09-19 11:01:25 +08:00
Mingyu Chen	a4ed023bad	[fix](colocation) fix decommission failure with 2 BEs and colocation table (#12644 ) This PR fix: 2 Backends. Create tables with colocation group, 1 replica. Decommission one of Backends. The tablet on decommissioned Backend is not reduced. This is a bug of ColocateTableCheckerAndBalancer.	2022-09-19 08:34:50 +08:00
abmdocrt	4f98146e83	[enhancement](tracing) Support forward to master tracing (#12290 )	2022-09-18 17:39:04 +08:00
Lightman	e01986b8b9	[feature](light-schema-change) fix light-schema-change and add more cases (#12160 ) Fix _delete_sign_idx and _seq_col_idx when append_column or build_schema when load. Tablet schema cache support recycle when schema sptr use count equals 1. Add a http interface for flink-connector to sync ddl. Improve tablet->tablet_schema() by max_version_schema.	2022-09-17 11:29:36 +08:00
924060929	0a95ebf602	[feature](Nereids) Add scalar function code generator and some function trait (#12671 ) This pr did these things: 1. Change the nullable mode of 'from_unixtime' and 'parse_url' from DEPEND_ON_ARGUMENT to ALWAYS_NULLABLE, which nullable configuration was missing previously. 2. Add some new interfaces for origin NullableMode. This change inspired by the grammar of scala's mix-in trait, It help us to quickly understand the traits of function without read the lengthy procedural code and save the work to write some template code, like `class Substring extends ScalarFunction implements ImplicitCastInputTypes, PropagateNullable`. These are the interfaces: - PropagateNullable: equals to NullableMode.DEPEND_ON_ARGUMENT - AlwaysNullable: equals to NullableMode.ALWAYS_NULLABLE - AlwaysNotNullable: equals to NullableMode.ALWAYS_NOT_NULLABLE - others ComputeNullable: equals to NullableMode.CUSTOM 3. Add `GenerateScalarFunction` to generate nereids-style function code from legacy functions, but not actual generate any new function class yet, because the function's trait is not ready for use. I need add some traits for the legacy function's CompareMode and NonDeterministic, this thought is the same as ComputeNullable.	2022-09-16 21:27:30 +08:00
yongjinhou	6fc74def02	[fix](Broker load): fix bug for broker label has already been used (#12630 )	2022-09-16 20:46:01 +08:00
morrySnow	378acfa28f	[enhancement](Nereids) eliminate all unessential cross join in TPC-H benchmark (#12651 ) For eliminate all unessential cross join in TPC-H benchmark, this PR: 1. push all predicates that can be push down through join before do ReorderJoin rule. Then we could eliminate all cross join that can be eliminated in ReorderJoin rule since this rule need matching a LogicalFilter as a root pattern. (Q2, Q15, Q16, Q17, Q18) 2. enable expression optimization rule - extract common expression. (Q19) 3. fix cast translate failed. (Q19)	2022-09-16 19:09:58 +08:00
minghong	21319e6db4	[fix](nereids) generate invalid slot when translate predicates in filter on hash join (#12475 ) test sql: TPC-H q21 ``` select count(*) from lineitem l3 right anti join lineitem l1 on l3.l_orderkey = l1.l_orderkey and l3.l_suppkey <> l1.l_suppkey; ``` if we have other join conjuncts, we have to put all slots from left and right into `slotReferenceMap` instead of `hashjoin.getOutput()` After splitting intermediate tuple and output tuple, we meet several issues in regression test. And hence, we make following changes: 1. since translating project will replace underlying hash-join node's output tuple, we add PhysicalHashJoin.shouldTranslateOutput 2. because PhysicalPlanTranslator will merge filter and hashJoin, we add PhysicalHashJoin.filterConjuncts and translate filter conjuncts in physicalHashJoin 3. In this pr, we set HashJoinNode.hashOutputSlotIds properly when using nereids planner. 4. in order to be compatible with BE, in substring function, nullable() returns true	2022-09-16 16:51:04 +08:00
Shuo Wang	131f2a42d2	[Improvement](Nereids) Restrict the condition to apply MergeConsecutiveLimits rule (#12624 ) This PR added a condition check for MergeConsecutiveLimits rule: the input upper limit should not have valid offset info.	2022-09-16 13:05:39 +08:00
jakevin	0f6dbb5769	[fix](Nereids): split INNER and OUTER into different rules. (#12646 )	2022-09-16 10:34:42 +08:00
Henry2SS	d4f8e0c754	[Bug](spark load) fix spark load clearSparkLauncherLog NPE #12619 Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-16 10:30:57 +08:00
minghong	98dad6158b	[fix](Nereids) type coercion on case-when is not correct (#12650 ) When we do type coercion on CaseWhen expression, such as sql like this: ``` CASE WHEN n_nationkey > 1 THEN n_regionkey ELSE 0 END ``` The ELSE part 0 need do type coercion as CAST (0 AS INT). But we miss it in PR #11802	2022-09-16 02:26:11 +08:00
mch_ucchi	a63cdc8a7c	[feature](Nereids) support basic runtime filter (#12182 ) This PR add runtime filter to Nereids planner. Now only support push through join node and scan node. TODO: 1. current support inner join, cross join, right outer join, and will support other join type in future. 2. translate left outer join to inner join if there are inner join ancestors. 3. some complex situation cannot be handled now, see more details in test case: testPushDownThroughJoin. 4. support src key is aggregate group key.	2022-09-16 02:21:01 +08:00
Adonis Ling	0daa25d9a9	[fix](nereids) UT failed when test cases in package (#12622 ) NamedExpressionUtil::clear should reset the nextId rather than create a new IdGenerator<ExprId> because the old one may be referenced by other objects and it may cause some cases start in a dirty environment when we run test cases in package.	2022-09-15 22:25:40 +08:00
jakevin	db8bc80c36	[feature](Nereids): semi join transpose (#12590 ) * [feature](Nereids): semi join transpose and enable ZIG_ZAG join reorder.	2022-09-15 21:32:50 +08:00
morrySnow	858e8234d7	[feature](Nereids) add predicates push down on all join type (#12571 ) * [feature](Nereids) add predicates push down on all join type	2022-09-15 15:18:42 +08:00
yinzhijian	5b6d48ed5b	[feature](nereids) support distinct count (#12159 ) support distinct count with group by clause. for example: SELECT count(distinct c_custkey + 1) FROM customer group by c_nation; TODO: support distinct count without group by clause.	2022-09-15 13:01:47 +08:00
Shuo Wang	b11791b9a8	[Feature](Nereids) Limit pushdown. (#12518 ) This PR adds rewrite rules to push the limit down. Following two cases would be handled: ``` limit -> join limit -> project -> join ```	2022-09-15 12:12:10 +08:00
Shuo Wang	d2d5c19d51	[Improvement](Nereids) Avoid unsafe cast. (#12603 ) This PR changed some interfaces to avoid unsafe cast. - Modify `Plan.getExpressions()`'s return type from `List<Expression>` to `List<? extends Expression>` Return projects (type is a list of named expression) in `getExpressions` can avoid unsafe cast. See `LogicalProject.getExpression()` as an example. - Modify `EmptyRelation.getProjects()`'s return type from `List<NamedExpression>` to `List<? extends NamedExpression>` Creating empty relation with a list of slots can avoid unsafe cast. See the `EliminateLimit` rule for example.	2022-09-15 12:02:35 +08:00
mch_ucchi	5e0dc11f87	[feature](Nereids)add RelationId as a unique identifier of relations (#12461 ) In Nereids, we could not distinguish two relation from same table in one PlanTree. This lead to some trick code to process them when do plan. Such as a separate branch to do equals in GroupExpression. This PR add RelationId to LogicalRelation and PhysicalRelation. Then all relations equals function will compare RelationId to help us distinguish two relation from same table. TODO: add relation id to UnboundRelation, UnboundOneRowRelation, LogicalOneRowRelation, PhysicalOneRowRelation.	2022-09-15 11:56:56 +08:00
Gabriel	fc4298e85e	[feature](outfile) support parquet writer (#12492 )	2022-09-15 11:09:12 +08:00
zhangstar333	22a8d35999	[Feature](vectorized) support jdbc sink for insert into data to table (#12534 )	2022-09-15 11:08:41 +08:00
carlvinhust2012	33f5a86e69	[fix](array-type) forbid to create materialized view for array column (#12543 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-09-15 11:08:23 +08:00
HappenLee	e413a2b8e9	[Opt](vectorized) Use new way to do hash shffle to speed up query (#12586 )	2022-09-15 11:08:04 +08:00
Henry2SS	2ac790bf31	[enhancement](statistic) the calculation of routine load statistics are not accurate (#12594 ) Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-15 11:00:57 +08:00
jakevin	6543924790	[fix](Nereids): avoid commute cause dead-loop. (#12616 ) * [fix](Nereids): avoid commute cause dead-loop. * update best plan	2022-09-15 10:47:11 +08:00
Mingyu Chen	8aa5899484	[fix](load) add scan tuple for stream load scan node only when vectorization is enable (#12578 )	2022-09-15 08:44:39 +08:00
Gabriel	beeb0ef3eb	[Bug](lead) fix wrong child expression of `lead` function (#12587 )	2022-09-15 08:44:18 +08:00
Kikyou1997	d4cb0bbdd5	[test](nereids) Add TPC-H regression test cases for nereids (#12600 ) forbidden some test cases that could not run success. Will be open if we fix corresponding bugs	2022-09-14 22:37:56 +08:00
Yongqiang YANG	be0a0200cf	[fix](grpc-java) use pooled stub to call rpc on be instead of one stub (#10439 ) A channel is closed when a timeout or exception happens, if only one stub is used, then all query would fail. If we dont close the channel, sometimes grpc-java stuck without sending any rpc.	2022-09-14 22:30:45 +08:00
Kikyou1997	3543f85ae5	[feature](nereids) merge push down and remove redundant operator rules into one batch (#12569 ) 1. For some related rules, we need to execute them together to get the expected plan. 2. Add session variables to avoid fallback to stale planner when running regression tests of nereids for piggyback.	2022-09-14 14:37:36 +08:00
jakevin	fd0cf78aa7	[fix](Nereids): fix StatsCalculator compute project and correct commute join type. (#12539 )	2022-09-14 10:32:05 +08:00
ChPi	ead016e0d2	[Enhancement](execute) add timeout for executing fragment rpc (#12512 ) Co-authored-by: chenjie <chenjie@cecdat.com>	2022-09-14 09:12:33 +08:00
HappenLee	d913ca5731	[Opt](vectorized) Speed up bucket shuffle join hash compute (#12407 ) * [Opt](vectorized) Speed up bucket shuffle join hash compute	2022-09-13 20:19:22 +08:00
jakevin	9a5be4bab5	[feature](Nereids): Eliminate redundant filter and limit. (#12511 )	2022-09-13 20:08:13 +08:00

1 2 3 4 5 ...

2788 Commits