doris

Author	SHA1	Message	Date
Pxl	0755fd16d8	remove create hot partition failed check (#22093 )	2023-07-22 17:47:46 +08:00
amory	3d0f952934	[FIX](complex-type)delete enable_map/struct_type switch #21957	2023-07-22 15:29:32 +08:00
Pxl	ae809fbeba	[Bug](storage )fix dead lock when create_tablet need lock two tablet && update mv_p0… (#21969 ) fix dead lock when create_tablet need lock two tablet && update mv_p0/ssb case	2023-07-22 15:27:05 +08:00
zhannngchen	50c8563f35	[fix](partial update) fix some bugs of sequence column (#21896 )	2023-07-22 15:26:48 +08:00
Jibing-Li	355ac18363	[Fix](jdbc catalog) Pass conjuncts to JdbcScanNode and FileScanNode before doing finalize. (#21998 ) JdbcScanNode need to use the conjuncts to generate sql in finalize function. But the conjuncts have not passed to JdbcScanNode yet while calling finalize. This pr is to pass the conjuncts to scan node before using it to avoid scan the whole table.	2023-07-22 14:08:44 +08:00
amory	f7e3cc1553	[FIX](map)fix map proto contains_null #22107 when we select map in order by and limit; be node will coredump	2023-07-22 10:41:55 +08:00
starocean999	93f9a8cbf5	[fix](nereids)PredicatePropagation only support integer types for now (#22096 )	2023-07-21 23:40:08 +08:00
YueW	ef01988ae1	[opt](inverted index) support the same column create different type index (#21972 )	2023-07-21 23:02:39 +08:00
starocean999	acf4aa2818	[fix](planner)shouldn't force push down conjuncts for union statement (#22079 ) * [fix](planner)shouldn't force push down conjuncts for union statement	2023-07-21 21:12:56 +08:00
Siyang Tang	e489b60ea3	[feature](load) support line delimiter for old broker load (#22030 )	2023-07-21 19:31:19 +08:00
Dongyang Li	37f230ee3e	[pipeline](regression) do not run build if only modified regression conf (#22075 ) in order to fast exclude cases that block regression pipeline.	2023-07-21 17:13:28 +08:00
lihangyu	40299d280d	[Fix](json reader) fix rapidjson `array->PushBack` may take ownership… (#21988 ) With bellow json path `["$.data","$.data.datatimestamp"]` After `array_obj->PushBack` the `data` field owner will be taken from array_obj, and lead to null values for json path `$.data.datatimestamp` Rapidjson doc: ``` //! Append a GenericValue at the end of the array. \note The ownership of \c value will be transferred to this array on success. */ GenericValue& PushBack(GenericValue& value, Allocator& allocator); ```	2023-07-21 17:02:01 +08:00
bobhan1	2b2ac10e93	[feature](partial update) add failure tolerance for strict mode partial update stream load	2023-07-21 16:46:44 +08:00
bobhan1	732e0d14ff	[Enhancement](window-funnel)add different modes for window_funnel() function (#20563 )	2023-07-21 13:57:27 +08:00
bobhan1	74313c7d54	[feature-wip](autoinc)(step-3) add auto increment support for unique table (#22036 )	2023-07-21 13:24:41 +08:00
starocean999	fb5b412698	[fix](planner)fix bug of pushing conjuncts into inlineview (#21962 ) 1. markConstantConjunct method shouldn't change the input conjunct 2. Use Expr's comeFrom method to check if the pushed expr is one of the group by exprs, this is the correct way to check if the conjunct can be pushed down through the agg node. 3. migrateConstantConjuncts should substitute the conjuncts using inlineViewRef's analyzer to make the analyzer recognize the column in the conjuncts in the following analyze phase	2023-07-21 11:34:56 +08:00
Jibing-Li	eabd5d386b	[Fix](multi catalog)Fix nereids context table always use internal catalog bug (#21953 ) The getTable function in CascadesContext only handles the internal catalog case (try to find table only in internal catalog and dbs). However, it should take all the external catalogs into consideration, otherwise, it will failed to find a table or get the wrong table while querying external table. This pr is to fix this bug.	2023-07-20 20:32:01 +08:00
morrySnow	ee65e0a6b1	[fix](Nereids) should not remove any limit from uncorrelated subquery (#21976 ) We should not remove any limit from uncorrelated subquery. For Example ```sql -- should return nothing, but return all tuple of t if we remove limit from exists SELECT * FROM t WHERE EXISTS (SELECT * FROM t limit 0); -- should return the tuple with smallest c1 in t, -- but report error if we remove limit from scalar subquery SELECT * FROM t WHERE c1 = (SELECT * FROM t ORDER BY c1 LIMIT 1); ```	2023-07-20 18:37:04 +08:00
bobhan1	367ad9164a	[feature-wip](auto-inc)(step-2) support auto-increment column for duplicate table (#19917 )	2023-07-20 18:03:39 +08:00
starocean999	86d7233b06	[fix](nereids) ExtractAndNormalizeWindowExpression rule should push down correct exprs to child (#21827 ) consider the window function: ```sql substr( ref_1.cp_type, sum(CASE WHEN ref_1.cp_type = 0 THEN 3 ELSE 2 END) OVER (), 1) ``` Before the pr, only "CASE WHEN ref_1.cp_type = 0 THEN 3 ELSE 2 END" is pushed down. But both "ref_1.cp_type" and "CASE WHEN ref_1.cp_type = 0 THEN 3 ELSE 2 END" should be pushed down. This pr fix it	2023-07-20 11:47:55 +08:00
Kaijie Chen	0f116ce148	Revert "[Enhancement](Nereids)enable nereids DML by default. (#21539 )" (#22013 ) This reverts commit f668b3965effbd5df4902f20b496cb6b6642414c.	2023-07-20 11:32:54 +08:00
lihangyu	20242d9a0e	[Improve](simdjson) put unescaped string value after parsed (#21866 ) In some cases, it is necessary to unescape the original value, such as when converting a string to JSONB. If not unescape, then later jsonb parse will be failed	2023-07-20 10:33:17 +08:00
zy-kkk	2daad2151d	[enhancement](jdbc catalog) Add mysql jdbc catalog function to filter push-down identification (#21745 )	2023-07-19 23:48:23 +08:00
LiBinfeng	58f2593ba1	[Fix](Nereids) Add cast comparison with slot reference when inferring predicate (#21171 ) Problem: When inferring predicate, we assume that slot reference need to be inferred. But in this case: carete table tb1(l1 smallint) ...; create table tb2(l2 int) ...; select * from tb1 inner join tb2 where tb1.l1 = tb2.l2 and tb2.l2 = 1; We can not get tb1.l1 = 1 filter because we will add a cast to l1 (Cast smallint to int l1) = l2. Solved: Add cast consideration when inferring predicate, also add change judgement when judging equals to slotreference and cast expression. But when we want to infer predicate from bigger type cast to smaller type, it is logical error. For example: select * from tb1 inner join tb2 where tb1.l1 = cast(tb2.l2 as smallint) and tb2.l2 = (number between smallint max and intmax); tb2.l2 value can not infer to left side because tb1.l1 would be false value, and when we add one more condition like tb1.l1 = tb3.l3(smallint). It would cause this predicate be false.	2023-07-19 23:14:26 +08:00
zhangy5	56c67a442a	[regression-test] add p0/p1 case about partition table (#21777 )	2023-07-19 14:05:56 +08:00
mch_ucchi	f668b3965e	[Enhancement](Nereids)enable nereids DML by default. (#21539 ) TODO: fix cast agg_state type when do insert	2023-07-19 13:52:15 +08:00
morrySnow	d987f782d2	[refactor](Nereids) refactor cte analyze, rewrite and reuse code (#21727 ) REFACTOR: 1. Generate CTEAnchor, CTEProducer, CTEConsumer when analyze. For example, statement `WITH cte1 AS (SELECT * FROM t) SELECT * FROM cte1`. Before this PR, we got analyzed plan like this: ``` logicalCTE(LogicalSubQueryAlias(cte1)) +-- logicalProject() +-- logicalCteConsumer() ``` we only have LogicalCteConsumer on the plan, but not LogicalCteProducer. This is not a valid plan, and should not as the final result of analyze. After this PR, we got analyzed plan like this: ``` logicalCteAnchor() \|-- logicalCteProducer() +-- logicalProject() +-- logicalCteConsumer() ``` This is a valid plan with LogicalCteProducer and LogicalCteConsumer 2. Replace re-analyze unbound plan with deepCopy plan when do CTEInline Because we generate LogicalCteAnchor and LogicalCteProducer when analyze. So, we could not do re-analyze to gnerate CTE inline plan anymore. The another reason is, we reuse relation id between unbound and bound relation. So, if we do re-analyze on unresloved CTE plan, we will get two relation with same RelationId. This is wrong, because we use RelationId to distinguish two different relations. This PR implement two helper class to deep copy a new plan from CTEProducer. `LogicalPlanDeepCopier` and `ExpressionDeepCopier` 3. New rewrite framework to ensure do CTEInline in right way. Before this PR, we do CTEInline before apply any rewrite rule. But sometimes, some CteConsumer could be eliminated after rewrite. After this PR, we do CTEInline after the plans relaying on CTEProducer have been rewritten. So we could do CTEInline if some the count of CTEConsumer decrease under the threshold of CTEInline. 4. add relation id to all relation plan node 5. let all relation generated from table implement trait CatalogRelation 6. reuse relation id between unbound relation and relation after bind ENHANCEMENT: 1. Pull up CTEAnchor before RBO to avoid break other rules' pattern Before this PR, we will generate CTEAnchor and LogicalCTE in the middle of plan. So all rules should process LogicalCTEAnchor, otherwise will generate unexpected plan. For example, push down filter and push down project should add pattern like: ``` logicalProject(logicalCTE) ... logicalFilter(logicalCteAnchor) ... ``` project and filter must be push through these virtual plan node to ensure all project and filter could be merged togather and get right order of them. for Example: ``` logicalProject +-- logicalFilter +-- logicalCteAnchor +-- logicalProject +-- logicalFilter +-- logicalOlapScan ``` upper plan will lead to translation error. because we could not do twice filter and project on bottom logicalOlapScan. BUGFIX: 1. Recursive analyze LogicalCTE to avoid bind outer relation on inner CTE For example ```sql SELECT * FROM (WITH cte1 AS (SELECT * FROM t1) SELECT * FROM cte1)v1, cte1 v2; ``` Before this PR, we will use nested cte name to bind outer plan. So the outer cte1 with alias v2 will bound on the inner cte1. After this PR, the sql will throw Table not exists exception when binding. 2. Use right way do withChildren in CTEProducer and remove projects in it Before this PR, we add an attr named projects in CTEProducer to represent the output of it. This is because we cannot get right output of it by call `getOutput` method on it. The root reason of that is the wrong implementation of computeOutput of LogicalCteProducer. This PR fix this problem and remove projects attr of CTEProducer. 3. Adjust nullable rule update CTEConsumer's output by CTEProducer's output This PR process nullable on LogicalCteConsumer to ensure CteConsumer's output with right nullable info, if the CteProducer's output nullable has been adjusted. 4. Bind set operation expression should not change children's output's nullable This PR use fix a problem introduced by prvious PR #21168. The nullable info of SetOperation's children should not changed after binding SetOperation.	2023-07-19 11:41:41 +08:00
starocean999	5b043a980e	[fix](planner)only forbid literal value in AnalyticExpr's order by list (#21819 ) * [fix](planner)only forbid literal value in AnalyticExpr's order by list	2023-07-19 09:40:55 +08:00
Pxl	0de94e857f	[Bug](materialized view) fix wrong match mv when mv have where clause (#21797 )	2023-07-19 01:11:39 +08:00
starocean999	fff1983f40	[fix](planner)use tupleId of agg node to get its unsigned conjuncts (#21949 )	2023-07-19 00:46:49 +08:00
AKIRA	28dfcd8785	[fix](pipeline) Fix pipeline that cause plenty timeout of p0 cases #21917	2023-07-18 23:15:49 +08:00
TengJianPing	a9ea138caf	[fix](two level hash table) fix dead loop when converting to two level hash table for zero value (#21899 ) When enable two level hash table , if there is zero value in the existing one level hash table, it will cause dead loop when converting to two level hash table, because the PartitionedHashTable::_is_partitioned flag is not set correctly when doing the converting.	2023-07-18 19:50:30 +08:00
HHoflittlefish777	c6063ed92f	[Revert](lazy open) revert lazy open and add case (#21821 )	2023-07-18 19:41:33 +08:00
zhangstar333	87556b5741	[bug](test) fix regression test case failed with curdate (#21922 ) fix regression test case failed with curdate	2023-07-18 19:10:55 +08:00
morrySnow	d6d27ef428	[fix](Nereids) join other conjuncts should get slot from join output (#21840 )	2023-07-18 18:22:40 +08:00
starocean999	ec12a4159a	[fix](planner) push conjuncts into SetOperationStmt inline view (#21718 ) * [fix](planner)push conjuncts into SetOperationStmt inline view	2023-07-18 14:17:07 +08:00
Pxl	417e3e5616	[Feature](delete) support fold constant on delete stmt (#21833 ) support fold constant on delete stmt	2023-07-18 12:56:28 +08:00
Pxl	19492b06c1	[Bug](decimalv3) fix failed on test_dup_tab_decimalv3 due to wrong precision (#21890 ) fix failed on test_dup_tab_decimalv3 due to wrong precision	2023-07-18 12:53:09 +08:00
starocean999	07e720e65d	[fix](planner)need recalculate nullable info of output slots for join node (#21650 ) * [fix](planner)need recalculate nullable info of output slots for join node	2023-07-18 12:10:27 +08:00
Jibing-Li	489171e4c1	[Fix](multi catalog)Fix hive partition value contains special character such as / bug (#21876 ) Hive escapes some special characters in partition value to %XX, for example, / is escaped to %2F. Doris didn't handle this case which will cause doris failed to list the files under partition with special characters. This pr is to fix this bug.	2023-07-18 11:20:38 +08:00
yujun	ebd2a4b707	[fix](dynamic partition) fix create hot partition failed without error response (#20996 )	2023-07-18 10:56:37 +08:00
Mryange	b656f31cf2	[Enchancement](compatible) show decimalv3 to decimal (#21782 )	2023-07-18 09:17:14 +08:00
Tiewei Fang	12784f863d	[fix](Export) Fixed the bug that would be core when exporting large amounts of data (#21761 ) A heap-buffer-overflow error occurs when exporting large amounts of data to orc format. Reserve 50B for buffer to avoid this problem.	2023-07-18 00:06:38 +08:00
Jibing-Li	a92508c3f9	[Fix](statistics) Fix analyze db always use internal catalog bug (#21850 ) `Analyze database db_name ` command couldn't use current catalog, it is always using the internal catalog. This will cause the command failed to find the db. This pr is to fix this bug.	2023-07-17 15:28:54 +08:00
Mingyu Chen	5fc0a84735	[improvement](catalog) reduce the size thrift params for external table query (#21771 ) ### 1 In previous implementation, for each FileSplit, there will be a `TFileScanRange`, and each `TFileScanRange` contains a list of `TFileRangeDesc` and a `TFileScanRangeParams`. So if there are thousands of FileSplit, there will be thousands of `TFileScanRange`, which cause the thrift data send to BE too large, resulting in: 1. the rpc of sending fragment may fail due to timeout 2. FE will OOM For a certain query request, the `TFileScanRangeParams` is the common part and is same of all `TFileScanRange`. So I move this to the `TExecPlanFragmentParams`. After that, for each FileSplit, there is only a list of `TFileRangeDesc`. In my test, to query a hive table with 100000 partitions, the size of thrift data reduced from 151MB to 15MB, and the above 2 issues are gone. ### 2 Support when setting `max_external_file_meta_cache_num` <=0, the file meta cache for parquet footer will not be used. Because I found that for some wide table, the footer is too large(1MB after compact, and much more after deserialized to thrift), it will consuming too much memory of BE when there are many files. This will be optimized later, here I just support to disable this cache.	2023-07-17 13:37:02 +08:00
zy-kkk	03b575842d	[Feature](table function) support explode_json_array_json (#21795 )	2023-07-17 11:40:02 +08:00
zclllyybb	d0775f8209	[log](profile) add doris version info to query profile (#21501 )	2023-07-17 11:18:05 +08:00
Pxl	86841d8653	[Bug](materialized-view) fix some problems of mv and make ssb mv work on nereids (#21559 ) fix some problems of mv and make ssb mv work on nereids	2023-07-17 10:08:25 +08:00
abmdocrt	c409fa0f58	[Feature](Compaction)Support full compaction (#21177 )	2023-07-16 13:21:15 +08:00
starocean999	7a61953d17	[fix](nereids)SimplifyComparisonPredicate rule need special care for deicmalv3 and datetimev2 literal (#21575 )	2023-07-14 23:05:14 +08:00

1 2 3 4 5 ...

2073 Commits