doris

Author	SHA1	Message	Date
lsy3993	93b940bc92	[test](jdbc)add new case for mysql jdbc table (#14581 )	2022-11-27 13:39:59 +08:00
HappenLee	38b4cbe253	[Bug](regression) regression fail random in fuzzy mode (#14614 )	2022-11-27 09:23:36 +08:00
lsy3993	a877c8e50d	[test](docker) delete show table (#14612 )	2022-11-26 23:44:29 +08:00
lsy3993	4c60186e87	[test](jdbc)add new case for pg jdbc table (#14582 )	2022-11-26 13:02:05 +08:00
Mingyu Chen	064b8d2aa6	[fix](multi-catalog) fix coredump when querying partitioned hive table with text format (#14604 ) BE will crash when querying partitioned hive table with text format and put partition column at first of select items. 1. FE should use file slots to set the column mapping index of csv file. 2. BE should use `get_by_name` of block to get right column in a block in csv reader.	2022-11-26 11:42:40 +08:00
Kang	52c6ba051e	[feature](jsonb type)refactor JSONB type using column and add testcase (#13778 ) 1. Refactor JSONB type using ColumnString instead making a copy. 2. Add regression testcase for JSONB load and functions.	2022-11-26 10:06:15 +08:00
luozenglin	4728e75079	[feature](bitmap) Support in bitmap syntax and bitmap runtime filter (#14340 ) 1.Support in bitmap syntax, like 'where k1 in (select bitmap_column from tbl)'; 2.Support bitmap runtime filter. Generate a bitmap filter using the right table bitmap and push it down to the left table storage layer for filtering.	2022-11-25 15:22:44 +08:00
lihangyu	7ba4cd764a	[enhancement](array-function) `array_position`,`array_contains`,`countequal` which in `FunctionArrayIndex` handle target NULL (#14564 ) in the previous, the result is: ``` mysql> select array_position([1, null], null); +--------------------------------------+ \| array_position(ARRAY(1, NULL), NULL) \| +--------------------------------------+ \| NULL \| +--------------------------------------+ 1 row in set (0.02 sec) ``` but after this commit, the result become: ``` mysql> select array_position([1, null], null); +--------------------------------------+ \| array_position(ARRAY(1, NULL), NULL) \| +--------------------------------------+ \| 2 \| +--------------------------------------+ 1 row in set (0.02 sec) ```	2022-11-25 14:19:50 +08:00
zhangstar333	d5d356b17f	[vectorized](function) support order by field function (#14528 ) * [vectorized](function) support order by field function * update * update test	2022-11-25 14:00:46 +08:00
924060929	deef491e01	[fix](Nereids) refactor CTE and EliminateAliasNode and fix the bug that CTE reuse relationId (#14534 ) This pr contribute: - support explain CTE; - refine CTE, fix the bug: reuse the same analyzed plan which LogicalOlapScan has the same relationId; - change EliminateAliasNode to LogicalSubQueryAliasToLogicalProject and move to the top of rewrite stage, so we can simply observe the analyzed plan by the LogicalSubQueryAlias with alias; - job traverse left child first, so the ExprId growth from left child to right child.	2022-11-25 10:54:53 +08:00
lihangyu	bc699511d0	[Fix](array-function) fix `array_distinct` null values (#14544 ) in the previous the result is: ``` mysql> select array_distinct([1,1,3,3,null, null, null]); +-----------------------------------------------------+ \| array_distinct(ARRAY(1, 1, 3, 3, NULL, NULL, NULL)) \| +-----------------------------------------------------+ \| [1, 3, NULL, NULL, NULL] \| +-----------------------------------------------------+ 1 row in set (0.00 sec) ``` after this fix, the result becomes: ``` mysql> select array_distinct([1,1,3,3,null, null, null]); +-----------------------------------------------------+ \| array_distinct(ARRAY(1, 1, 3, 3, NULL, NULL, NULL)) \| +-----------------------------------------------------+ \| [1, 3, NULL] \| +-----------------------------------------------------+ 1 row in set (0.00 sec) ```	2022-11-24 19:07:28 +08:00
htyoung	0c4830600d	test(grouping sets) add regression test case for grouping sets (#14539 ) Co-authored-by: tongyang.hty <hantongyang@douyu.tv>	2022-11-24 17:38:12 +08:00
zy-kkk	59b31a03c4	[Improvement](agg function) support group_bit_and/group_bit_or/group_bit_xor functions (#14386 )	2022-11-24 16:46:42 +08:00
lsy3993	608cb6c4ad	[test](jdbc)add new case for mysql external table (#14530 )	2022-11-24 16:36:44 +08:00
lsy3993	b4d8ae5204	[test](jdbc)add new pg case from other source (#14445 )	2022-11-24 16:35:59 +08:00
zhengshiJ	a04e1b49ec	[feature](Nereids) Implement group by grouping sets, cube and rollup (#14496 ) Issue Number: close #13615 The main work: implement grouping sets/ cube/ rollup. fix if function Infinite loop problem. Support for isNull transitions to legacy optimizers.	2022-11-24 16:34:31 +08:00
minghong	0680b3b4d5	[opt](nereids) adjust nereids related regression test cases (#14439 ) 1. in dateV2, we adjust the dir structure to avoid creating a tpch-1G database 2. use `drop table XXX` to replace `delete * from XXX where key>0` 3. remove explain cases, because - the explain string itself is variable, and the case is hard to maintain - it is original planner explain, not nereids	2022-11-24 16:02:52 +08:00
lihangyu	8afe298a0f	[Fix](function) fix function `retention` lost `ARRAY`'s element type … (#14538 )	2022-11-24 15:19:50 +08:00
starocean999	ae4f4b9bf1	[fix](agg)having clause should use column name first then alias (#14408 ) * [fix](agg)having clause should use column name first then alias * fix fe ut	2022-11-24 10:31:58 +08:00
abmdocrt	70ea07bc4b	[fix](nullable) Fix nullable cache to avoid function returning wrong value (#14463 )	2022-11-24 09:35:08 +08:00
lsy3993	6fcffd041c	[test](jdbc)add new mysql jdbc case from other source (#14495 )	2022-11-23 16:23:42 +08:00
xueweizhang	79688c34a1	[feature](catalog) add max num of same name meta information in catalog recycle bin (#14482 )	2022-11-23 14:04:14 +08:00
starocean999	d36b561520	[fix](in)fix in predicate datatype mismatch after union (#14497 )	2022-11-23 09:57:03 +08:00
Xin Liao	3b5f4ad198	[fix](unique-key-merge-on-write) fix that unique key with mow may loss some data in the query result with predicates (#14455 ) When unique key with MOW table has sequence column, the query result may be wrong with predicates. There are two problems: The sequence column needs to be removed from primary key index when comparing key. The sequence column needs to be removed from min/max key.	2022-11-23 09:08:07 +08:00
Yongqiang YANG	18b9db17b3	[fix](test) move cases in query to query_p0 (#14452 )	2022-11-22 21:35:18 +08:00
Kikyou1997	3360bdf124	[feature-wip](statistics) update cache when analysis job finished (#14370 ) 1. Update cache when analysis job finished 2. Rename `StatisticsStorageInitializer` to `InernalSchemaInitializer`	2022-11-22 21:33:10 +08:00
lsy3993	1fe9bced25	[test](jdbc)add more mysql jdbc test case (#14475 )	2022-11-22 21:14:10 +08:00
shee	89c676e597	[Bug] fix bug for grouping set query which where condition is false (#14401 )	2022-11-22 16:03:43 +08:00
AlexYue	6e3716e0ea	[enhancement](regression) split ssb sf1 to sf0.1 to get smaller test data size (#14437 )	2022-11-22 10:36:12 +08:00
Gabriel	1ec7f45fb6	[Bug](avg) Fix `avg` for bigint (#14433 )	2022-11-22 10:29:59 +08:00
zhangstar333	16d8a1853a	[Bug](array-function) array set function not handle all null value (#14318 )	2022-11-22 09:07:43 +08:00
lsy3993	e3d764aac5	[test](jdbc) add new jdbc case in other source (#14443 )	2022-11-21 21:33:06 +08:00
Pxl	bcd641877f	[Enhancement](scan) disable build key range and filters when push down agg work (#14248 ) disable build key range and filters when push down agg work	2022-11-21 12:47:57 +08:00
zhangstar333	3f29e3bff6	[bug](test) fix regression test of jdbc postgresql table core (#14417 )	2022-11-20 23:03:14 +08:00
lsy3993	5dfe5ef965	[test](hive catalog)add hive catalog test case (#14217 )	2022-11-19 17:26:18 +08:00
Mingyu Chen	512b787559	[fix](parquet-reader) fix stack-use-after-return error (#14411 )	2022-11-19 10:52:50 +08:00
lihangyu	b4aef889f2	[feature-array](array-function) add array constructor function `array()` (#14250 ) * [feature-array](array-function) add array constructor function `array()` ``` mysql> select array(qid, creationDate) from nested_c_2 limit 10; +------------------------------+ \| array(`qid`, `creationDate`) \| +------------------------------+ \| [1000038, 20090616074056] \| \| [1000069, 20090616075005] \| \| [1000130, 20090616080918] \| \| [1000145, 20090616081545] \| +------------------------------+ 10 rows in set (0.01 sec) ```	2022-11-19 10:49:50 +08:00
lsy3993	02372ca2ea	[test](jdbc external table) add new jdbc mysql external table (#14323 )	2022-11-19 09:46:48 +08:00
xueweizhang	68da6bccb7	[fix](type) fix DECIMAL scale when cast function on fe (#12877 ) before: MySQL [test]> select cast('135.759999999' as DECIMAL(10,3)); +----------------------------------------+ \| CAST('135.759999999' AS DECIMAL(10,3)) \| +----------------------------------------+ \| 135.759999999 \| +----------------------------------------+ 1 row in set (0.00 sec) now: MySQL [stage]> select cast('135.759999999' as DECIMAL(10,3)); +----------------------------------------+ \| CAST('135.759999999' AS DECIMAL(10,3)) \| +----------------------------------------+ \| 135.759 \| +----------------------------------------+ 1 row in set (0.01 sec)	2022-11-18 19:36:14 +08:00
carlvinhust2012	eab0af7afe	[optimization](array-type) optimize the export precision of floating point numbers (#14261 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-11-18 18:24:11 +08:00
Mingyu Chen	2c4236fd24	[improvement](ctas) use string type for varchar/char/string (#14382 ) When executing create table as select stmt, the varchar/char/string type of column in created table will be unified to string type. Because when select from external table (mysql/pg, etc), the length of varchar in external database is calculated by "char" length, not "byte" length. So if there is a column with varchar(10) in external table, then there will be a same varchar(10) in created table. But the byte length of data in external table may be larger than 10, causing failure of CTAS. Change to string will not impact performance of the capacity of disk storage. And notice that if a string type column is the first column, it will be changed to varchar(65535), because we do not allow string type column as sort key column.	2022-11-18 14:20:13 +08:00
Tiewei Fang	a1d02f36ac	[feature](table-valued-function) support `hdfs()` tvf (#14213 ) This pr does two things: 1. support `hdfs()` table valued function. 2. add regression test	2022-11-18 14:17:02 +08:00
morrySnow	da0b09caea	[fix](Nereids) DateTimeType migrate to DateType is wrong when hour, minute and second all zero (#14327 ) 1. fix DateTimeType migrate to DateType is wrong when hour, minute and second all zero 2. add TPC-H regression test with DATEV2 type	2022-11-18 01:38:03 +08:00
Xin Liao	fb140d0180	[Enhancement](sequence-column) optimize the use of sequence column (#13872 ) When you create the Uniq table, you can specify the mapping of sequence column to other columns. You no longer need to specify mapping column when importing.	2022-11-17 22:39:09 +08:00
Gabriel	50bfd99b59	[feature](join) support nested loop semi/anti join (#14227 )	2022-11-17 22:20:08 +08:00
Ashin Gau	44ee4386f7	[test](multi-catalog)Regression test for external hive orc table (#13762 ) Add regression test for external hive orc table. This PR has generated all basic types support by hive orc, and create a hive external table to touch them in docker environment. Functions to be tested: 1. Ensure that all types are parsed correctly 2. Ensure that the null map of all types are parsed correctly 3. Ensure that the `SearchArgument` of `OrcReader` works well 4. Only select partition columns	2022-11-17 20:36:02 +08:00
Mingyu Chen	7182f14645	[improvement][fix](multi-catalog) speed up list partition prune (#14268 ) In previous implementation, when doing list partition prune, we need to generation `rangeToId` every time we doing prune. But `rangeToId` is actually a static data that should be create-once-use-every-where. So for hive partition, I created the `rangeToId` and all other necessary data structures for partition prunning in partition cache, so that we can use it directly. In my test, the cost of partition prune for 10000 partitions reduce from 8s -> 0.2s. Aslo add "partition" info in explain string for hive table. ``` \| 0:VEXTERNAL_FILE_SCAN_NODE \| \| predicates: `nation` = '0024c95b' \| \| inputSplitNum=1, totalFileSize=4750, scanRanges=1 \| \| partition=1/10000 \| \| numNodes=1 \| \| limit: 10 \| ``` Bug fix: 1. Fix bug that es scan node can not filter data 2. Fix bug that query es with predicate like `where substring(test2,2) = "ext2";` will fail at planner phase. `Unexpected exception: org.apache.doris.analysis.FunctionCallExpr cannot be cast to org.apache.doris.analysis.SlotRef` TODO: 1. Some problem when quering es version 8: ` Unexpected exception: Index: 0, Size: 0`, will be fixed later.	2022-11-17 08:30:03 +08:00
yuanyuan8983	442b844b22	[regressiontest](delete)delete-where-in-test (#14036 ) * delete-where-in-test * Update test_delete_where_in.groovy * Update test_delete_where_in.groovy	2022-11-15 18:35:31 +08:00
camby	3ea9d3f2e1	[enhancement](array) support read list(Array) type from orc file (#14132 ) Before this pr, if we try to load ORC file with native list(or array) type data, the be will crash. Because complex types in ORC file include multi real columns, so we need to filter columns by column names. Otherwise we could not read all columns we need. Now arrow release-7.0.0 only support create stripe reader by column index, so we patch it to support create stripe reader by column names. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-11-15 17:48:17 +08:00
zhangstar333	70cc725649	[Vectorized](function) support avg_weighted/percentile_array/topn_wei… (#14209 ) * [Vectorized](function) support avg_weighted/percentile_array/topn_weighted functions * update add to stringRef	2022-11-15 16:38:38 +08:00

... 22 23 24 25 26 ...

1645 Commits