doris

Author	SHA1	Message	Date
Gabriel	1ef85ae1f2	[Improvement](join) Support nested loop outer join (#13965 )	2022-11-10 19:50:46 +08:00
morrySnow	6c13126e5c	[enhancement](Nereids) analyze check input slots must in child's output (#14107 )	2022-11-10 19:28:01 +08:00
minghong	ae4f2aead7	[fix](nereids) column stats min/max missing (#14091 ) in the result of SHOW COLUMN STATS tbl, min/max value is not displayed.	2022-11-10 17:08:44 +08:00
Ashin Gau	6bd5378f66	[feature-wip](multi-catalog) lazy read for ParquetReader (#13917 ) Read predicate columns firstly, and use VExprContext(push-down predicates) to generate the select vector, which is then applied to read the non-predicate columns. The data in non-predicate columns may be skipped by select vector, so the value-decode-time can be reduced. If a whole page can be skipped, the decompress-time can also be reduced.	2022-11-10 16:56:14 +08:00
Zhengguo Yang	724cf1cdb8	[chore][build] add instructions to build version string (#14067 )	2022-11-10 16:23:34 +08:00
shee	9b5b411112	[fix](schemeChange) fe oom because replicas too many when schema change (#12850 )	2022-11-10 16:17:25 +08:00
谢健	151a72d158	[feature](Nereids) support circle graph (#14082 )	2022-11-10 15:54:21 +08:00
Pxl	0e26f28bf2	[Enhancement](runtime-filter) enlarge runtime filter in predicate threshold (#13581 ) enlarge runtime filter in predicate threshold	2022-11-10 15:48:46 +08:00
Xinyi Zou	a73f4dfdc1	[fix](memtracker) Fix scanner thread ending after fragment thread causing mem tracker null pointer #14143	2022-11-10 15:42:53 +08:00
jakevin	4cde9c4765	[enhance](Nereids): add missing hypergraph rule. (#14087 )	2022-11-10 15:23:31 +08:00
xueweizhang	90bfd87660	[feature](function) add new function uuid() (#14092 )	2022-11-10 14:55:41 +08:00
jakevin	0dfdbe4508	[feature](Nereids): InnerJoinLeftAssociate, InnerJoinRightAssociate and JoinExchange. (#14051 )	2022-11-10 12:21:06 +08:00
Mingyu Chen	8c5c6d9d7f	[fix](ctas) fix wrong string column length after executing ctas from external table (#14090 )	2022-11-10 11:36:56 +08:00
minghong	17867e446f	[feature](nereids) let user define right deep tree penalty by session variable (#14040 ) it is hard for us to find a proper factor for all queries. default is 0.7	2022-11-10 11:25:02 +08:00
shee	57225d69f3	[Fix] add hll param for if function (#12366 ) * [Fix] add hll param for if function * add ut Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>	2022-11-10 11:20:58 +08:00
starocean999	84b969a25c	[fix](grouping)the grouping expr should check col name from base table first, then alias (#14077 ) * [fix](grouping)the grouping expr should check col name from base table first, then alias * fix fe ut, the behavior would be same as mysql	2022-11-10 11:10:42 +08:00
minghong	994d563f52	[fix](nereids) cannot collect decimal column stats (#13961 ) When execute analyze table, doris fails on decimal columns. The root cause is the scale in decimalV2 is 9, but 2 in schema. There is no need to check scale for decimalV2, since it is not a float point type.	2022-11-10 11:06:38 +08:00
Gabriel	184cee2d2b	[Bug](outfile) Fix wrong decimal format for ORC (#14124 )	2022-11-10 11:01:30 +08:00
Tiewei Fang	43eb946543	[feature](table-valued-function)S3 table valued function supports parquet/orc/json file format #14130 S3 table valued function supports parquet/orc/json file format. For example: parquet format	2022-11-10 10:33:12 +08:00
Jerry Hu	10df61b5bf	[improvement](join) Share hash table in fragments for broadcast join (#13921 )	2022-11-10 09:48:34 +08:00
zhangstar333	df622d8b7d	[Bug](udf) fix java-udaf process string type error and add some tests (#14106 )	2022-11-10 09:30:57 +08:00
Liqf	55cae6202f	[typo](docs)add udf doc and optimize udf regression test (#14000 )	2022-11-10 09:24:45 +08:00
Xin Liao	3690c4dbe7	[fix](load) fix that load channel failed to be released in time (#14119 )	2022-11-09 22:38:08 +08:00
Pxl	794a551b0f	[Enhancement][fix](profile)() modify some profiles (#14074 ) 1. add RemainedDownPredicates 2. fix core dump when _scan_ranges is empty 3. fix invalid memory access on vLiteral's debug_string() 4. enlarge mv test wait time	2022-11-09 21:59:28 +08:00
camby	322ac5cf89	[refractor](array) refractor DataTypeArray from_string (#13905 ) refractor DataTypeArray from_string, make it more clear; support ',' and ']' inside string element, for example: ['hello,,,', 'world][]'] support empty elements, such as [,] ==> [0,0] Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-11-09 16:58:08 +08:00
mch_ucchi	3117ac9289	[enhancement](Nereids) use post-order to generate runtime filter in RuntimeFilterGenerator (#13949 ) change runtime filter generator from pre-order to post-order, it maybe change the quantity of generated runtime filters. and the ut will be corrected.	2022-11-09 14:28:49 +08:00
Tiewei Fang	b74d0a4747	[feature](table-valued-function) Support `desc from s3()` and modify the syntax of tvf (#14047 ) This pr does two things: Support desc function s3() modify the syntax of tvf	2022-11-09 14:12:43 +08:00
camby	f912d4e392	[fix](compile) fix compile error #14103 Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-11-09 14:10:06 +08:00
WenYao	e692636b4f	[performance-wip] (vectorization) Opt HashJoin Performance (#12390 )	2022-11-09 14:07:49 +08:00
morrySnow	84bb82acc0	[fix](Nereids) aggregate disassemble generate error output list on GLOBAL phase aggregate (#14079 ) we must use localAggregateFunction as key of globalOutputSMap, because we use local output exprs to generate global output in disassembleDistinct	2022-11-09 13:43:12 +08:00
jakevin	b144d2b4f4	[improve](Nereids): remove redundant code, add annotation in Memo. (#14083 )	2022-11-09 13:39:20 +08:00
morrySnow	aff62655c4	[feature](Nereids) binding slot in order by that not show in project (#14042 ) 1. binding slot in order by that not show in project, such as: SELECT c1 FROM t WHERE c2 > 0 ORDER BY c3 2. not check unbound when bind slot reference. Instead, do it in analysis check.	2022-11-09 13:25:41 +08:00
carlvinhust2012	7362460525	[docs](array-type) update the docs to specify how to use array function when import data (#13995 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-11-09 12:21:26 +08:00
Gabriel	a3c5fa8c01	[Compile](join) Boost compiling and linking (#14081 )	2022-11-09 11:27:46 +08:00
ChPi	55ca810445	[fix](Vectorized)fix json_object and json_array function return wrong result on vectorized engine (#13775 ) Issue Number: close #13598	2022-11-09 11:26:55 +08:00
Kang	aec214b4b0	[bug](ColumnDecimal)call set_decimalv2_type when cloning ColumnDecimal (#14061 ) * call set_decimalv2_type when cloning ColumnDecimal * clang format	2022-11-09 11:23:43 +08:00
xueweizhang	572f491756	[fix](ctas) text column type len = 1 when create table as select (#13906 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2022-11-09 09:09:34 +08:00
Adonis Ling	291fa499e9	[fix](JSON) Fail to parse JSONPath (libc++) (#13941 )	2022-11-09 08:58:01 +08:00
Liqf	287c3893b9	[typo](docs)update array type doc #14057	2022-11-09 08:40:38 +08:00
zhengyu	6a1c7fac9d	[enhancement](load) shrink reserved buffer for page builder (#14012 ) (#14014 ) * [enhancement](load) shrink reserved buffer for page builder (#14012) For table with hundreds of text type columns, flushing its memtable may cost huge memory. These memory are consumed when initializing page builder, as it reserves 1MB for each column. So memory consumption grows in proportion with column number. Shrinking the reservation may reduce memory substantially in load process. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com> * response to the review Signed-off-by: freemandealer <freeman.zhang1992@gmail.com> * Update binary_plain_page.h * Update binary_dict_page.cpp * Update binary_plain_page.h Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2022-11-09 08:40:07 +08:00
xueweizhang	a0f136a0bc	[docs](odbc) fix docs for sqlserver odbc table (#14017 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com> Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2022-11-09 08:39:39 +08:00
Mingyu Chen	cd8f0713ea	[refactor](new-scan) remove old vectorized scan node (#14029 )	2022-11-09 08:39:20 +08:00
HappenLee	75b6b267ea	[opt](ssb) Add query hint for the SSB queries (#14089 )	2022-11-09 08:37:31 +08:00
Kang	151842a1fe	[feature](inverted index)WIP inverted index api: SQL syntax and metadata (#13430 ) Introduce a SQL syntax for creating inverted index and related metadata changes. ``` -- create table with INVERTED index CREATE TABLE httplogs ( ts datetime, clientip varchar(20), request string, status smallint, size int, INDEX idx_size (size) USING INVERTED, INDEX idx_status (status) USING INVERTED, INDEX idx_clientip (clientip) USING INVERTED PROPERTIES("parser"="none") ) DUPLICATE KEY(ts) DISTRIBUTED BY RANDOM BUCKETS 10 -- add an INVERTED index to a table CREATE INDEX idx_request ON httplogs(request) USING INVERTED PROPERTIES("parser"="english"); ```	2022-11-08 23:46:53 +08:00
Tiewei Fang	826cfdaf93	[feature](information_schema) add `backends` information_schema table (#13086 )	2022-11-08 22:15:10 +08:00
Pxl	ae3c513d74	use extern template to date_time_add (#13970 )	2022-11-08 22:11:41 +08:00
luozenglin	115c6bd411	[fix](keyranges) fix the split error of keyranges (#14049 ) fix the split error of keyranges	2022-11-08 22:09:16 +08:00
shee	3f3f2eb098	[Nereids][Improve] infer predicate after push down predicate (#12996 ) This PR implements the function of predicate inference For example: ``` sql select * from student left join score on student.id = score.sid where score.sid > 1 ``` transformed logical plan tree: left join / \ filter(sid >1) filter(id > 1) <---- inferred predicate \| \| scan scan See `InferPredicatesTest` for more cases The logic is as follows: 1. poll up bottom predicate then infer additional predicates for example: select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id 1. poll up bottom predicate select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1 2. infer select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1 and t2.id = 1 finally transformed sql: select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t2.id = 1 2. put these predicates into `otherJoinConjuncts` , these predicates are processed in the next round of predicate push-down Now only support infer `ComparisonPredicate`. TODO: We should determine whether `expression` satisfies the condition for replacement eg: Satisfy `expression` is non-deterministic	2022-11-08 21:36:17 +08:00
Mingyu Chen	b6f91b6eff	[improvement](profile) support ordinary user to get query profile via http api (#14016 )	2022-11-08 20:39:01 +08:00
Kikyou1997	ecfdf0320d	[fix](statistics) ColumnStatistics was changed unexpectedly when show stats (#14068 ) The logic of show stats would change the internal collected ColumnStat unexpectedly which would cause inaccurate cost and inefficient plan	2022-11-08 20:26:37 +08:00

1 2 3 4 5 ...

7106 Commits