doris

Author	SHA1	Message	Date
Jibing-Li	73ee352705	[fix](multi catalog)Fix convert_to_doris_type missing break for some cases (#14992 )	2022-12-13 13:34:55 +08:00
slothever	e7a84e4a16	[fix](multi-catalog)fix page index thrift deserialize (#15001 ) fix the err when parse page index: Couldn't deserialize thrift msg. use two buffer to store column index and offset index msg, avoid parse them in a buffer	2022-12-13 13:33:19 +08:00
Jibing-Li	8fe0729835	[fix](multi catalog)Check orc file reader is not null before using it. (#14988 ) The external table file path cache may out of date, which will cause orc reader to visit non-exist files. In this case, orc file reader is nullptr. This pr is to check the reader before using it to avoid core dump of visiting nullptr.	2022-12-13 11:27:51 +08:00
Pxl	c25a7235f9	[Pipeline](load) support pipeline broker load (#14940 ) support pipeline broker load	2022-12-13 00:28:36 +08:00
plat1ko	f3aea7f0f0	[Enhancement](status) Unify error code and enable customed err msg for BE internal errors (#14744 )	2022-12-11 23:33:18 +08:00
Gabriel	7fb695b51d	[Pipeline](select node) Support select node on pipeline engine (#14928 )	2022-12-11 21:31:32 +08:00
zhangstar333	ef46b580d0	[Vectorized](operator) support analytic eval operator (#14774 )	2022-12-10 19:32:11 +08:00
HappenLee	68092fe514	[pipeline](NLJ) support nested loop join for pipeline (#14966 )	2022-12-10 00:20:16 +08:00
wxy	af50461211	[fix](statistics) fix CpuTimeMS in audit log when enable_vectorized_engine=true. (#14853 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2022-12-09 21:13:05 +08:00
zhangstar333	0c8fdc90fb	[pipeline](union) support union operator (#14963 )	2022-12-09 19:55:40 +08:00
shee	4611947920	[Pipeline](Operator) Add schema scan node operator (#14955 )	2022-12-09 19:55:03 +08:00
Jerry Hu	873b128fde	[feature](pipeline) add inersect/except operators (#14868 )	2022-12-09 14:13:48 +08:00
TengJianPing	5a1c7f6314	[improvement](analytic) improve memory counter (#14890 )	2022-12-09 14:13:17 +08:00
HappenLee	9d36931038	[Refactor](NLJ) refactor the nested loop join node (#14911 ) * [Refactor](NLJ) refactor the nested loop join node * change the logic of alloc/release resource	2022-12-09 14:10:26 +08:00
Tiewei Fang	00f44257e2	[feature-wip](file-reader) Merge hdfs reader to the new file reader (#14875 )	2022-12-09 13:21:59 +08:00
zhangstar333	20f2abb3d4	[vectorized](pipeline) support assert num rows operator (#14923 )	2022-12-09 09:39:29 +08:00
Gabriel	0c817e6b3a	[Pipeline](hashjoin) Support hash join on pipeline engine (#14898 )	2022-12-08 15:43:02 +08:00
zhangstar333	962810b973	[Vectorized](jdbc) add check type for jdbc table (#14501 )	2022-12-08 10:27:47 +08:00
Pxl	48a9166aa4	[Pipeline](sink) support olap table sink operator (#14872 ) * support olap table sink operator * update config	2022-12-07 15:29:56 +08:00
TengJianPing	fcea89bcf4	[fix](const_expr) fix coredump caused by unsupported cast const expr (#14825 )	2022-12-06 10:31:15 +08:00
HappenLee	b30cd86e9e	[Refactor](pipeline) Refactor operator and builder code of pipeline (#14787 )	2022-12-05 18:35:00 +08:00
Gabriel	1190fd4cd6	[Pipeline](regression) Add ssb flat for pipeline (#14763 )	2022-12-05 15:05:23 +08:00
TengJianPing	8c0e13ab51	[improvement](profile) add detail memory counter for exec nodes (#14806 ) * [improvement](profile) improve accuraccy of memory usage and add detail memory counter * fix	2022-12-05 11:51:52 +08:00
wxy	e141664339	[fix](statistics) fix missing scanBytes and scanRows in query statist… (#14750 ) * [fix](statistics) fix missing scanBytes and scanRows in query statistics when enable_vectorized_engine=true. Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2022-12-05 09:17:51 +08:00
HappenLee	12304bc0ee	[Pipeline](exec) Support pipeline exec engine (#14736 ) Co-authored-by: Lijia Liu <liutang123@yeah.net> Co-authored-by: HappenLee <happenlee@hotmail.com> Co-authored-by: Jerry Hu <mrhhsg@gmail.com> Co-authored-by: Pxl <952130278@qq.com> Co-authored-by: shee <13843187+qzsee@users.noreply.github.com> Co-authored-by: Gabriel <gabrielleebuaa@gmail.com> ## Problem Summary: ### 1. Design DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-027%3A+Support+Pipeline+Exec+Engine ### 2. How to use: Set the environment variable `set enable_pipeline_engine = true; `	2022-12-02 17:11:34 +08:00
Gabriel	9dd1d989e8	[test](decimalv3) add regression test cases for decimalv3 (#14672 )	2022-12-01 15:18:40 +08:00
Xinyi Zou	176f519fa1	[enhancement](memtracker) Optimize exec node memory tracking (#14711 )	2022-12-01 14:52:21 +08:00
Jerry Hu	b4d32a0c44	[fix](join) runtime filter shared from other instance wasn't be published (#14717 )	2022-12-01 14:17:23 +08:00
Pxl	bba77fa9dd	[Enhancement](profile) enhance column predicates display on profile (#14664 )	2022-12-01 13:07:12 +08:00
luozenglin	7873bc95a6	[Enhancement](bitmapfilter) Support bitmap filter to apply zone_map index to filter pages (#14635 )	2022-12-01 10:41:09 +08:00
luozenglin	6c70d794f6	[fix](bitmapfilter) fix core dump caused by bitmap filter (#14702 )	2022-12-01 09:56:22 +08:00
Tiewei Fang	9272680d00	[feature](multi-catalog) support Jdbc catalog (#14527 ) Issue Number: close #xxx I add jdbc catalog for doris multi-catalog feature. Currently, the jdbc catalog only supports MYSQL DBMS. TODO: support for postgre DB Support for other databases. Problem summary For jdbc catalog, we can create catalog like: CREATE CATALOG jdbc4 PROPERTIES ( "type"="jdbc", "jdbc.user"="root", "jdbc.password"="123456", "jdbc.jdbc_url" = "jdbc:mysql://127.0.0.1:13396/demo?yearIsDateType=false", "jdbc.driver_url" = "file:/mnt/disk2/ftw/tools/jar/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar", "jdbc.driver_class" = "com.mysql.jdbc.Driver" ); Note: yearIsDateType is a param of jdbc: If yearIsDateType configuration property is set to false, then the returned object type is java.sql.Short. If set to true (the default), then the returned object is of type java.sql.Date with the date set to January 1st, at midnight. To compat with mysql, we force the use of yearIsDateType=false in FE. if user sets yearIsDateType=true, doris FE will force to change yearIsDateType=false.	2022-11-30 11:28:08 +08:00
Gabriel	3e8b3658c7	[feature-wip](decimalv3) Support basic agg and arithmetic operations for decimal v3 (#14513 )	2022-11-29 15:12:41 +08:00
lsy3993	f7a827c06b	[fix](new-scan) fix some bugs about new scan node and readers (#14504 ) json reader DCHECK fail because of missing TYPE_STRING fix bug that if no file is found, the tvf will throw NPE. The predicate conjuncts can not be pushed down to parquet reader if this is a load task. Because the predicate should be applied on column of dest table, not on column of source file. Add a temp property "use_new_load_scan_node" of broker load to make regression test happy. So that we can use new load scan node for a certain job and avoid setting global FE config.	2022-11-29 10:21:41 +08:00
Gabriel	7513c82431	[NLJoin](conjuncts) separate join conjuncts and general conjuncts (#14608 )	2022-11-29 08:55:54 +08:00
starocean999	78adecac1b	[enhancemennt](be)optimize mem usage in join and set node (#14602 )	2022-11-27 13:38:49 +08:00
Tiewei Fang	36419fae48	[fix](JdbcExecutor) fix that JdbcExecutor did not load the class jar (#14598 ) JdbcExecutor did not load jdbc driver jar, so add classloader to load jdbc jar.	2022-11-26 23:53:05 +08:00
Mingyu Chen	064b8d2aa6	[fix](multi-catalog) fix coredump when querying partitioned hive table with text format (#14604 ) BE will crash when querying partitioned hive table with text format and put partition column at first of select items. 1. FE should use file slots to set the column mapping index of csv file. 2. BE should use `get_by_name` of block to get right column in a block in csv reader.	2022-11-26 11:42:40 +08:00
luozenglin	4728e75079	[feature](bitmap) Support in bitmap syntax and bitmap runtime filter (#14340 ) 1.Support in bitmap syntax, like 'where k1 in (select bitmap_column from tbl)'; 2.Support bitmap runtime filter. Generate a bitmap filter using the right table bitmap and push it down to the left table storage layer for filtering.	2022-11-25 15:22:44 +08:00
Ashin Gau	25de068a05	[fix](parquet-reader) the value of null map will overflow when LazyRead merges too many empty batches (#14558 ) The run length of null map is saved as `uint16_t`. Previously, the run length of null map was limited by `batch_size` in the `ParquetReader`, by setting `batch_size = std::min(batch_size, (size_t)USHRT_MAX)`. It works well when the batch size is less than `USHRT_MAX`. However, [Lazy read](https://github.com/apache/doris/pull/13917) will merge empty batches until reading a non-empty batch or reaching the EOF of a row group, so the `batch_size` may be greater than `USHRT_MAX` in non-predicate columns. In addition, even if the `batch_size` does not exceed `USHRT_MAX`, the adjacent batches may also make the run length exceed the `USHRT_MAX` in `ColumnSelectVector::get_next_run`.	2022-11-25 12:22:18 +08:00
Jerry Hu	9103ded1dd	[improvement](join)optimize sharing hash table for broadcast join (#14371 ) This PR is to make sharing hash table for broadcast more robust: Add a session variable to enable/disable this function. Do not block the hash join node's close function. Use shared pointer to share hash table and runtime filter in broadcast join nodes. The Hash join node that doesn't need to build the hash table will close the right child without reading any data(the child will close the corresponding sender).	2022-11-24 21:06:44 +08:00
TengJianPing	6c7f758ef7	[improvement](hashjoin) support partitioned hash table in hash join (#14480 )	2022-11-24 14:16:47 +08:00
Gabriel	d14e1d25ff	[Bug](vectorized) Fix wrong column type (#14387 )	2022-11-23 18:07:33 +08:00
starocean999	1520e5c88a	[enhancement](agg)use new method to serialize keys in batch if the key is too large (#14484 ) * [enhancement](agg)use new method to serialize keys in batch if the key is too large * fix compile error	2022-11-23 17:35:39 +08:00
luozenglin	30e1818724	[fix](tracing) fix tracing in the new scan node does not meet expectations (#14155 ) Issue Number: close #14149 - Remove unexpected tracing, like 'vscanner::scan' - Merge span vscannode::get_next	2022-11-22 16:44:02 +08:00
Gabriel	1ec7f45fb6	[Bug](avg) Fix `avg` for bigint (#14433 )	2022-11-22 10:29:59 +08:00
Xin Liao	fea9966728	[fix](parquet-orc) fix that be core dump when some columns specified are not in the parquet or orc file (#14440 ) When some columns specified are not in the parquet or orc file in broker load, _batch->num_columns() will less than _num_of_columns_from_file. It will lead to be core dump. To prevent be core dump, just return an error in this case.	2022-11-22 09:10:38 +08:00
Pxl	bcd641877f	[Enhancement](scan) disable build key range and filters when push down agg work (#14248 ) disable build key range and filters when push down agg work	2022-11-21 12:47:57 +08:00
Gabriel	2c42f0a905	[refactor](decimalv3) Refine code for DecimalV3 (#14394 )	2022-11-19 16:57:17 +08:00
Mingyu Chen	512b787559	[fix](parquet-reader) fix stack-use-after-return error (#14411 )	2022-11-19 10:52:50 +08:00

1 2 3 4 5 ...

423 Commits