doris

Author	SHA1	Message	Date
minghong	b9381570d6	[feature](nereids) semi and anti join estimation (#18129 ) in this pr, we add a new algorithm to estimate semi/anti join row count. In original alg., we reduce row count from cross join. usually, this is not good. for example, L left semi join R on L.a=R.a suppose L is larger than R, and ndv(L.a) < ndv(R.a) the estimated row count is rowcount(R) * rowcount(L) / ndv(R.a). in most cases, the estimated row count is larger than rowcount(L). in new alg, we use ndv(R.a)/originalNdv(R.a) to estimate result rowCount. the basic idea is as following: 1. Suppose ndv(R.a) reduced from m to n. 2. Assume that the value space of L.a is the same as R.a if R.a is not filtered.(this assumption is also hold in original alg.) regard `L left join R` as a filter applied on L, that is, if L.a is in R.a, then this tuple stays in result. R.a shrinks to m/n, so L.a also shrinks to m/n	2023-04-03 09:11:10 +08:00
Xinyi Zou	4b914c196a	[fix](expr pushdown) Fix VRuntimeFilterWrapper cannot get children #18289	2023-04-03 09:09:52 +08:00
ZhangYu0123	03e49b986d	[fix](conf) fix be JAVA_OPTS conf #18305 Co-authored-by: zhangyu209 <zhangyu209@meituan.com>	2023-04-03 09:07:13 +08:00
Mingyu Chen	03fc41ea51	[doc](catalog) add faq for hive catalog (#18298 )	2023-04-03 09:01:49 +08:00
Yongqiang YANG	8011bdb30d	[improvement](test) print exception when streamload fails (#18315 )	2023-04-03 08:56:54 +08:00
Mingyu Chen	7131c60e05	[fix](audit-log) fixslow query missing in audit log (#18317 ) #17738 changed the column name in audit log, causing "slow_query" will not be recorded in fe.audit.log	2023-04-03 08:52:14 +08:00
mch_ucchi	4fcd93ac00	[Enhancement](Nereids)add datelikev2 type support for fold constant. #18275 add datelikev2 type support for fold constant. date_add / years_add / mouths_add / days_add / hours_add / minutes_add / seconds_add and xxx_sub.	2023-04-03 08:47:47 +08:00
Yongqiang YANG	ff66efd7d0	[improvement](test) print response of streamload (#18313 ) We need reponse text to reason failures of streamload.	2023-04-02 20:08:28 +08:00
Yongqiang YANG	419aa4f12a	[fix](thrift_server) do not check started state in ThriftServer::join (#18314 ) started may be set to false when server thread is stopped.	2023-04-02 19:24:41 +08:00
morrySnow	04929ff6d4	[fix](doc) suggest use window function to replace running_difference (#18281 )	2023-04-02 16:35:10 +08:00
Jack Drogon	7d49d9cf99	[improvement](dynamic partition) Fix dynamic partition no bucket (#18300 ) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>	2023-04-02 15:51:21 +08:00
slothever	97aab138aa	[fix](parquet-reader) reset value idx in bool rle decoder and support iceberg datetime(3) (#18245 ) 1. Fix value idx in bool rle decoder 2. Iceberg table support datetimev2(3). In the previous version, we converted hive timestamp to datetimev2(0) default.	2023-04-01 21:00:01 +08:00
jakevin	9e087622ab	[fix](Nereids): fix JoinReorderContext in withXXX() of LogicalJoin. (#18299 )	2023-04-01 16:51:27 +08:00
abmdocrt	365867a867	[feature](SSL) default enable SSL MySQL connection to FE (#18285 )	2023-03-31 21:31:23 +08:00
Xinyi Zou	5e7ea5e305	[fix](memory) Fix `bthread_setspecific` log fatal on UBSAN build (#18274 )	2023-03-31 19:46:53 +08:00
Pxl	236ee1411e	[Chore](case) add p0 testSubQuery,testJoinOnLeftProjectToJoin,testAggregateMVCalcAg	2023-03-31 19:46:05 +08:00
Mingyu Chen	7e61a85331	[refactor](libhdfs) introduce hadoop libhdfs (#18204 ) 1. Introduce hadoop libhdfs 2. For Linux-X86 platform, use the hadoop libhdfs 3. For other platform, use libhdfs3, because currently we don't have hadoop libhdfs binary for other platform Co-authored-by: adonis0147 <adonis0147@gmail.com>	2023-03-31 18:41:39 +08:00
yiguolei	a77921d767	[refactor](typesystem) remove unused rpc common file and using function rpc (#18270 ) rpc common is duplicate, all its method is included in function rpc. So that I remove it. get_field_type is never used, remove it. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-31 18:13:25 +08:00
mch_ucchi	3ea98b65df	[Fix](Nereids) fix nereids failed to parse set operation with query in parenthesis (#18062 ) sql like the format (q1, q2, q3 is a query): ``` sql (q1) UNION ALL (q2) UNION ALL (q3) ORDER BY keys ``` cannot be parsed by nereids, because order will be recognized as an alias of query, we add queryOrganization to avoid it.	2023-03-31 15:55:52 +08:00
Jerry Hu	22a705543b	[fix](string_ref) Incorrect result caused by the improperly comparing of StringRef on macOS with Apple silicon or using non-avx2 #18264 On macOS systems with Apple silicon, the '==' operator of StringRef uses string_compare, which takes StringRef as a C-String with null-terminated chars.	2023-03-31 15:11:11 +08:00
Xin Liao	c3e2269c4c	[fix](merge-on-write) fix that missed rows don't match merged rows for base compaction (#18262 )	2023-03-31 15:06:51 +08:00
yiguolei	1027abe0d3	[enhancement](query exec) should print error status when query meet error (#18247 ) If BE is in heavy load, the query may failed, but BE will try to connect to FE using thrift, if FE is also in heavy load the thrift connection will failed. And the status is rewritten at line 342, and the actual failure reason for the query is lost. Should print the error status every time during update. Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-31 14:08:24 +08:00
morrySnow	1a56c56e90	[fix](planner) lateral view do not support lower case table name config (#18165 ) TableFunctionNode lower_case_table_names set to 1 and 2	2023-03-31 13:42:24 +08:00
yongkang.zhong	1c2f95b887	[improve](clickhouse jdbc) support clickhouse jdbc 4.x version (#18258 ) In clickhouse's 4.x version of jdbc, some UInt types use special Java types, so I adapted Doris's ClickHouse JDBC External ``` com.clickhouse.data.value.UnsignedByte; com.clickhouse.data.value.UnsignedInteger; com.clickhouse.data.value.UnsignedLong; com.clickhouse.data.value.UnsignedShort; ```	2023-03-31 13:40:10 +08:00
gitccl	20b3bdb000	[vectorized](function) support array_first_index function (#18175 ) mysql> select array_first_index(x->x+1>3, [2, 3, 4]); +-------------------------------------------------------------------+ \| array_first_index(array_map([x] -> x(0) + 1 > 3, ARRAY(2, 3, 4))) \| +-------------------------------------------------------------------+ \| 2 \| +-------------------------------------------------------------------+ mysql> select array_first_index(x -> x is null, [null, 1, 2]); +----------------------------------------------------------------------+ \| array_first_index(array_map([x] -> x(0) IS NULL, ARRAY(NULL, 1, 2))) \| +----------------------------------------------------------------------+ \| 1 \| +----------------------------------------------------------------------+ mysql> select array_first_index(x->power(x,2)>10, [1, 2, 3, 4]); +---------------------------------------------------------------------------------+ \| array_first_index(array_map([x] -> power(x(0), 2.0) > 10.0, ARRAY(1, 2, 3, 4))) \| +---------------------------------------------------------------------------------+ \| 4 \| +---------------------------------------------------------------------------------+	2023-03-31 12:51:29 +08:00
Pxl	307170030c	[Bug](materialized-view) fix core dump when create mv have case different with base table (#18206 ) fix core dump when create mv have case different with base table	2023-03-31 12:32:09 +08:00
zhangstar333	1b2aaab2f2	[vectorized](bug) fix some case in enable fold constant (#17997 ) fix some case in enable fold constant	2023-03-31 11:41:31 +08:00
zclllyybb	f800ba8f4c	[Exec](opt) Optimize function call for const columns (#18212 )	2023-03-31 11:36:21 +08:00
Pxl	e7bcd970f5	[Bug](materialized-view) fix isDisableTuplesMVRewriter rreturn true when expr is literal (#18246 ) fix isDisableTuplesMVRewriter rreturn true when expr is literal	2023-03-31 11:30:47 +08:00
jakevin	8e15388074	[fix](Nereids): use CBO rule instead of using rewrite rule. (#18256 )	2023-03-31 11:23:26 +08:00
zhangy5	1938899aa3	[regression-test] add grouping sets test case (#18194 )	2023-03-31 11:00:38 +08:00
lihangyu	35bae25568	[Improve](row store) add more profile info in log for point query and make row column page size more configurable (#18181 ) save about 20% FE cpu cost for point query with prepared statement which table contains 100 columns	2023-03-31 10:58:59 +08:00
camby	7d92bf095a	[fix](expr) refractor create_tree_from_thrift to avoid stack overflow (#18214 )	2023-03-31 10:38:20 +08:00
Kang	479272f86f	[bugfix](topn) fix topn optimzation wrong result for NULL values (#18121 ) 1. add PassNullPredicate to fix topn wrong result for NULL values 2. refactor RuntimePredicate to avoid using TCondition 3. refactor using ordering_exprs in fe and vsort_node	2023-03-31 10:02:07 +08:00
Kang	4e1e0ce06d	[bugfix](topn) fix topn optimzation wrong result for NULL values (#18121 ) 1. add PassNullPredicate to fix topn wrong result for NULL values 2. refactor RuntimePredicate to avoid using TCondition 3. refactor using ordering_exprs in fe and vsort_node	2023-03-31 10:01:34 +08:00
HappenLee	8be43857ef	[feature](executor) Add memory limit for pip_scanner_context (#18238 ) Co-authored-by: wangbo <506340561@qq.com>	2023-03-31 09:36:57 +08:00
Xinyi Zou	e5793249cd	[opt](hashtable) Modify default filled strategy to 75% (#18242 )	2023-03-31 09:28:11 +08:00
lihangyu	e0f6083e73	[refactor](dynamic table) add `get_type_as_tprimitive_type` and `get_type_as_primitive_type` in IDataType to get `PrimitiveType` and `TPrimitiveType` (#18260 )	2023-03-31 09:03:06 +08:00
minghong	1abb19d0fd	filter estimation refactor (#18170 )	2023-03-31 08:49:38 +08:00
abmdocrt	a88e80f8ee	[fix](ssl)refactor some SSL info logs to debug logs (#18234 )	2023-03-31 08:41:02 +08:00
AKIRA	b5ea299697	[fix](planner) Fix agg on inlineview which with constant slot (#18201 ) Since slot that reference to constant has been marked as constant expr either, just add condition check to make sure such slot wouldn't be eliminated as constant from group exprs	2023-03-30 23:54:37 +08:00
jakevin	28793b6441	[fix](Nereids): fix copyIn() in Memo when useless project with groupplan (#18223 )	2023-03-30 23:49:21 +08:00
Ashin Gau	d6b0fe9072	[feature](jni) jni table scanner framework (#17960 ) A framework that read data from jni scanner, which can support the data source from java ecosystem(java API). ## Java Interface Java scanner should extends `org.apache.doris.jni.JniScanner`, implements the following methods: ``` // Initialize JniScanner public abstract void open() throws IOException; // Close JniScanner and release resources public abstract void close() throws IOException; // Scan data and save as vector table public abstract int getNext() throws IOException; ``` See demo usage in `org.apache.doris.jni.MockJniScanner` ## c++ interface C++ reader should use `doris::JniConnector` to get data from `org.apache.doris.jni.JniScanner`. See demo usage in `doris::MockJniReader`. ## Pushed-down predicates Java scanner can get pushed-down predicates by `org.apache.doris.jni.vec.ScanPredicate`. ## Remaining works: 1. Implement complex nested types. 2. Read hudi MOR table as the end-to-end demo usage.	2023-03-30 23:47:45 +08:00
Xiangyu Wang	0c2ff09fcf	[Fix](multi-catalog) fix hms automatic update error. (#18252 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2023-03-30 23:09:07 +08:00
HappenLee	1d2dbe7898	[Bug][Pipeline] Run clickbench dead lock in pipeline exec engine (#18211 ) In pipeline exec engine run clickbench may dead lock in some query	2023-03-30 21:41:57 +08:00
Mingyu Chen	1050df7076	[fix](fs) fix local file system copy bug (#18243 ) `copy_dirs` has a bug that will cause infinity iteration	2023-03-30 21:36:07 +08:00
mch_ucchi	322b51180f	[Fix](regression-test)replace now to a fixed datetime. (#18253 )	2023-03-30 21:04:09 +08:00
jakevin	3d2c70f75d	[fix](Nereids): fix merge_group(). (#18250 )	2023-03-30 20:34:47 +08:00
mch_ucchi	fefc0d6814	[Fix](planner)fix create view ignore order by info bug. (#18197 )	2023-03-30 20:17:46 +08:00
liujinhui	ce79ff947a	[Enhancement](spark load)Support spark time out config (#17108 )	2023-03-30 20:12:46 +08:00

... 171 172 173 174 175 ...

18263 Commits