doris

Author	SHA1	Message	Date
Yulei-Yang	6ffc26858a	[Improvement](meta) add default_value column & is changed column for result of show_variables stmt (#23017 ) * [Improvement](meta) add default_value column for result of show_variables stmt * add Changed column to show whether value is modified * fix code style issue	2023-08-20 20:48:45 +08:00
slothever	97fa840324	[feature](multi-catalog)support iceberg hadoop catalog external table query (#22949 ) support iceberg hadoop catalog external table query	2023-08-20 19:29:25 +08:00
slothever	5ba505ebf4	[fix](multi-catalog)fix avro and jdbc scanner dependency (#23015 ) add preload-extensions module, put all conflict dependencies to pom.xml in preload-extensions	2023-08-20 19:28:17 +08:00
jakevin	6bf65253d0	[fix](Nereids): unstable test when run single UT. (#23189 )	2023-08-18 23:14:56 +08:00
Tiewei Fang	10abbd2b62	[Feauture](Export) support parallel export job using Job Schedule (#22854 )	2023-08-18 22:24:42 +08:00
Calvin Kirs	6847592137	[Fix](RoutineLoad)Fix when Unique (MoW) RoutineLoad imports unspecified Sequence column (#23167 ) [Fix](RoutineLoad)Fix when Unique (MoW) routineload imports unspecified Sequence column	2023-08-18 21:49:09 +08:00
slothever	b6dd56fee0	[fix](multi-catalog)fix compability issue for s3 endpoint (#23175 )	2023-08-18 18:37:21 +08:00
jakevin	345eaab00b	[refactor](Nereids): remove useless equals()/hashcode() about Id (#23162 )	2023-08-18 18:31:31 +08:00
Mingyu Chen	7c4870c371	[fix](catalog) fix hive partition prune bug on nereids (#23026 )	2023-08-18 18:31:01 +08:00
Mingyu Chen	9cee0ecccc	[fix](show-table-status) fix priv error on show table status stmt (#22918 )	2023-08-18 18:30:09 +08:00
jakevin	f71b78c415	[enhancement](Nereids): remove override child(int index) (#23124 ) method `child(int index)` use code `super.child(index)` will cause Pointer jump twice.	2023-08-18 17:34:49 +08:00
minghong	609d20de8c	[refactor](nereids)remove ColumnStatistics.selectivity (#23039 )	2023-08-18 16:45:54 +08:00
ZenoYang	1c3cc77a54	[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236 ) * [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty * add ut * fix nereids * fix regression-test	2023-08-18 14:37:49 +08:00
Siyang Tang	a7771ea507	[fix](planner) fix current_timestamp param type mismatch when doing stream load (#23092 ) FileLoadScanNode did not analyze the default value expr, result in target param type int32 become int8 as the original IntLiteral type.	2023-08-18 14:28:45 +08:00
caiconghui	635349a015	[fix](log4j) fix audit_log_roll_num not work for fe audit log file (#23157 ) Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-08-18 14:13:45 +08:00
jakevin	441032c3d8	[fix](Nereids): LogicalSink equals() shouldn't invoke super.equals() (#23145 )	2023-08-18 14:05:48 +08:00
amory	2d96d19030	[FIX](array-func) fix array() with decimal type (#23117 ) if we write sql with : select array(1.0,2.0,null, null,2.0) here will pass arg type with uint8 to be which does not match array() func sign with deicmal, and make be core. so here should cast from be and make null tag to cast decimal type	2023-08-18 12:12:50 +08:00
Pxl	59c6139aa5	[Chore](parser) fix create view failed when view contained cast as varchar (#23043 ) fix create view failed when view contained cast as varchar	2023-08-18 11:50:18 +08:00
Siyang Tang	df8e7f7f09	[enhancement](msg) add disk root path in message (#23000 )	2023-08-18 11:21:59 +08:00
luozenglin	d018ac8fb7	fix show grants throw NullPointerException (#22943 )	2023-08-18 10:48:56 +08:00
wuwenchi	a5ca6cadd6	[Improvement] Optimize count operation for iceberg (#22923 ) Iceberg has its own metadata information, which includes count statistics for table data. If the table does not contain equli'ty delete, we can get the count data of the current table directly from the count statistics.	2023-08-18 09:57:51 +08:00
Xiangyu Wang	03d59ba81e	[Fix](Nereids) fix sql-cache for nereids. (#22808 ) 1. should not use ((LogicalPlanAdapter)parsedStmt).getStatementContext().getOriginStatement().originStmt.toLowerCase() as the cache key (do not invoke toLowerCase()), for example: select * from tbl1 where k1 = 'a' is different with select * from tbl1 where k1 = 'A', so the cache should be missed. 2. according to issue 6735 , the cache key should contains all views' s ddl sql (including nested views)	2023-08-18 09:36:07 +08:00
hzq	38c182100a	[refactor](mysql compatibility) An abstract class for all databases created for mysql compatibility (#23087 ) Better code structure for mysql compatibility databases.	2023-08-18 09:16:23 +08:00
yujun	1f19d0db3e	[improvement](tablet clone) improve tablet balance, scaling speed etc (#22317 )	2023-08-17 22:30:49 +08:00
Chenyang Sun	b91bb9f503	[fix](alter table property) fix alter property if rpc failed (#22845 ) * fix alter property * add regression case * do not repeat	2023-08-17 18:02:34 +08:00
morrySnow	11d76d0ebe	[fix](Nereids) non-inner join should not merge dist info (#22979 ) 1. left join should use left dist info. 2. right join should use right dist info. 3. full outer join should return ANY dist info.	2023-08-17 17:48:50 +08:00
LiBinfeng	d7a6b64a65	[Fix](Planner) fix case function with null cast to array null (#22947 )	2023-08-17 16:37:07 +08:00
zy-kkk	a248cb720c	[fix](jdbc catalog) fix DefaultValueExpr in Jdbc table column when CTAS (#22978 )	2023-08-17 15:52:20 +08:00
Jibing-Li	3fe419eafa	[Fix](statistics)Fix update cached column stats bug (#23049 ) `show column cached stats` sometimes show wrong min/max value: ``` mysql> show column cached stats hive.tpch100.region; +-------------+-------+------+----------+-----------+---------------+------+------+--------------+ \| column_name \| count \| ndv \| num_null \| data_size \| avg_size_byte \| min \| max \| updated_time \| +-------------+-------+------+----------+-----------+---------------+------+------+--------------+ \| r_regionkey \| 5.0 \| 5.0 \| 0.0 \| 24.0 \| 4.0 \| N/A \| N/A \| null \| \| r_comment \| 5.0 \| 5.0 \| 0.0 \| 396.0 \| 66.0 \| N/A \| N/A \| null \| \| r_name \| 5.0 \| 5.0 \| 0.0 \| 40.8 \| 6.8 \| N/A \| N/A \| null \| +-------------+-------+------+----------+-----------+---------------+------+------+--------------+ ``` This pr is to fix this bug. It is because while transferring ColumnStatistic object to JSON, it doesn't contain the minExpr and maxExpr attribute.	2023-08-17 15:20:02 +08:00
jakevin	bf2b92f5e8	[fix](Nereids): PushdownDistinctThroughJoin don't push distinct for relation (#23066 ) * [fix](Nereids): PushdownDistinctThroughJoin don't push distinct for relation. * fix test	2023-08-17 14:50:34 +08:00
slothever	f5da9f4ccc	[fix](muti-catalog)convert to s3 path when use aws endpoint (#22784 ) Convert to s3 path when use aws endpoint For compatibility， we can also use s3 client to access other cloud by setting s3 endpoint properties	2023-08-17 14:28:00 +08:00
starocean999	92c8f842f7	[fix](nereids) dphyper join reorder use wrong method to get hash and other conjuncts (#22966 ) should use getHashJoinConjuncts() and getOtherJoinConjuncts() to get hash and other conjuncts of hash join node instead of categorizing them by checking if it's 'EqualTo' expression	2023-08-17 11:03:45 +08:00
TengJianPing	343a6dc29d	[improvement](hash join) Return result early if probe side has no data (#23044 )	2023-08-17 09:17:09 +08:00
HappenLee	814acbf331	[pipeline](exec) disable pipeline load in master code (#23061 ) disable pipeline load in master code	2023-08-16 21:53:58 +08:00
morrySnow	0594acfcf1	[fix](Nereids) scan should output all invisiable column (#23003 )	2023-08-16 18:07:59 +08:00
minghong	f1880d32d9	[fix](nereids)bind slot failed because of "default_cluster" #23008 slot bind failed for following querys: select tpch.lineitem.* from lineitem select tpch.lineitem.l_partkey from lineitem the unbound slot is tpch.lineitem.l_partkey, but the bounded slot is default_cluster:tpch.lineitem.l_partkey. They are not matched. we need to ignore default_cluster: when compare dbName	2023-08-16 17:22:44 +08:00
谢健	92f443b3b8	[enhancement](Nereids): count(1) to count() #22999 add a rule to transform count(1) to count()	2023-08-16 17:19:23 +08:00
LiBinfeng	2dbca7a688	[Fix](Planner) fix multi phase analysis failed in multi instance environment substitution (#22840 ) Problem: When executing group_concat with order by inside in view, column can not be found when analyze. Example: create view if not exists test_view as select group_concat(c1,',' order by c1 asc) from table_group_concat; select * from test_view; it will return an error like: "can not find c1 in table_list" Reason: When we executing this sql in multi-instance environment, Planner would try to create plan in multi phase aggregation. And because we analyze test_view independent with tables outside view. So we can not get table informations inside view. Solution: Substitute order by expression of merge aggregation expressions.	2023-08-16 16:46:26 +08:00
mch_ucchi	7adb2be360	[Fix](Nereids) fix insert into return npe from follower node. (#22734 ) insert into table command run at a follower node, it will forward to the master node, and the parsed statement is not set to the cascades context, but set to the executor::parsedStmt, we use the latter to get the user info.	2023-08-16 16:37:17 +08:00
bobhan1	5148bc6fa7	[fix](partial update)allow delete sign column in partial update in planForPipeline (#23034 )	2023-08-16 14:20:39 +08:00
bobhan1	4510e16845	[improvement](delete) support delete predicate on value column for merge-on-write unique table (#21933 ) Previously, delete statement with conditions on value columns are only supported on duplicate tables. After we introduce delete sign mechanism to do batch delete, a delete statement with conditions on value columns on unique tables will be transformed into the corresponding insert into ..., __DELETE_SIGN__ select ... statement. However, for unique table with merge-on-write enabled, the overhead of inserting these data can be eliminated. So this PR add the ability to allow delete predicate on value columns for merge-on-write unique tables.	2023-08-16 12:18:05 +08:00
Calvin Kirs	3efa06e63e	[Fix](View)varchar type conversion error (#22987 )	2023-08-16 11:49:04 +08:00
zy-kkk	221e7bdd17	[test](jdbc external) fix mysql and pg external regression test (#22998 )	2023-08-16 10:44:47 +08:00
minghong	423002b20a	[fix](nereids) partitionTopN & Window estimation (#22953 ) * partitionTopN & winExpr estimation * tpcds 44/47/57	2023-08-15 20:19:03 +08:00
minghong	80566f7fed	[stats](nereids)support partition stats (#22606 )	2023-08-15 17:52:25 +08:00
HappenLee	9b2323b7fd	[Pipeline](exec) support async writer in pipelien query engine (#22901 )	2023-08-15 17:32:53 +08:00
yujun	d7a5c37672	[improvement](tablet clone) update the capacity coeficient for calculating backend load score (#22857 ) update the capacity coeficient for calcutating the backend load score: 1. Add fe config entry `backend_load_capacity_coeficient` to allow setting the capacity coeficient manually; 2. Adjust calculating capacity coeficient as below. We emphasize disk usage for calculating load score. If a be has a high used capacity percent, we should increase its load score. So we increase capacity coefficient with a be's used capacity percent. But this is not enough. For example, if the tablets have a big difference in data size. Then for below two BEs, their load score maybe the same: BE A: disk usage = 60%, replica number = 2000 (it contains the big tablets) BE B: disk usage = 30%, replica number = 4000 (it contains the small tablets) But what we want is: firstly move some big tablets from A to B, after their disk usages are close, then move some small tablets from B to A, finally both of their disk usages and replica number are close. To achieve this, when the max difference between all BE's disk usages >= 30%, we set the capacity cofficient to 1.0 and avoid the affect of replica num. After the disk usage difference decrease, then decrease the capacity cofficient to make replica num effective.	2023-08-15 17:27:31 +08:00
谢健	7de362f646	[fix](Nereids): expand other join which has or condition (#22809 )	2023-08-15 16:49:19 +08:00
jakevin	dd09e42ca9	[enhancement](Nereids): expression unify constructor by using List (#22985 )	2023-08-15 16:47:58 +08:00
Xiangyu Wang	140ab60a74	[Enhancement](multi-catalog) add a BE selection strategy for hdfs short-circuit-read. (#22697 ) Sometimes the BEs will be deployed on the same node with DataNode, so we can use a more reasonable BE selection policy to use the hdfs short-circuit-read as much as possible.	2023-08-15 15:34:39 +08:00

1 2 3 4 5 ...

5593 Commits