doris

Author	SHA1	Message	Date
xzj7019	eb2db1bfb0	[enhance](Tools) update tpch tools (#24291 ) update tpch tools: 1) extend data scale to sf1/sf100/sf1000/sf10000 2) add table schema, sql, opt config for all different scale. 3) refine result output	2023-09-14 09:47:50 +08:00
airborne12	5dede120db	[Enhancement](inverted index) fix compound predicates error (#24300 )	2023-09-14 09:21:41 +08:00
zhangstar333	9b7f041bea	[Bug](function) fix explode_json_array_int can't handle min/max values (#24284 ) the json str get value maybe beyond max/min of Int64, so add some check to limit the value, and return the max/min of Int64	2023-09-14 09:20:59 +08:00
Jerry Hu	539a7c2975	[fix](agg) Add the unimplemented functions in 'AggregateFunctionCountNotNullUnaryOld' (#24310 ) Similar to #24211	2023-09-14 09:19:02 +08:00
jakevin	93a9f1007c	[fix](Nereids): fix regression test (#24336 ) fix failed regression test by #23842	2023-09-14 01:55:09 +08:00
qiye	11afd321cb	[fix](es catalog) fix issue with select and insert from es catalog core (#24318 ) Issue Number: close #24315 The root cause of this issue is that Elasticsearch's long type allows inserting floats and strings. Doris did not handle these cases when doing type conversion. The current strategy is to take the integer before the decimal point if a float or string is found.	2023-09-13 23:07:31 +08:00
Ashin Gau	d5b490b2e7	[test](regression) add file cache regression test (#24192 ) Add file cache regression test in tpch 1g on orc&parquet format. tpch will run 3 times: 1. running without file cache 2. running with file cache for the first time 3. running with file cache for the second time The file cache configuration is already added in `be/conf/be.conf` on the regression test environment, and the available capacity is 100MB. After running the tpch 1g test, the metrics introduced by https://github.com/apache/doris/pull/19177 is like: ``` doris_be_file_cache_normal_queue_curr_size{path="/mnt/datadisk1/gaoxin/file_cache"} 92808933 doris_be_file_cache_normal_queue_curr_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 59 doris_be_file_cache_normal_queue_max_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 102400 doris_be_file_cache_normal_queue_max_size{path="/mnt/datadisk1/gaoxin/file_cache"} 89128960 doris_be_file_cache_removed_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 2132 doris_be_file_cache_segment_reader_cache_size{path="/mnt/datadisk1/gaoxin/file_cache"} 54 ```	2023-09-13 22:59:01 +08:00
Tiewei Fang	9847f7789f	[Feature](Export) `Export` sql supports to export data of `view` and `exrernal table` (#24070 ) Previously, EXPORT only supported the export of the olap table, This pr supports the export of view table and external table.	2023-09-13 22:55:19 +08:00
jakevin	d7e5f97b74	[feature](Nereids): eliminate AssertNumRows (#23842 )	2023-09-13 22:24:02 +08:00
zy-kkk	dbfacdc4af	[improvement](jdbc catalog) Optimize Loop Performance by Caching `isNebula` Method Result (#24260 )	2023-09-13 21:40:28 +08:00
Mryange	07dd6830e8	[pipelineX](refactor) add union node in pipelineX (#24286 )	2023-09-13 20:39:58 +08:00
zy-kkk	5238be24a2	[fix](jdbc catalog) Ensure Thread Safety by Refactoring isDoris&convertDateToNull Static Variable in JdbcMySQLClient (#24253 )	2023-09-13 20:19:44 +08:00
minghong	dad671af8e	[feature](nereids)prune runtime filter (tpch part) #19312 A rf is effective if it could filter target data. In this pr, a rf is effective if any one of following conditions is satisfied: A filter is applied on rf src, like T.A =1 A effective rf applied on this rf's src, denote X as src and target insertsection range. src.ndv with respect to X is smaller than target.ndv explaination of condition 2 Supplier join Nation on s_nationkey = n_nationkey join Region on n_regionkey = r_regionkey RF(nation->supplier) is effective because nation is filtered by an effective rf: RF(region->nation)	2023-09-13 20:12:08 +08:00
shuke	db9ed626da	[fix](ut) update submodule in run-be-ut.sh (#24239 ) There is no "git submodule update" when run ut, which causing compile errors in be ut in pipelines.	2023-09-13 19:43:28 +08:00
AKIRA	786a721e03	[feat](stats) Support analyze with sample automatically (#23978 ) 1. Analyze with sample automatically when table size is greater than huge_table_lower_bound_size_in_bytes(5G by default). User can disable this feature by fe option enable_auto_sample 2. Support grammer like `ANALYZE TABLE test WITH FULL` to force do full analyze whatever table size is 3. Fix bugs that tables stats doesn't get updated properly when stats is dropped, or only few column is analyzed	2023-09-13 19:42:10 +08:00
jakevin	05722b4cfd	[feature](Nereids): date/datetime parser support many complex case (#24287 ) - feature: normalize date/datetime with leading 0 - feature: support 'HH' offset in date/datetime - feature: normalize() add missing Minute/Second in Time part - feature: normalize offset HH to HH:MM - correct DateTimeFormatterUtilsTest	2023-09-13 17:30:58 +08:00
starocean999	231038f050	[fix](planner)allow infer predicate for external table (#24227 ) CREATE EXTERNAL TABLE `dim_server` ( `col1` varchar(50) NOT NULL, `col2` varchar(50) NOT NULL ) create view ads_oreo_sid_report ( `col1` , `col2` ) AS select tmp.col1,tmp.col2 from ( select 'abc' as col1,'def' as col2 ) tmp inner join dim_server ds on tmp.col1 = ds.col1 and tmp.col2 = ds.col2; select * from ads_oreo_sid_report where col1='abc' and col2='def'; before this pr, col1='abc' and col2='def' can't be pushed to dim_server. now the 2 predicates can be pushed to odbc table.	2023-09-13 17:22:39 +08:00
zclllyybb	86aa3802cf	[log](config) set streamload record default to enable	2023-09-13 16:32:30 +08:00
bobhan1	ccfc912ec0	[Fix](merge-on-write) Check the returned filtered rows from different replicas (#24191 )	2023-09-13 16:03:17 +08:00
Kaijie Chen	563c3f75ff	[feature](move-memtable) share delta writer v2 among sinks (#24066 )	2023-09-13 14:39:29 +08:00
Siyang Tang	d87b852e18	[enhancement](delete-handler) split Deletehandler#commitJob and add preconditions to intercept NPE(#24086 )	2023-09-13 14:34:12 +08:00
谢健	335064f897	[feature](Nereids) add lambda argument and array_map function (#23598 ) add array_map function SELECT ARRAY_MAP(x->x+1, ARRAY(87, 33, -49)) +----------------------------------------------------------------------+ \| array_map([x] -> (x + 1), x#1 of array(87, 33, -49)) \| +----------------------------------------------------------------------+ \| [88, 34, -48] \| +----------------------------------------------------------------------+	2023-09-13 14:24:16 +08:00
airborne12	edd711105a	[Feature](inverted index) add disjunction for inverted index query (#24263 )	2023-09-13 14:19:02 +08:00
daidai	e30c3f3a65	[fix](csv_reader)fix bug that Read garbled files caused be crash. (#24164 ) fix bug that read garbled files caused be crash.	2023-09-13 14:12:55 +08:00
Ashin Gau	9916324e9c	[fix](FileCache) the logic of selecting the cache path is reversed (#24277 ) Bug was introduced by https://github.com/apache/doris/pull/23881/files The logic of selecting the cache path is reversed, and BE will be crashed when enable file cache.	2023-09-13 13:25:07 +08:00
mch_ucchi	f985b28ac6	[fix](Nereids) default partition be prunned by mistake (#24186 ) ```sql CREATE TABLE IF NOT EXISTS t ( k1 tinyint NOT NULL, k2 smallint NOT NULL, k3 int NOT NULL, k4 bigint NOT NULL, k5 decimal(9, 3) NOT NULL, k8 double max NOT NULL, k9 float sum NOT NULL ) AGGREGATE KEY(k1,k2,k3,k4,k5) PARTITION BY LIST(k1) ( PARTITION p1 VALUES IN ("1","2","3","4"), PARTITION p2 VALUES IN ("5","6","7","8"), PARTITION p3 ) DISTRIBUTED BY HASH(k1) BUCKETS 5 properties("replication_num" = "1") select * from t where k1=10 ``` The query will return 0 rows because p3 is pruned, we fix it by skip prune default partitions. TODO: prune default partition if filter do not hit it	2023-09-13 12:04:20 +08:00
Mingyu Chen	a6f05e89f5	[doc](catalog) update cache refresh doc (#24183 ) Some cache refresh doc is missing	2023-09-13 11:45:26 +08:00
amory	2f74936382	[FIX](decimalv3) fix decimalv3 with precision cast (#24241 ) now we use cast to decimalv3 may has error , because decimalv3 use type precision for translate string mysql [test]>select cast("9999e-1" as decimal(2,1)); +------------------------------------+ \| cast('9999e-1' as DECIMALV3(2, 1)) \| +------------------------------------+ \| 999.9 \| +------------------------------------+ 1 row in set (0.01 sec) this pr will fix this just keep reaction same with mysql mysql> select cast('9999e-1' as decimalv3(2, 1)); +------------------------------------+ \| cast('9999e-1' as DECIMALV3(2, 1)) \| +------------------------------------+ \| 9.9 \| +------------------------------------+ 1 row in set (0.07 sec)	2023-09-13 11:35:33 +08:00
jakevin	7025293e17	[refactor](Nereids): new Date/Datetime parser to support more condition (#24224 ) * unify all Date/Datetime use one string-parser * support microsecond & ZoneOffset both exist * add many UT case * add determineScale() to get scale of datetime, original code just get length of part after . * reject more bad condition like 2022-01-01 00:00:00., we don't allow . without microsecond. * .....	2023-09-13 11:20:27 +08:00
AKIRA	f205473426	[feat](stats) enable set auto analyze time by set global session variable (#24026 )	2023-09-13 10:59:25 +08:00
zhangguoqiang	026cefbfbc	[Regresstion](external)open case test_doris_jdbc_catalog (#24093 ) open case test_doris_jdbc_catalog	2023-09-13 10:09:58 +08:00
zzzxl	1f769291b5	[doc](invert index) add invert index char_filter doc (#24205 )	2023-09-13 10:02:45 +08:00
zhiqqqq	c7ae2a7d22	[Refactor & Bugfix](static variables) move some static vairables to exec_env (#24029 )	2023-09-13 09:27:03 +08:00
morrySnow	1a3b70bf4a	[fix](Nereids) fix ctas bugs (#24267 ) 1. ctas should support without distribution desc 2. ctas should support column name list 3. ctas should throw exception when excution failed 4. ctas should convert null type to tinyint 5. ctas should support type conversion 6. ctas should convert first column from string to varchar	2023-09-13 09:17:57 +08:00
daidai	ebe3749996	[fix](tvf)support s3,local compress_type and append regression test (#24055 ) support s3,local compress_type and append regression test.	2023-09-13 00:32:59 +08:00
Qi Chen	9df72a96f3	[Feature](multi-catalog) Support hadoop viewfs. (#24168 ) ### Feature Support hadoop viewfs. ### Test - Regression tests: - hive viewfs test. - tvf viewfs test. - Broker load with broker and with hdfs tests manually.	2023-09-13 00:20:12 +08:00
Mingyu Chen	c402d48f97	[fix](query-cache) fix query cache with empty set (#24147 ) If the query result set is empty, the query cache will not cache the result. This PR fix it.	2023-09-12 20:11:20 +08:00
bobhan1	c926e8ff9d	[Enhancement](Status) use `Status` to expose the error info more explicitly in `FlushToken` (#24240 )	2023-09-12 19:30:16 +08:00
plat1ko	d8ef9dda59	[feature](merge-cloud) Rewrite FS interface (#23953 )	2023-09-12 19:20:25 +08:00
amory	4d107bd5dc	[FIX](regresstest) fix local tvf for doris cluter mode which here may not has fe log in #24187	2023-09-12 18:42:12 +08:00
Jerry Hu	3f7a612e76	[fix](agg) Add the unimplemented functions in AggregateFunctionOldSum. (#24211 )	2023-09-12 18:21:33 +08:00
zhiqqqq	6efeb12237	[chore](log) fix error log not aligned #24233	2023-09-12 18:18:07 +08:00
XuJianxu	7e467c91d3	[test](regression) add routine load cases (#24194 ) add routine load cases	2023-09-12 18:00:01 +08:00
airborne12	8d777e64e4	[Improvement](inverted index) return status error when create inverted index reader throwing error (#24223 ) Doris will be core when index file error when initialize column reader's index reader, try to catch throwing error and return error status.	2023-09-12 16:34:44 +08:00
zclllyybb	d3f1388717	[Feature](partitions) Support auto-partition (#24153 ) Co-authored-by: zhangstar333 <2561612514@qq.com>	2023-09-12 15:23:15 +08:00
TengJianPing	4bb9a12038	[function](bitmap) support bitmap_remove (#24190 )	2023-09-12 14:52:04 +08:00
yujun	9e0d843501	[fix](publish) publish go ahead even if quorum is not met (#23806 ) Co-authored-by: Yongqiang YANG <dataroaring@gmail.com>	2023-09-12 14:29:01 +08:00
Jibing-Li	2e2e174804	[fix](forward master op)Set default catalog and db only when they exist in master FE while executing forwarded stmt (#24212 ) In this case, forward to master will throw catalog or db not found exception: Connect to a follower: 1. create database test 2. use test 3. drop database test 4. create database test This is because after step 2, the default db in follower has been set to `test`, drop database will not change the default db. In step 4, the default db `test` is set and forwarded to master, and master will fail to find it because it is already dropped. This pr is to set the default catalog and db only when they exist. The actual reason is that, when Follower handle the `drop db` stmt, it will forward to master to execute it, but can not unset its own "current db"	2023-09-12 14:12:18 +08:00
Calvin Kirs	232f120edc	[Improve](Job)Support other types of Job query interfaces (#24172 ) - Support MTMV job - Task info add create time and sql - Optimize scheduling logic	2023-09-12 13:55:56 +08:00
HappenLee	dbf509edc0	[Debug](scan) Add debug log for find p0 scan coredump in pipeline (#24202 )	2023-09-12 12:17:44 +08:00

1 2 3 4 5 ...

13374 Commits