doris

Author	SHA1	Message	Date
Hong Liu	74f694753b	Fix the en docs of benchmark (#14459 )	2022-11-22 08:40:51 +08:00
xu tao	b36f3d7e61	[typo](docs) fix typo in schema-change.md (#14311 )	2022-11-21 13:38:47 +08:00
zy-kkk	ce489cf723	[Feature](JDBC)support clickhouse jdbc external table (#14244 )	2022-11-21 10:33:53 +08:00
jiafeng.zhang	98cea90950	[typo](docs)benchmark doc fix number (#14427 )	2022-11-20 22:51:42 +08:00
HappenLee	c29975d347	[Docs](function) Add some function do not in sidebars (#14426 )	2022-11-20 22:50:52 +08:00
jiafeng.zhang	71e80e8957	[typo](docs)Performance test documentation update (#14147 ) * Performance test documentation update	2022-11-20 09:40:57 +08:00
Mingyu Chen	2ccb5209a0	(improvement)[doc] add document version tag instruction (#14406 )	2022-11-20 00:05:53 +08:00
FreeOnePlus	f5f2e84e31	[refactor](planner) remove the limit return rows of order by (#12478 ) Originally, Order By Limit returned a maximum of 65535 rows of data by default during the query, but now many businesses do not apply this limit. It is necessary to add larger data after the query statement to complete the full data query, which is extremely inconvenient, so adjustments have been made. At the same time, I added the variable DEFAULT_ORDER_BY_LIMIT to the SessionVariable, the default value is -1, if the user does not use the LIMIT keyword or the LIMIT value is a negative integer, the default query return value is Long.MAX_VALUE. If the corresponding maximum query value is set, the number of data items is returned according to the maximum query value or the value followed by the LIMIT keyword.	2022-11-19 12:45:44 +08:00
lihangyu	b4aef889f2	[feature-array](array-function) add array constructor function `array()` (#14250 ) * [feature-array](array-function) add array constructor function `array()` ``` mysql> select array(qid, creationDate) from nested_c_2 limit 10; +------------------------------+ \| array(`qid`, `creationDate`) \| +------------------------------+ \| [1000038, 20090616074056] \| \| [1000069, 20090616075005] \| \| [1000130, 20090616080918] \| \| [1000145, 20090616081545] \| +------------------------------+ 10 rows in set (0.01 sec) ```	2022-11-19 10:49:50 +08:00
Mingyu Chen	2c4236fd24	[improvement](ctas) use string type for varchar/char/string (#14382 ) When executing create table as select stmt, the varchar/char/string type of column in created table will be unified to string type. Because when select from external table (mysql/pg, etc), the length of varchar in external database is calculated by "char" length, not "byte" length. So if there is a column with varchar(10) in external table, then there will be a same varchar(10) in created table. But the byte length of data in external table may be larger than 10, causing failure of CTAS. Change to string will not impact performance of the capacity of disk storage. And notice that if a string type column is the first column, it will be changed to varchar(65535), because we do not allow string type column as sort key column.	2022-11-18 14:20:13 +08:00
Xin Liao	fb140d0180	[Enhancement](sequence-column) optimize the use of sequence column (#13872 ) When you create the Uniq table, you can specify the mapping of sequence column to other columns. You no longer need to specify mapping column when importing.	2022-11-17 22:39:09 +08:00
Mingyu Chen	8fe5211df4	[improvement](multi-catalog)(cache) invalidate catalog cache when refresh (#14342 ) Invalidate catalog/db/table cache when doing refresh catalog/db/table. Tested table with 10000 partitions. The refresh operation will cost about 10-20 ms.	2022-11-17 20:47:46 +08:00
jiafeng.zhang	a4d4fc8c02	datax doris writer doc fix (#14344 )	2022-11-17 13:08:32 +08:00
jiafeng.zhang	0bf6d1fd79	[typo](doc)Datax doris writer doc update (#14328 )	2022-11-17 08:53:55 +08:00
xu tao	3259fcb790	[typo](docs) fix docs kafka-load.md (#14313 )	2022-11-16 23:17:30 +08:00
zhangstar333	70cc725649	[Vectorized](function) support avg_weighted/percentile_array/topn_wei… (#14209 ) * [Vectorized](function) support avg_weighted/percentile_array/topn_weighted functions * update add to stringRef	2022-11-15 16:38:38 +08:00
abmdocrt	f86886f8f5	[Feature](function) Support array_compact function (#14141 )	2022-11-15 14:24:37 +08:00
zhangstar333	93e5d8e660	[Vectorized](function) support bitmap_from_array function (#14259 )	2022-11-15 01:55:51 +08:00
Mingyu Chen	7eed5a292c	[feature-wip](multi-catalog) Support hive partition cache (#14134 )	2022-11-14 14:12:40 +08:00
Stalary	23a8c7eeb6	(fix)(multi-catalog)(es) Fix error result because not used fields_context (#14229 ) Fix error result because not used fields_context	2022-11-14 14:00:55 +08:00
lihangyu	43490a33a5	[feature-array](array-type) Add array function array_with_constant (#14115 ) Return array of constants with length num. ``` mysql> select array_with_constant(4, 1223); +------------------------------+ \| array_with_constant(4, 1223) \| +------------------------------+ \| [1223, 1223, 1223, 1223] \| +------------------------------+ 1 row in set (0.01 sec) ``` co-authored-by @eldenmoon	2022-11-11 22:08:43 +08:00
Yixi Zhang	0ba13af8ff	[feature](running_difference) support running_difference function (#13737 )	2022-11-11 21:22:56 +08:00
xueweizhang	a162dab40a	[feature](docs) add docs for SHOW-CATALOG-RECYCLE-BIN (#14185 )	2022-11-11 15:54:05 +08:00
abmdocrt	b6ba654f5b	[Feature](Sequence) Support sequence_match and sequence_count functions (#13785 )	2022-11-11 13:38:45 +08:00
Gabriel	7782fb63ca	[docs](outfile) Add ORC to outfile document (#14153 )	2022-11-11 09:42:30 +08:00
yongjinhou	6297ef10e9	[enhancement](plugin) import audit logs for slow queries into a separate table (#14100 ) * import audit logs for slow queries into a separate table	2022-11-11 09:06:01 +08:00
jakevin	b62e700f4e	[fix](doc): remove incubator. (#14159 )	2022-11-11 08:58:42 +08:00
xueweizhang	45a3bb87c4	[docs](recover) modify recover doc (#13904 )	2022-11-10 20:20:39 +08:00
shee	9b5b411112	[fix](schemeChange) fe oom because replicas too many when schema change (#12850 )	2022-11-10 16:17:25 +08:00
Pxl	0e26f28bf2	[Enhancement](runtime-filter) enlarge runtime filter in predicate threshold (#13581 ) enlarge runtime filter in predicate threshold	2022-11-10 15:48:46 +08:00
xueweizhang	90bfd87660	[feature](function) add new function uuid() (#14092 )	2022-11-10 14:55:41 +08:00
zhangstar333	df622d8b7d	[Bug](udf) fix java-udaf process string type error and add some tests (#14106 )	2022-11-10 09:30:57 +08:00
Liqf	55cae6202f	[typo](docs)add udf doc and optimize udf regression test (#14000 )	2022-11-10 09:24:45 +08:00
Tiewei Fang	b74d0a4747	[feature](table-valued-function) Support `desc from s3()` and modify the syntax of tvf (#14047 ) This pr does two things: Support desc function s3() modify the syntax of tvf	2022-11-09 14:12:43 +08:00
carlvinhust2012	7362460525	[docs](array-type) update the docs to specify how to use array function when import data (#13995 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-11-09 12:21:26 +08:00
Liqf	287c3893b9	[typo](docs)update array type doc #14057	2022-11-09 08:40:38 +08:00
xueweizhang	a0f136a0bc	[docs](odbc) fix docs for sqlserver odbc table (#14017 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com> Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2022-11-09 08:39:39 +08:00
Mingyu Chen	b6f91b6eff	[improvement](profile) support ordinary user to get query profile via http api (#14016 )	2022-11-08 20:39:01 +08:00
zhangstar333	f7ecb6d79f	[Bug](Bitmap) fix sub_bitmap calculate wrong result to return null (#13978 ) fix sub_bitmap calculate wrong result to return null	2022-11-08 14:10:12 +08:00
Mingyu Chen	1c07a01038	[feature](multi-catalog) Support data on s3-compatible oss and support aliyun DLF (#13994 ) Support Aliyun DLF Support data on s3-compatible object storage, such as aliyun oss. Refactor some interface of catalog, to make it more tidy. Fix bug that the default text format field delimiter of hive should be \x01 Add a new class PooledHiveMetaStoreClient to wrap the IMetaStoreClient.	2022-11-08 14:02:41 +08:00
TaoZex	241801ca17	[typo](doc) fix get_start doc (#14001 )	2022-11-07 21:28:45 +08:00
zy-kkk	0031304015	[typo](docs)fix config doc #14010	2022-11-07 17:00:16 +08:00
Wanghuan	7254999f02	[typo](docs) fix docs，delete redundant words #13849	2022-11-07 13:51:10 +08:00
Yiliang Qiu	e8d2fb6778	[feature](function)add search functions: multi_search_all_positions & multi_match_any (#13763 ) Co-authored-by: yiliang qiu <yiliang.qiu@qq.com>	2022-11-07 11:50:55 +08:00
lihangyu	7ffe88b579	[feature-array](array-type) Add array function array_popback (#13641 ) Remove the last element from array. ``` mysql> select array_popback(['test', NULL, 'value']); +-----------------------------------------------------+ \| array_popback(ARRAY('test', NULL, 'value')) \| +-----------------------------------------------------+ \| [test, NULL] \| +-----------------------------------------------------+ ```	2022-11-07 10:48:16 +08:00
caoliang-web	380395a61f	[doc](routineload)Common mistakes in adding routine load #13975	2022-11-05 19:17:33 +08:00
lihaijian	087488db3b	[typo](doc) fixed spelling errors (#13974 )	2022-11-05 15:40:55 +08:00
zhengyu	554f566217	[enhancement](compaction) introduce segment compaction (#12609 ) (#12866 ) ## Design ### Trigger Every time when a rowset writer produces more than N (e.g. 10) segments, we trigger segment compaction. Note that only one segment compaction job for a single rowset at a time to ensure no recursing/queuing nightmare. ### Target Selection We collect segments during every trigger. We skip big segments whose row num > M (e.g. 10000) coz we get little benefits from compacting them comparing our effort. Hence, we only pick the 'Longest Consecutive Small" segment group to do actual compaction. ### Compaction Process A new thread pool is introduced to help do the job. We submit the above-mentioned 'Longest Consecutive Small" segment group to the pool. Then the worker thread does the followings: - build a MergeIterator from the target segments - create a new segment writer - for each block readed from MergeIterator, the Writer append it ### SegID handling SegID must remain consecutive after segment compaction. If a rowset has small segments named seg_0, seg_1, seg_2, seg_3 and a big segment seg_4: - we create a segment named "seg_0-3" to save compacted data for seg_0, seg_1, seg_2 and seg_3 - delete seg_0, seg_1, seg_2 and seg_3 - rename seg_0-3 to seg_0 - rename seg_4 to seg_1 It is worth noticing that we should wait inflight segment compaction tasks to finish before building rowset meta and committing this txn.	2022-11-04 14:12:51 +08:00
Kang	1b36843664	[doc](jsonb type)add documents for JSONB datatype (#13792 )	2022-11-03 19:33:51 +08:00
luozenglin	6ff306b1ea	[docs](round) complement round function documentation (#13838 )	2022-11-03 14:30:49 +08:00

1 2 3 4 5 ...

1593 Commits