doris

Author	SHA1	Message	Date
yongkang.zhong	25e8c71943	[test](fix) fix postgresql test (#18900 ) * [test](fix) fix postgresql test * fix	2023-04-23 18:41:41 +08:00
Luzhijing	2c776584e5	[doc](releasenote)release 1.2.4 (#18934 ) * release 1.2.4 * Update README.md * Update sidebars.json	2023-04-23 16:04:25 +08:00
ZenoYang	0da2cf270a	[improvement](fetch data) Merge result into batch to reduce rpc times (#17828 )	2023-04-23 15:07:28 +08:00
Jerry Hu	63e8fb7300	[chore](regression) Add 'sync' after stream_load in some cases (#18945 )	2023-04-23 14:39:33 +08:00
WenYao	166bed11d4	[Enchancement](auth) Forbid to login doris from 127.0.0.1 without password (#18816 ) * forbid to login from 127.0.0.1 without password * add localhost limit * rename	2023-04-23 13:56:31 +08:00
yiguolei	61b44108e2	[bugfix](asan) fix possible asan check bug in exception to string (#18936 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-23 12:26:36 +08:00
Ashin Gau	29f502380c	[opt](FileReader) merge small IO to optimize read performace (#18796 ) Add `MergeRangeFileReader` to merge small IO to optimize parquet&orc read performance. `MergeRangeFileReader` is a FileReader that efficiently supports random access in format like parquet and orc. In order to merge small IO in parquet and orc, the random access ranges should be generated when creating the reader. The random access ranges is a list of ranges that order by offset. The range in random access ranges should be reading sequentially, can be skipped, but can't be read repeatedly. When calling read_at, if the start offset located in random access ranges, the slice size should not span two ranges. For example, in parquet, the random access ranges is the column offsets in a row group. When reading at offset, if [offset, offset + 8MB) contains many random access ranges, the reader will read data in [offset, offset + 8MB) as a whole, and copy the data in random access ranges into small buffers(name as box, default 1MB, 64MB in total). A box can be occupied by many ranges, and use a reference counter to record how many ranges are cached in the box. If reference counter equals zero, the box can be release or reused by other ranges. When there is no empty box for a new read operation, the read operation will do directly. ## Effects The runtime of ClickBench reduces from 102s to 77s, and the runtime of Query 24 reduces from 24.74s to 9.45s. The profile of Query 24: ``` VFILE_SCAN_NODE (id=0):(Active: 8s344ms, % non-child: 83.06%) - FileReadBytes: 534.46 MB - FileReadCalls: 1.031K (1031) - FileReadTime: 28s801ms - GetNextTime: 8s304ms - MaxScannerThreadNum: 12 - MergedSmallIO: 0ns - CopyTime: 157.774ms - MergedBytes: 549.91 MB - MergedIO: 94 - ReadTime: 28s642ms - RequestBytes: 507.96 MB - RequestIO: 1.001K (1001) - NumScanners: 18 ``` 1001 request IOs has been merged into 94 IOs. ## Remaining problems 1. Add p2 regression test in nest PR 2. Profiles are scattered in various codes and will be refactored in the next PR 3. Support ORC reader	2023-04-23 10:51:38 +08:00
Mryange	b81b470d4f	[fix](planner) fix pr "using crchash replace murmurhash in the runtime filter" (#18759 )	2023-04-23 10:33:35 +08:00
huanghaibin	9756be6bf0	[improvement](stream-load) use vector instead of skiplist when insert dup keys (#18686 )	2023-04-23 09:40:09 +08:00
Mingyu Chen	e7ad536a71	[scirpte](download) add 1.2.4 download script (#18932 )	2023-04-23 07:40:19 +08:00
Hong Liu	bc379eebed	[doc](show-rollup)delete SHOW-ROLLUP doc. (#18924 ) Co-authored-by: smallhibiscus <844981280>	2023-04-22 23:39:24 +08:00
lsy3993	e44aad2b86	[typo](docs)add new attention of doris flink connector (#18930 )	2023-04-22 23:38:48 +08:00
TengJianPing	34ce946f5b	[tools](profile) add script file to get all tree profiles off a query (#18587 ) Add a tool script that output query profiles of all fragment instances in tree form.	2023-04-22 22:10:57 +08:00
zhangstar333	fd905b66b0	[refactor](jdbc) close datasource if no need to maintain the cache (#18724 ) after pr #18670 could use jvm parameters to init jdbc datasource, but when set JDBC_MIN_POOL=0, it can be immediately closed. There is no need to wait for the recycling timer.	2023-04-22 22:07:34 +08:00
Qi Chen	1ff2ccc6c5	[Fix](docker) Fix regression test docker issues. (#18928 ) 1. Fix not reset data after pg restarted. 2. 'docker-compose' to 'docker compose'.	2023-04-22 18:03:50 +08:00
amory	1ffd34f6f1	[Refact](type system)refact interconversion for jsonb with column (#18819 ) * refact jsonb to column * update * fix format * fixed * fix file head for compile	2023-04-22 14:01:05 +08:00
jakevin	814f12981d	[feat](Nereids): validate Project list. (#18868 )	2023-04-22 12:32:51 +08:00
yiguolei	c80dc91a78	[bugfix](memleak) UserFunctionCache may have memory leak during close (#18913 ) * [bugfix](memleak) UserFunctionCache may have memory leak during close * [bugfix](memleak) UserFunctionCache may have memory leak during close --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-22 10:15:51 +08:00
zhangdong	04d18eec59	[Improve](be)check max open file #18888	2023-04-22 08:42:43 +08:00
zzzzzzzs	a49311b48e	[typo](doc) Fixed typos in DROP-CATALOG.md (#18909 )	2023-04-22 08:39:42 +08:00
Tiewei Fang	13894ae790	[fix](jdbc catalog) Use default value if the user does not set the pool parameter in be.conf #18919	2023-04-22 08:39:26 +08:00
gitccl	a1c05b5c13	[fix](compaction) fix potential null pointer dereference (#18915 )	2023-04-22 08:38:32 +08:00
TengJianPing	b75f4c97f3	[function](string) support char function (#18878 ) * [function](string) support char function * fix	2023-04-22 08:36:48 +08:00
Mryange	de0e89d1b4	[feature](function) Modified cast as time to behave more like MySQL (#18565 ) Because the underlying type of time was float64, select cast("19:22:18" as time) would result in a null value in the past. Results in the following:	2023-04-22 06:11:59 +08:00
yiguolei	24ee391a7e	[bugfix](memoryleak) inlist is memory leak if the type is int (#18883 ) * [bugfix](memoryleak) inlist is memory leak if the type is int --------- Co-authored-by: yiguolei <yiguolei@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-04-22 00:34:10 +08:00
wudi	5db0b66bd9	update doc (#18871 ) Co-authored-by: wudi <>	2023-04-21 23:04:27 +08:00
Qi Chen	6eea3d9e2d	[Test](multi-catalog) Fix test_hive_parquet regression test order issue. (#18879 ) l_orderkey cannot guarantee unique order.	2023-04-21 22:59:34 +08:00
AlexYue	d56fed345e	[chore](doc) fix mv doc typo and cold heat separation (#18892 )	2023-04-21 22:30:56 +08:00
Mingyu Chen	313fab0802	[fix](mtmv) fix mtmv thread interruption issue (#18884 )	2023-04-21 22:27:13 +08:00
Jibing-Li	425101bf53	[fix](test)Move broker test to p2. Move test data to cos in Beijing region (#18893 ) Fix broker load p2 test case error. 1. Move test data from cos Hong kong region to Beijing region. 2. Move broker load test to p2 group. 3. Fix error message mismatch error.	2023-04-21 22:15:52 +08:00
xueweizhang	f7651d8dfb	(fix)[olap] not support in_memory=true now (#18731 ) * (fix)[olap] can not set in_memory=true now --------- Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-04-21 21:55:37 +08:00
Lei Zhang	0ae3a6df7e	[bug](bdbje) Add retry for reSetupBdbEnvironment() `restore.execute()` (#18777 ) * In reSetupBdbEnvironment() `restore.execute()` may throw NullPointerException, add retry for `restore.execute()`	2023-04-21 20:58:42 +08:00
jakevin	317d9ee152	[feat](Nereids): Simplify Agg GroupBy (#18887 )	2023-04-21 18:57:15 +08:00
lihangyu	af20b2c95e	[Bug](topn opt) Fix be crash when enable topn opt with larger thresho… (#18858 ) topn opt should be inited when update it	2023-04-21 17:45:00 +08:00
Jack Drogon	5706bef2b3	[feature](common) Add unexpected/result support (#18312 ) * Add unexpected/result support * Rename result.hpp -> result.h && Add NOLINT in expected.hpp * Add NOLINT in result.h to avoid clang-tidy checker * Rename result.h to expected.h * Add Apache License for be/src/util/expected.hpp * Disable clang-format in be util/expected.hpp	2023-04-21 17:07:20 +08:00
luozenglin	c72a46f3df	[Improvement](bitmap-filter) enable bitmap runtime filter in fuzzy mode. (#17621 )	2023-04-21 16:00:13 +08:00
zzzzzzzs	c82964a294	[typo](doc) Fixed typos in lateral-view.md (#18842 )	2023-04-21 15:59:04 +08:00
superche	5570b57e41	[typo](doc)fix invalid url (#18855 ) Co-authored-by: hechao <hechao@selectdb.com>	2023-04-21 15:58:46 +08:00
xu tao	dba27a67bc	[typo](docs) fix docs SHOW-COLUMNS.md (#18875 )	2023-04-21 15:58:29 +08:00
Liqf	ec1ab1a3d2	[Improve](GEO)wkb input and output are represented as hexadecimal strings And delete EWKB (#18721 )	2023-04-21 15:11:18 +08:00
Xiaocc	3007cd49f2	[enhancement](mysql) enable two-way ssl authentication (#18530 ) According to the mysql-ssl, enable two-way SSL authentication.	2023-04-21 14:39:14 +08:00
starocean999	c41b486e7e	[fix](nereids) LogicalProject should always has non-empty project list (#18863 )	2023-04-21 14:28:07 +08:00
jakevin	0c26f8df4d	[refactor](Nereids): move out misunderstanding func from JoinUtils (#18865 )	2023-04-21 14:11:03 +08:00
AKIRA	063dfefd80	[fix](planner) Failed to create table with CTAS when multiple varchar type filed as key (#18814 ) Add restricton for converting varchar/char to string type, only fields that is string type and not in key desc could be convert to string type now.	2023-04-21 13:33:35 +08:00
ElvinWei	1a6401d682	[enchancement](statistics) support sampling collection of statistics (#18880 ) 1. Supports sampling to collect statistics 2. Improved syntax for collecting statistics 3. Support histogram specifies the number of buckets 4. Tweaked some code structure --- The syntax supports WITH and PROPERTIES, using the same syntax as before. Column Statistics Collection Syntax: ```SQL ANALYZE [ SYNC ] TABLE table_name [ (column_name [, ...]) ] [ [WITH SYNC] \| [WITH INCREMENTAL] \| [WITH SAMPLE PERCENT \| ROWS ] ] [ PROPERTIES ('key' = 'value', ...) ]; ``` Column histogram collection syntax: ```SQL ANALYZE [ SYNC ] TABLE table_name [ (column_name [, ...]) ] UPDATE HISTOGRAM [ [ WITH SYNC ][ WITH INCREMENTAL ][ WITH SAMPLE PERCENT \| ROWS ][ WITH BUCKETS ] ] [ PROPERTIES ('key' = 'value', ...) ]; ``` Illustrate： - sync：Collect statistics synchronously. Return after collecting. - incremental：Collect statistics incrementally. Incremental collection of histogram statistics is not supported. - sample percent \| rows：Collect statistics by sampling. Scale and number of rows can be sampled. - buckets：Specifies the maximum number of buckets generated when collecting histogram statistics. - table_name: The purpose table for collecting statistics. Can be of the form `db_name.table_name`. - column_name: The specified destination column must be a column that exists in `table_name`, and multiple column names are separated by commas. - properties：Properties used to set statistics tasks. Currently only the following configurations are supported (equivalent to the with statement) - 'sync' = 'true' - 'incremental' = 'true' - 'sample.percent' = '50' - 'sample.rows' = '1000' - 'num.buckets' = 10 --- TODO: - Supplement the complete p0 test - `Incremental` statistics see #18653	2023-04-21 13:11:43 +08:00
Jibing-Li	ae76b59f2f	[fix](external table) Use FederationBackendPolicy in Coordinator for ExternalScanNode #18860	2023-04-21 12:35:45 +08:00
Jiwen liu	2cc811bd54	[typo](docs)Fix explode_json_array document error (#18867 )	2023-04-21 12:35:14 +08:00
morrySnow	b84bd156fb	[enhancement](Nereids) two phase read for topn (#18829 ) add two phase read topn opt, the legacy planner's PR are: - #15642 - #16460 - #16848 TODO: we forbid limit(sort(project(scan))) since be core when plan has a project on the scan. we need to remove this restirction after we fix be bug	2023-04-21 12:05:22 +08:00
Mingyu Chen	fc63747f59	[improvement](test) remove set global (#18807 )	2023-04-21 11:24:20 +08:00
lihangyu	8cc0af150a	[Fix](dynamic table) fix dynamic table with insert into and column al… (#18808 ) 1. The num_rows should be correctly set 2. insert into has no dynamic column	2023-04-21 11:19:00 +08:00

1 2 3 4 5 ...

10071 Commits