doris

Author	SHA1	Message	Date
Xiaocc	3007cd49f2	[enhancement](mysql) enable two-way ssl authentication (#18530 ) According to the mysql-ssl, enable two-way SSL authentication.	2023-04-21 14:39:14 +08:00
starocean999	c41b486e7e	[fix](nereids) LogicalProject should always has non-empty project list (#18863 )	2023-04-21 14:28:07 +08:00
jakevin	0c26f8df4d	[refactor](Nereids): move out misunderstanding func from JoinUtils (#18865 )	2023-04-21 14:11:03 +08:00
AKIRA	063dfefd80	[fix](planner) Failed to create table with CTAS when multiple varchar type filed as key (#18814 ) Add restricton for converting varchar/char to string type, only fields that is string type and not in key desc could be convert to string type now.	2023-04-21 13:33:35 +08:00
ElvinWei	1a6401d682	[enchancement](statistics) support sampling collection of statistics (#18880 ) 1. Supports sampling to collect statistics 2. Improved syntax for collecting statistics 3. Support histogram specifies the number of buckets 4. Tweaked some code structure --- The syntax supports WITH and PROPERTIES, using the same syntax as before. Column Statistics Collection Syntax: ```SQL ANALYZE [ SYNC ] TABLE table_name [ (column_name [, ...]) ] [ [WITH SYNC] \| [WITH INCREMENTAL] \| [WITH SAMPLE PERCENT \| ROWS ] ] [ PROPERTIES ('key' = 'value', ...) ]; ``` Column histogram collection syntax: ```SQL ANALYZE [ SYNC ] TABLE table_name [ (column_name [, ...]) ] UPDATE HISTOGRAM [ [ WITH SYNC ][ WITH INCREMENTAL ][ WITH SAMPLE PERCENT \| ROWS ][ WITH BUCKETS ] ] [ PROPERTIES ('key' = 'value', ...) ]; ``` Illustrate： - sync：Collect statistics synchronously. Return after collecting. - incremental：Collect statistics incrementally. Incremental collection of histogram statistics is not supported. - sample percent \| rows：Collect statistics by sampling. Scale and number of rows can be sampled. - buckets：Specifies the maximum number of buckets generated when collecting histogram statistics. - table_name: The purpose table for collecting statistics. Can be of the form `db_name.table_name`. - column_name: The specified destination column must be a column that exists in `table_name`, and multiple column names are separated by commas. - properties：Properties used to set statistics tasks. Currently only the following configurations are supported (equivalent to the with statement) - 'sync' = 'true' - 'incremental' = 'true' - 'sample.percent' = '50' - 'sample.rows' = '1000' - 'num.buckets' = 10 --- TODO: - Supplement the complete p0 test - `Incremental` statistics see #18653	2023-04-21 13:11:43 +08:00
Jibing-Li	ae76b59f2f	[fix](external table) Use FederationBackendPolicy in Coordinator for ExternalScanNode #18860	2023-04-21 12:35:45 +08:00
morrySnow	b84bd156fb	[enhancement](Nereids) two phase read for topn (#18829 ) add two phase read topn opt, the legacy planner's PR are: - #15642 - #16460 - #16848 TODO: we forbid limit(sort(project(scan))) since be core when plan has a project on the scan. we need to remove this restirction after we fix be bug	2023-04-21 12:05:22 +08:00
Yulei-Yang	c6b1b9de80	[Improvement](broker) support broker load from tencent Goose File System (#18745 ) Including below functions: 1. broker load 2. export 3. select into outfile 4. create repo and backup to gfs after config env, use gfs like other hdfs system.	2023-04-20 23:12:17 +08:00
slothever	097dcf2119	[fix](outfile) unify broker and hdfs path in outfile (#18809 ) unify broker and hdfs path in outfile fix fe ut and add outfile case	2023-04-20 21:01:39 +08:00
奕冷	94509e51af	[fix](editLog) add sufficient replay logic and edit log for altering light schema change (#18746 )	2023-04-20 19:20:03 +08:00
Jerry Hu	c4e469c82c	[feature](agg) Support spill to disk in aggregation (#18051 )	2023-04-20 18:59:08 +08:00
LiBinfeng	668c681fbc	[Fix](Nereids) Check bound status in analyze straight after bounding (#18581 ) Probleam: Dead loop cause of keep pushing analyze tasks into job stack. When doing analyze process and generate new operators, the same analyze rule would be pushed again, so it cause dead loop. And analyze process generate new operators when trying to bound order by key and aggregate function. Solve: We need to make it throw exception before complex analyze and rewrite process, so checking whether all expressions being bound should be done twice. One is done after bounding all expression, another is done after all analyze process in case of generate new expressions and new operators. Example: Cases were put in file: regression-test/suites/nereids_p0/except/test_bound_exception.groovy	2023-04-20 18:50:13 +08:00
Tiewei Fang	8e2146f48c	[Enhencement](Export) support export with outfile syntax (#18325 ) `Export` syntax provides asynchronous export function, but `Export` does not achieve vectorization. `Outfile` syntax provides synchronous export function`. So we can reimplement the export syntax with oufile syntax.	2023-04-20 17:27:04 +08:00
starocean999	ea795b9909	[fix](nereids)disable SelectMaterializedIndexWithAggregate rule (#18380 ) * [fix](nereids)disable SelectMaterializedIndexWithAggregate rule * rebase code * disable related test cases * remove failed test cases for now	2023-04-20 17:02:36 +08:00
Gabriel	c659e0bfc7	[Improvement](bloom filter) adjust bloom filter size (#18846 )	2023-04-20 16:50:22 +08:00
morrySnow	3644dfa9fd	[fix](Nereids) stddev functions not support decimalv3 type arg (#18840 )	2023-04-20 14:54:12 +08:00
jakevin	52d32cccad	[enhance](Nereids): check cycle by getParentGroupExpressions(). (#18687 )	2023-04-20 11:51:58 +08:00
Qi Chen	3328a65b75	[Fix](mutli-catalog) Use decimal v3 type to fix decimal loss issue in multi-catalog module. (#18835 ) Fix decimal v3 precision loss issues in the multi-catalog module. Now it will use decimal v3 to represent decimal type in the multi-catalog module. Regression Test: `test_load_with_decimal.groovy`	2023-04-20 11:02:53 +08:00
HappenLee	33d4c60570	[RegressionTest](fuzzy) enable set global enable_pipeline_engine (#18832 ) enable set global enable_pipeline_engine	2023-04-20 10:38:11 +08:00
zclllyybb	fb377a9da9	[Improvement](functions)Optimized some datetime function's return value (#18369 )	2023-04-19 15:51:11 +08:00
jakevin	1f5f5a12b6	[fix](Nereids): need update parentExpression after replace child. (#18771 )	2023-04-19 15:13:42 +08:00
slothever	93b35bbfbf	[feature](multi-catalog) add catalog comment and create time info (#18778 ) add catalog comment and create time info ``` create catalog hms_ctl comment 'your comment' properties ( 'type'='hms', 'hive.metastore.uris' = 'thrift://xx:1234' ); ``` Create Time will generate when the catalog is created. use show catalogs and show create catalog to get these info.	2023-04-19 15:08:42 +08:00
Jibing-Li	1a25f110ec	[Fix](planner)Fix TupleDescriptor include not materialized slot bug (#18783 ) setOutputSmap function in ScanNode may include not materialized to outputTupleDesc. This PR is to fix this.	2023-04-19 14:08:09 +08:00
minghong	446db3def6	[opt](nereids) estimate broadcast cost by a new formula (#18744 ) estimate broadcast cost by an experience formula: beNumber^0.5 * rowCount 1. sender number and receiver number is not available at RBO stage now, so we use beNumber 2. senders and receivers work in parallel, that why we use square of beNumber	2023-04-19 12:14:55 +08:00
Gabriel	15529afed8	[minor](decimal) forbid to create table with decimal type exceeds 18 (#18763 ) * [minor](decimal) forbid to create table with decimal type exceeds 18 * update	2023-04-19 11:34:27 +08:00
zhangstar333	0b379de602	[refactor](scan) optimize the agg function of count(1) (#18739 )	2023-04-19 09:10:51 +08:00
赵立伟	d24a8a524e	[refactor](fe): Remove resource group which is useless (#18249 )	2023-04-18 21:04:30 +08:00
luozenglin	5c076b738b	[improvement](resource-group) add test for resource group (#18575 ) Co-authored-by: wangbo <youseebiggirl_t_t@qq.com>	2023-04-18 20:20:50 +08:00
Xin Liao	4a16eff16d	[fix](merge-on-write) enable_unique_key_merge_on_write property should only be used for unique table (#18734 )	2023-04-18 18:40:01 +08:00
AKIRA	031d35d4a1	[fix](stats) Stats still in cache after user dropped it (#18720 ) 1. Evict the dropped stats from cache 2. Remove codes for the partition level stats collection 3. Disable analyze whole database directly 4. Fix the potential death loop in the stats cleaner 5. Sleep thread in each loop when scanning stats table to avoid excessive IO usage by this task.	2023-04-18 16:41:10 +08:00
Gabriel	c3f808cc06	Revert "[enhancement](Nereids) optimize bloom filter size reducing strategy (#18596 )" (#18768 ) This reverts commit 3eac53f75d5f3eb05e958403efeb7578ad86e438.	2023-04-18 15:37:19 +08:00
xueweizhang	62e4140d17	[fix](olap) fix lost disable_auto_compaction info when fe restart (#18757 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-04-18 14:11:40 +08:00
zhangstar333	6b351a2818	[vectorzied](function) fix array_map function analyzed failed with order by clause (#18676 ) * [vectorzied](function) fix array_map function analyzed failed with order by clause * add test	2023-04-18 12:01:44 +08:00
jakevin	3a6eae0ec5	[feature](Nereids): infer not null from Agg Count(distinct). (#18599 )	2023-04-18 11:22:36 +08:00
slothever	98b8efc2c2	[fix](multi-catalog)fix old s3 properties check (#18430 ) fix old s3 properties check fix for #18005 (comment)	2023-04-18 09:58:13 +08:00
jakevin	10b252856d	[feature](Nereids): pullup semiJoin through aggregate. (#18669 )	2023-04-18 09:31:07 +08:00
jakevin	86b8e95045	[fix](Nereids): when GroupExpr already exists, we need to remove ParentExpression (#18749 )	2023-04-17 23:12:26 +08:00
Calvin Kirs	575c1620c2	[Improve](fe)Use commons-lang3 uniformly and refactor PatternGenerator#generateTypePattern (#18666 ) `commons-lang`(1and2) is no longer maintained since 2011, and the official recommendation is `commons-lang3`, which can be smoothly upgraded to be compatible with `commons-lang`. We use both dependencies in `fe`, which can be completely unified. `PatternGenerator#generateTypePattern` has many meaningless loops, and IntegerRange is introduced for, which is unnecessary. So I refactored it.	2023-04-17 20:15:17 +08:00
Gabriel	74d424e6d4	[Bug](DECIMAL) Fix bug for arithmatic expr DECIMALV2 / DECIMALV3 (#18723 )	2023-04-17 16:43:36 +08:00
jakevin	d61f52d277	[fix](Nereids): fix sum func in eager agg (#18675 )	2023-04-17 15:06:28 +08:00
Gabriel	5300b21db7	[Bug](DECIMALV3) report failure if a decimal value is overflow (#18336 )	2023-04-17 13:18:14 +08:00
HappenLee	eb128753ac	[Opt](pipeline) opt pipeline shared scan (#18715 )	2023-04-17 13:06:39 +08:00
minghong	a2278dbc6c	[opt](nereids) optimize filter estimation for pattern "col=col" #18716 Tpc-h q10 and q5 benefit from this optimization. For a given hash join condition, A=B, sometimes both A and B are reduced by filters. In this pr, both reductions are counted in join estimation.	2023-04-17 11:44:35 +08:00
jakevin	b5b0148010	[feature](Nereids): when cost time > 5s, throw timeout Exception (#18316 )	2023-04-17 11:21:54 +08:00
morrySnow	3eac53f75d	[enhancement](Nereids) optimize bloom filter size reducing strategy (#18596 )	2023-04-17 10:50:08 +08:00
Lei Zhang	e6884a3768	[log](fe) add more detail log for master transfer (#17350 ) (#17485 )	2023-04-16 18:35:06 +08:00
Mingyu Chen	1cbbc60822	[feature](config) support "experimental" prefix for FE config (#18699 ) For each release of Doris, there are some experimental features. These feature may not stable or qualified enough, and user need to use it by setting config or session variables, eg, set enable_mtmv = true, otherwise, these feature is disable by default. We should explicitly tell user which features are experimental, so that user will notice that and decide whether to use it. Changes In this PR, I support the experimental_ prefix for FE config and session variables. Session Variable Given enable_nereids_planner as an example. The Nereids planner is an experimental feature in Doris, so there is an EXPERIMENTAL annotation for it: @VariableMgr.VarAttr(..., expType = ExperimentalType.EXPERIMENTAL) private boolean enableNereidsPlanner = false; And for compatibility, user can set it by: set enable_nereids_planner = true; set experimental_enable_nereids_planner = true; And for show variables, it will only show experimental_enable_nereids_planner entry. And you can also see all experimental session variables by: show variables like "%experimental%" Config Same as session variable, give enable_mtmv as an example. @ConfField(..., expType = ExperimentalType.EXPERIMENTAL) public static boolean enable_mtmv = false; User can set it in fe.conf or ADMIN SET FRONTEND CONFIG stmt with both names: enable_mtmv experimental_enable_mtmv And user can see all experimental FE configs by: ADMIN SHOW FRONTEND CONFIG LIKE "%experimental%"; TODO Support this feature for BE config Only add experimental for: enable_pipeline_engine enable_nereids_planner enable_single_replica_insert and FE config: enable_mtmv enabel_ssl enable_fqdn_mode Should modify other config and session vars	2023-04-16 18:32:10 +08:00
yongkang.zhong	afdac1204d	[improve](postgresql catalog) support postgresql bytea type to doris string (#18623 ) * [improve](postgresql catalog) support postgresql bytea type to doris string * modify function name * add case	2023-04-16 18:14:42 +08:00
lihangyu	7bc242d665	[regression-test](prepared statement) Fix connection error when test framework used lower jdbc version (#18665 )	2023-04-16 18:13:45 +08:00
Yulei-Yang	c12646382d	[feature](multicatalog) enable doris hive/iceberg catalog to read data on tencent GooseFS (#18685 )	2023-04-16 18:11:57 +08:00

1 2 3 4 5 ...

3161 Commits