doris

Author	SHA1	Message	Date
wucheng	6af1c52e13	[Feature] add support for tencent chdfs (#8963 ) Co-authored-by: chengwu <chengwu@tencent.com>	2022-04-12 16:02:42 +08:00
jiafeng.zhang	51269efbb7	[improvement]Disable mini load (#8955 ) Disable miniload by default	2022-04-12 16:01:03 +08:00
Gabriel	0d761f9909	[feature-wip][UDF][DIP-1] Support variable-size input and output for Java UDF (#8678 ) This feature is proposed in DSIP-1. This PR support variable-length input and output Java UDF.	2022-04-11 09:36:16 +08:00
Mingyu Chen	936b942e3a	[fix](error-code) replace invalid format specifier (#8940 ) change %lu and %ld to %d	2022-04-10 20:37:32 +08:00
Jiading Guo	1ee8633e5e	[fix](account) use LOG.info instead of LOG.debug (#8911 ) This complements (#8849)	2022-04-09 19:18:13 +08:00
HappenLee	ce6b5169c2	[fix](join) Fix error bucket num get in bucket shuffle join in dynamic partition (#8891 )	2022-04-09 19:11:44 +08:00
Henry2SS	a290104966	[fix](routine load) Routine load task doesn't reallocate when previous BE is down. (#8824 ) if previous be is not alive, should assigned another available BE instead.	2022-04-09 19:02:55 +08:00
morrySnow	ddf7ef9327	[improvement](join) update broadcast join cost algorithm (#8695 ) broadcast join cost is used compressed data size currently. The amount of memory used may be significantly more than estimated. This patch: 1. add a compressed ratio to broadcast join cost and set to 5 according to the experience. 2. add a new session variable `auto_broadcast_join_threshold` to limit memory used by broadcast in bytes, the default value is 1073741824(1GB)	2022-04-09 19:00:27 +08:00
Pxl	453485abfb	[Bug] Fix some bugs(rewrite rule/symbol transport) of `like predicate` (#8770 )	2022-04-08 14:32:09 +08:00
camby	c5718928df	[feature-wip](array-type) support explode and explode_outer table function (#8766 ) explode(ArrayColumn) desc: > Create a row for each element in the array column. explode_outer(ArrayColumn) desc: > Create a row for each element in the array column. Unlike explode, if the array is null or empty, it returns null. Usage example: 1. create a table with array column, and insert some data; 2. open enable_lateral_view and enable_vectorized_engine; ``` set enable_lateral_view = true; set enable_vectorized_engine=true; ``` 3. use explode_outer ``` > select * from array_test; +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 3 \| NULL \| NULL \| \| 1 \| 2 \| [1, 2] \| \| 2 \| 3 \| NULL \| \| 4 \| NULL \| [] \| +------+------+--------+ > select k1,explode_column from array_test LATERAL VIEW explode_outer(k3) TempExplodeView as explode_column; +------+----------------+ \| k1 \| explode_column \| +------+----------------+ \| 1 \| 1 \| \| 1 \| 2 \| \| 2 \| NULL \| \| 4 \| NULL \| \| 3 \| NULL \| +------+----------------+ ``` 4. explode usage example. explode return empty rows while the ARRAY is null or empty ``` > select k1,explode_column from array_test LATERAL VIEW explode(k3) TempExplodeView as explode_column; +------+----------------+ \| k1 \| explode_column \| +------+----------------+ \| 1 \| 1 \| \| 1 \| 2 \| +------+----------------+ ```	2022-04-08 12:11:04 +08:00
HappenLee	fa8e4ec2f0	[fix] Disable cast operation of object type (#8882 ) Disable cast between string and object type(bitmap, hll, quantile_state)	2022-04-08 09:13:56 +08:00
EmmyMiao87	e3daa9580a	[Fix](Lateral View) The Error expr type when exploding a function result of inline view (#8851 ) Fixed #8850 The column in inline view maybe a function instead of slotRef. So when this column is used as the input of explode function, it can't be converted to slotRef. The correct way is to treat it as an Expr and extract the required slotRef for materialization. For example: ``` with d as (select k1+k1 as k1_plus from table) select k1_plus from d explode_split(k1_plus, ",") ``` FnExp: SlorRef<k1_plus> SubstituteFnExpr: functionCallExpr<k1+k1> originSlotRefList: SlotRef<k1>	2022-04-08 09:08:55 +08:00
Jiading Guo	318feb01f3	[improvement](account) support to account management sql (#8849 ) Add [IF EXISTS] support to following statements: - CREATE [IF NOT EXISTS] USER - CREATE [IF NOT EXISTS] ROLE - DROP [IF EXISTS] USER - DROP [IF EXISTS] ROLE	2022-04-08 09:08:08 +08:00
Xin Liao	32bba15e34	[refactor][fix] remove useless import in Config.java (#8878 )	2022-04-07 11:40:05 +08:00
morrySnow	64d18364db	[improvement](restore) set table property 'dynamic_partition.enable' to false after restore (#8852 ) when restore table with dynamic partition properties, 'dynamic_partition.enable' is set to the backup time value. but Doris could not turn on dynamic partition automatically when restore. So we cloud see table never do dynamic partition with dynamic_partition.enable is set to 'true'.	2022-04-07 11:34:01 +08:00
Mingyu Chen	ce50c4d826	[feature](diagnose) support "ADMIN DIAGNOSE TABLET" stmt (#8839 ) `ADMIN DIAGNOSE TABLET tablet_id` This statement makes it easier to quickly diagnose the status of a tablet. See "ADMIN-DIAGNOSE-TABLET.md" for details ``` mysql> admin diagnose tablet 10196; +----------------------------------+------------------------------+------------+ \| Item \| Info \| Suggestion \| +----------------------------------+------------------------------+------------+ \| TabletExist \| Yes \| \| \| TabletId \| 10196 \| \| \| Database \| default_cluster:db1: 10192 \| \| \| Table \| tbl1: 10194 \| \| \| Partition \| tbl1: 10193 \| \| \| MaterializedIndex \| tbl1: 10195 \| \| \| Replicas(ReplicaId -> BackendId) \| {"10197":10002} \| \| \| ReplicasNum \| OK \| \| \| ReplicaBackendStatus \| Backend 10002 is not alive. \| \| \| ReplicaVersionStatus \| OK \| \| \| ReplicaStatus \| OK \| \| \| ReplicaCompactionStatus \| OK \| \| +----------------------------------+------------------------------+------------+ ```	2022-04-07 11:30:03 +08:00
jiafeng.zhang	e72ccfd80c	[Refactor][httpv2]remove http v1 code (#8848 ) http v2 has been actually tested in production, and it is completely replaceable to have http code. In order to simplify code maintenance, remove the previous http part of the code	2022-04-07 08:38:29 +08:00
caiconghui	98cab78320	[refactor](schema_hash) remove schema_hash since every tablet id in be is unique (#8574 )	2022-04-07 08:37:45 +08:00
caiconghui	319f1f634a	[fix](ut) fix fe run CreateTableAsSelectStmtTest ,UserPropertyTest, ProjectPlannerFunctionTest and AggregateTest failed (#8838 )	2022-04-06 15:23:49 +08:00
HappenLee	33736e45fa	[fix](table-function) Fixed unreasonable nullable conversion (#8818 )	2022-04-03 11:02:35 +08:00
GoGoWen	a8417e6c8b	[fix](restore) fix restore issue when meta version is too low (#8816 ) When restore snapshot from 0.13 to master, the restore job is pending for long time. However, we get error "Could not set meta version to 93 since it is lower than minimum required version 100" in log. We should cancel restore job once get that error.	2022-04-03 10:56:23 +08:00
jiafeng.zhang	eed4908790	[chore](deps) upgrade spring to 2.6.2 to 2.6.6 (#8802 )	2022-04-03 10:52:31 +08:00
EmmyMiao87	0e3b15f2d7	[fix](colocate) Fix the error colocate plan when query is (rollup + instance >1) (#8779 ) The Repeat Node will change the fragment data partition. So the output partition of child fragment is different from the data partition of current fragment. When judging whether colocate can be enabled, the current data partition of fragment should be used directly instead of the child's output partition. Before this PR fix, queries with '''rollup + concurrency greater than 1''' may have incorrect results. For example: ``` select t1.tc1,t1.tc2,sum(t1.tc3) as total from t1 join[shuffle] t1 t2 on t1.tc1=t2.tc1 group by rollup(tc1,tc2) order by t1.tc1,t1.tc2,total; ``` Fixed #8778	2022-04-03 10:19:39 +08:00
dataroaring	a75e4a1469	Window funnel (#8485 ) Add new feature window funnel	2022-04-02 22:08:50 +08:00
jiafeng.zhang	13f1f94f86	[chore] upgrade log4j version to 2.17.2 (#8774 ) upgrade log4j version to 2.17.2	2022-04-02 21:29:25 +08:00
camby	4d516bece8	[feature-wip](array-type)Add element_at and subscript functions (#8597 ) Describe the overview of changes. 1. add function element_at; 2. support element_subscript([]) to get element of array, col_array[N] <==> element_at(col_array, N); 3. return error message instead of BE crash while array function execute failed; element_at(array, index) desc: > Returns element of array at given (1-based) index. If index < 0, accesses elements from the last to the first. Returns NULL if the index exceeds the length of the array or the array is NULL. Usage example: 1. create table with ARRAY type column and insert some data: ``` +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 1 \| 2 \| [1, 2] \| \| 2 \| 3 \| NULL \| \| 4 \| NULL \| [] \| \| 3 \| NULL \| NULL \| +------+------+--------+ ``` 2. enable vectorized: ``` set enable_vectorized_engine=true; ``` 3. element_subscript([]) usage example: ``` > select k1,k3,k3[1] from array_test; +------+--------+----------------------------+ \| k1 \| k3 \| %element_extract%(`k3`, 1) \| +------+--------+----------------------------+ \| 3 \| NULL \| NULL \| \| 1 \| [1, 2] \| 1 \| \| 2 \| NULL \| NULL \| \| 4 \| [] \| NULL \| +------+--------+----------------------------+ ``` 4. element_at function usage example: ``` > select k1,k3 from array_test where element_at(k3, -1) = 2; +------+--------+ \| k1 \| k3 \| +------+--------+ \| 1 \| [1, 2] \| +------+--------+ ```	2022-04-02 12:03:56 +08:00
zhangstar333	6c5bbc6e4c	fix agg functions check failed from empty table (#8785 ) fix agg functions check failed from empty table	2022-04-02 10:44:55 +08:00
EmmyMiao87	9f80f6cf5e	[Improvement](Planner)Enable hash join project (#8618 )	2022-04-01 15:42:25 +08:00
spaces-x	6729d41c93	[improvement] add switch of quantile_state column (#8706 ) Add switch for quantile_state column, default false.	2022-03-31 22:59:27 +08:00
Xujian Duan	d17ee7b476	[fix](sql-block-rule)Fix sql block rule bug (#8738 ) 1. Check properties' effectiveness of sql_block_rule, can't set limitations of sql_block_rule to be negative. 2. Optimize the judgment conditions when a query hits a sql_block_rule 3. Check if sql_block_rule has already exist when exec set property for "user" "sql_block_rule" = "xxx" 4. Add UT ad SqlBlockRuleMgrTest.java	2022-03-31 13:51:13 +08:00
Mingyu Chen	b98da02611	[chore][fix](httpv2) Use mariadb-java-client for http query api (#8716 ) In #8319, I remove mysql-connector-java dependency because of license incompatibility. But we need a mysql compatible driver for http query api. So I choose mariadb-java-client, which is under LGPL.	2022-03-30 09:59:45 +08:00
Mingyu Chen	22cf6ea17c	[chore] Modify build.sh and refactor dependency of FE submodules (#8732 ) This PR fixes the #8731 and refactor the `build.sh` script. The build.sh script is currently responsible for the compilation of the following Doris components. 1. FE - fe-common - fe-core - spark-dpp - hive-udf - java-udf - ui 2. BE - palo_be - meta_tool 3. broker In the FE module. - The 4 submodules `fe-common, fe-core, spark-dpp and ui` together form Frontend. - `spark-dpp, hive-udf and java-udf` can be compiled separately to produce jar packages for individual use. In the BE module. - `palo_be` can start the BE process separately. - `meta_tool` can be compiled separately to produce binaries. The modified build.sh script has the following changes: 1. there is no longer an option to compile `ui` separately, build together with `--fe`. 2. `fe/be/spark-dpp/hive-udf/java-udf/palo_be/meta_tool` can be compiled separately. 3. all components except `java-udf` will be compiled by default (`java-udf` is in development) Remaining issues: Several submodules of FE have messy dependencies. For example, `java-udf` depends on `fe-core`, and `fe-core` depends on `spark-dpp`, resulting in a large binary jar of `java-udf`. It needs to be reorganized afterwards.	2022-03-30 00:13:24 +08:00
dataalive	3f5bc5206d	[Improvement] broker load with hdfs support wildcard (#8718 ) broker load with hdfs support wildcard	2022-03-29 18:21:41 +08:00
zhangstar333	66a3c574df	[Vectorized][Bug] fix percentile_approx function to return always nullable (#8572 )	2022-03-29 14:47:39 +08:00
Mingyu Chen	f3659c87c1	[fix][chore](repository)(fe) check reponame when creating repository and modify build.sh (#8671 ) 1. We need to check repo name when creating repository 2. modify build.sh to not install spark-dpp when spark-dpp is not compiled	2022-03-29 11:32:52 +08:00
Mingyu Chen	d82c138a60	[fix](user-property) Fix bug that can not set exec_mem_limit at user level (#8710 )	2022-03-29 10:03:33 +08:00
Mingyu Chen	7cfce63a13	[fix](mini-load) Remove mini load in LOADING and PENDING state (#8649 ) 1. Remove some unused code. 2. handle mini load with wrong state 1. For some historical reasons, some mini load jobs in LOADING state have not been cleared. As a result, new load jobs cannot be committed. 2. If a mini load job is created right before FE restart, the mini load job will be in PENDING state forever. But it should be removed finally.	2022-03-28 10:22:17 +08:00
HB	39717a85a2	[fix](load) Fix null column bug in load's mapping column setting (#8625 )	2022-03-28 10:08:00 +08:00
yinzhijian	f96bc62573	[feature](balance) Support balance between disks on a single BE (#8553 ) Current situation of Doris is that the cluster is balanced, but the disks of a backend may be unbalanced. for example, backend A have two disks: disk1 and disk2, disk1's usage is 98%, but disk2's usage is only 40%. disk1 is unable to take more data, therefore only one disk of backend A can take new data, the available write throughput of backend A is only half of its ability, and we can not resolve this through load or partition rebalance now. So we introduce disk rebalancer, disk rebalancer is different from other rebalancer(load or partition) which take care of cluster-wide data balancing. it takes care about backend-wide data balancing. [For more details see #8550](https://github.com/apache/incubator-doris/issues/8550)	2022-03-28 10:03:21 +08:00
Zhengguo Yang	b2861f36c4	[chore] optimize aws thirdparty package download. (#8637 )	2022-03-28 09:35:51 +08:00
Zhengguo Yang	cfb57be731	[api-change] add soft limit of String type length (#8567 ) 1. add a config string_type_soft_limit to soft limit max length of string type 2. disable using String type in Key column, partition column and distribution column 3. remove String type alias BLOB for futrue use	2022-03-25 09:28:41 +08:00
HappenLee	5f606c9d57	[fix] Fix coredump of stddev function (#8543 ) This is only a temporary fix its performance is not ideal. Finally, we need to reconstruct the functions of `stddev` and delete the interface of `insert_to_null_default ()`.	2022-03-24 11:39:29 +08:00
Mingyu Chen	a58e56f0b4	[fix](load) fix another bug that BE may crash when calling `mark_as_failed` (#8607 ) Same as #8501	2022-03-24 09:13:54 +08:00
spaces-x	bea9a7ba4f	[feature] Support pre-aggregation for quantile type (#8234 ) Add a new column-type to speed up the approximation of quantiles. 1. The new column-type is named `quantile_state` with fixed aggregation function `quantile_union`, which stores the intermediate results of pre-aggregated approximation calculations for quantiles. 2. support pre-aggregation of new column-type and quantile_state related functions.	2022-03-24 09:11:34 +08:00
Lijia Liu	72dfdb9a6c	[fix] Fix Check_time return wrong value when exec show table status (#8578 )	2022-03-23 10:34:23 +08:00
Gabriel	b89e4c7bba	[feature-wip](java-udf) support java UDF with fixed-length input and output (#8516 ) This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF). This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR. To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java. To achieve that, I use a UdfExecutor instead. For users, a UDF class must have a public evaluate method.	2022-03-23 10:32:50 +08:00
camby	71ce3c4a6e	[feature-wip](array-type) Add codes and UT for array_contains and array_position functions (#8401 ) (#8589 ) array_contains function Usage example: 1. create table with ARRAY column, and insert some data: ``` > select * from array_test; +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 1 \| 2 \| [1, 2] \| \| 2 \| 3 \| NULL \| \| 4 \| NULL \| [] \| \| 3 \| NULL \| NULL \| +------+------+--------+ ``` 2. enable vectorized: ``` > set enable_vectorized_engine=true; ``` 3. select with array_contains: ``` > select k1,array_contains(k3,1) from array_test; +------+-------------------------+ \| k1 \| array_contains(`k3`, 1) \| +------+-------------------------+ \| 3 \| NULL \| \| 1 \| 1 \| \| 2 \| NULL \| \| 4 \| 0 \| +------+-------------------------+ ``` 4. also we can use array_contains in where condition ``` > select * from array_test where array_contains(k3,1); +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 1 \| 2 \| [1, 2] \| +------+------+--------+ ``` 5. array_position usage example ``` > select k1,k3,array_position(k3,2) from array_test; +------+--------+-------------------------+ \| k1 \| k3 \| array_position(`k3`, 2) \| +------+--------+-------------------------+ \| 3 \| NULL \| NULL \| \| 1 \| [1, 2] \| 2 \| \| 2 \| NULL \| NULL \| \| 4 \| [] \| 0 \| +------+--------+-------------------------+ ```	2022-03-22 15:42:40 +08:00
Adonis Ling	b638c07533	[feature-wip](array-type) Support nested array insertion. (#8305 ) (#8586 ) Please refer to #8304 .	2022-03-22 15:28:26 +08:00
Adonis Ling	e44038caf3	[feature-wip](array-type) Array data can be loaded in stream load. (#8368 ) (#8585 ) Please refer to #8367 .	2022-03-22 15:25:40 +08:00
Adonis Ling	38ec3cbbdf	[feature-wip](array-type) Support ArrayLiteral in SQL. (#8089 ) (#8582 ) Please refer to #8074	2022-03-22 15:07:06 +08:00

1 2 3 4 5 ...

2007 Commits