doris

Author	SHA1	Message	Date
HappenLee	5f606c9d57	[fix] Fix coredump of stddev function (#8543 ) This is only a temporary fix its performance is not ideal. Finally, we need to reconstruct the functions of `stddev` and delete the interface of `insert_to_null_default ()`.	2022-03-24 11:39:29 +08:00
Mingyu Chen	a58e56f0b4	[fix](load) fix another bug that BE may crash when calling `mark_as_failed` (#8607 ) Same as #8501	2022-03-24 09:13:54 +08:00
spaces-x	bea9a7ba4f	[feature] Support pre-aggregation for quantile type (#8234 ) Add a new column-type to speed up the approximation of quantiles. 1. The new column-type is named `quantile_state` with fixed aggregation function `quantile_union`, which stores the intermediate results of pre-aggregated approximation calculations for quantiles. 2. support pre-aggregation of new column-type and quantile_state related functions.	2022-03-24 09:11:34 +08:00
Lijia Liu	72dfdb9a6c	[fix] Fix Check_time return wrong value when exec show table status (#8578 )	2022-03-23 10:34:23 +08:00
Gabriel	b89e4c7bba	[feature-wip](java-udf) support java UDF with fixed-length input and output (#8516 ) This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF). This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR. To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java. To achieve that, I use a UdfExecutor instead. For users, a UDF class must have a public evaluate method.	2022-03-23 10:32:50 +08:00
camby	71ce3c4a6e	[feature-wip](array-type) Add codes and UT for array_contains and array_position functions (#8401 ) (#8589 ) array_contains function Usage example: 1. create table with ARRAY column, and insert some data: ``` > select * from array_test; +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 1 \| 2 \| [1, 2] \| \| 2 \| 3 \| NULL \| \| 4 \| NULL \| [] \| \| 3 \| NULL \| NULL \| +------+------+--------+ ``` 2. enable vectorized: ``` > set enable_vectorized_engine=true; ``` 3. select with array_contains: ``` > select k1,array_contains(k3,1) from array_test; +------+-------------------------+ \| k1 \| array_contains(`k3`, 1) \| +------+-------------------------+ \| 3 \| NULL \| \| 1 \| 1 \| \| 2 \| NULL \| \| 4 \| 0 \| +------+-------------------------+ ``` 4. also we can use array_contains in where condition ``` > select * from array_test where array_contains(k3,1); +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 1 \| 2 \| [1, 2] \| +------+------+--------+ ``` 5. array_position usage example ``` > select k1,k3,array_position(k3,2) from array_test; +------+--------+-------------------------+ \| k1 \| k3 \| array_position(`k3`, 2) \| +------+--------+-------------------------+ \| 3 \| NULL \| NULL \| \| 1 \| [1, 2] \| 2 \| \| 2 \| NULL \| NULL \| \| 4 \| [] \| 0 \| +------+--------+-------------------------+ ```	2022-03-22 15:42:40 +08:00
Adonis Ling	b638c07533	[feature-wip](array-type) Support nested array insertion. (#8305 ) (#8586 ) Please refer to #8304 .	2022-03-22 15:28:26 +08:00
Adonis Ling	e44038caf3	[feature-wip](array-type) Array data can be loaded in stream load. (#8368 ) (#8585 ) Please refer to #8367 .	2022-03-22 15:25:40 +08:00
Adonis Ling	38ec3cbbdf	[feature-wip](array-type) Support ArrayLiteral in SQL. (#8089 ) (#8582 ) Please refer to #8074	2022-03-22 15:07:06 +08:00
Adonis Ling	cf0a9fd177	[feature-wip](array-type) Create table with nested array type. (#8003 ) (#8575 ) ``` create table array_type_table(k1 INT, k2 Array<Array<int>>) duplicate key (k1) distributed by hash(k1) buckets 1 properties('replication_num' = '1'); ```	2022-03-22 15:03:32 +08:00
morrySnow	106d7c2e41	[fix] Wrong conf be used for Filesytem in S3Storage (#8568 ) wrong conf for Filesytem in S3Storage to disable cache. it will lead to wrong behavior when use it to list objects in object store	2022-03-22 11:42:38 +08:00
Jibing-Li	9a0a1c693e	[fix] fix NPE in thrift when forwarding stmt to master FE	2022-03-22 11:41:13 +08:00
Pxl	be3d203289	[feature][vectorized] support table function explode_numbers() (#8509 )	2022-03-22 11:38:00 +08:00
Zhengguo Yang	f06780249a	fix some fe ut failed (#8547 )	2022-03-21 10:36:06 +08:00
Xinyi Zou	eeae516e37	[Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476 ) Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G Implement a new way of memory statistics based on TCMalloc New/Delete Hook, MemTracker and TLS, and it is expected that all memory new/delete/malloc/free of the BE process can be counted.	2022-03-20 23:06:54 +08:00
caiconghui	8470455e0a	[fix](tablet-report) Fix bug that tabletReport function of ReportHandler in fe may throw NullPointerException due to transaction check logic (#8481 )	2022-03-18 09:31:51 +08:00
Arthur Yang	30d8089b2f	[fix](partition_cache) Fix Partition Cache NullPointerException bug (#8454 ) Filter the partitions in predicate but not in OlapTable.	2022-03-17 10:04:49 +08:00
Mingyu Chen	2252ff81d7	[fix](dynamic-partition) fix bug that can not set dynamic_partition.replication_allocation property (#8471 )	2022-03-15 11:45:18 +08:00
mklzl	30eff9d6e9	[improvement] Update ShowExecutor.java (#8462 ) we have some engines like mysql,olap,es,hive and so on , we should add more details for show engines	2022-03-15 11:44:36 +08:00
Gabriel	7d1d45d6dc	[feature-wip](udf) support java udf in FE (#8437 ) First step to support Java UDF in Doris. After this PR, we can create Java UDF in doris. For example, we create Java UDF function by code below. ``` CREATE FUNCTION test_udf(int) RETURNS int PROPERTIES ( "file"="file:///root/hive-udf-1.0-SNAPSHOT.jar", "symbol"="udf.Main", "type"="JAVA_UDF" ) ``` 1. `file` indicate where user file is. 2. `symbol` for java udf means udf class in this jar. 3. `type` indicate this function is a java udf.	2022-03-15 11:42:39 +08:00
wunan1210	571f0b688d	[improvment] show export support label like (#8202 ) using `show export where label like 'xxx%'` to list more results.	2022-03-15 11:41:59 +08:00
Mingyu Chen	a4b710cb2d	[chore](dependency) fix build thirdparty errors (#8456 ) 1. the patch for aws-c-cal-0.4.5 does not need anymore 2. remove duplicate bit_length document 3. add some debug log for routine load	2022-03-13 22:11:24 +08:00
HappenLee	2c63fc1d6c	[improvement](vectorized) Support BetweenPredicate enable fold const expr (#8450 )	2022-03-13 09:36:24 +08:00
Zhengguo Yang	f3c44bcd75	[chore][fix](librdkafka) disable librdkafka assert and update some thirdparty (#8425 ) 1. comment librdkafka `rd_assert(thrd_is_current(rkb->rkb_thread));` to avoid core dump 2. upgrade arrow to 7.0.0 3. upgrade aws sdk to 1.9 4. upgrade orc to 1.7.2	2022-03-12 22:09:06 +08:00
dataroaring	a467e7a790	[refactor][fix] small fixes and code cleanups related to schema change (#8328 ) For now, usage of RowBlockAllocator::allocate is a little complicated due to its ambiguous return value. Some callers just test the return value while some test the return value and non-null pointer. This patch let it return success code only when it succeeds, then caller can just test the return value.	2022-03-12 22:05:43 +08:00
Zhengguo Yang	ebbe6f650c	[fix](broker-load) hdfs or bos path parser not support glob exprs (#8390 )	2022-03-12 20:10:05 +08:00
caiconghui	23d0e7b4f9	[Feature](proc) Support proc dir for showing tablet health status (#8324 )	2022-03-11 22:51:14 +08:00
caiconghui	4a38f2d8a1	[fix](transaction) Fix committed transaction couldn't be finished when table is dropped (#8423 ) Issue Number: close #8426	2022-03-11 17:36:23 +08:00
Mingyu Chen	ffddebfd1d	[fix](report) fix bug that tablet may already be delete when reporting (#8444 ) 1. This bug was introduced by #8209. Error in fe.warn.log: ``` java.lang.IllegalStateException: 560278 at com.google.common.base.Preconditions.checkState(Preconditions.java:508) ~[spark-dpp-0.15-SNAPSHOT.jar:0.15-SNAPSHOT] at org.apache.doris.catalog.TabletInvertedIndex.getReplica(TabletInvertedIndex.java:462) ~[palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.catalog.Catalog.replayBackendReplicasInfo(Catalog.java:6941) ~[palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:626) [palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.catalog.Catalog.replayJournal(Catalog.java:2446) [palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.master.Checkpoint.doCheckpoint(Checkpoint.java:116) [palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.master.Checkpoint.runAfterCatalogReady(Checkpoint.java:74) [palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:0.15-SNAPSHOT] at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:0.15-SNAPSHOT] ``` Since the reporting of a tablet and the deletion of a tablet are two independent events and are not mutually exclusive, it may happen that the tablet is deleted first and the reporting is done later. 2. Change the tablet report info. Now, the version of a tablet report from BE is the largest continuous version. Eg, versions: [1,2,3,5,7], the report version of this tablet will be 3.	2022-03-11 17:24:20 +08:00
Mingyu Chen	a76889b319	[improvement] Avoid print large string in error log (#8436 ) 1. Avoid print large string in error log If user load a unqualified large string, the all string will be saved in error log, so the error log is too big that can not be shown be using `show load warnings on "url"`. Err: `Got packet bigger than 'max_allowed_packet' bytes` 2. Remove duplicate help doc Do not allow doc with same title, or error thrown when starting FE: `java.lang.IllegalArgumentException: Multiple entries with same key:`	2022-03-11 17:23:47 +08:00
zhangstar333	e0ef9b8f6c	[refactor](vectorized) to_bitmap(-1) return NULL instead of return parse failed error_message (#8373 )	2022-03-11 17:21:47 +08:00
Mingyu Chen	e403dbc38c	[feature](user-property) Support user level exec_mem_limit and load_mem_limit (#8365 ) ``` SET PROPERTY FOR 'jack' 'exec_mem_limit' = '2147483648'; SET PROPERTY FOR 'jack' 'load_mem_limit' = '2147483648'; ``` The user level property will overwrite the value in session variables.	2022-03-11 17:20:09 +08:00
HappenLee	68dd799796	[improvement](vectorized) Support function tuple is null (#8442 )	2022-03-11 16:54:37 +08:00
Pxl	d3d8301a13	[feature](function) support vectorized digital_masking (#8409 )	2022-03-10 09:07:07 +08:00
Pxl	10c3712aa1	[fix](vectorized) fix arithmetic calculate get wrong result(#8226 )	2022-03-09 13:03:57 +08:00
Mingyu Chen	826467e116	[fix](replica) handle replica version missing info to avoid -214 error (#8209 ) In the original tablet reporting information, the version missing information is done by combining two pieces of information as follows: 1. the maximum consecutive version number 2. the `version_miss` field The logic of this approach is confusing and inconsistent with the logic of checking for missing versions when querying. After the change, we directly use the version checking logic used in the query, and set `version_miss` to true if a missing version is found and on the FE processing side. Originally, only the bad replica information was syncronized among FEs, but not the version missing information. As a result, the non-master FE is not aware of the missing version information. In the new design, we deprecate the original log persistence class `BackendTabletsInfo` and use the new `BackendReplicasInfo` to record replica reporting information and write both bad and version missing information to metadata so that other FEs can synchronize these information.	2022-03-09 13:03:22 +08:00
Mingyu Chen	22bafef875	[fix](broker-load) fix bug that a cancelled job's state is LOADING (#8363 ) 1. Before executing LoadLoadingTask of a broker load, we should check if the job is cancelled. 2. Add a new column `runningTransactionNum` for `show proc "/transactions"`. So that we can view all running txns in each db in one command.	2022-03-08 18:53:45 +08:00
Mingyu Chen	1e70f992e7	[improvement][fix](insert)(replay) support SHOW LAST INSERT stmt and fix json replay bug (#8355 ) 1. support SHOW LAST INSERT In the current implementation, the insert operation returns a json string to describe the result information of the insert. But this information is in the session track field of the mysql protocol, and it is difficult to obtain programmatically. Therefore, I provide a new syntax `show last insert` to explicitly obtain the result of the latest insert operation, and return a normal query result set to facilitate the user to obtain the result information of the insert. 2. the `ReturnRows` field in fe.audit.log of insert operation will be set to the loaded row num of the insert. 3. Fix a bug described in #8354	2022-03-08 18:53:11 +08:00
Mingyu Chen	50a59f3f86	[license] Organize third-party dependent licenses for bianry releases (#8350 )	2022-03-07 23:18:58 +08:00
HappenLee	477b87cb28	[feature](vec) Support update stmt in vec query engine (#8296 )	2022-03-07 14:03:55 +08:00
Lijia Liu	22a0011403	[fix](planner) Convert format in RewriteFromUnixTimeRule (#8235 ) SQL to reproduce: ``` SELECT * FROM table WHERE where FROM_UNIXTIME(d_datekey,'%Y-%m-%d %H:%i:%s') != '1970-08-20 00:11:43'; org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = Unexpected exception: Illegal pattern character 'i' at org.apache.doris.qe.StmtExecutor.analyze(StmtExecutor.java:584) ~[palo-fe.jar:3.4.0] at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:345) ~[palo-fe.jar:3.4.0] at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:318) ~[palo-fe.jar:3.4.0] at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:221) ~[palo-fe.jar:3.4.0] at org.apache.doris.qe.ConnectProcessor.dispatch(ConnectProcessor.java:361) ~[palo-fe.jar:3.4.0] at org.apache.doris.qe.ConnectProcessor.processOnce(ConnectProcessor.java:562) ~[palo-fe.jar:3.4.0] at org.apache.doris.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:50) ~[palo-fe.jar:3.4.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:835) [?:?] ``` Describe the overview of changes. Just support: yyyy-MM-dd HH:mm:ss yyyy-MM-dd yyyyMMdd	2022-03-05 15:26:37 +08:00
ChPi	f57f02bbf2	[improvement] Support show tablets stmt (#7970 ) change `show tablet from tbl` to `show tablets from tbl`	2022-03-05 15:25:57 +08:00
Mingyu Chen	9961b2c860	[refactor] Remove mysql-connector and replace org.json with com.googlecode.json-simple (#8319 ) 1. mysql-connector-java mysql-connector-java is under GLPv2 license, which is not compatible with APLv2, and Doris does not use it. 2. org.json org.json is under JSON license, which is not compatible with APLv2. I use `json-simple` to replace it.	2022-03-05 14:41:04 +08:00
GoGoWen	9cf2798fb3	[typo] fix error for PushTask (#8316 )	2022-03-05 14:40:32 +08:00
caiconghui	0383001442	[Enhancement] Support Skipping compaction lower replica where select queryable replica for better scan performance (#8146 )	2022-03-05 09:51:50 +08:00
caiconghui	46ca23f216	[Feature] Support Changing the bucketing mode of the table from Hash Distribution to Random Distribution (#8259 ) Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2022-03-04 09:05:23 +08:00
GoGoWen	80e88159d9	[improvement](restore) allow query on part of partitions when others are in RESTORE (#8245 )	2022-03-03 22:34:10 +08:00
Zhengguo Yang	c56a372e06	[improvement][fix](grouping-set)(tablet-repair) optimize compaction too slow replica process, (#8123 ) 1. optimize compaction too slow replica process, will set to ok if the replica compaction is done. And will not set bad if more than half replica is too slow, 2. fix field in grouping_id() is not the same name will cause error lile `select list expression not produced by aggregation output (missing from GROUP BY clause?): id`	2022-03-03 22:30:35 +08:00
Zhengguo Yang	f5ab0553ff	[chore] remove some ut temp files and add some file to .gitignore (#8309 )	2022-03-03 13:23:27 +08:00
Zhengguo Yang	f622ce0497	[refactor] remove types_test (#8289 ) * [refactor] remove types_test 1. remove types_test, it will cause core dump in higher version GCC or clang, because of memory align, some code will be vectorized in higher GCC or clang 2. Change string type length to 2 GB instead of -1 3. modify inaccessible code	2022-03-03 09:31:35 +08:00

1 2 3 4 5 ...

1966 Commits