doris

Author	SHA1	Message	Date
wangbo	726eaa68ea	[fix](vectorization) Vectorization decimal arithmetic inconsistent (#8626 )	2022-03-28 10:12:39 +08:00
HB	39717a85a2	[fix](load) Fix null column bug in load's mapping column setting (#8625 )	2022-03-28 10:08:00 +08:00
yinzhijian	f96bc62573	[feature](balance) Support balance between disks on a single BE (#8553 ) Current situation of Doris is that the cluster is balanced, but the disks of a backend may be unbalanced. for example, backend A have two disks: disk1 and disk2, disk1's usage is 98%, but disk2's usage is only 40%. disk1 is unable to take more data, therefore only one disk of backend A can take new data, the available write throughput of backend A is only half of its ability, and we can not resolve this through load or partition rebalance now. So we introduce disk rebalancer, disk rebalancer is different from other rebalancer(load or partition) which take care of cluster-wide data balancing. it takes care about backend-wide data balancing. [For more details see #8550](https://github.com/apache/incubator-doris/issues/8550)	2022-03-28 10:03:21 +08:00
Zhengguo Yang	b2861f36c4	[chore] optimize aws thirdparty package download. (#8637 )	2022-03-28 09:35:51 +08:00
Pxl	02612c7ec0	[Refactor] Remove ununsed file (#8657 )	2022-03-27 01:41:06 +08:00
yiguolei	aeee738af0	Revert "[Refactor][agent_task] Remove etl mgr and etl job pool from be (#8635 )" (#8666 ) This reverts commit 6bc982c37436acf288f566cf10e084731b80fa44.	2022-03-25 18:32:50 +08:00
zbtzbtzbt	e285d09157	[Enhancement](load) speed up stream load for duplicate table, use template for faster get_type_info. (#8500 )	2022-03-25 15:18:43 +08:00
yiguolei	6bc982c374	[Refactor][agent_task] Remove etl mgr and etl job pool from be (#8635 )	2022-03-25 15:17:39 +08:00
dataroaring	8b4e57287f	ow num is more accurate than column num in data_types (#8628 )	2022-03-25 14:38:27 +08:00
Zhengguo Yang	cfb57be731	[api-change] add soft limit of String type length (#8567 ) 1. add a config string_type_soft_limit to soft limit max length of string type 2. disable using String type in Key column, partition column and distribution column 3. remove String type alias BLOB for futrue use	2022-03-25 09:28:41 +08:00
Justice Gong	5511d435de	[Doris Manager][Doc]Basic User Documents for Doris Manager (#8609 )	2022-03-24 21:34:49 +08:00
924060929	9db2a96af1	[test] support a lot of actions (#8632 ) Support a lot of actions for regression testing framework. e.g. thread, lazyCheck, onSuccess, connect, selectUnionAll, timer Demo exists in ${DORIS_HOME}/regression-test/suites/demo	2022-03-24 20:22:24 +08:00
caiconghui	c69dd54116	[refactor](mutex) Use std::mutex to replace Mutex and refactor some lock logic (#8452 )	2022-03-24 14:50:02 +08:00
Xinyi Zou	aaaaae53b5	[feature] (memory) Switch TLS mem tracker to separate more detailed memory usage (#8605 ) In pr #8476, all memory usage of a process is recorded in the process mem tracker, and all memory usage of a query is recorded in the query mem tracker, and it is still necessary to manually call `transfer to` to track the cached memory size. We hope to separate out more detailed memory usage based on Hook TCMalloc new/delete + TLS mem tracker. In this pr, the more detailed mem tracker is switched to TLS, which automatically and accurately counts more detailed memory usage than before.	2022-03-24 14:29:34 +08:00
HappenLee	5f606c9d57	[fix] Fix coredump of stddev function (#8543 ) This is only a temporary fix its performance is not ideal. Finally, we need to reconstruct the functions of `stddev` and delete the interface of `insert_to_null_default ()`.	2022-03-24 11:39:29 +08:00
Pxl	0292b9ad9e	[Enhancement] add build paramnt ENABLE_JAVAUDF, BUILD_DOCS (#8612 ) * add build parament ENABLE_JAVAUDF,BUILD_DOCS	2022-03-24 10:53:52 +08:00
Pxl	2760bcbcc1	[fix] fix core dump on deep_copy_tuple when data is null (#8620 )	2022-03-24 09:15:38 +08:00
qiye	6e1147206e	[doc] fix help module failed (#8617 ) Introduced by #8509. Docs title is duplicate.	2022-03-24 09:15:06 +08:00
GoGoWen	286ee8e1d4	[doc] fix typo for session (#8610 )	2022-03-24 09:14:44 +08:00
Mingyu Chen	a58e56f0b4	[fix](load) fix another bug that BE may crash when calling `mark_as_failed` (#8607 ) Same as #8501	2022-03-24 09:13:54 +08:00
Pxl	7fc22c2456	[fix][vectorized] fix core on get_predicate_column_ptr && fix double copy on _read_columns_by_rowids (#8581 )	2022-03-24 09:12:42 +08:00
spaces-x	bea9a7ba4f	[feature] Support pre-aggregation for quantile type (#8234 ) Add a new column-type to speed up the approximation of quantiles. 1. The new column-type is named `quantile_state` with fixed aggregation function `quantile_union`, which stores the intermediate results of pre-aggregated approximation calculations for quantiles. 2. support pre-aggregation of new column-type and quantile_state related functions.	2022-03-24 09:11:34 +08:00
HappenLee	36c85d2f06	[fix][vectorized] Fix bug of left semi/anti with other join conjunct (#8596 )	2022-03-23 10:34:47 +08:00
Lijia Liu	72dfdb9a6c	[fix] Fix Check_time return wrong value when exec show table status (#8578 )	2022-03-23 10:34:23 +08:00
HappenLee	92feb9c6c8	[fix] Fix error crc32 method to cal uint128 and int128 (#8577 )	2022-03-23 10:33:32 +08:00
Gabriel	b89e4c7bba	[feature-wip](java-udf) support java UDF with fixed-length input and output (#8516 ) This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF). This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR. To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java. To achieve that, I use a UdfExecutor instead. For users, a UDF class must have a public evaluate method.	2022-03-23 10:32:50 +08:00
camby	9f0b93e3c6	[feature-wip](array-type) Fix conflict while merge array-type branch (#8594 )	2022-03-22 16:35:30 +08:00
Adonis Ling	b522de884c	[feature-wip](array-type) Fix compilation error. (#8556 ) (#8591 )	2022-03-22 15:52:34 +08:00
Adonis Ling	2580da4f72	[feature-wip](array-type) Support insertion for vectorized engine. (#8494 ) (#8590 ) Please refer to #8493	2022-03-22 15:48:13 +08:00
camby	71ce3c4a6e	[feature-wip](array-type) Add codes and UT for array_contains and array_position functions (#8401 ) (#8589 ) array_contains function Usage example: 1. create table with ARRAY column, and insert some data: ``` > select * from array_test; +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 1 \| 2 \| [1, 2] \| \| 2 \| 3 \| NULL \| \| 4 \| NULL \| [] \| \| 3 \| NULL \| NULL \| +------+------+--------+ ``` 2. enable vectorized: ``` > set enable_vectorized_engine=true; ``` 3. select with array_contains: ``` > select k1,array_contains(k3,1) from array_test; +------+-------------------------+ \| k1 \| array_contains(`k3`, 1) \| +------+-------------------------+ \| 3 \| NULL \| \| 1 \| 1 \| \| 2 \| NULL \| \| 4 \| 0 \| +------+-------------------------+ ``` 4. also we can use array_contains in where condition ``` > select * from array_test where array_contains(k3,1); +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 1 \| 2 \| [1, 2] \| +------+------+--------+ ``` 5. array_position usage example ``` > select k1,k3,array_position(k3,2) from array_test; +------+--------+-------------------------+ \| k1 \| k3 \| array_position(`k3`, 2) \| +------+--------+-------------------------+ \| 3 \| NULL \| NULL \| \| 1 \| [1, 2] \| 2 \| \| 2 \| NULL \| NULL \| \| 4 \| [] \| 0 \| +------+--------+-------------------------+ ```	2022-03-22 15:42:40 +08:00
Adonis Ling	a9f51b5b65	[feature-wip](array-type) Fix compilation error. (#8422 ) (#8587 )	2022-03-22 15:31:16 +08:00
Adonis Ling	b638c07533	[feature-wip](array-type) Support nested array insertion. (#8305 ) (#8586 ) Please refer to #8304 .	2022-03-22 15:28:26 +08:00
Adonis Ling	e44038caf3	[feature-wip](array-type) Array data can be loaded in stream load. (#8368 ) (#8585 ) Please refer to #8367 .	2022-03-22 15:25:40 +08:00
camby	a498463ab5	[feature-wip](array-type)support select ARRAY data type on vectorized engine (#8217 ) (#8584 ) Usage Example: 1. create table for test; ``` `CREATE TABLE `array_test` ( `k1` tinyint(4) NOT NULL COMMENT "", `k2` smallint(6) NULL COMMENT "", `k3` ARRAY<int(11)> NULL COMMENT "" ) ENGINE=OLAP DUPLICATE KEY(`k1`) COMMENT "OLAP" DISTRIBUTED BY HASH(`k1`) BUCKETS 5 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2" );` ``` 2. insert some data ``` `insert into array_test values(1, 2, [1, 2]);` `insert into array_test values(2, 3, null);` `insert into array_test values(3, null, null);` `insert into array_test values(4, null, []);` ``` 3. open vectorized `set enable_vectorized_engine=true;` 4. query array data `select * from array_test;` +------+------+--------+ \| k1 \| k2 \| k3 \| +------+------+--------+ \| 4 \| NULL \| [] \| \| 2 \| 3 \| NULL \| \| 1 \| 2 \| [1, 2] \| \| 3 \| NULL \| NULL \| +------+------+--------+ 4 rows in set (0.061 sec) Code Changes include： 1. add column_array, data_type_array codes; 2. codes about data_type creation by Field, TabletColumn, TypeDescriptor, PColumnMeta move to DataTypeFactory; 3. support create data_type for ARRAY date type; 4. RowBlockV2::convert_to_vec_block support ARRAY date type; 5. VMysqlResultWriter::append_block support ARRAY date type; 6. vectorized::Block serialize and deserialize support ARRAY date type;	2022-03-22 15:21:44 +08:00
Adonis Ling	38ec3cbbdf	[feature-wip](array-type) Support ArrayLiteral in SQL. (#8089 ) (#8582 ) Please refer to #8074	2022-03-22 15:07:06 +08:00
Adonis Ling	cf0a9fd177	[feature-wip](array-type) Create table with nested array type. (#8003 ) (#8575 ) ``` create table array_type_table(k1 INT, k2 Array<Array<int>>) duplicate key (k1) distributed by hash(k1) buckets 1 properties('replication_num' = '1'); ```	2022-03-22 15:03:32 +08:00
morrySnow	106d7c2e41	[fix] Wrong conf be used for Filesytem in S3Storage (#8568 ) wrong conf for Filesytem in S3Storage to disable cache. it will lead to wrong behavior when use it to list objects in object store	2022-03-22 11:42:38 +08:00
kangshisen	54aaa8a56a	[doc] update star-schema-benchmark.md (#8565 )	2022-03-22 11:42:10 +08:00
kangshisen	4335c07c35	[doc] update star-schema-benchmark.md (#8564 )	2022-03-22 11:41:45 +08:00
Jibing-Li	9a0a1c693e	[fix] fix NPE in thrift when forwarding stmt to master FE	2022-03-22 11:41:13 +08:00
Pxl	be3d203289	[feature][vectorized] support table function explode_numbers() (#8509 )	2022-03-22 11:38:00 +08:00
yiguolei	989e03ddf9	[improvement] Improve sig handler (#8545 ) * Refactor glog's default signal handler Co-authored-by: Zhengguo Yang <780531911@qq.com>	2022-03-22 10:40:31 +08:00
kangshisen	011985e7e3	fix en broker load (#8566 ) fix en broker load	2022-03-21 22:53:51 +08:00
caiconghui	905b9a6289	[fix](lru_cache) fix heap-use-after-free problem for lru cache(#8569 )	2022-03-21 21:23:43 +08:00
Mingyu Chen	04004021b5	[chore] Separate debugging information from BE binaries (#8544 ) Currently, the compiled output of BE mainly consists of two binaries: palo_be and meta_tool, which are both around 1.6G in size. However, the debug information is only needed for debugging purposes. So I separate the debug info from binaries. After BE is built, the debug info file will be saved in `be/lib/debug_info/` dir. `palo_be` and `meta_tool`'s size decrease to about 100MB This is optional, and default is disabled. To enable it, use: `STRIP_DEBUG_INFO=ON sh build.sh`	2022-03-21 16:33:01 +08:00
Zhengguo Yang	7c1c2b1d17	[chore] fix compile error when use clang as compiler and a be ut problem (#8554 )	2022-03-21 15:38:59 +08:00
yiguolei	337d174c14	[Refactor](schema_change) Remove tablet instances since tablet id is unique between base tablet and new schema change tablet (#8486 )	2022-03-21 12:43:54 +08:00
Zhengguo Yang	f06780249a	fix some fe ut failed (#8547 )	2022-03-21 10:36:06 +08:00
minghong	c772020db4	[fix] fix bug in WindowFunctionLastData::data, it keeps the first data not the last. (#8536 ) WindowFunctionLastData::add should keep the last value, but current implementation keeps the first one. Obviously, this code is copied from WindowFunctionFirstData::add.	2022-03-21 09:51:56 +08:00
Mingyu Chen	dde50fb2bf	[doc] change http to https in download page (#8546 )	2022-03-20 23:36:17 +08:00

1 2 3 4 5 ...

4155 Commits