Commit Graph

18429 Commits

Author SHA1 Message Date
66a3c574df [Vectorized][Bug] fix percentile_approx function to return always nullable (#8572) 2022-03-29 14:47:39 +08:00
23b348456b [Bug] Read bitmap/hll column failed for storage layer vectorization (#8560)
* fix bitmap error

* Update be/src/olap/rowset/segment_v2/segment_iterator.cpp

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>

Co-authored-by: Wang Bo <wangbo36@meituan.com>
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2022-03-29 14:18:59 +08:00
0d43f8e130 [refactor] remove atomic.h/cpp use std::atomic instead (#8693) 2022-03-29 12:41:41 +08:00
f3659c87c1 [fix][chore](repository)(fe) check reponame when creating repository and modify build.sh (#8671)
1. We need to check repo name when creating repository
2. modify build.sh to not install spark-dpp when spark-dpp is not compiled
2022-03-29 11:32:52 +08:00
d82c138a60 [fix](user-property) Fix bug that can not set exec_mem_limit at user level (#8710) 2022-03-29 10:03:33 +08:00
365eba0b92 [fix] fix core dump when avg on not null decimal in empty table (#8681) 2022-03-28 12:41:00 +08:00
db5299d63e [fix] fix compile error (#8688)
Introduced from PR #8643
The condition_variable can only wait unique_lock
2022-03-28 11:53:42 +08:00
b67596ba2a [fix](ut) fix be ut failed (#8682) 2022-03-28 10:50:41 +08:00
Rio
7cf39fe885 [typo] Optimize some code comments (#8673) 2022-03-28 10:38:10 +08:00
aa1592b932 [community] Add more collaborators (#8672)
1. add dataroaring
2. remove qidaye because he has became the committer of Doris

discuss thread: https://lists.apache.org/thread/8bxnj7qw2p120v077nm8gny52m65d22r
2022-03-28 10:37:31 +08:00
e4c0dd97ed [doc] fix buffer pool default value (#8670) 2022-03-28 10:37:12 +08:00
Pxl
8eef5c337a [doc] fix sql-mode document (#8662) 2022-03-28 10:35:27 +08:00
079e35f3d3 [doc] update doc of vec-execution-engine (#8655) 2022-03-28 10:26:28 +08:00
d45026171d [test] regression framework use RollingFileAppender by default (#8654) 2022-03-28 10:25:34 +08:00
79be81a8a4 [chore] Optimize build_lz4 in build-thirdparty.sh (#8653) 2022-03-28 10:24:32 +08:00
727e8842d4 [test] limit memory used by regression test framework (#8651) 2022-03-28 10:24:12 +08:00
6cbc5014b9 [doc] update export.md (#8650)
"where" should be in front of "to".
2022-03-28 10:23:53 +08:00
7cfce63a13 [fix](mini-load) Remove mini load in LOADING and PENDING state (#8649)
1. Remove some unused code.
2. handle mini load with wrong state
    1. For some historical reasons, some mini load jobs in LOADING state have not been cleared.
        As a result, new load jobs cannot be committed.
    2. If a mini load job is created right before FE restart, the mini load job will be in PENDING state forever.
        But it should be removed finally.
2022-03-28 10:22:17 +08:00
57e038120f [chore] add -rtlib=compiler-rt for UBSAN under clang (#8647) 2022-03-28 10:21:55 +08:00
887301474d [doc] Update compilation.md (#8646)
Added solutions to the "fatal error: Killed signal terminated program ..."
problem encountered when compiling with Docker to the FAQ.
2022-03-28 10:21:31 +08:00
70fd5c0735 [doc] optimize some doc expression (#8645) 2022-03-28 10:20:38 +08:00
ea45940ef0 [fix] fix memory leak in VDataStreamRecvr::SenderQueue (#8643)
After `VDataStreamRecvr::SenderQueue::close` clears `_block_queue`, calling 
`VDataStreamRecvr::SenderQueue::add_block` again will cause a memory leak.

So, change the lock position, like the other add_block and add_batch.
2022-03-28 10:19:22 +08:00
cdf0a016c3 [fix](vec) fix coredump for aggregate function when delete large_data, due to alloc-dealloc-mismatch (#8641) 2022-03-28 10:17:13 +08:00
11f9f5fe4d [chore][be-test] Link gtest_main to provide default main function definition. (#8631) 2022-03-28 10:14:48 +08:00
726eaa68ea [fix](vectorization) Vectorization decimal arithmetic inconsistent (#8626) 2022-03-28 10:12:39 +08:00
HB
39717a85a2 [fix](load) Fix null column bug in load's mapping column setting (#8625) 2022-03-28 10:08:00 +08:00
f96bc62573 [feature](balance) Support balance between disks on a single BE (#8553)
Current situation of Doris is that the cluster is balanced, but the disks of a backend may be unbalanced.
for example, backend A have two disks: disk1 and disk2, disk1's usage is 98%, but disk2's usage is only 40%.
disk1 is unable to take more data, therefore only one disk of backend A can take new data,
the available write throughput of backend A is only half of its ability, and we can not resolve this through load or 
partition rebalance now.

So we introduce disk rebalancer, disk rebalancer is different from other rebalancer(load or partition)
which take care of cluster-wide data balancing. it takes care about backend-wide data balancing.

[For more details see #8550](https://github.com/apache/incubator-doris/issues/8550)
2022-03-28 10:03:21 +08:00
b2861f36c4 [chore] optimize aws thirdparty package download. (#8637) 2022-03-28 09:35:51 +08:00
Pxl
02612c7ec0 [Refactor] Remove ununsed file (#8657) 2022-03-27 01:41:06 +08:00
aeee738af0 Revert "[Refactor][agent_task] Remove etl mgr and etl job pool from be (#8635)" (#8666)
This reverts commit 6bc982c37436acf288f566cf10e084731b80fa44.
2022-03-25 18:32:50 +08:00
e285d09157 [Enhancement](load) speed up stream load for duplicate table, use template for faster get_type_info. (#8500) 2022-03-25 15:18:43 +08:00
6bc982c374 [Refactor][agent_task] Remove etl mgr and etl job pool from be (#8635) 2022-03-25 15:17:39 +08:00
8b4e57287f ow num is more accurate than column num in data_types (#8628) 2022-03-25 14:38:27 +08:00
cfb57be731 [api-change] add soft limit of String type length (#8567)
1. add a config string_type_soft_limit to soft limit max length of string type
2. disable using String type in Key column, partition column and
   distribution column
3. remove String type alias BLOB for futrue use
2022-03-25 09:28:41 +08:00
5511d435de [Doris Manager][Doc]Basic User Documents for Doris Manager (#8609) 2022-03-24 21:34:49 +08:00
9db2a96af1 [test] support a lot of actions (#8632)
Support a lot of actions for regression testing framework.
e.g. thread, lazyCheck, onSuccess, connect, selectUnionAll, timer

Demo exists in ${DORIS_HOME}/regression-test/suites/demo
2022-03-24 20:22:24 +08:00
c69dd54116 [refactor](mutex) Use std::mutex to replace Mutex and refactor some lock logic (#8452) 2022-03-24 14:50:02 +08:00
aaaaae53b5 [feature] (memory) Switch TLS mem tracker to separate more detailed memory usage (#8605)
In pr #8476, all memory usage of a process is recorded in the process mem tracker,
and all memory usage of a query is recorded in the query mem tracker,
and it is still necessary to manually call `transfer to` to track the cached memory size.

We hope to separate out more detailed memory usage based on Hook TCMalloc new/delete + TLS mem tracker.

In this pr, the more detailed mem tracker is switched to TLS, which automatically and accurately
counts more detailed memory usage than before.
2022-03-24 14:29:34 +08:00
5f606c9d57 [fix] Fix coredump of stddev function (#8543)
This is only a temporary fix its performance is not ideal. Finally,
we need to reconstruct the functions of `stddev` and delete the interface of `insert_to_null_default ()`.
2022-03-24 11:39:29 +08:00
Pxl
0292b9ad9e [Enhancement] add build paramnt ENABLE_JAVAUDF, BUILD_DOCS (#8612)
* add build parament ENABLE_JAVAUDF,BUILD_DOCS
2022-03-24 10:53:52 +08:00
Pxl
2760bcbcc1 [fix] fix core dump on deep_copy_tuple when data is null (#8620) 2022-03-24 09:15:38 +08:00
6e1147206e [doc] fix help module failed (#8617)
Introduced by #8509.
Docs title is duplicate.
2022-03-24 09:15:06 +08:00
286ee8e1d4 [doc] fix typo for session (#8610) 2022-03-24 09:14:44 +08:00
a58e56f0b4 [fix](load) fix another bug that BE may crash when calling mark_as_failed (#8607)
Same as #8501
2022-03-24 09:13:54 +08:00
Pxl
7fc22c2456 [fix][vectorized] fix core on get_predicate_column_ptr && fix double copy on _read_columns_by_rowids (#8581) 2022-03-24 09:12:42 +08:00
bea9a7ba4f [feature] Support pre-aggregation for quantile type (#8234)
Add a new column-type to speed up the approximation of quantiles.
1. The  new column-type is named `quantile_state` with fixed aggregation function `quantile_union`, which stores the intermediate results of pre-aggregated approximation calculations for quantiles.
2. support pre-aggregation of new column-type and quantile_state related functions.
2022-03-24 09:11:34 +08:00
36c85d2f06 [fix][vectorized] Fix bug of left semi/anti with other join conjunct (#8596) 2022-03-23 10:34:47 +08:00
72dfdb9a6c [fix] Fix Check_time return wrong value when exec show table status (#8578) 2022-03-23 10:34:23 +08:00
92feb9c6c8 [fix] Fix error crc32 method to cal uint128 and int128 (#8577) 2022-03-23 10:33:32 +08:00
b89e4c7bba [feature-wip](java-udf) support java UDF with fixed-length input and output (#8516)
This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF). 
This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR.

To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java.
To achieve that, I use a UdfExecutor instead. 

For users, a UDF class must have a public evaluate method.
2022-03-23 10:32:50 +08:00