Commit Graph

1970 Commits

Author SHA1 Message Date
de7dce4df8 [Refactor] remove some useless code (#8976) 2022-04-18 09:55:54 +08:00
be0ba76dff [Refactor] Use '#pragma once' to replace '#define' and '#endif' (#9062) 2022-04-18 09:54:59 +08:00
c71ffc01de [Refactor] Cleanup some unused include (#9063) 2022-04-18 09:52:31 +08:00
d1d834694f [fix] Fix bug of wrong argument of drop_tablet function (#9031)
introduced from #8574
2022-04-15 15:19:28 +08:00
0bf72caf68 [Bug][Vectorized] Fix UB when doing ORDER BY. (#9023) 2022-04-15 14:02:29 +08:00
Pxl
f7a5ff4f1d [Enhancement] [Storage Vectorize] optimize BitmapRangeIterator.next_range() (#9013) 2022-04-15 11:27:03 +08:00
579aee110a [fix](ut)(compile) Fix BE compile bug and FE unit test (#9027)
1. The compile bug is introduced from #8855
2. FE ut bug is introduced from #8848 and #8770
2022-04-14 17:37:41 +08:00
9ac6d23a44 [Feature]support stddev/variance agg functions to window function (#8962) 2022-04-14 12:07:26 +08:00
5e95d99925 [fix](load) fix bug of infinite loop in orc scanner (#9007)
When encounter unqualified data, orc scanner may not be able
to quit correctly.
2022-04-14 11:46:48 +08:00
e5e0dc421d [refactor] Change ALL OLAPStatus to Status (#8855)
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
2022-04-14 11:43:49 +08:00
8765881d8b [fix](load) wait _send_batch_thread_pool_token rather than shutdown. (#8970)
We can not shutdown _send_batch_thread_pool_token, because _packet_in_flight
has to be clear finally. Otherwise a never ended join on rpc would happen.

It is difficult to handle concurrent problem if a flag setter is not guaranteed to run.
2022-04-14 10:05:14 +08:00
943b08bcdf fix master compile error (#8992)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-04-13 11:23:37 +08:00
c872793a23 remove rowset converter since it is useless (#8974)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-04-13 10:40:12 +08:00
5881e8fdc6 [refactor] use c++ 14 deprecated instaed of comment, this detect usage of deprecated var or func at compile time (#8439) 2022-04-13 10:19:04 +08:00
290366787c [refactor] refactor code, replace some file with stl libs (#8759)
1. replace ConditionVariables with std::condition_variable
2. repalace Mutex with std::mutex
3. repalce MonoTime with std::chrono
2022-04-13 09:55:29 +08:00
Pxl
64cf64d1f8 remove unused code and opt int_div (#8966) 2022-04-13 09:51:01 +08:00
52d18aa83c permute impl for column array; and codes format (#8949)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-04-13 09:47:54 +08:00
0c8ea8ce9f [Vectorizd] Let VAssertRowNumNode handle return value of child->get_next (#8969) 2022-04-12 19:56:03 +08:00
5a44eeaf62 [refactor] Unify all unit tests into one binary file (#8958)
1. solved the previous delayed unit test file size is too large (1.7G+) and the unit test link time is too long problem problems
2. Unify all unit tests into one file to significantly reduce unit test execution time to less than 3 mins
3. temporarily disable stream_load_test.cpp, metrics_action_test.cpp, load_channel_mgr_test.cpp because it will re-implement part of the code and affect other tests
2022-04-12 15:30:40 +08:00
66d2f4e1fd [fix][mem tracker] Fix MemTracker null pointer in vectorized (#8925)
Fix ThreadMemTrackerMgr::update_tracker null pointer and some details.

Issue Number: close #8920
2022-04-12 10:17:10 +08:00
067309c466 [fix](compile) fix compilation bug (#8950) 2022-04-11 13:12:34 +08:00
Pxl
8a066e2586 [fix](vectorized) core dump on ST_AsText (#8870) 2022-04-11 09:39:32 +08:00
8158b05ea0 [fix] Fix bug that tablet data size and row num info are failed to report. (#8945)
Introduced from #8146
2022-04-11 09:38:28 +08:00
7f7172807f [feature](function)(vectorized) Support all geolocation functions on vectorized engine (#8846) 2022-04-11 09:36:53 +08:00
0d761f9909 [feature-wip][UDF][DIP-1] Support variable-size input and output for Java UDF (#8678)
This feature is proposed in DSIP-1. This PR support variable-length input and output Java UDF.
2022-04-11 09:36:16 +08:00
6ed59bb98b [refactor](code_style) remove useless inline #8933
1.Member functions defined in a class are inline by default (implicitly), and do not need to be added
2.inline is a keyword used for implementation, which has no effect when placed before the function declaration
2022-04-10 18:29:55 +08:00
1fe4ea4c7c [Refactor-step1] Add OLAPInternalError to status (#8900) 2022-04-10 00:16:43 +08:00
5706679e08 [fix] fix the problem that using tsan to compile,BE will stack overflow when start (#8904)
Currently TSAN can only be compiled using CLang, not GCC.
And when compiling with -o0, stack overflow occurs at startup, issue #8868.
A function definition will be reported missing at compile time, the file provided in PR #8665 is required.
2022-04-09 19:17:28 +08:00
ce6b5169c2 [fix](join) Fix error bucket num get in bucket shuffle join in dynamic partition (#8891) 2022-04-09 19:11:44 +08:00
c5718928df [feature-wip](array-type) support explode and explode_outer table function (#8766)
explode(ArrayColumn) desc:
> Create a row for each element in the array column. 

explode_outer(ArrayColumn) desc:
> Create a row for each element in the array column. Unlike explode, if the array is null or empty, it returns null.

Usage example:
1. create a table with array column, and insert some data;
2. open enable_lateral_view and enable_vectorized_engine;
```
set enable_lateral_view = true;
set enable_vectorized_engine=true;
```
3. use explode_outer
```
> select * from array_test;
+------+------+--------+
| k1   | k2   | k3     |
+------+------+--------+
|    3 | NULL | NULL   |
|    1 |    2 | [1, 2] |
|    2 |    3 | NULL   |
|    4 | NULL | []     |
+------+------+--------+

> select k1,explode_column from array_test LATERAL VIEW explode_outer(k3) TempExplodeView as explode_column;
+------+----------------+
| k1   | explode_column |
+------+----------------+
|    1 |              1 |
|    1 |              2 |
|    2 |           NULL |
|    4 |           NULL |
|    3 |           NULL |
+------+----------------+
```
4. explode usage example. explode return empty rows while the ARRAY is null or empty
```
> select k1,explode_column from array_test LATERAL VIEW explode(k3) TempExplodeView as explode_column;
+------+----------------+
| k1   | explode_column |
+------+----------------+
|    1 |              1 |
|    1 |              2 |
+------+----------------+
```
2022-04-08 12:11:04 +08:00
bd0a3369b7 [fix] check disk capacity before writing data (#8887)
1. We forgot to check disk capacity when writing data.
2. TODO: the user specified disk capacity is not used now. We need to find a way to use it.
3. Avoid print too much compaction log when there is not suitable version for compaction.
2022-04-08 11:29:49 +08:00
f854f0e83e remove unreadable char in comment (#8909) 2022-04-08 09:26:53 +08:00
Pxl
dbbc6549bd [feature](vectorized) support vexplode_bitmap (#8890) 2022-04-08 09:20:26 +08:00
3f04220d49 [typo] Fix typo in function.cpp (#8873) 2022-04-08 09:09:19 +08:00
0b98d78664 [improvement](hll) Optimize Hyperloglog (#8829)
In meituan, pr #6625 was revert due to the oom probleam.
currently, we are trying to modify the old hyperloglog, based on pr #8555, we did some works.
via some test, we find it better than old hll, and better than apache:master hll.

Changes summary:

- use SIMD max tp speed up heavy function _merge_registers
- use phmap::flat_hash_set rather than std::set
- replace std::max
- other small changes
2022-04-08 09:06:08 +08:00
519305cb22 [feature-wip] (memory tracker) (step4) Switch TLS mem tracker to separate more detailed memory usage (#8669)
Based on #8605, Separate out the memory usage of each operator from the Query/Load/StorageEngine mem tracker.
2022-04-08 09:02:26 +08:00
7fb4b6a6e2 [chore](tsan) add file mremap_fallback for tsan (#8665) 2022-04-08 09:01:53 +08:00
d51545a952 [fix](ut)(memory-leak) Fix be asan ut failed and hdfs file reader memory leak (#8905) 2022-04-08 00:07:00 +08:00
Pxl
2a25b90cb3 [Test] Fix explode test and build fail (#8885) 2022-04-07 14:23:57 +08:00
02be8176c3 [fix] access parallel_flat_hash_map via thread safely methods (#8854)
Iterator of parallel_flat_hash_map is not thread safely, so
we should use if_contains instead.
2022-04-07 11:35:59 +08:00
ca4055244e [fix](storage) Fix core bug of convert to predicate column (#8833)
recurrent:
When `enable_low_cardinality_optimize = true`, for the TPCH dataset, using the following SQL query will Core
```sql
select count(*) from lineitem where l_comment = 'ously even exc';
```

This SQL will trigger the execution of `ColumnDictionary::convert_to_predicate_column_if_dictionary`, and `res->reserve(_codes.size())` is problematic because the current `_codes.size()` is smaller than its reserve value, so inserting a value into `PredicateColumn` will Core.
2022-04-07 11:29:26 +08:00
98cab78320 [refactor](schema_hash) remove schema_hash since every tablet id in be is unique (#8574) 2022-04-07 08:37:45 +08:00
e53c90fbef min and max window function bug fix (#8822)
[Fix bug] min and max window function bug fix #8822
2022-04-07 08:36:33 +08:00
f90a1a1919 [fix](ut)(compile) Fix ut failure at functions_geo and compilation bug (#8843) 2022-04-05 21:30:40 +08:00
Pxl
03c5d5d677 fix some error on build.sh && fix build fail with clang on runtime_profile (#8748) 2022-04-05 15:52:53 +08:00
d07b49247e rm sequential file (#8713)
[refactor]remove sequential file reader from env
2022-04-04 17:49:06 +08:00
fcefed7c1c [Bug][Vectorized] Fix core bug of segment vectorized (#8800)
* [Bug][Vectorized] Fix core bug of segment vectorized
1. Read table with delete condition
2. Read table with default value HLL/Bitmap Column

* refactor some code

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-04-03 19:50:25 +08:00
33736e45fa [fix](table-function) Fixed unreasonable nullable conversion (#8818) 2022-04-03 11:02:35 +08:00
78b85414d6 [fix](debug) get_hash_value_fvn DCHECK failed (#8811)
* fix_get_hash_value_fvn

* fix compile
2022-04-03 10:55:15 +08:00
f3c6ddf651 [feature](function) Support geolocation functions on vectorized engine (#8790) 2022-04-03 10:50:54 +08:00