Commit Graph

249 Commits

Author SHA1 Message Date
65e277e365 [refacotr](node) refactor partition sort node to improve readability (#30511)
* [refacotr](node) refactor partition sort node to improve readability

* update
2024-02-01 19:01:08 +08:00
Pxl
1aa7a914e1 fix wrong profile on distinct agg and pass reference on uint136's compare (#30661) 2024-02-01 19:00:50 +08:00
77b366fc4b [fix](join) incorrect result of mark join (#30543)
incorrect result of mark join
2024-01-31 23:53:40 +08:00
ef8d9ad9a4 [pipelinex](profile) improve memory counter of pipelineX (#30538) 2024-01-31 23:53:39 +08:00
e6fbccd3ed [Feature](Variant) support row store for variant type (#30052) 2024-01-31 23:53:39 +08:00
22322a3864 [refactor](join) remove unused RowRefListType from join (#30442) 2024-01-27 09:10:41 +08:00
713798d549 [feature](nereids)support mark join (#30133)
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
2024-01-27 09:09:53 +08:00
9aaa6ba351 [Fix](Variant) fix variant lost null info after cast_column (#30153)
This could result incorrect output in hirachinal cases

```
 sql """insert into ${table_name} values (-3, '{"a" : 1, "b" : 1.5, "c" : [1, 2, 3]}')"""
    sql """insert into  ${table_name} select -2, '{"a": 11245, "b" : [123, {"xx" : 1}], "c" : {"c" : 456, "d" : "null", "e" : 7.111}}'  as json_str
            union  all select -1, '{"a": 1123}' as json_str union all select *, '{"a" : 1234, "xxxx" : "kaana"}' as json_str from numbers("number" = "4096") limit 4096 ;"""

mysql> select v["c"] from var_rs where k = -3 or k = -2;
+----------------------+
| element_at(`v`, 'c') |
+----------------------+
| [1,2,3]              |
| []                   |
+----------------------+
2 rows in set (0.04 sec)
```
2024-01-27 09:08:29 +08:00
Pxl
1d7d7ee9b6 [Chore](join) split out join hash map from hash map (#30280)
split out join hash map from hash map
2024-01-25 13:24:52 +08:00
ead3b4ac1d [feature](function) support ip function is_ipv4_compat, is_ipv4_mapped (#29954) 2024-01-23 10:07:51 +08:00
Pxl
a5ca8833d7 [Improvement](aggregate) optimize for small string aggregate (#29919) 2024-01-19 15:48:15 +08:00
66513d57f9 [feature](function) support ip function named ipv6_cidr_to_range(addr, cidr) (#29812) 2024-01-16 18:42:09 +08:00
Pxl
068367063f [Improvement](function) optimization for substr with ascii string (#29799) 2024-01-12 11:59:52 +08:00
Pxl
3cf95d0fdf [Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec (#28788)
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
Pxl
e556536de1 [Refactor](join) split SetHashTableVariants out from HashTableVariants (#29519)
split SetHashTableVariants out from HashTableVariants
2024-01-08 10:37:00 +08:00
7402fee1fc [feature](function) support ip function ipv6_string_to_num(_or_default, _or_null), inet6_aton (#28361) 2024-01-05 19:24:45 +08:00
Pxl
d8a08dad90 [Bug](mark-join) fix wrong result on mark join + other conjunct (#29321)
fix wrong result on mark join + other conjunct
2024-01-04 11:58:39 +08:00
Pxl
358995e4ac [Chore](sort) fix block used after it was moved(#29416)
/mnt/disk2/pengyu/codebase/apache/doris/be/src/vec/common/sort/heap_sorter.cpp:91:23: error: 'tmp_block' used after it was moved [bugprone-use-after-move,-warnings-as-errors]
2024-01-03 10:02:35 +08:00
a525d5c5a3 [refactor](decimal) change type name Decimal128 to Decimal128V2, Decimal128I to Decimal128V3 to avoid confusion (#29265)
change type name Decimal128 to Decimal128V2, Decimal128I to Decimal128V3 to avoid confusion
2023-12-29 10:11:44 +08:00
3dc3e81734 [Improvement](datatype) Update Parser for IPv4/v6 data types (#29044)
Transforming from parsing std:: string to parsing char * to accelerate the parsing of ipv4/v6 data types.
2023-12-28 11:00:38 +08:00
7081139bdc [fix](block) fix be core while mutable block merge may cause different row size between columns in origin block (#27943) 2023-12-25 20:35:22 +08:00
b7ae7a07c7 [fix](join) incorrect result of left semi/anti join with empty build side (#28898) 2023-12-25 09:07:38 +08:00
Pxl
89d728290d [Chore](execute) remove some unused code and adjust check_row_nums #28576 2023-12-19 09:55:50 +08:00
e75d91c91b [regression-test](Variant) Add more cases related to schema changes (#27958)
* [regression-test](Variant) Add more cases related to schema changes

And fix bugs about schema change for variant:
fix bug schema change crash on doing schema change with tablet schema that contains extracted columns
2023-12-08 10:15:12 +08:00
17016b9797 [improvement](decimal) use new way for decimal arithmetic precision promotion (#27787)
* [DNM](decimal) use new way for decimal arithmetic precision promotion

* [improvement](decimal) [DNM](decimal) use new way for decimal arithmetic precision promotion
1. [DNM](decimal) use new way for decimal arithmetic precision promotion
2. throw exception if it overflows for decimal arithmetics
3. throw exception if it overflows when casting among number types

* fix compile error of gcc

* improvement

---------

Co-authored-by: morrySnow <morrysnow@126.com>
2023-12-05 12:54:40 +08:00
48935c14e2 [Improvement](variant) limit the column size on tablet schema (#27399) (#27785)
1. limit the column count to default 2048
2. fix get_inverted_index return nullptr when variant's unique id is -1, using it's parent unique id instead
3. avoid add same path subcolumn duplicately in tablet schema
4. make extracted column unique id -1
2023-12-04 14:47:36 +08:00
2e1ce758f1 [feature](function) support ip function ipv6numtostring(alias inet6_ntoa) (#27342) 2023-12-02 11:48:19 +08:00
8ca8a0655e [fix](memtracking) require size in Allocator::free (#27795) 2023-11-30 15:57:15 +08:00
e9debca97c [Improve](sort) avoid too may tmp vectors for get_columns (#27734) 2023-11-30 09:47:31 +08:00
7398c3daf1 [Feature-Variant](Variant Type) support variant type query and index (#27676) 2023-11-29 10:37:28 +08:00
Pxl
d969047b50 [Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
2023-11-28 19:46:00 +08:00
f565f60bc3 [refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587) 2023-11-28 13:02:30 +08:00
334260dff7 [feature](function) support ip function ipv4stringtonum(ordefault, ornull), inet_aton (#25510) 2023-11-17 10:27:07 +08:00
6183b298e1 [refactor](data_type) remove some unused functions (#26966) 2023-11-15 09:23:53 +08:00
de6ecd2035 [fix](tls) Manually track memory in Allocator instead of mem hook and ThreadContext life cycle to manual control (#26904)
Manually track query/load/compaction/etc. memory in Allocator instead of mem hook.
Can still use Mem Hook when cannot manually track memory code segments and find memory locations during debugging.
This will cause memory tracking loss for Query, loss less than 10% compared to the past, but this is expected to be more controllable.
Similarly, Mem Hook will no longer track unowned memory to the orphan mem tracker by default, so the total memory of all MemTrackers will be less than before.
Not need to get memory size from jemalloc in Mem Hook each memory alloc and free, which would lose performance in the past.
Not require caching bthread local in pthread local for memory hook, in the past this has caused core dumps inside bthread, seems to be a bug in bthread.
ThreadContext life cycle to manual control
In the past, ThreadContext was automatically created when it was used for the first time (this was usually in the Jemalloc Hook when the first malloc memory), and was automatically destroyed when the thread exited.
Now instead of manually controlling the create and destroy of ThreadContext, it is mainly created manually when the task thread start and destroyed before the task thread end.
Run 43 clickbench query tests.
Use MemHook in the past:
2023-11-14 10:30:42 +08:00
7332b1b371 [fix](decimal) fix undefined behaviour of divide by zero when cast string to decimal (#26822)
* [fix](decimal) fix undefined behaviour of divide by zero when cast string to decimal

* fix format
2023-11-13 10:09:06 +08:00
a5565f68b2 [Refactor](opentelemetry) Remove opentelemetry (#26605) 2023-11-09 18:05:34 +08:00
a3666aa87e [feature](decimal) support decimal256 when creating table (#26308) 2023-11-08 15:21:01 +08:00
44b51bf0b9 [Feature](Variant) support variant load (#26572) 2023-11-08 00:37:57 -06:00
a4e415ab09 [feature](hive)Support hive tables after alter type. (#25138)
1.Reconstruct the logic of decode to read parquet. The parquet  reader first reads the data according to the parquet physical type, and then performs a type conversion.

2.Support hive alter table.
2023-11-02 00:24:21 +08:00
Pxl
d7ea4d31fb [Chore](hash-table) remove unused code about HashTableTraits (#26202)
remove unused code about HashTableTraits
2023-11-01 21:36:50 +08:00
973657d163 [fix](compile) be cannot compile on MacOS (#26155)
build on MacOS meet error: reference to 'detail' is ambiguous.
Because there is a detail namespace under std
2023-10-31 17:36:00 +08:00
6761dc4113 [coverage](test) improve test coverage (#26096)
improve test coverage
2023-10-30 18:01:55 +08:00
c1d64a7128 [Feature](datatype) Add IPv4/v6 data type for doris (#24965) 2023-10-26 17:33:28 +08:00
6dd60c6ebb [Enhance](BE) Add -Wshadow-field compile option to avoid unexpected shadowing behavior (#25698)
* Fix `Tablet::_meta_lock` shadows member inherited from `BaseTablet`

* Add -Wshadow-field compile option to avoid unexpected shadowing behavior
2023-10-26 10:00:28 +08:00
693982fd1a [feature](decimal) support decimal256 (#25386) 2023-10-25 15:47:51 +08:00
87b414cdae [Fix](query execution) Fix result sink fragment can't be cancelled in non-pipeline (#25524) 2023-10-24 11:30:29 +08:00
Pxl
6714966df2 [Chore](function) remove bit_cast/bit_helper (#25700)
remove bit_cast/bit_helper
2023-10-23 11:54:14 +08:00
b964ab76b3 [refactor](shuffle) Simplify hash partitioning strategy (#25596) 2023-10-19 19:28:22 +08:00
31a5e072e7 [refactor](pipelineX) Simplify set operation (#25502) 2023-10-17 15:11:46 +08:00