Commit Graph

27 Commits

Author SHA1 Message Date
b38caed808 [Improve](columns)replace fatal with exception #38035 (#38996) 2024-08-12 09:51:30 +08:00
25358564ca [Fix](compile) Fix gcc compile on master (#33864)
This is imported by #33511. wrongly used

ColumnStr<T> ();

which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)
2024-04-19 23:41:37 +08:00
1300317723 [Exec](join) Support column string64 to avoid join failed in string size overflow the uint32 (#33511) (#33850) 2024-04-18 19:43:08 +08:00
Pxl
8fd6d4c41b [Chore](build) add -Wconversion and remove some unused code (#33127)
add -Wconversion and remove some unused code
2024-04-10 15:26:08 +08:00
ebbfb06162 [Bug](array) fix array column core dump in get_shrinked_column as not check type (#33295)
* [Bug](array) fix array column core dump in get_shrinked_column as not check type

* add function could_shrinked_column
2024-04-08 07:27:40 +08:00
f2a38e6345 [chore](columns) remove update_hashes_with_value for SipHash (#31224) 2024-02-22 13:01:48 +08:00
Pxl
bb4575a392 [Improvement](join) optimization for build_side_output_column (#30826)
optimization for build_side_output_column
2024-02-19 17:22:03 +08:00
cbcb81f381 [FIX](complextype)fix compare_at base function support nested types (#29297) 2024-01-06 12:05:43 +08:00
Pxl
e3d2425d47 [Improvement](join) remove insert_indices_from_join and special judge for -1 (#27779)
remove insert_indices_from_join and special judge for -1
2023-12-04 11:03:22 +08:00
Pxl
d969047b50 [Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
2023-11-28 19:46:00 +08:00
8a436d8ecc [FIX](collectiontype) fix shrink char column in map/struct (#25725)
fix shrink char column in map/struct
before we has char with specific length defined in map or struct field
we select map or struct , the char column in which one has been padding if we just insert less than specific length chars
but in mysql here just show inserted chars not padding specific length chars
---------

Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-10-25 17:16:13 +08:00
b964ab76b3 [refactor](shuffle) Simplify hash partitioning strategy (#25596) 2023-10-19 19:28:22 +08:00
Pxl
f4e2eb6564 remove unused code and adjust clang-tidy checks (#25405)
remove unused code and adjust clang-tidy checks
2023-10-13 16:27:37 +08:00
53b46b7e6c [FIX](filter) update for filter_by_select logic (#25007)
this pr is aim to update for filter_by_select logic and change delete limit

only support scala type in delete statement where condition
only support column nullable and predict column support filter_by_select logic, because we can not push down non-scala type to storage layer to pack in predict column but do filter logic
2023-10-09 21:27:40 +08:00
4de3df6a46 [refactor](column) remove unused method and column definitions (#25152)
remove unused method and column definitions
using primitive type in predicate column to check datev1 and datev2
2023-10-09 17:14:35 +08:00
642e5cdb69 [Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly (#23395) 2023-09-29 22:38:52 +08:00
b35cfc5d5e [opt](join) Opt the performance of join probe (#21845) 2023-07-19 01:21:22 +08:00
d0eb4d7da3 [Improve](hash-fun)improve nested hash with range #21699
Issue Number: close #xxx

when cal array hash, elem size is not need to seed hash
hash = HashUtil::zlib_crc_hash(reinterpret_cast<const char*>(&elem_size),
                                                   sizeof(elem_size), hash);
but we need to be care [[], [1]] vs [[1], []], when array nested array , and nested array is empty, we should make hash seed to
make difference
2. use range for one hash value to avoid virtual function call in loop.
which double the performance. I make it in ut

column: array[int64]
50 rows , and single array has 10w elements
2023-07-11 14:40:40 +08:00
b7d6a70868 [FIX](datatype) Implement hash func with array/map/struct type (#21334)
we do not Implement any hash functions in array/map/struct column , so we use sql like this will make be core

select * from (
        select
            bdp.nc_num,
            collect_list(distinct(bd.catalog_name)) as catalog_name,
            material_qty
        from
            dataease.bu_delivery_product bdp
            left join dataease.bu_trans_transfer btt on bdp.delivery_product_id = btt.delivery_product_id
            left join dataease.bu_delivery bd on bdp.delivery_id = bd.delivery_id
        where
            bd.val_status in ('10', '20', '30', '90')
            and bd.delivery_type in (0, 1, 2)
        group by
            nc_num,
            material_qty
        union
        ALL
        select
            bdp.nc_num,
            collect_list(distinct(bd.catalog_name)) as catalog_name,
            material_qty
        from
            dataease.bu_trans_transfer btt
            left join dataease.bu_delivery_product bdp on bdp.delivery_product_id = btt.delivery_product_id
            left join dataease.bu_delivery bd on bdp.delivery_id = bd.delivery_id
        where
            bd.val_status in ('10', '20', '30', '90')
            and bd.delivery_type in (0, 1, 2)
        group by
            nc_num,
            material_qty
) aa;
core :
2023-06-30 17:11:35 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
0b379de602 [refactor](scan) optimize the agg function of count(1) (#18739) 2023-04-19 09:10:51 +08:00
f38e00b4c0 [refactor](typesystem) using typeindex to create column instead of type name because type name is not stable (#18328)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-09 18:08:31 +08:00
66a0c090b8 [fix](column) Add unimplemented replicate function in ColumnStruct (#18368) 2023-04-06 09:50:27 +08:00
48ef61780d [refactor](struct-type) refactor and clean unused code for struct type (#17257)
remove unused code for struct type
2023-03-01 15:49:31 +08:00
91fc9fae8e [Bug](complex-type) Fix is null predicate in delete stmt for array/struct/map type (#17018) 2023-02-23 15:06:49 +08:00
08adf914f9 [improvement](vec) avoid creating a new column while filtering mutable columns (#16850)
Currently, when filtering a column, a new column will be created to store the filtering result, which will cause some performance loss。 ssb-flat without pushdown expr from 19s to 15s.
2023-02-21 09:47:21 +08:00
1b3902baa2 [Feature](Complex-type) Add struct and map type to Doris (#16444)
This commit support:
1、Insert + select for struct/map type
2、Json stream load for struct type
3、m[key] function for map type

How to use:
Set the fe config to create table for struct and map type
1、admin set frontend config("enable_struct_type" = "true");
2、admin set frontend config("enable_map_type" = "true");

#16547

Co-authored-by: xy720 <xuyang25@baidu.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2023-02-10 11:00:33 +08:00