Commit Graph

3234 Commits

Author SHA1 Message Date
e3e5f18f26 [Fix](Json type) correct cast result for json type (#34764) 2024-05-18 18:40:17 +08:00
9b5028785d [fix](prepare) fix datetimev2 return err when binary_row_format (#34662)
fix datetimev2 return err when binary_row_format. before pr, Backend return datetimev2 alwary by to_string.
fix datatimev2 return metadata loss scale.
2024-05-18 18:37:41 +08:00
eb7eaee386 [fix](function) money format (#34680) 2024-05-18 18:35:29 +08:00
dff6171546 [fix](auto inc) db_id and table_id should be int64_t instead of int32_t (#34912) 2024-05-18 18:29:59 +08:00
4b96f9834f [fix](move-memtable) change brpc connection type to single (#34883) 2024-05-18 18:29:20 +08:00
849eeb39e9 [fix](load) skip sending cancel rpc if VNodeChannel is not inited (#34897) 2024-05-18 18:29:10 +08:00
6b1c441258 [fix](group_commit) Wal reader should check block length to avoid reading empty block (#34792) 2024-05-18 18:17:56 +08:00
6c515e0c76 [fix](group commit) Make compatibility issues on serializing and deserializing wal file more clear (#34793) 2024-05-18 18:12:43 +08:00
80dd027ce2 [opt](join) For left semi/anti join without mark join conjunct and without other conjucnts, stop probing after matching one row (#34703) 2024-05-18 18:08:50 +08:00
876248aa4e [fix](function) json_object can not input null value (#34591) 2024-05-18 18:00:48 +08:00
691f3c5ee7 [Performance](Variant) Improve load performance for variant type (#33890)
1. remove phmap for padding rows
2. add SimpleFieldVisitorToScarlarType for short circuit type deducing
3. correct type coercion for conflict types bettween integers
4. improve nullable column performance
5. remove shared_ptr dependancy for DataType use TypeIndex instead
6. Optimization by caching the order of fields (which is almost always the same)
and a quick check to match the next expected field, instead of searching the hash table.

benchmark:
In clickbench data, load performance:
12m36.799s ->7m10.934s about 43% latency reduce

In variant_p2/performance.groovy:
3min44s20 -> 1min15s80 about 66% latency reducy
2024-05-18 17:58:33 +08:00
e74b17c761 [Fix](Row store) support decimal256 type (#34887) 2024-05-15 19:01:18 +08:00
2cbe6740a5 [fix](reader) avoid be coredump in block reader in abnormal situation (#34878) 2024-05-15 12:38:40 +08:00
1f0c45204b [fix](iceberg) read the primary key columns if hasing equality delete (#34884)
backport: #34835
2024-05-15 11:37:25 +08:00
02084fd91f [fix](iceberg_orc)Fixed the bug that the iceberg reader did not perform position delete when reading the orc file without a predicate. (#34814) (#34882)
bp #34814
2024-05-15 11:31:29 +08:00
e13ce905cf [Fix](hive-writer) Fix hive partition update file size and remove redundant column names. (#34651) (#34885)
Backport #34651.
2024-05-15 11:23:32 +08:00
c7134faea9 [Fix](outfile) Fix the timing of setting the _is_closed flag in Parquet/ORC writer (#34668) 2024-05-15 10:28:22 +08:00
d5ab2787ba [Fix](function) fix pad functions behaviour of empty pad string (#34796)
fix pad functions behaviour of empty pad string
2024-05-15 10:28:09 +08:00
0b4d814598 [fix](decimal) Fix wrong result produced by decimal128 multiply (#34825)
* [fix](decimal) Fix wrong result produced by decimal128 multiply

* update
2024-05-14 23:34:11 +08:00
4dd5379951 [bugfix](hive)fix error for writing to hive for 2.1 (#34518)
mirror #34520
2024-05-14 23:27:29 +08:00
9491b7d422 [fix](iceberg) prevent coredump if read position delete file failed (#34802) 2024-05-14 14:03:33 +08:00
0ae1b9c70a [chore](remove code) Remove dragonbox related (#34528)
* Revert "[refactor](mysql result format) use new serde framework to tuple convert (#25006)"

This reverts commit e5ef0aa6d439c3f9b1f1fe5bc89c9ea6a71d4019.

* run buildall

* MORE

* FIX
2024-05-13 22:16:57 +08:00
ca9eb56233 [Fix](functions) fix strcmp return value #34565 2024-05-12 09:49:38 +08:00
e23a89f0da fix compile error 2024-05-11 15:36:06 +08:00
719e50f353 [fix](json function) fix failed when json_exists_path use not null input (#34289) 2024-05-11 15:04:35 +08:00
Pxl
1ff4dc8f85 [Bug](runtime-filter) fix coredump won change_null_to_true when argument column is not null… (#34602)
fix coredump won change_null_to_true when argument column is not nullable
2024-05-11 15:04:35 +08:00
8c237e82a3 [Bug](exec) fix intersections/differences bug (#34675) 2024-05-11 11:45:31 +08:00
58c19e33b3 [fix](round) Fix incorrect decimal scale inference in round functions (#34471)
* FIX NEEDED

* FORMAT

* FORMAT

* FIX TEST
2024-05-11 11:42:12 +08:00
0a79c547ff [Refactor](Sink) Remove is_append mode in table sink (#34684)
Remove the is_append mode from the sink component due to the following reasons:
1. The performance improvement from this mode is relatively minor, approximately 10%, as demonstrated in previous benchmarks.
2. The mode complicates maintenance. It requires a separate data writing path to avoid copying, which increases complexity and poses a risk of potential data loss.

I've already test the compability with previous version
2024-05-11 11:20:10 +08:00
Pxl
e2ea54c0a7 [Improvement](sink) remove unused check on string's write_column_to_mysql (#34491)
remove unused check on string's write_column_to_mysql
2024-05-10 22:13:05 +08:00
aa684d85d7 [Bug](Variant) fix rapidjson::Allocator may cause mem allocate issue when build with DENABLE_CLANG_COVERAGE (#34150) 2024-05-10 22:12:00 +08:00
853dbdcb00 [Feature](PreparedStatement) implement general server side prepared (#33807) 2024-05-10 22:10:11 +08:00
082216496e [opt](inverted index) opt for log output when matching without an index (#34024)
Generates a large volume of log output when scanning large amounts of data
2024-05-10 14:45:05 +08:00
e2fc231b7b [refactor](move-memtable) simplify LoadStreamStub::open (#34488) 2024-05-10 14:43:31 +08:00
9b712b03b4 [FIX]fix is_ip_address_in_range func with const param (#34266) 2024-05-10 14:37:20 +08:00
520774a24b [fix](serde) fix ipv4/v6 serde functions for arrow, orc, parquet format (#34042)
this PR is from @sjyango work in #32326,
wants merge #32326 into master branch, but it's draft and not maintain long time. so have this new PR.
Co-authored-by: sjyango <sjyang2022@zju.edu.cn>
2024-05-10 14:37:04 +08:00
Pxl
804586b342 [Improvement](sort) insert data by batch on VSortedRunMerger::get_next (#34363)
insert data by batch on VSortedRunMerger::get_next
2024-05-10 14:36:53 +08:00
cc00666be6 [opt](inverted index) add inlist condition handling to compound (#34134)
1. Previously, the compound did not support the inlist condition, which could impact performance if an inverted index was created.
2024-05-10 14:35:47 +08:00
cbe8e5c010 [opt](join) For a mark join without other conjuncts, stop probing after matching one row (#34581) 2024-05-10 13:45:34 +08:00
e085f75a43 [opt](file-scanner) print current path when encountering error (#34365) (#34523)
bp #34365
2024-05-08 14:49:03 +08:00
1acd8e9fcb [fix](spill) incorrect result of hash join (#34450) 2024-05-08 10:06:32 +08:00
ac56255f82 [opt](inverted index) the "unicode" tokenizer can be configured to disable stop words. (#34467) 2024-05-07 18:23:43 +08:00
4be589951b Revert "Revert "[fix](csv-reader) fix column split error when there is escape character (#34364)""
This reverts commit d127d67ebe989484bbdf340a4de5b79ded56eecc.
2024-05-07 18:03:56 +08:00
561c6a752d [Bug](RegressionTest) fix regresstion test failed (#34466) 2024-05-07 16:53:05 +08:00
d127d67ebe Revert "[fix](csv-reader) fix column split error when there is escape character (#34364)"
This reverts commit 971e10a9db782c9986b20e1209468e4d7aeedf71.
2024-05-07 13:36:11 +08:00
9d0d7293f0 [fix](json) fix be crash while load json data (#34283) 2024-05-07 07:42:53 +08:00
971e10a9db [fix](csv-reader) fix column split error when there is escape character (#34364) 2024-05-07 07:38:35 +08:00
8fdfbcb3c4 Revert "[Opt](func) opt the percentile func performance (#34373) (#34416)"
This reverts commit 509ae425e416b4779ae94eab9c2b21f9850e03c3.
2024-05-07 07:23:48 +08:00
f7900b53ce [enhancement](function) floor/ceil/round/round_bankers can use column as scale argument (#34391) 2024-05-06 22:18:36 +08:00
509ae425e4 [Opt](func) opt the percentile func performance (#34373) (#34416) 2024-05-06 20:10:35 +08:00