For comparison predicate, two arguments must be cast to datetime and push down to storage if either one is date type. This PR disables predicate push-down for this case.
Use the unified jni framework to refactor java udf.
The unified jni framework takes VectorTable as the container to transform data between c++ and java, and hide the details of data format conversion.
In addition, the unified framework supports complex and nested types.
The performance of basic types remains consistent, with a 30% improvement in string types and an order of magnitude improvement in complex types.
_do_evaluate will add temp result column into original table block, so in order to only convert correct columns to be nullable, need call convert_block_to_null before _do_evaluate
this pr
1. fix use podarray push_back() with back() will make heap_use_after_free when podarray is reach capacity which would may make heap free
2. add cases for csv format for nested types. and csv file has two define which are without quote or just like json text
When executing broker load in ASAN mode, BE may crash with error:
```
F20231010 18:18:17.044978 185490 block.cpp:694] Check failed: d.column->use_count() == 1 (3 vs. 1)
*** Check failure stack trace: ***
@ 0x55e9d94c4e46 google::LogMessage::SendToLog()
@ 0x55e9d94c1410 google::LogMessage::Flush()
@ 0x55e9d94c5689 google::LogMessageFatal::~LogMessageFatal()
@ 0x55e9c509f80d doris::vectorized::Block::clear_column_data()
@ 0x55e9b6c170b3 doris::PlanFragmentExecutor::get_vectorized_internal()
@ 0x55e9b6c147e6 doris::PlanFragmentExecutor::open_vectorized_internal()
@ 0x55e9b6c12d9a doris::PlanFragmentExecutor::open()
@ 0x55e9b6c18426 doris::PlanFragmentExecutor::execute()
@ 0x55e9b6945cca doris::FragmentMgr::_exec_actual()
@ 0x55e9b696456c doris::FragmentMgr::exec_plan_fragment()::$_0::operator()()
```
It may happen when there is column maping like:
```
(k1,v2,v3,v4,v5,v6,v7,v8)
set (k2=v4,k3=v4,k4=v4)
```
in load stmt.
Case is covered by Baidu test cases
Fix work load group GC, add cancel load and add logs.
Unify the format and change all to lowercase of GC logs, avoid unnecessary trouble when grep or less
Add logs to help locate the cause of slow GC.
Optimize the performance of reading decimal in parquet reader.
- Static dispatch `DecimalScaleParams`.
- Optimize `memcpy`, static dispatch copy size in fixed length cases.
- Use right shift bit operator to convert decimals.
now create table use auto create partition:
AUTO PARTITION BY RANGE date_trunc(event_day, 'day')
so the value of event_day will be insert into partition of date_trunc(event_day, 'day'),
eg: select * from partition_range where date_trunc(event_day,"day")= "2023-08-07 11:00:00";
we can prune some partitions by invoke function of date_trunc("2023-08-07 11:00:00","day" );