Change function percentile_approx return value from nan to null (like hive.) to ensure that return value of function percentile_approxcan be parsed by JDBC successfully.
Co-authored-by: weizuo <weizuo@xiaomi.com>
1. Change `brpc_socket_max_unwritten_bytes` to 1GB
This can make the system more fault-tolerant.
Especially in the case of high system load, try to reduce EOVERCROWDED errors.
2. Change `brpc_max_body_size` to 3GB
To handle some large object such as bitmap or string.
For the first, we need to make a parameter to discribe the data is local or remote.
At then, we need to support some basic function to support the operation for remote storage.
Reverts apache/incubator-doris#7351
This commit will cause wrong result with agg table.
For example, an agg table `(k1, k2, v1 sum)` with single non-overlapping rowset
`select count(k1) from tbl1;` should using `_direct_agg_key_next_row` instead of `_agg_key_next_row`.
Otherwise it return less rows than expected.(because `_agg_key_next_row` will only do aggregation with `k1`)
1. Refactor the scheduling logic of broker load. Details see #7367
2. Fix bug that loadedBytes in SHOW LOAD result is wrong.
3. Cancel the thread of LoadTimeoutChecker
Now for PENDING load jobs, there will be no timeout. And the timeout of a load job
start when pending load task is scheduled.
4. Fix a bug that the loading task is never submitted to the pool.
The logic of BlockedPolicy is wrong. We should make sure the task is submitted to the pool,
or the RejectedExecutionException should be thrown.
5. Now the transaction of a load job will begin in pending task, instead of when submitting the job.
1. Delete useless variables
2. Add const modifier for read-only function
3. Delete the empty destructor, the compiler will automatically generate it, refer to the 3/5/0 rule:
[https://en.cppreference.com/w/cpp/language/rule_of_three]
4. It is recommended to add the override keyword (instead of the virtual keyword) to the subclass virtual function.
Override will let the compiler help check and improve security. This is also the reason why C++11 introduces override
If the memory exceeds the limit when be generates a materialized view or schema change,
a more detailed log about limit and configuration will be prompted..
1. Fix some memory leaks
2. Remove redundant and invalid code
3. Fix some buggy writes to reduce extra memory copies and return null pointers to string
4. Reframing the naming to make the structure clearer
code refactor: improve code's readability, avoid const_cast
1. make loop simpler and clearer by using range-based loop grammar, it's safer than old loop style
2. iteration for _row_desc.tuple_descriptors() use index replace index and iterator mixed
3. add new function To cast_to(From from), use this union-based casting between two types to replace reinterpret_cast, this new cast is more readable
4. avoid using the same variable name for nested loop, it's dangerous
5. add const keyword for member functions followed CppCoreGuidelines
the tuple String Slot's ptr and len are not assigned appropriately on send side, the receive side may crash in some situation.
detail description:
on send side, when we call RowBatch::serialize(PRowBatch* output_batch) to pack RowBatch, the Tuple::deep_copy()
will be called, for each String Slot, only String Slots that is not null will set ptr and len with proper value, the null String
Slots will keep original status, the ptr member will point randomly and the len member may unexpect.
on recv side, unpack is processed by RowBatch::RowBatch(const RowDescriptor&, const PRowBatch&...), in this
function, each String Slot will transfer offset to valid string_val->ptr whether the String Slot is null or not.
but some business logic depends on string_val->len=0, such as AggregateFuncTraits::init(), HyperLogLog::deserialize()
will return correctly if slice.size<=0. so if string_val->len is set to 0 in send side, everything will be ok, otherwise server
may crash.
by netcomm viewpoint, we should make sure transfer correct data, it's sender's responsibility to set data with proper
value, and do not make any presume which way the recv side will use it.
Transfer RowBatch in Protobuf Request to Controller Attachment,
when the maximum length of the RowBatch in the Protobuf Request is exceeded.
This can avoid reaching the upper limit of the Protobuf Request length (2G),
and it is expected that performance can be improved.
This is beacuse of an const MAX_PHYSICAL_PACKET_LENGTH in fe should be 2^24 -1,
but it is set as 2^24 -2 by mistake.
2. Fix bitmap_to_string may failed when the result is large than 2G
The broker scan node has two tuple descriptors:
One is dest tuple and the other is src tuple.
The src tuple is used to read the lines of the original file,
and the dest tuple is used to save the converted lines.
The preceding filter is executed on the src tuple, so src tuple descriptor should be used
to initialize the filter expression