1. upgrade log4j to 2.12.1
2. Add 2 new FE config:
'sys_log_delete_age' and default is '7d', for sys log.
'audit_log_delete_age' and default is '30d', for audit log.
it means if a log's last modification time is 7/30 days ago, it will be deleted.
Some use has the requirment that only some of columns will be update in
one load operation, and others will retain as original. However, Doris
can't handle this situation, because user must specify value for all
columns. Then if a column aggregation method is REPLACE, use must query
original value to overwrite it. This often needs some work for user to
do.
If this CL is applied, user can use REPLACE_IF_NOT_NULL instead of
REPLACE. Then when load data to table, if user don't intent to change
value of this column, user can specify NULL for this column. Doris will
retain original value for this column.
At present, we do not support SQL MODE which is similar to MySQL. In MySQL, SQL MODE is stored in global session and session with a 64 bit address,and every bit 0 or 1 on this address represents a mode state. Besides, MySQL supports combine mode which is composed of several modes.
We should support SQL MODE to deal with sql dialect problems. We can heuristically use the MySQL way to store SQL MODE in session and parse it into string when we need to return it back to client.
This commit suggests a solution to support SQL MODE. But it's just a sample, and the mode types in SqlModeHelper.java are not really meaningful from now on.
Mainly fix the following issues:
1. A null pointer exception is raised when a database or table is dropped. The expected behavior is that the routine load job is stopped.
2. Memory leaks. Batch routine load task submissions are no longer performed, and modifications are submitted separately for each task.
3. Unreasonable task timeout.
Routine load tasks should not be queued in the BE thread pool for execution. The task sent to the BE should be executed immediately, otherwise the task in the FE will be timeout first. Eventually leads to constant timeout for all subsequent tasks.
4. All routine load job should be scheduled once it being submitted. Not waiting the available BE slot. Otherwise, all later submitted jobs may not be scheduled forever.
Add a new type: Object. Currently, it's mainly for complex aggregate metrics(HLL , Bitmap).
The Object type has the following constraints:
1 Object type could not as key column type
2 Object type doesn't support all indices (BloomFilter, short key, zone map, invert index)
3 Object type doesn't support filter and group by
In the implementation:
The Object type reuse the StringValue and StringVal, because in storage engine, the Object type is binary, it has a pointer and length.
ISSUE-2069: This kind of query could be stuck.
The sender failed to send the last packet to receiver.
Also, the failure does not be reportted to FE , so the query is not cancelled.
The error log sames as "body_size=xxxx from xxx:xxx is too large".
The reason of the socket is that the packet of the query is too big which is more then the max_body_size of brpc.
This commit add a config named brpc_max_body_size whcih is used to change the max_body_size of brpc.
Also, user can change the max_body_size directly on-the-fly by "http://host:brpc_port/flags".
The prepare/close step of scalar function is already supported in execution framework, We only need to do is that support it in syntax and meta in frontend.
In addition, 'Hive' binary type of scalar function NOT supports prepare/close step, we need to make it supports.
Leverage gitattributes to enable auto convert end-of-line to LF when
checking in. Convert already exist CRLF to LF by removing all files and
checking out with new .gitattributes file. Except .gitattributes, all
files are only modified at the end of line.
Random distribution is no longer supported since version 0.9.
And we need a way to convert the random distribution to hash distribution.
ALTER TABLE db.tbl SET ("distribution_type" = "hash");
1. Support specifying label to Insert Into stmt.
INSERT INTO tbl1 WITH LABEL label1 ...;
2. Return job' state corresponding to the existing label in result of stream load.
...
"Status": "Label Already Exists",
"ExistingJobStatus": "FINISHED"
...
3. Return the recent 2000 transactions in SHOW PROC '/transactions'
This config is used to control the max concurrent task num per be.
The cluster max concurrent task num = max_concurrent_task_num_per_be * number of be.