This is only a temporary fix its performance is not ideal. Finally,
we need to reconstruct the functions of `stddev` and delete the interface of `insert_to_null_default ()`.
Add a new column-type to speed up the approximation of quantiles.
1. The new column-type is named `quantile_state` with fixed aggregation function `quantile_union`, which stores the intermediate results of pre-aggregated approximation calculations for quantiles.
2. support pre-aggregation of new column-type and quantile_state related functions.
This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF).
This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR.
To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java.
To achieve that, I use a UdfExecutor instead.
For users, a UDF class must have a public evaluate method.
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G
Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.
First step to support Java UDF in Doris. After this PR, we can create Java UDF in doris.
For example, we create Java UDF function by code below.
```
CREATE FUNCTION test_udf(int) RETURNS int
PROPERTIES (
"file"="file:///root/hive-udf-1.0-SNAPSHOT.jar",
"symbol"="udf.Main",
"type"="JAVA_UDF"
)
```
1. `file` indicate where user file is.
2. `symbol` for java udf means udf class in this jar.
3. `type` indicate this function is a java udf.
For now, usage of RowBlockAllocator::allocate is a little complicated
due to its ambiguous return value. Some callers just test the return value
while some test the return value and non-null pointer. This patch let
it return success code only when it succeeds, then caller can just
test the return value.
1.
This bug was introduced by #8209.
Error in fe.warn.log:
```
java.lang.IllegalStateException: 560278
at com.google.common.base.Preconditions.checkState(Preconditions.java:508) ~[spark-dpp-0.15-SNAPSHOT.jar:0.15-SNAPSHOT]
at org.apache.doris.catalog.TabletInvertedIndex.getReplica(TabletInvertedIndex.java:462) ~[palo-fe.jar:0.15-SNAPSHOT]
at org.apache.doris.catalog.Catalog.replayBackendReplicasInfo(Catalog.java:6941) ~[palo-fe.jar:0.15-SNAPSHOT]
at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:626) [palo-fe.jar:0.15-SNAPSHOT]
at org.apache.doris.catalog.Catalog.replayJournal(Catalog.java:2446) [palo-fe.jar:0.15-SNAPSHOT]
at org.apache.doris.master.Checkpoint.doCheckpoint(Checkpoint.java:116) [palo-fe.jar:0.15-SNAPSHOT]
at org.apache.doris.master.Checkpoint.runAfterCatalogReady(Checkpoint.java:74) [palo-fe.jar:0.15-SNAPSHOT]
at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:0.15-SNAPSHOT]
at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:0.15-SNAPSHOT]
```
Since the reporting of a tablet and the deletion of a tablet are two independent events
and are not mutually exclusive, it may happen that the tablet is deleted first and the reporting is done later.
2.
Change the tablet report info. Now, the version of a tablet report from BE is the largest continuous version.
Eg, versions: [1,2,3,5,7], the report version of this tablet will be 3.
1. Avoid print large string in error log
If user load a unqualified large string, the all string will be saved in error log,
so the error log is too big that can not be shown be using `show load warnings on "url"`.
Err: `Got packet bigger than 'max_allowed_packet' bytes`
2. Remove duplicate help doc
Do not allow doc with same title, or error thrown when starting FE:
`java.lang.IllegalArgumentException: Multiple entries with same key:`
```
SET PROPERTY FOR 'jack' 'exec_mem_limit' = '2147483648';
SET PROPERTY FOR 'jack' 'load_mem_limit' = '2147483648';
```
The user level property will overwrite the value in session variables.
In the original tablet reporting information, the version missing information is done by combining
two pieces of information as follows:
1. the maximum consecutive version number
2. the `version_miss` field
The logic of this approach is confusing and inconsistent with the logic of checking for missing versions when querying.
After the change, we directly use the version checking logic used in the query, and set `version_miss` to true
if a missing version is found
and on the FE processing side. Originally, only the **bad replica** information was syncronized among FEs,
but not the **version missing** information. As a result, the non-master FE is not aware of the missing version information.
In the new design, we deprecate the original log persistence class `BackendTabletsInfo` and use the new
`BackendReplicasInfo` to record replica reporting information and write both **bad** and **version missing**
information to metadata so that other FEs can synchronize these information.
1.
Before executing LoadLoadingTask of a broker load, we should check if the job is cancelled.
2.
Add a new column `runningTransactionNum` for `show proc "/transactions"`.
So that we can view all running txns in each db in one command.
1. support SHOW LAST INSERT
In the current implementation, the insert operation returns a json string to describe the result information
of the insert. But this information is in the session track field of the mysql protocol,
and it is difficult to obtain programmatically.
Therefore, I provide a new syntax `show last insert` to explicitly obtain the result of the latest insert operation,
and return a normal query result set to facilitate the user to obtain the result information of the insert.
2. the `ReturnRows` field in fe.audit.log of insert operation will be set to the loaded row num of the insert.
3. Fix a bug described in #8354
SQL to reproduce:
```
SELECT * FROM table WHERE where FROM_UNIXTIME(d_datekey,'%Y-%m-%d %H:%i:%s') != '1970-08-20 00:11:43';
org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = Unexpected exception: Illegal pattern character 'i'
at org.apache.doris.qe.StmtExecutor.analyze(StmtExecutor.java:584) ~[palo-fe.jar:3.4.0]
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:345) ~[palo-fe.jar:3.4.0]
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:318) ~[palo-fe.jar:3.4.0]
at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:221) ~[palo-fe.jar:3.4.0]
at org.apache.doris.qe.ConnectProcessor.dispatch(ConnectProcessor.java:361) ~[palo-fe.jar:3.4.0]
at org.apache.doris.qe.ConnectProcessor.processOnce(ConnectProcessor.java:562) ~[palo-fe.jar:3.4.0]
at org.apache.doris.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:50) ~[palo-fe.jar:3.4.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:835) [?:?]
```
Describe the overview of changes.
Just support:
yyyy-MM-dd HH:mm:ss
yyyy-MM-dd
yyyyMMdd
1. mysql-connector-java
mysql-connector-java is under GLPv2 license, which is not compatible with APLv2, and Doris does not use it.
2. org.json
org.json is under JSON license, which is not compatible with APLv2. I use `json-simple` to replace it.
1. optimize compaction too slow replica process, will set to ok if the replica compaction is done.
And will not set bad if more than half replica is too slow,
2. fix field in grouping_id() is not the same name will cause error lile `select list expression not produced by
aggregation output (missing from GROUP BY clause?): id`
* [refactor] remove types_test
1. remove types_test, it will cause core dump in higher version GCC or
clang, because of memory align, some code will be vectorized in higher
GCC or clang
2. Change string type length to 2 GB instead of -1
3. modify inaccessible code