Since we use division calculation, when the start time is not specified,
it may have a wrong deviation from our expected time.
For example, if it is the 7th minute now, the cycle is executed every two minutes.
Then it is calculated that the first execution is 8 minutes Because 7/2=3
3+1=4
But ideally we think it should be executed at the 9th minute
when both decimalv2 and float like type in the arithmetic expr, the common type is depend on roundPreciseDecimalV2Value session variable. If it's true, the common type is DecimalV2Type.SYSTEM_DEFAULT, otherwise its double type.
This PR was originally #16940 , but it has not been updated for a long time due to the original author @Cai-Yao . At present, we will merge some of the code into the master first.
thanks @Cai-Yao @yiguolei
when pushing down constant conjunct into set operation node, we should assign the conjunct to agg node if there is one. This is consistant with pushing constant conjunct into inlineview.
For load request, there are 2 tuples on scan node, input tuple and output tuple.
The input tuple is for reading file, and it will be converted to output tuple based on user specified column mappings.
And the broker load support different column mapping in different data description to same table(or partition).
So for each scanner, the output tuples are same but the input tuple can be different.
The previous implements save the input tuple in scan node level, causing different scanner using same input tuple,
which is incorrect.
This PR remove the input tuple from scan node and save them in each scanners.
sort out the test cases of external table.
After modify, there are 2 directories:
1. `external_table_p0`: all p0 cases of external tables: hive, es, jdbc and tvf
2. `external_table_p2`: all p2 cases of external tables: hive, es, mysql, pg, iceberg and tvf
So that we can run it with one line command like:
```
sh run-regression-test.sh --run -d external_table_p0,external_table_p2
```
If there is a core dump here, it may cover up the real stack, if stack trace indicates heap corruption
(which led to invalid jemalloc metadata), like double free or use-after-free in the application.
Try sanitizers such as ASAN, or build jemalloc with --enable-debug to investigate further.
count_by_enum(expr1, expr2, ... , exprN);
Treats the data in a column as an enumeration and counts the number of values in each enumeration. Returns the number of enumerated values for each column, and the number of non-null values versus the number of null values.
In some cases, the high load of HDFS may lead to a long time to read the data on HDFS,
thereby slowing down the overall query efficiency. HDFS Client provides Hedged Read.
This function can start another read thread to read the same data when a read request
exceeds a certain threshold and is not returned, and whichever is returned first will use the result.
eg:
create catalog regression properties (
'type'='hms',
'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
'dfs.client.hedged.read.threadpool.size' = '128',
'dfs.client.hedged.read.threshold.millis' = "500"
);