## Proposed changes
1. send fragment in BE may run for long time.
2. timeout is not related with BE down.
3. cancel logic should not handle blacklist. because if a BE is down,
send fragment will find it.
Co-authored-by: yiguolei <yiguolei@gmail.com>
…cs (#41989)
before this pr: use ! or not in or condition when table has been
analyzed we will meet
```
SELECT
count(1)
FROM
table_30_un_pa_ke_pr_di4
where
col_int_undef_signed_not_null < -128
or not array_contains(col_array_bigint__undef_signed, col_int_undef_signed_not_null);
ERROR 1105 (HY000): errCode = 2, detailMessage = Not-predicate meet unexpected child:
array_contains(col_array_bigint__undef_signed, cast(col_int_undef_signed_not_null as BIGINT))
```
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
pick (#41992)
We initially introduced jdbc connection pool to improve the connection
performance of jdbc catalog, but we always found that connection pool
would bring some unexpected errors, so we chose to add a catalog
property: `enable_connection_pool` to choose whether to enable the jdbc
connection pool of jdbc catalog, and the default false.However, the
created catalog will still open the connection pool when it is upgraded,
and only the newly created catalog will be false
And we conducted performance tests on this, the performance loss is
within the expected range.
- Enable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM mysql.test.test
limit 1;' --create-schema=mysql --delimiter=";" --verbose
Benchmark
Average number of seconds to run all queries: 0.008 seconds
Minimum number of seconds to run all queries: 0.004 seconds
Maximum number of seconds to run all queries: 0.133 seconds
Number of clients running queries: 1
Average number of queries per client: 1
- Disable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM
mysql_no_pool.test.test limit 1;' --create-schema=mysql --delimiter=";"
--verbose
Benchmark
Average number of seconds to run all queries: 0.054 seconds
Minimum number of seconds to run all queries: 0.047 seconds
Maximum number of seconds to run all queries: 0.184 seconds
Number of clients running queries: 1
Average number of queries per client: 1
pick (#41218)
In the previous FileScanNode, some parts that used conjuncts for
predicate conversion were placed in the init phase. However, for the
Nereids planner, pushing the filter down to the scan happens in the
Translator, which means that the ScanNode can only get the complete
conjuncts in the finalized phase. Therefore, in this PR, I have removed
all conjuncts variables in External for the Nereids planner. They no
longer need to store conjuncts themselves or add them to the ScanNode.
Instead, all places in the ScanNode that use conjuncts should be moved
to the finalized phase.
This refactor also fix a performance issue introduced from #40176 After
introducing the change of generating SelectNode for consecutive projects
or filters, FileScan still adds conjuncts too early in the init phase,
resulting in the discovery of consecutive filters when the upper layer
continues to translate, a selectnode was unexpectedly generated on the
scannode, causing the project to be unable to prune the scannode
columns. However, the Project node trims columns of SelectNode and
ScanNode differently, which causes ScanNode to scan unnecessary columns.
My modification removes the addition of conjuncts in the scannode step,
so that we can keep the structure from ScanNode to Project and achieve
correct column trimming.
pick (#42102)
Add a variable `enable_jdbc_cast_predicate_push_down`, the default value
is false, which prohibits the pushdown of non-constant predicates with
type conversion and all predicates with implicit conversion. This change
can prevent the wrong predicates from being pushed down to the Jdbc data
source, resulting in query data errors, because the predicates with cast
were not correctly pushed down to the data source before. If you find
that the data is read correctly and the performance is better before
this change, you can manually set this variable to true
```
| Expression | Can Push Down |
|-----------------------------------------------------|---------------|
| column type equals const type | Yes |
| column type equals cast const type | Yes |
| cast column type equals const type | No |
| cast column type equals cast const type | No |
| column type not equals column type | No |
| column type not equals cast const type | No |
| cast column type not equals const type | No |
| cast column type not equals cast const type | No |
```
* `replay journal cost too much time` is a counter for replaying a batch
editlog it is normal that cost too much time, the warning level can make
confused
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
pick from master #41356
if reference is nullable and simplify result is boolean literal. the
real result should be:
IF(${reference} IS NULL, NULL, ${not_null_result})
## Proposed changes
Some processors have erased the stats information of the nodes, causing
the runtime_filter_pruner to encounter a NullPointerException.
Issue Number: close #xxx
<!--Describe your changes.-->
cherry-pick #41951
load failed where not set database in session, should use label's
database if not set database in session
LOAD LABEL test_db.label_111111 ( DATA
INFILE("hdfs://hdfs01:9000/user/") INTO TABLE `test_load_tb`) WITH
BROKER "broker" ( "username" = "user", "password" = "");
ERROR 1105 (HY000): errCode = 2, detailMessage = Current database is not
set.
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
cherry-pick: #39723#41164#41331#41546 because later problem is intro by prev one, so put them together
when using fold constant by be,
the return type of substring('123456',1, 3) would changed to be text, which we want it to be 3 remove windowframe in window expression to avoid folding constant on be
## Proposed changes
upgrade commons-configuration2 to 2.11.0
upgrade logging-interceptor to 4.12.0
upgrade commons-compress to 1.27.1
upgrade jetty-bom to 9.4.56.v20240826
upgrade azure-sdk to 1.2.27
Iceberg depends on configuration2, and configuration2 relies on a newer
version of commons-lang3. However, there were significant breaking
changes in commons-lang3, which made it
incompatible.https://issues.apache.org/jira/browse/LANG-1705 As a
result, I rewrote the clone method.
(cherry picked from commit 945edf8dbffaa25c987bcefad59b6cde52772d4f)
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
pick: #35925#39715#40167#40958
Add feat of force use/nouse cbo rule hint and fix pr
introduce
when not using this hint, cbo rules like INFER_SET_OPERATOR_DISTINCT
would generate two plans and compare their cost
and nereids optimizer would decide which is better. But when we want to
control the behavior of cbo rules we could use this force cbo rule hint
usage example
explain shape plan
select /*+ USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) */
*
from t1
union
select * from t2;
the USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) hint would force rule
INFER_SET_OPERATOR_DISTINCT to be used
and generate plan like, which hashAgg below union is generated by this
rule:
-- !with_hint_union_distinct --
----hashAgg[GLOBAL]
--------hashAgg[LOCAL]
----------PhysicalUnion
--------------hashAgg[LOCAL]
----------------PhysicalOlapScan[t1]
--------------hashAgg[LOCAL]
----------------PhysicalOlapScan[t2]
Hint log:
Used: INFER_SET_OPERATOR_DISTINCT
UnUsed:
SyntaxError:
When we want to force disable this rule, we could use
explain shape plan select /*+
NO_USE_CBO_RULE(INFER_SET_OPERATOR_DISTINCT) */ * from t1 union select *
from t2;
which would generate plan with this rule:
-- !with_hint_no_union_distinct --
----hashAgg[GLOBAL]
--------hashAgg[LOCAL]
----------PhysicalUnion
--------------PhysicalOlapScan[t1]
--------------PhysicalOlapScan[t2]
Hint log:
Used: NO_INFER_SET_OPERATOR_DISTINCT
UnUsed:
SyntaxError:
change sessionvariable enableNereidsRules to varType.remove
## Proposed changes
The JOB's execution SQL is currently defined by an older CUP file, which
causes some issues with lexical analysis in the new optimizer as it
doesn't pass under the old optimizer. Since the JOB's underlying
execution already uses the new optimizer, we're planning to fully
migrate to ANTLR4 for consistency.
(cherry picked from commit 334b473deb5ff2e5c29c5eedcfac95dd806ae622)
#41391
pick from master #41719
just like previous PR #41548
this PR process union node to ensure not require any column from its
children when it is required by its parent with empty slot set
bp #41659
## Proposed changes
Because ExternalTable will initialize the previously uninitialized table
when `getCachedRowCount()`, which is unnecessary. So for the
uninitialized table, we directly return -1.
This will increase the speed of our query `information_schema.tables`.