When we try to query array of datetimev2 column by inverted index, it returns an error like this:
CREATE TABLE `nested` (
`qid` bigint(20) NULL,
`tag` array<text> NULL,
`creationDate` datetime NULL,
`title` text NULL,
`user` text NULL,
`answers.user` array<text> NULL,
`answers.date` array<datetimev2(0)> NULL,
INDEX tag_idx (`tag`) USING INVERTED PROPERTIES("parser" = "english") COMMENT '',
INDEX creation_date_idx (`creationDate`) USING INVERTED COMMENT '',
INDEX title_idx (`title`) USING INVERTED COMMENT '',
INDEX user_idx (`user`) USING INVERTED COMMENT '',
INDEX answers_user_idx (`answers.user`) USING INVERTED COMMENT '',
INDEX answers_date_idx (`answers.date`) USING INVERTED COMMENT ''
) ENGINE=OLAP
DUPLICATE KEY(`qid`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`qid`) BUCKETS 18
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"storage_format" = "V2",
"compression" = "ZSTD",
"light_schema_change" = "true",
"dynamic_schema" = "true",
"disable_auto_compaction" = "false"
);
mysql> select * from nested.nested where tag match 'java' and `answers.date` element_le '2012-04-08T21:15:33.873Z' limit 10;
ERROR 1105 (HY000): errCode = 2, detailMessage = no function found for MATCH_ELEMENT_LE,`answers.date` MA
1. add session variable max_execution_time to an alias of query timeout, if user set max_execution_time, the query timeout will be modified too.
2. add a setter attribute to session variable, so that we could add some logic in setter method instead of field reflection.
Add a new FE config: show_details_for_unaccessible_tablet.
Default is false, when set to true, if a query is unable to select a healthy replica,
the detailed information of all the replicas of the tablet including the specific reason why they are unqueryable,
will be printed out.
when forbid_unknown_col_stats is open and some column stats is unknown,
we will check the be status by StatisticsUtil.statsTblAvailable(), and report error according to be status.
why upgrade? anything wrong?
Try to fix the problem about opentelemetry::v1::ext::http::client::curl::HttpOperation::Send(), I have updated the pr info.
A similar situation with #19344 , because sometimes hms meta info is newer than hms events, if we try to invoke org.apache.doris.datasource.hive.PooledHiveMetaStoreClient#getTable and this table is not exists, some error will throws and this event can not be handled.
To be more compatible with MySQL, rename JSONB type name and function name to JSON.
The old JSONB type name and jsonb_xx function can still be used for backward compatibility.
There is a function jsonb_extract remained since json_extract is used by json string function and more work need to change it. It will be changed further.
1. add json_unquote and json_extract functions
2. remove mv releated code in visitPhysicalOlapScan
3. forbid bitmap and hll type for topn node's sort exprs
4. HashDistributionInfo of olap scan node should use the slots from output not the full schema
5. SelectMaterializedIndexWithoutAggregate should use the filter node's output together with the predicate to get the correct mv
6. forbid SimplifyArithmeticRule for decimal type
7. make DecimalLiteral's type and value consistent with each other if the value is decimalv2
8. json_array need support empty argument
Nereids planner add conjuncts to ScanNode after call finalize, this may cause external table scan node fail to filter
useless partition, because external table do the partition prune in the finalize method.
This pr is to fix this bug. In the rewrite stage, pass the conjuncts to LogicalFileScan object, and eventually pass to
ScanNode while creating it. So that the ScanNode could use the conjuncts while doing finalize.
Why not doing the partition prune in the LogicalFileScan like LogicalOlapScan doing?
Because Iceberg api doesn't have the partition concept, it just accept a list of Conjuncts,
so it's easier to pass the conjuncts to ScanNode (Hive, Icegerg, Hudi...) and doing the partition prune in there.
Fix threes bugs of timestampv2 precision:
1. Hive catalog doesn't set the precision of timestampv2, and can't get the precision from hive metastore, so set the largest precision for timestampv2;
2. Jdbc catalog use datetimev1 to parse timestamp, and convert to timestampv2, so the precision is lost.
3. TVF doesn't use the precision from meta data of file format.
Change nereids minidump switch from "Dump_nereids" to "enable_minidump" which is more exactly and neat. Also fix bug of Double.NaN (not a number) serialize bug when doing column statistic serialization.
# Proposed changes
In the nereids. Before this PR: when we access some unexists tables. It will report the exception as follows:
```
mysql> select * from tt;
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: null
```
After this PR, it will get the following results:
```
mysql> select * from tt;
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: Table [tt] does not exist in database [default_cluster:test].
```
## Problem summary
It is because in this [function](f5af07f7b2/fe/fe-core/src/main/java/org/apache/doris/nereids/CascadesContext.java (L328)), we ignore the exception. So the size of `tables` in `CascadesContext` is zero not null. So we can only get null after `table = cascadesContext.getTableByName(tableName);`.
There are two problems of mini dump:
1、minidump do not load connect context to ThreadInfo, so it can not be get easily
2、minidump write maps with not concurrent protection, so the map size with change when we iterating map iterator
Solution:
1、loading connect context to minidump threading
2、use immutable map copy a new map before we actually doing iteration