fix problem:
If there is an unfinished schema change job (job-2), and before this time, another schema change job (job-1) of the same table has been finished.
Then restart fe, will replay edit log (pending log and waiting_txn log) for job-2, and the table's state is set to SHCEMA_CHANGE, but when loadAlterJob after replayJournal, will add job-1 to schema change handler, and then run the job-1 will set the table to NORMAL because of job-1 is done, but at this point, the job-2 is doing runWaitingTxnJob, in this function will check table's state, if not normal will throw exception, not change the job's state, and cannot cancel the job because the table is not under schema change.
fix s3 resource check:
ERROR 1105 (HY000): Unexpected exception: org.apache.doris.common.DdlException: errCode = 2, detailMessage = Missing [AWS_ACCESS_KEY] in properties.
we should use new properties to check s3 available
If we have join as the root node, then after some join reorder join, the root Group in Memo will have a GroupExpression including LogicalProject as its plan and the children is its ownerGroup.
This PR add a rewrite rule to ensure we have a Project on the top of the top Join of plan to avoid circle in Memo.
- The data source parameters are sunk into the specific data source class
- Simplify some code logic to reduce code complexity
- Provide a data source factory class to extract public logic
- Code that removes tests from production code. We should not include code for testing purposes in any production code.
Iceberg table partition name may contain upper case characters, for example: City=xxx, Nation=xxx.
But in Doris, all column names are in lower case. Here we transfer the partition name to lower case to keep consist with column name.
Hudi external table is deprecated since 1.2.
We should remove it now.
Recommend to use "multi-catalog" feature to connect to Hudi.
User can not create Hudi external table.
When restarting FE, all hudi external table will still be replayed but can not be read. And when doing checkpoint, all these tables will be discarded.
Sometimes I find that the tablet scheduler can not schedule tablet, and with no more info for debugging.
So I add some debug log for this process.
No logic is changed.
In a scenario where multiple DBs are simultaneously imported with high concurrency, a significant number of transactions will be generated. Without a summary field, we cannot clearly see how many transactions there are in the current cluster. Therefore, I have enhanced this point.
```
mysql> show proc "/transactions";
+-------+-----------------------------------+-----------------------+
| DbId | DbName | RunningTransactionNum |
+-------+-----------------------------------+-----------------------+
| 10002 | default_cluster:xxxx | 0 |
| 14005 | default_cluster:__internal_schema | 0 |
| Total | 2 | 0 |
+-------+-----------------------------------+-----------------------+
3 rows in set (0.02 sec)
```
When doing serialization of minidump input, we can find that when serializing colocate table index, the size and entry get by the hash map always unmatched when concurrent occur. So a write lock be added to ensure concurrency.
When we try to query array of datetimev2 column by inverted index, it returns an error like this:
CREATE TABLE `nested` (
`qid` bigint(20) NULL,
`tag` array<text> NULL,
`creationDate` datetime NULL,
`title` text NULL,
`user` text NULL,
`answers.user` array<text> NULL,
`answers.date` array<datetimev2(0)> NULL,
INDEX tag_idx (`tag`) USING INVERTED PROPERTIES("parser" = "english") COMMENT '',
INDEX creation_date_idx (`creationDate`) USING INVERTED COMMENT '',
INDEX title_idx (`title`) USING INVERTED COMMENT '',
INDEX user_idx (`user`) USING INVERTED COMMENT '',
INDEX answers_user_idx (`answers.user`) USING INVERTED COMMENT '',
INDEX answers_date_idx (`answers.date`) USING INVERTED COMMENT ''
) ENGINE=OLAP
DUPLICATE KEY(`qid`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`qid`) BUCKETS 18
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"storage_format" = "V2",
"compression" = "ZSTD",
"light_schema_change" = "true",
"dynamic_schema" = "true",
"disable_auto_compaction" = "false"
);
mysql> select * from nested.nested where tag match 'java' and `answers.date` element_le '2012-04-08T21:15:33.873Z' limit 10;
ERROR 1105 (HY000): errCode = 2, detailMessage = no function found for MATCH_ELEMENT_LE,`answers.date` MA
1. add session variable max_execution_time to an alias of query timeout, if user set max_execution_time, the query timeout will be modified too.
2. add a setter attribute to session variable, so that we could add some logic in setter method instead of field reflection.
Add a new FE config: show_details_for_unaccessible_tablet.
Default is false, when set to true, if a query is unable to select a healthy replica,
the detailed information of all the replicas of the tablet including the specific reason why they are unqueryable,
will be printed out.
when forbid_unknown_col_stats is open and some column stats is unknown,
we will check the be status by StatisticsUtil.statsTblAvailable(), and report error according to be status.
why upgrade? anything wrong?
Try to fix the problem about opentelemetry::v1::ext::http::client::curl::HttpOperation::Send(), I have updated the pr info.