This PR adds the processing of lowercase Column names in Oracle Jdbc Catalog. In the previous behavior, we changed all Oracle columns to uppercase queries by default, but could not handle the lowercase case. This PR can solve this situation and improve All Jdbc Catalog works
When a table is doing schema-change, it adds _doris_shadow prefix in name of modified columns in shadow index.
The writes during schema-change will generate rowset schema with _doris_shadow prefix in BE.
If the alter task arrives at be after the write request, it will use the rowset schema with max version which has the _doris_shadow prefix.
And an error will be thrown as below:
a shadow column is encountered __doris_shadow_p_retailprice
[INTERNAL_ERROR]failed due to operate on shadow column
This commit will disable shadow prefix in rowset meta schema.
Fix two bugs:
1. Missing column is case sensitive, change the column name to lower case in FE for hive/iceberg/hudi
2. Iceberg use custom method to encode special characters in column name. Decode the column name to match the right column in parquet reader.
Since #24830 introduce `broker.name` in hms catalog, data scan will run on specified brokers.
And [doris operator](https://github.com/selectdb/doris-operator) support BE and broker deployed in same pod, BE access local broker is the fastest approach to access data.
In previous logic, every inputSplit will select one BE to execute, then randomly select one broker for actual data access, BE and related broker are always located on separate K8S pod.
This pr optimizes the broker select strategy to prioritize BE-local broker when `broker.name` is specified in hms catalog.
consider sql:
SELECT * FROM t1 WHERE t1.a <= (SELECT COUNT(t2.a) FROM t2 WHERE (t1.b = t2.b));
when unnest correlated subquery, we create a left join node.
Assume outer query is left table and subquery is right one.
If there is no match, the row from right table is filled with nulls.
But COUNT function is always not nullable.
So wrap COUNT with Nvl to ensure it's result is 0 instead of null to get the correct result
Fixed the bug of incomplete query results when querying information_schema.rowsets in the case of multiple BEs.
The reason is that the schema scanner sends the scan fragment to one of multiple bes, and be queries the information of fe through rpc. Since the rowsets information requires information about all BEs, the scan fragment needs to be sent to all BEs.
when backup is prepareAndSendSnapshotTask(), if some table has error, return status not ok, but not return, and other tables continue put snapshot job into batchTask and summit jobs to be while these jobs need cancel. so when status is not ok, return and do not summit jobs
1. fix race condition problem when get tablet load index
2. change tablet search algorithm from random to round-robin for random distribution table when load_to_single_tablet set to false
Reproduce:
DBA do following operations:
1. create user user1@['domain']; // the domain will be resolved as 2 ip: ip1 and ip2;
2. create user user1@'ip1';
3. wait at least 10 second
4. grant all on *.*.* to user1@'ip1'; // will return error: user1@'ip1' does not exist
This is because the daemon thread DomainResolver resolve the "domain" and overwrite the `user1@'ip1'`
which is created by DBA.
This PR fix it.
New session variable: runtime_filter_wait_infinitely. If set runtime_filter_wait_infinitely = true, consumer of rf will wait on receiving until query is timeout.
## Motivation:
In the past, our JOB only supported Timer JOB, which could only provide scheduling for fixed-time tasks. Meanwhile, the JOB was solely responsible for execution, and during execution, there might be inconsistencies in states, where the task was executed successfully, but the JOB's recorded task status was not updated.
This inconsistency in task states recorded by the JOB could not guarantee the correctness of the JOB's status. With the gradual integration of various businesses into the JOB, such as the export job and mtmv job, we found that scaling became difficult, and the JOB became particularly bulky. Hence, we have decided to refactor JOB.
## Refactoring Goals:
- Provide a unified external registration interface so that all JOBs can be registered through this interface and be scheduled by the JobScheduler.
- The JobScheduler can schedule instant JOBs, timer JOBs, and manual JOBs.
- JOB should provide a unified external extension class. All JOBs can be extended through this extension class, which can provide special functionalities like JOB status restoration, Task execution, etc.
- Extended JOBs should manage task states on their own to avoid inconsistent state maintenance issues.
- Different JOBs should use their own thread pools for processing to prevent inter-JOB interference.
### Design:
- The JOBManager provides a unified registration interface through which all JOBs can register and then be scheduled by the JobScheduler.
- The TimerJob periodically fetches JOBs that need to be scheduled within a time window and hands them over to the Time Wheel for triggering. To prevent excessive tasks in the Time Wheel, it distributes the tasks to the dispatch thread pool, which then assigns them to corresponding thread pools for execution.
- ManualJob or Instant Job directly assigns tasks to the corresponding thread pool for execution.
- The JOB provides a unified extension class that all JOBs can utilize for extension, providing special functionalities like JOB status restoration, Task execution, etc.
- To implement a new JOB, one only needs to implement AbstractJob.class and AbstractTask.class.
<img width="926" alt="image" src="https://github.com/apache/doris/assets/16631152/3032e05d-133e-425b-b31e-4bb492f06ddc">
## NOTICE
This will cause the master's metadata to be incompatible
The show column stats result for external table shows N/A for the columns of method, type, trigger and query_times. This pr is to fix this bug, to show the correct value.
Invoke ConnectContext.get() at replayer thread of slave FE nodes maybe return null, so a NPE will be thrown and slave nodes will be crashed.
Co-authored-by: wangxiangyu <wangxiangyu@360shuke.com>