BufferControlBlock may block all fragment handle threads leads to be out of work
modify include:
BufferControlBlock cancel after max timeout
StmtExcutor notify be to cancel the fragment when unexcepted occur
more details see issue #16203
External hms catalog table column names in doris are all in lower case,
while iceberg table or spark-sql created hive table may contain upper case column name,
which will cause empty query result. This pr is to fix this bug.
1. For parquet file, transfer all column names to lower case while parse parquet metadata.
2. For orc file, store the origin column names and lower case column names in two vectors, use the suitable names in different cases.
3. FE side, change the column name back to the origin column name in iceberg while doing convertToIcebergExpr.
remove json functions code
remove string functions code
remove math functions code
move MatchPredicate to olap since it is only used in storage predicate process
remove some code in tuple, Tuple structure should be removed in the future.
remove many code in collection value structure, they are useless
This pr #15836 change the way to use parquet reader by first open() then init_reader().
But we forgot to call open() for iceberg delete file, which cause coredump.
Issue Number: close #xxx
This pr fix two bugs:
_jdbc_scanner may be nullptr in vjdbc_connector.cpp, so we use another method to count jdbc statistic. close [Enhencement](jdbc scanner) add profile for jdbc scanner #15914
In the batch insertion scenario, oracle database does not support syntax insert into tables values (...),(...); , what it supports is:
insert all
into table(col1,col2) values(c1v1, c2v1)
into table(col1,col2) values(c1v2, c2v2)
SELECT 1 FROM DUAL;
1.In vertical compaction, segments will be loaded for every column group, so
we should cache segment ptr to avoid too many repeated io.
2.fix vertical compaction data size bug
1. support row format using codec of jsonb
2. short path optimize for point query
3. support prepared statement for point query
4. support mysql binary format
Support iceberg schema evolution for parquet file format.
Iceberg use unique id for each column to support schema evolution.
To support this feature in Doris, FE side need to get the current column id for each column and send the ids to be side.
Be read column id from parquet key_value_metadata, set the changed column name in Block to match the name in parquet file before reading data. And set the name back after reading data.
This PR optimize topn query like `SELECT * FROM tableX ORDER BY columnA ASC/DESC LIMIT N`.
TopN is is compose of SortNode and ScanNode, when user table is wide like 100+ columns the order by clause is just a few columns.But ScanNode need to scan all data from storage engine even if the limit is very small.This may lead to lots of read amplification.So In this PR I devide TopN query into two phase:
1. The first phase we just need to read `columnA`'s data from storage engine along with an extra RowId column called `__DORIS_ROWID_COL__`.The other columns are pruned from ScanNode.
2. The second phase I put it in the ExchangeNode beacuase it's the central node for topn nodes in the cluster.The ExchangeNode will spawn a RPC to other nodes using the RowIds(sorted and limited from SortNode) read from the first phase and read row by row from storage engine.
After the second phase read, Block will contain all the data needed for the query