Commit Graph

53 Commits

Author SHA1 Message Date
9e972cb0b9 [bugfix](iceberg)Fix the datafile path error issue for 2.1 (#36066)
bp: #35957
2024-06-08 21:51:46 +08:00
b91d2caab8 [Feature](iceberg-writer) Implements iceberg sink basic functionality for inserting into table. (#35587)
backport #34929
2024-05-29 16:40:54 +08:00
eb49cd839b [refactor](datalake) return the error status instead of static_cast<void> (#34873)
Followup #34797
`static_cast<void>` has ignored the wrong status, some of them should make the query finished with error status, so replace `static_cast<void>`  with `RETURN_IF_ERROR`.

The following three scenarios need to be handled separately and cannot be simply replaced:
1. The outer function returns void;
2. Call status function inner constructors or destructors;
3. Call status function with best effort, and should ignore the wrong status.
2024-05-23 19:06:21 +08:00
adc364a6fd [feature](Paimon) support deletion vector for Paimon naive reader (#34743) (#35241)
bp #34743
Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>
2024-05-23 00:01:30 +08:00
1f0c45204b [fix](iceberg) read the primary key columns if hasing equality delete (#34884)
backport: #34835
2024-05-15 11:37:25 +08:00
9491b7d422 [fix](iceberg) prevent coredump if read position delete file failed (#34802) 2024-05-14 14:03:33 +08:00
35f8563a75 [feature](iceberg) support iceberg equality delete (#34223) (#34327)
bp #34223

Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>
2024-04-30 11:51:29 +08:00
1bfe0f0393 [feature](iceberg)support read iceberg complex type,iceberg.orc format and position delete. (#33935) (#34256)
master #33935
2024-04-29 14:40:12 +08:00
25358564ca [Fix](compile) Fix gcc compile on master (#33864)
This is imported by #33511. wrongly used

ColumnStr<T> ();

which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)
2024-04-19 23:41:37 +08:00
1300317723 [Exec](join) Support column string64 to avoid join failed in string size overflow the uint32 (#33511) (#33850) 2024-04-18 19:43:08 +08:00
2e564036ef [fix](profile) avoid update profile in deconstructor (#32131)
In previous, the counter in `profile` may be updated when close the file reader.
And the file reader may be closed when the object being deconstruted.
But at that time, the `profile` object may already be deleted, causing NPE and BE will crash.

This PR try to fix this issue:

1. Remove the "profile counter update" logic from all `close()` method.

2. Add a new interface `ProfileCollector`

	It has 2 methods:
	
	- `collect_profile_at_runtime()`

		It can be called at runtime, eg, in every `get_next_block()` method.
		So that the counter in profile can be updated at runtime.
		
	- `collect_profile_before_close()`

		Should be called before the object call `close()`. And it will only be called once.
		
3. Derived from `ProfileCollector`

	All classes which may update the profile counter in `close()` method should extends
	the `ProfileCollector`. Such as `GenericReader`, etc. And implement `collect_profile_before_close()`
	
	And `collect_profile_before_close()` will be called in `scanner->mark_to_need_to_close()`.
2024-03-21 14:07:22 +08:00
0b5b7175d6 [fix](multi-catalog) add max compute custom odps and tunnel url (#31390)
add max compute custom odps and tunnel url
2024-02-29 16:44:40 +08:00
be893d792c [fix](jni) fix jni_reader function name get_nex_block to get_next_block (#29943) 2024-01-16 18:39:00 +08:00
2d2f14bc75 [fix](paimon) use SlotDescriptor to parse the required fields (#28990)
Before this PR, Paimon has created the schema of `VectorTable` by accessing meta information. However, once the schema of `VectorTable` in java is not same as `Block` in c++, BE will crashed, and there is no good way to troubleshoot errors.
2023-12-27 15:45:53 +08:00
7081139bdc [fix](block) fix be core while mutable block merge may cause different row size between columns in origin block (#27943) 2023-12-25 20:35:22 +08:00
1706699e7e [fix](multi-catalog)support the max compute partition prune (#27154)
1. max compute partition prune,
we just support filter mc partitions by '=',it can filter just one partition
to support multiple partition filter and range operator('>','<', '>='..), the partition prune should be supported.

2. add max compute row count cache and partitionValues cache

3. add max compute regression case
2023-12-01 22:28:26 +08:00
f565f60bc3 [refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587) 2023-11-28 13:02:30 +08:00
52995c528e [fix](iceberg) iceberg use customer method to encode special characters of field name (#27108)
Fix two bugs:
1. Missing column is case sensitive, change the column name to lower case in FE for hive/iceberg/hudi
2. Iceberg use custom method to encode special characters in column name. Decode the column name to match the right column in parquet reader.
2023-11-17 18:38:55 +08:00
a5565f68b2 [Refactor](opentelemetry) Remove opentelemetry (#26605) 2023-11-09 18:05:34 +08:00
b98744ae90 [Bug](iceberg)fix read partitioned iceberg without partition path (#25503)
Iceberg does not require partition values to exist on file paths, so we should get the partition value from `PartitionScanTask.partition`.
2023-10-31 18:09:53 +08:00
47689fd452 [refactor](jni) unified jni framework for java udf (#25302)
Use the unified jni framework to refactor java udf.
The unified jni framework takes VectorTable as the container to transform data between c++ and java, and hide the details of data format conversion.
In addition, the unified framework supports complex and nested types.
The performance of basic types remains consistent, with a 30% improvement in string types and an order of magnitude improvement in complex types.
2023-10-18 09:27:54 +08:00
18c2a13e09 [fix](multi-catalog)fix maxcompute partition filter and session creation (#24911)
add maxcompute partition support
fix maxcompute partition filter
modify maxcompute session create method
2023-10-17 22:36:10 +08:00
4e8cde127c [Enhance](catalog)add table cache in paimon jni (#25014)
- fix get old schema after refresh paimon table
- add table cache in paimon jni
2023-10-08 10:36:18 +08:00
642e5cdb69 [Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly (#23395) 2023-09-29 22:38:52 +08:00
b9ddcbf729 [feature](merge-cloud) Rewrite code related to IOContext (#24269) 2023-09-15 19:57:58 +08:00
6e28d878b5 [fix](hudi) compatible with hudi spark configuration and support skip merge (#24067)
Fix three bugs:
1. Hudi slice maybe has log files only, so `new Path(filePath)`  will throw errors.
2. Hive column names are lowercase only, so match column names in ignore-case-mode.
3.  Compatible with [Spark Datasource Configs](https://hudi.apache.org/docs/configurations/#Read-Options), so users can add `hoodie.datasource.merge.type=skip_merge` in catalog properties to skip merge logs files.
2023-09-11 19:54:59 +08:00
f66f161017 [fix](multi-catalog)fix hive table with cosn location issue (#23409)
Sometimes, the partitions of a hive table may on different storage, eg, some is on HDFS, others on object storage(cos, etc).
This PR mainly changes:

1. Fix the bug of accessing files via cosn.
2. Add a new field `fs_name` in TFileRangeDesc
    This is because, when accessing a file, the BE will get a hdfs client from hdfs client cache, and different file in one query
request may have different fs name, eg, some of are `hdfs://`, some of are `cosn://`, so we need to specify fs name
for each file, otherwise, it may return error:

`reason: IllegalArgumentException: Wrong FS: cosn://doris-build-1308700295/xxxx, expected: hdfs://[172.xxxx:4007](http://172.xxxxx:4007/)`
2023-08-26 00:16:00 +08:00
a5ca6cadd6 [Improvement] Optimize count operation for iceberg (#22923)
Iceberg has its own metadata information, which includes count statistics for table data. If the table does not contain equli'ty delete, we can get the count data of the current table directly from the count statistics.
2023-08-18 09:57:51 +08:00
f2658dc7bd [Feature](multi-catalog) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema. (#22318)
Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema by session var `truncate_char_or_varchar_columns`.
2023-08-10 14:37:20 +08:00
1ed1b69485 [refactor](reader) move reader from vec/exec/scan to vec/exec/format (#22371)
This readers should be in vec/exec/format
2023-08-04 09:47:20 +08:00
b7a50a09fe [Opt](orc-reader) Optimize orc reader by dict filtering. (#20806)
Optimize orc reader by dict filtering.  It is similar with #17594.
Test result
**ssb-flat-100**: (3 nodes)
| Query        | before opt           | after opt  |
| ------------- |:-------------:| ---------:|
Q1.1 | 1.239 | 1.145
Q1.2 | 1.254 | 1.128
Q1.3 | 1.931 | 1.644
Q2.1 | 1.359 | 1.006
Q2.2 | 1.229 | 0.674
Q2.3 | 0.934 | 0.427
Q3.1 | 2.226 | 1.712
Q3.2 | 2.042 | 1.562
Q3.3 | 1.631 | 1.021
Q3.4 | 1.618 | 0.732
Q4.1 | 2.294 | 1.858
Q4.2 | 2.511 | 1.961
Q4.3 | 1.736 | 1.446
total | 22.004 | 16.316
2023-06-16 13:11:37 +08:00
73ad885e19 [Feature][Fix](multi-catalog) Implements transactional hive full acid tables. (#20679)
After supporting insert-only transactional hive full acid tables #19518, #19419, this PR support transactional hive full acid tables.

Support hive3 transactional hive full acid tables.
Hive2 transactional hive full acid tables need to run major compactions.
2023-06-13 08:55:16 +08:00
9f8de89659 [refactor](exec) replace the single pointer with an array of 'conjuncts' in ExecNode (#19758)
Refactoring the filtering conditions in the current ExecNode from an expression tree to an array can simplify the process of adding runtime filters. It eliminates the need for complex merge operations and removes the requirement for the frontend to combine expressions into a single entity.

By representing the filtering conditions as an array, each condition can be treated individually, making it easier to add runtime filters without the need for complex merging logic. The array can store the individual conditions, and the runtime filter logic can iterate through the array to apply the filters as needed.

This refactoring simplifies the codebase, improves readability, and reduces the complexity associated with handling filtering conditions and adding runtime filters. It separates the conditions into discrete entities, enabling more straightforward manipulation and management within the execution node.
2023-05-29 11:47:31 +08:00
339d804ec4 [Refactor](exceptionsafe) add factory creator to some class (#19000) 2023-04-25 14:33:47 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
Pxl
297764b37d [Chore](build) fix some compile fail on gnu20 && remove some unused compatibility codes (#18467) 2023-04-10 18:05:52 +08:00
7c0bcbdca1 [enhance](parquet-reader) cache file meta of parquet to speed up query (#18074)
Problem:
1. FE will split the parquet file into split. So a file can have several splits.
2. BE will scan each split, read the footer of the parquet file.
3. If 2 splits belongs to a same parquet file, the footer of this file will be read twice.

This PR mainly changes:
1. Use kv cache to cache the footer of parquet file.
2. The kv cache is belong to a scan node, so all parquet reader belong to this scan node will share same kv cache.
3. In cache, the key is "meta_file_path", the value is parsed thrift footer.

The KV Cache is sharded into mutlti sub cache.
So that different file can use different sub cache, avoid blocking each other

In my test, a query with 26 splits can reduce the footer parse time from 4s -> 1s
2023-03-25 23:22:57 +08:00
cb79e42e5c [refactor](file-system)(step-1) refactor file sysmte on BE and remove storage_backend (#17586)
See #17764 for details
I have tested:
- Unit test for local/s3/hdfs/broker file system: be/test/io/fs/file_system_test.cpp
- Outfile to local/s3/hdfs/broker.
- Load from local/s3/hdfs/broker.
- Query file on local/s3/hdfs/broker file system, with table value function and catalog.
- Backup/Restore with local/s3/hdfs/broker file system

Not test:
- cold & host data separation case.
2023-03-21 21:08:38 +08:00
b4b126b817 [Feature](parquet-reader) Implements dict filter functionality parquet reader. (#17594)
Implements dict filter functionality parquet reader to improve performance.
2023-03-16 20:29:27 +08:00
ef2fdb79bb [Improvement](parquet-reader) Optimize and refactor parquet reader to improve performance. (#16818)
Optimize and refactor parquet reader to improve performance.
- Improve 2x performance for small dict string by aligned copying.
- Refactor code to decrease condition(if) checking.
- Don't call skip(0).
- Don't read page index if no condition.

**ssb-flat-100**: (single-machine, single-thread)
| Query        | before opt           | after opt  |
| ------------- |:-------------:| ---------:|
| SELECT count(lo_revenue) FROM lineorder_flat       | 9.23   | 9.12 |
| SELECT count(lo_linenumber) FROM lineorder_flat | 4.50    | 4.36 |
| SELECT count(c_name) FROM lineorder_flat             | 18.22 | 17.88| 
| **SELECT count(lo_shipmode) FROM lineorder_flat**     |**10.09** | **6.15**|
2023-02-20 11:42:29 +08:00
de8d884ec3 [Fix](multi catalog)Fix iceberg parquet file doesn't have iceberg.schema meta problem (#16764)
To support schema evolution, Iceberg add schema information to Parquet file metadata.
But for early iceberg version, it doesn't write any schema information to Parquet file.
This PR is to support read parquet without schema information.
2023-02-16 00:08:59 +08:00
b1b2697cc7 [fix](iceberg) fix iceberg catalog (#16372)
1. Fix iceberg catalog access s3
2. Fix iceberg catalog partition table query
3. Fix persistence
2023-02-05 13:15:28 +08:00
1589d453a3 [fix](multi catalog)Support parquet and orc upper case column name (#16111)
External hms catalog table column names in doris are all in lower case,
while iceberg table or spark-sql created hive table may contain upper case column name,
which will cause empty query result. This pr is to fix this bug.
1. For parquet file, transfer all column names to lower case while parse parquet metadata.
2. For orc file, store the origin column names and lower case column names in two vectors, use the suitable names in different cases.
3. FE side, change the column name back to the origin column name in iceberg while doing convertToIcebergExpr.
2023-01-27 23:52:11 +08:00
23edb3de5a [fix](icebergv2) fix bug that delete file reader is not opened (#16133)
This pr #15836 change the way to use parquet reader by first open() then init_reader().
But we forgot to call open() for iceberg delete file, which cause coredump.
2023-01-24 10:19:46 +08:00
199d7d3be8 [Refactor]Merged string_value into string_ref (#15925) 2023-01-22 16:39:23 +08:00
3ebc98228d [feature wip](multi catalog)Support iceberg schema evolution. (#15836)
Support iceberg schema evolution for parquet file format.
Iceberg use unique id for each column to support schema evolution.
To support this feature in Doris, FE side need to get the current column id for each column and send the ids to be side.
Be read column id from parquet key_value_metadata, set the changed column name in Block to match the name in parquet file before reading data. And set the name back after reading data.
2023-01-20 12:57:36 +08:00
f17d69e450 [feature](file cache)Import file cache for remote file reader (#15622)
The main purpose of this pr is to import `fileCache` for lakehouse reading remote files.
Use the local disk as the cache for reading remote file, so the next time this file is read,
the data can be obtained directly from the local disk.
In addition, this pr includes a few other minor changes

Import File Cache:
1. The imported `fileCache` is called `block_file_cache`, which uses lru replacement policy.
2. Implement a new FileRereader `CachedRemoteFilereader`, so that the logic of `file cache` is hidden under `CachedRemoteFilereader`.

Other changes:
1. Add a new interface `fs()` for `FileReader`.
2. `IOContext` adds some statistical information to count the situation of `FileCache`

Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>
2023-01-10 12:23:56 +08:00
707eab9a63 [opt](multi-catalog) cache and reuse position delete rows in iceberg v2 (#15670)
A deleted file may belong to multiple data files. Each data file will read a full amount of deleted files,
so a deleted file may be read repeatedly. The deleted files can be cached, and multiple data files
can reuse the first read content.

The performance is improved by 60% in the case of single thread, and by 30% in the case of multithreading.
2023-01-07 22:29:11 +08:00
50f1931f96 [fix](multi-catalog) get dictionary-encode from parquet metadata (#15525) 2022-12-31 19:08:10 +08:00
2c8de30cce [optimize](multi-catalog) use dictionary encode&filter to process delete files (#15441)
**Optimize**
PR #14470 has used `Expr` to filter delete rows to match current data file,
but the rows in the delete file are [sorted by file_path then position](https://iceberg.apache.org/spec/#position-delete-files)
to optimize filtering rows while scanning, so this PR remove `Expr` and use binary search to filter delete rows.

In addition, delete files are likely to be encoded in dictionary, it's time-consuming to decode `file_path`
columns into `ColumnString`, so this PR use `ColumnDictionary` to read `file_path` column.

After testing, the performance of iceberg v2's MOR is improved by 30%+.

**Fix Bug**
Lazy-read-block may not have the filter column, if the whole group is filtered by `Expr`
and the batch_eof is generated from next batch.
2022-12-30 08:57:55 +08:00