Use SIMD stringsearcher and SIMD memcmp optimze split_by_string and substring_index function.
split_by_string function has 32%~540% up
substring_index function has 22%~46% up
Performance difference depends on the needle size and whether the needle is constant param. And the longer the needle, the more performance improvement
When the be_exec_version is less than 2, murmurhash will still be used, otherwise crc32 will be used. When the be_exec_version is upgraded to 2, please remove.
1. If we set hadoop user property along with kerberos info, the authentication will fail.
2. fix some minor issue of local fs, follow up #18397
3. Add KW_HOSTNAME to keywords region, follow up #17329
4. Fix tvf not working with pipeline engine, follow up #18376
Sometimes, `show load profile` will only show part of the insert opertion's profile.
This is because we assume that for all load operation(including insert), there is only one fragment in the plan.
But actually, there will be more than 1 fragment in plan. eg:
`insert into tbl1 select * from tbl1 limit 1` will have 2 fragments.
This PR mainly changes:
1. modify the `show load profile`
Before: `show load profile "/queryid/taskid/instanceid";`
After: `show load profile "/queryid/taskid/fragmentid/instanceid";`
2. Modify the display of `ReadColumns` in OlapScanNode
Because for wide table, the line of `ReadColumns` may be too long for show in profile.
So I wrap it and each line contains at most 10 columns names.
3. Fix tvf not working with pipeline engine, follow up #18376
Optimize concat function 29% up by memcpy_small_allow_read_write_overflow15.
Optimize string functions list: concat, convert_to, mask, initcap, lower, upper.
concat function has 29% up:
Optimize constant empty string compare:
(1) When the constant empy string '' (size is 0), we can compare offsets in SIMD directly.
q10: SELECT MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM hits WHERE MobilePhoneModel <> '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10;
q11: SELECT MobilePhone, MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM hits WHERE MobilePhoneModel <> '' GROUP BY MobilePhone, MobilePhoneModel ORDER BY u DESC LIMIT 10;
q12: SELECT SearchPhrase, COUNT(*) AS c FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;
q13: SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
q14: SELECT SearchEngineID, SearchPhrase, COUNT(*) AS c FROM hits WHERE SearchPhrase <> '' GROUP BY SearchEngineID, SearchPhrase ORDER BY c DESC LIMIT 10;
Issue Number: close #xxx
BE will core dump while use whole/sub file cache.
Call func CachedRemoteFileReader/WholeFileCache/SubFileCache::read_at_impl() did not pass IOContext when reading segment footer.
In the current implementation of the function of dynamically add and drop inverted index, there is a problem that the inverted index information of historical data is out of date after compaction on the base tablet.
In the future, I will submit PRs to solve this problem. Now, temporarily add or drop inverted index by the directly schema change logic
When processing data in hash table for right join and full outer join, if the output data rows of one hash bucket excceeds batch size, the logic when continue processing this bucket is wrong, it should differentiate between different join types.
Fix tow bugs:
1. Enabling file caching requires both `FE session` and `BE` configurations(enable_file_cache=true) to be enabled.
2. `ParquetReader` has not used `IOContext` previously, but `CachedRemoteFileReader::read_at` needs `IOContext` after PR(#17586).
#18015 enables stream load profile log, however be will encounter rpc fail when loading tpch data(see #18291). This is because when `is_report_success` is true, be will reportExecStatus to fe, but fe cannot find QueryInfo in `coordinatorMap`, thus it will return error to be.
Co-authored-by: ByteYue <[yj976240184@gmail.com](mailto:yj976240184@gmail.com)>
This PR is an optimization for https://github.com/apache/doris/pull/17478:
1. Change the buffer size of `LineReader` to 4MB to align with the size of prefetch buffer.
2. Lazily prefetch data in the first read to prevent wasted reading.
3. S3 block size is 32MB only, which is too small for a file split. Set 128MB as default file split size.
4. Add `_end_offset` for prefetch buffer to prevent wasted reading.
The query performance of reading data on object storage is improved by more than 3x+.
jdbc read array type get result from Doris is string, PG is java.sql.array, CK is java.lang.object
it's difficult to maintain and read the code,
so change all database's array result to string, then add a cast function from string to doris array type
When full clone, if the max version of the local table is less than or equal to the max version of the clone table, there is no need to calculate the delete bitmap again.
For scan node with no vectorized predicate, the input column for the first short-circuit predicate is dense and we don't need to access the selector column.
This PR improve performance by ~30% on TPCH Q3.
We want to use file cache for caching cold data in S3.
When reading them, we want to know where the data come from and the time taken to read the datas.
So we support the metrics in olap scan node.
And for clearing the information, i also update the fields about the metrics.