add whether use Nereids or pipeline engine in profile, for example:
Summary:
- Profile ID: 460e710601674438-9df2d685bdfc20f8
- Task Type: QUERY
...
- Is Nereids: Yes
- Is Pipeline: Yes
- Is Cached: No
This file will be used when compiling Doris in regression pipeline.
And we can modify it to control the compile behavior.
I add BUILD_FS_BENCHMARK=ON, so that it will build fs_benchmark_tool.
1.
Fix bug that the field of s3_file_write_bufferpool is not initialized, causing undefined behavior.
2.
add fs_s3 benchmark tool,Reference to the usage of tools https://github.com/apache/doris/pull/20770
And opt the output:
`sh bin/run-fs-benchmark.sh --conf=conf/s3.conf --fs_type=s3 --operation=single_read --threads=1 --iterations=1`
```
------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------------
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 7366 ms 123 ms 1 ReadRate(B/S)=12.1823M/s ReadTime(S)=7.36572 ReadTotal(B)=89.7314M
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 6163 ms 116 ms 1 ReadRate(B/S)=14.5597M/s ReadTime(S)=6.16299 ReadTotal(B)=89.7314M
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 6048 ms 110 ms 1 ReadRate(B/S)=14.8366M/s ReadTime(S)=6.04796 ReadTotal(B)=89.7314M
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_mean 6526 ms 116 ms 3 ReadRate(B/S)=13.8596M/s ReadTime(S)=6.52556 ReadTotal(B)=89.7314M
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_median 6163 ms 116 ms 3 ReadRate(B/S)=14.5597M/s ReadTime(S)=6.16299 ReadTotal(B)=89.7314M
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_stddev 730 ms 6.68 ms 3 ReadRate(B/S)=1.45914M/s ReadTime(S)=0.729876 ReadTotal(B)=0
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_cv 11.18 % 5.75 % 3 ReadRate(B/S)=10.53% ReadTime(S)=11.18% ReadTotal(B)=0.00%
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_max 7366 ms 123 ms 3 ReadRate(B/S)=14.8366M/s ReadTime(S)=7.36572 ReadTotal(B)=89.7314M
S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_min 6048 ms 110 ms 3 ReadRate(B/S)=12.1823M/s ReadTime(S)=6.04796 ReadTotal(B)=89.7314M
```
* work around, ingest binlog after backup/restore which local_tablet.partition_id is not correct, use by
req.partition_id
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
Agg stats estimation should use the biggest groupby key's NDV as base, and multiply expansion factor, which is calculated by other groupby key' ndv.
Before, we use the smallest ndv as base
support bind external relation out of Doris fe environment, for example, analyze sql in other java application.
see BindRelationTest.bindExternalRelation.
This pr is to add the collecting hive statistic function. While the CBO fetching hive table statistics, statistic cache will
first load from internal stats olap table. If not found, then using this pr's function to fetch from remote Hive metastore.
Keep hadoop-aliyun version consistent with hadoop main version (3.3.5)
upgrade jackson to 2.14.3
upgrade netty version to 4.1.94.final
binding check.freamework version to 3.32.0
upgrade snappy-java to 1.1.10.1
upgrade hudi version to 0.13.1
upgrade spring version to 2.7.13
upgrade orc version to 1.8.4
revert nonsensical changes
in PR #21168 , we refactor physcial properties and translator
to ensure not generating useless excahange. olap scan node
could be gather in Nereids but translate to hash partitioned.
since coordinator could not process gather olap scan node,
we remove the candidate distribution spec of olap scan
When creating a new hive catalog or refresh the hive catalog, it will refresh the HiveMetaStore cache.
And it will call "FileInputFormat.setInputPaths()".
In this method, it will create a new FileSystem instance and store it in FileSystem's cache.
So if refresh catalog frequently, there will be too many FileSystem instances in cache, causing OOM.
This PR disable the FileSystem Cache.
Try to reuse an existed ugi at DFSFileSystem, otherwise if we query a more then ten-thousands partitons hms table, we will do more than ten-thousands login operations, each login operation will cost hundreds of ms from my test.
Co-authored-by: 王翔宇 <wangxiangyu@360shuke.com>
testEliminatingSortNode needs to check if SortNode is existed in plan tree, so it should check plan1.contains("order by:"), but rather than plan1.contains("SORT INFO:") or plan1.contains("SORT LIMIT:").
introduced by #19031
FE could not recover any more because there is a convert to olap table operation in the code. But there are many table types that is not a olap table such as view jdbc table ...
It will convert failed and FE will not start correctly.Co-authored-by: yiguolei <yiguolei@gmail.com>
1. prune hash join output slot ids list based on slot ids in required project and other conjunctions, to reduce the be side effort.
2. support pruning for semi/anti also
Jemalloc dirty page only use madvise MADV_FREE, memory is not release back to system, RSS won't reduce in time,
So when the process memory exceed limit or system available memory is insufficient,
manually transfer dirty page to the muzzy page, which will call MADV_DONTNEED to release the physical memory back to the system.
https://jemalloc.net/jemalloc.3.html#opt.dirty_decay_ms
Currently, when a columnIter is used for seek, then page cache is not set;
When this colunIter is used for later read data, then page cache could not be used.
this PR
1. refactor physical properties, property deriver and property regular
to ensure Nereids could generate plan with sufficent PhysicalDistribute.
2. refactor PhyscialPlanTranslator to ensure all ExchangeNode generated
by PhysicalDistribute, except CTEConsumer. We will refactor all cte
related node later.
the detail changes of this PR:
1. update DistributionSpec of physical properties:
- Any: random distribution, used in output and require
- StorageAny: random distribution but constrained by where the data is stored, used in output
- ExecutionAny: random distribution to present random shuffle, used in output
- Gather: gather distribution, used in output and require
- StorageGather: gather distribution but constrained by where the data is stored, used in output
- Replicated: broadcast distribution
- Hash: bucket distribution
2. update shuffle type of DistributionSpecHash
- REQUIRE: used in require
- NATURAL: distribution as storage engine hash algorithm, constrained by where the data is stored
- STORAGE_BUCKETED: distribution as storage engine hash algorithm
- EXECUTION_BUCKETED: distribution as execution engine hash algorithm
3. update HideOneRowRelationUnderSetOperation to MergeOneRowRelationIntoSetOperation
4. update property deriver of SetOperation to ensure suitable PhysicalDistribute be added
at top and below of SetOperation
5. refactor PhysicalPlanTranslator to ensure no unplanned exchange node will be added