1. remove forcing nullable for slot on EmptySetNode.
2. order by xxx desc should use nulls last as default order.
3. don't create runtime filter if runtime filter mode is OFF.
4. group by constant value need check the corresponding expr shouldn't have any aggregation functions.
5. fix two left outer join reorder bug( A left join B left join C).
6. fix semi join and left outer join reorder bug.( A left join B semi join C ).
7. fix group by NULL bug.
8. change ceil and floor function to correct signature.
9. add literal comparasion for string and date type.
10. fix the getOnClauseUsedSlots method may not return valid value.
11. the tightness common type of string and date should be date.
12. the nullability of set operation node's result exprs is not set correctly.
13. Sort node should remove redundent ordering exprs.
HOW to reproduce?
Add export CMAKE_BUILD_TYPE=DEBUG in custom_env.sh. Then build thirdparty in MAC.
There are two problems:
build vectorscan with DEBUG type, will got unused-but-set-variable error:
doris/thirdparty/src/vectorscan-vectorscan-5.4.7/src/nfa/mcclellancompile.cpp:1485:13: error: variable 'total_daddy' set but not used [-Werror,-Wunused-but-set-variable]
u16 total_daddy = 0;
gflags will output libgflags_debug.a instead of libgflags.a while build with DEBUG type. Then we will got error can not find library gflags error.
To avoid these errors, we set CMAKE_BUILD_TYPE while build vectorscan and gflags.
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: Adonis Ling <adonis0147@gmail.com>
The origin scan pools are in exec_env.
But after enable new_load_scan_node by default, the scan pool in exec_env is no longer used.
All scan task will be submitted to the scan pool in scanner_scheduler.
BTW, reorganize the scan pool into 3 kinds:
local scan pool
For olap scan node
remote scan pool
For file scan node
limited scan pool
For query which set cpu resource limit or with small limit clause
TODO:
Use bthread to unify all IO task.
Some trivial issues:
fix bug that the memtable flush size printed in log is not right
Add RuntimeProfile param in VScanner
In #15037, I modified the build script of libgsasl to enable GSSAPI,
but it is still wrong, because the PATH does not include the `thirdparty/installed/bin`,
so when building libgsasl, it will report error:
`WARNING: MIT Kerberos krb5-config not found, disabling GSSAPI`
but `krb5-config` is in `thirdparty/installed/bin`.
Without GSSAPI, the libhdfs3 can not access hdfs with kerberos authentication.
On macOS, we need some extra libraries to build the codebase,
therefore two packages were introduced to the project. They are `binutils` and `gettext`.
It takes a lot of time to build these packages completely. This PR introduces a way to build the needed libraries
and other stuff are skipped to build. It can save the time to build the third-party libraries on macOS.
* [regression-test](mtmv) add mtmv write data regression test
* [regression-test](mtmv) add mtmv write data regression test
* [regression-test](mtmv) add mtmv write data regression test
* [regression-test](mtmv) add mtmv write data regression test
* [regression-test](mtmv) add mtmv write data regression test
Support new table value function `iceberg_meta("table" = "ctl.db.tbl", "query_type" = "snapshots")`
we can use the sql `select * from iceberg_meta("table" = "ctl.db.tbl", "query_type" = "snapshots")` to get snapshots info of a table. The other iceberg metadata will be supported later when needed.
One of the usage:
Before we use following sql to time travel:
`select * from ice_table FOR TIME AS OF "2022-10-10 11:11:11"`;
`select * from ice_table FOR VERSION AS OF "snapshot_id"`;
we can use the snapshots metadata to get the `committed time` or `snapshot_id`,
and then, we can use it as the time or version in time travel clause
1. fix bug in estimation of min/max of Year
2. remove Utils.getLocalDatetimeFromLong(Long). this method is will throw exception if input parameter is too big. And this method is not used any more when we fix the above bug