update broadcast join cost estimate according to BE implementation.
there is an enhancement on BE. in broadcast join, BE only build one hash table, not instanceNum hash tables.
1. Support for more expression type
2. Support derive with histogram
3. Use StatisticRange to abstract to logic
4. Use Statistics rather than StatisDeriveResult
After adding a unique ID, the unRankTest fail because each plan has a different ID in the string.
To avoid the effect of unique ID, Compare the plan with the output rather than the string
* [Feature](vectorized)(quantile_state): support vectorized quantile state functions
1. now quantile column only support not nullable
2. add up some regression test cases
3. set default enable_quantile_state_type = true
---------
Co-authored-by: spaces-x <weixiang06@meituan.com>
1. introduce a new type `VARIANT` to encapsulate dynamic generated columns for hidding the detail of types and names of newly generated columns
2. introduce a new expression `SchemaChangeExpr` for doing schema change for extensibility
WITH t0 AS(
SELECT report.date1 AS date2 FROM(
SELECT DATE_FORMAT(date, '%Y%m%d') AS date1 FROM cir_1756_t1
) report GROUP BY report.date1
),
t3 AS(
SELECT date_format(date, '%Y%m%d') AS date3
FROM cir_1756_t2
)
SELECT row_number() OVER(ORDER BY date2)
FROM(
SELECT t0.date2 FROM t0 LEFT JOIN t3 ON t0.date2 = t3.date3
) tx;
The DATE_FORMAT(date, '%Y%m%d') was calculated in GROUP BY node, which is wrong. This expr should be calculated inside the subquery.
Some HDP/CDH Hive versions use gzip to compress the message body of hms NotificationEvent,
so com.qihoo.finance.hms.event.MetastoreEventFactory can not transfer it rightly.
1.if be is dead and be ip not changed by FQDNManager,A situation may occur that after a while the old ip is used by other new alive pod,this may cause two be share same ip which is unexpected.
2.when enable_fqdn is false, user can still set hostname in be when add backend
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
This is PR introduce splitter interface external table.
The splitter interface contain one method getSplits, which is used by QueryScanProvider to get the external file split.
For Hive/Iceberg/TVF, a split is a file block. For ES, it is a shard.
This PR also move the getSplits logic in FileScanProviderIf to the new Splitter interface.
In the future, we may unify internal table as well.
for some reasons, transaction pushlish succeed replica num less than quorum,
this transaction's status can not to be VISIBLE, and this publish task of this
replica of this tablet on this backend need retry publish success to
make transaction VISIBLE when last publish failed.
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
If table has no partition, backup will report error:
2023-03-06 17:35:32,971 ERROR (backupHandler|24) [Daemon.run():118] daemon thread got exception. name: backupHandler
java.util.NoSuchElementException: No value present
at java.util.Optional.get(Optional.java:135) ~[?:1.8.0_152]
at org.apache.doris.catalog.OlapTable.selectiveCopy(OlapTable.java:1259) ~[doris-fe.jar:1.0-SNAPSHOT]
at org.apache.doris.backup.BackupJob.prepareBackupMeta(BackupJob.java:505) ~[doris-fe.jar:1.0-SNAPSHOT]
at org.apache.doris.backup.BackupJob.prepareAndSendSnapshotTask(BackupJob.java:398) ~[doris-fe.jar:1.0-SNAPSHOT]
at org.apache.doris.backup.BackupJob.run(BackupJob.java:301) ~[doris-fe.jar:1.0-SNAPSHOT]
at org.apache.doris.backup.BackupHandler.runAfterCatalogReady(BackupHandler.java:188) ~[doris-fe.jar:1.0-SNAPSHOT]
at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) ~[doris-fe.jar:1.0-SNAPSHOT]
at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.0-SNAPSHOT]
In the past, only simple predicates (slot=const), and, like, or (only bitmap index) could be pushed down to the storage layer. scan process:
Read part of the column first, and calculate the row ids with a simple push-down predicate.
Use row ids to read the remaining columns and pass them to the scanner, and the scanner filters the remaining predicates.
This pr will also push-down the remaining predicates (functions, nested predicates...) in the scanner to the storage layer for filtering. scan process:
Read part of the column first, and use the push-down simple predicate to calculate the row ids, (same as above)
Use row ids to read the columns needed for the remaining predicates, and use the pushed-down remaining predicates to reduce the number of row ids again.
Use row ids to read the remaining columns and pass them to the scanner.
Notice some changes:
1. Support cancel query for mysql load
2. Change the thread pool for mysql load manager.
3. Fix sucret path check logic
4. Fix some doc error
before this pr, add or drop inverted index not change table state, maybe multiple alter jobs executed at the same time, that may lead to some unexpected problems.