## Proposed changes
#39015
### Description:
This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:
#### AbstractMonitoredLock:
A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59 [ Thread-2:2006 ] - [ WARN ] Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
at java.lang.Thread.getStackTrace(Thread.java:1564)
at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
at org.example.ExampleService.timeout(ExampleService.java:17)
at org.example.Main.lambda$test2$1(Main.java:39)
at java.lang.Thread.run(Thread.java:750)
```
#### DeadlockCheckerTool:
Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.
**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
at org.example.Main.lambda$testDeadLock$3(Main.java:79)
at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)
2024-08-07 14:11:28 [ pool-1-thread-1:2001 ] - [ WARN ] Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
at org.example.Main.lambda$testDeadLock$4(Main.java:93)
at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)
```
##### benchmark
```
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Threads(1)
Benchmark Mode Cnt Score Error Units
LockBenchmark.testMonitoredLock thrpt 2 15889.407 ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate thrpt 2 678.061 MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm thrpt 2 56.000 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space thrpt 2 668.249 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm thrpt 2 55.080 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space thrpt 2 0.075 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm thrpt 2 0.006 B/op
LockBenchmark.testMonitoredLock:·gc.count thrpt 2 20.000 counts
LockBenchmark.testMonitoredLock:·gc.time thrpt 2 6.000 ms
LockBenchmark.testNativeLock thrpt 2 103130.635 ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate thrpt 2 ≈ 10⁻⁴ MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm thrpt 2 ≈ 10⁻⁶ B/op
LockBenchmark.testNativeLock:·gc.count thrpt 2 ≈ 0 counts
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Threads(100)
Benchmark Mode Cnt Score Error Units
LockBenchmark.testMonitoredLock thrpt 2 10994.606 ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate thrpt 2 488.508 MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm thrpt 2 56.002 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space thrpt 2 481.390 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm thrpt 2 55.163 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space thrpt 2 0.020 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm thrpt 2 0.002 B/op
LockBenchmark.testMonitoredLock:·gc.count thrpt 2 18.000 counts
LockBenchmark.testMonitoredLock:·gc.time thrpt 2 9.000 ms
LockBenchmark.testNativeLock thrpt 2 558652.036 ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate thrpt 2 0.016 MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm thrpt 2 ≈ 10⁻⁴ B/op
LockBenchmark.testNativeLock:·gc.count thrpt 2 ≈ 0 counts
```
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
## Proposed changes
1. Use column idx of ref block instead of new block to indicate the ref
column.
2. Rename some variables to clarify their meanings.
3. Clarify some log msg.
4. Add a minimal case to verify the change.
…38731)
https://github.com/apache/doris/pull/38731
The stddev function has a separate implementation for the DecimalV2
type, but there are issues with the implementation. Given that there is
almost no existing data for DecimalV2, it will be removed here. For be,
upgrading to this situation will result in an error directly.
```
SELECT STDDEV(data) FROM DECIMALV2_10_0_DATA;
ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INTERNAL_ERROR]Agg Function stddev(decimal(10,0)) is not implemented
```
After removing DecimalV2, parameters of type DecimalV2 will be converted
to double for calculations.
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
…(#37760)
MoW will update delete bitmap during load, and the page cache could be
modified by segcompaction. Disable page cache touchs when doing
segcompaction could solve this problem.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
Co-authored-by: zhengyu <freeman.zhang1992@gmail.com>
We should lock table when generate distribute plan, because insert overwrite by async materialized view will drop partitions parallel, and query thread will throw exception:
```
java.lang.RuntimeException: Cannot invoke "org.apache.doris.catalog.Partition.getBaseIndex()" because "partition" is null
at org.apache.doris.nereids.util.Utils.execWithUncheckedException(Utils.java:76) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.glue.translator.PhysicalPlanTranslator.translatePlan(PhysicalPlanTranslator.java:278) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.NereidsPlanner.splitFragments(NereidsPlanner.java:341) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.NereidsPlanner.distribute(NereidsPlanner.java:400) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.NereidsPlanner.plan(NereidsPlanner.java:147) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:796) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:605) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.StmtExecutor.queryRetry(StmtExecutor.java:558) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:548) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectProcessor.executeQuery(ConnectProcessor.java:385) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:237) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.MysqlConnectProcessor.handleQuery(MysqlConnectProcessor.java:260) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.MysqlConnectProcessor.dispatch(MysqlConnectProcessor.java:288) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:342) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[doris-fe.jar:1.2-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.doris.catalog.Partition.getBaseIndex()" because "partition" is null
at org.apache.doris.planner.OlapScanNode.mockRowCountInStatistic(OlapScanNode.java:589) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.planner.OlapScanNode.finalizeForNereids(OlapScanNode.java:1733) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.util.Utils.execWithUncheckedException(Utils.java:74) ~[doris-fe.jar:1.2-SNAPSHOT]
... 17 more
2024-07-29 00:46:17,608 WARN (mysql-nio-pool-114|201) Analyze failed. stmt[210035, 49d3041004ba4b6a-b07fe4491d03c5de]
org.apache.doris.common.NereidsException: errCode = 2, detailMessage = Cannot invoke "org.apache.doris.catalog.Partition.getBaseIndex()" because "partition" is null
at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:803) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:605) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.StmtExecutor.queryRetry(StmtExecutor.java:558) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:548) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectProcessor.executeQuery(ConnectProcessor.java:385) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:237) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.MysqlConnectProcessor.handleQuery(MysqlConnectProcessor.java:260) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.MysqlConnectProcessor.dispatch(MysqlConnectProcessor.java:288) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:342) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[doris-fe.jar:1.2-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
```
this exception is too hard to reproduce, so I can not write a test case
## Proposed changes
Now, Agg's child predicates will not spread to agg.
For example:
select a, sum(b)
from (
select a,b from t where a = 1 and b = 2
) t
group by a
`a = 1` in scan can be propagated to `a` of agg.
But `b = 2` in scan can not be propagated to `sum(b)` of agg.
Issue Number: #38905
<!--Describe your changes.-->
Co-authored-by: liutang123 <liulijia@gmail.com>
If a block (>128M) is dequeue by local exchange source operator and it
is the last block, both of source operators and sink operators will be
hang. This PR fixed it.
pick #38885
pick #38514
*** is nereids: 0 ***
*** tablet id: 0 ***
*** Aborted at 1722279016 (unix time) try "date -d @1722279016" if you
are using GNU date ***
*** Current BE git commitID: e9f12fac47e ***
*** SIGSEGV unknown detail explain (@0x0) received by PID 1116227 (TID
1116498 OR 0x7f009ac00640) from PID 0; stack trace: *** 0#
doris::signal::(anonymous namespace)::FailureSignalHandler(int,
siginfo_t*, void*) at
/home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2#
JVM_handle_linux_signal in
/usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
3# 0x00007F01E49B0520 in /lib/x86_64-linux-gnu/libc.so.6
4# pthread_mutex_lock at ./nptl/pthread_mutex_lock.c:80
5# doris::pipeline::MultiCoreTaskQueue::take(unsigned long) at
/home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/task_queue.cpp:154
6# doris::pipeline::TaskScheduler::_do_work(unsigned long) at
/home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/task_scheduler.cpp:268
7# doris::ThreadPool::dispatch_thread() in
/mnt/disk1/STRESS_ENV/be/lib/doris_be
8# doris::Thread::supervise_thread(void*) at
/home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/thread.cpp:499
9# start_thread at ./nptl/pthread_create.c:442
10# 0x00007F01E4A94850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
enable run match expression outer of filter plan, e.g join conjunct
support eliminate outer join by match expression, if any arguments of match expression is null literal