Files
doris/fe
Calvin Kirs 30e2c3fb11 [feat](lock)add deadlock detection tool and monitored lock implementations #39015 (#39099)
## Proposed changes
#39015
### Description:

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:


#### AbstractMonitoredLock:

A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms 
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```












#### DeadlockCheckerTool:

Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


```
##### benchmark
```
    @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @Threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @Threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-08 21:15:49 +08:00
..

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

# fe-common

This module is used to store some common classes of other modules.

# spark-dpp

This module is Spark DPP program, used for Spark Load function.
Depends: fe-common

# fe-core

This module is the main process module of FE.
Depends: fe-common, spark-dpp