## Proposed changes
#39015
### Description:
This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:
#### AbstractMonitoredLock:
A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59 [ Thread-2:2006 ] - [ WARN ] Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
at java.lang.Thread.getStackTrace(Thread.java:1564)
at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
at org.example.ExampleService.timeout(ExampleService.java:17)
at org.example.Main.lambda$test2$1(Main.java:39)
at java.lang.Thread.run(Thread.java:750)
```
#### DeadlockCheckerTool:
Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.
**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
at org.example.Main.lambda$testDeadLock$3(Main.java:79)
at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)
2024-08-07 14:11:28 [ pool-1-thread-1:2001 ] - [ WARN ] Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
at org.example.Main.lambda$testDeadLock$4(Main.java:93)
at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
at java.lang.Thread.run(Thread.java:750)
```
##### benchmark
```
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Threads(1)
Benchmark Mode Cnt Score Error Units
LockBenchmark.testMonitoredLock thrpt 2 15889.407 ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate thrpt 2 678.061 MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm thrpt 2 56.000 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space thrpt 2 668.249 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm thrpt 2 55.080 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space thrpt 2 0.075 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm thrpt 2 0.006 B/op
LockBenchmark.testMonitoredLock:·gc.count thrpt 2 20.000 counts
LockBenchmark.testMonitoredLock:·gc.time thrpt 2 6.000 ms
LockBenchmark.testNativeLock thrpt 2 103130.635 ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate thrpt 2 ≈ 10⁻⁴ MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm thrpt 2 ≈ 10⁻⁶ B/op
LockBenchmark.testNativeLock:·gc.count thrpt 2 ≈ 0 counts
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Threads(100)
Benchmark Mode Cnt Score Error Units
LockBenchmark.testMonitoredLock thrpt 2 10994.606 ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate thrpt 2 488.508 MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm thrpt 2 56.002 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space thrpt 2 481.390 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm thrpt 2 55.163 B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space thrpt 2 0.020 MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm thrpt 2 0.002 B/op
LockBenchmark.testMonitoredLock:·gc.count thrpt 2 18.000 counts
LockBenchmark.testMonitoredLock:·gc.time thrpt 2 9.000 ms
LockBenchmark.testNativeLock thrpt 2 558652.036 ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate thrpt 2 0.016 MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm thrpt 2 ≈ 10⁻⁴ B/op
LockBenchmark.testNativeLock:·gc.count thrpt 2 ≈ 0 counts
```
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
pick (#36659)
pick #37015
In previous versions, we used druid as the default JDBC connection pool,
which can use custom decryption to parse the certificate when SQL Server
encryption is turned on. However, in the new version, after changing
HikariCP as the default connection pool, the SQLServer certificate
cannot be parsed, so encryption needs to be turned off for normal use.
Therefore, a parameter is added to decide whether to disable SQLServer
encryption. It is not disabled by default.
Previously, FE logs were written to files. The main FE logs include
fe.log, fe.warn.log, fe.audit.log, fe.out, and fe.gc.log.
In a K8s deployment environment, logs usually need to be output to
standard output, and then other components process the log stream.
This PR made the following changes:
1. Modified the log4j configuration template
- When started with `--daemon`, logs are still written to various files,
and the format remains unchanged.
- When started with `--console`, all logs are output to standard output
and marked with different prefixes:
- `StdoutLogger`: logs for standard output
- `StderrLogger`: logs for standard error output
- `RuntimeLogger`: logs for fe.log or fe.warn.log
- `AuditLogger:` logs for fe.audit.log
- No prefix: logs for fe.gc.log
Examples are as follows:
```
RuntimeLogger 2024-06-03 14:54:51,229 INFO (binlog-gcer|62)
[BinlogManager.gc():359] begin gc binlog
```
2. Added a new FE config: `enable_file_logger`
Defaults to true. Indicates that logs will be recorded to files
regardless of the startup method. For example, if it is started with
`--console`, the log will be output to both the file and the standard
output. If it is `false`, the log will not be recorded in the file
regardless of the startup method.
3. Optimized the log format of standard output
The byte streams of stdout and stderr are captured. The logs previously
outputted using `System.out` will be captured in fe.log for unified
management.
backport: #35690
`PropertyConverter.setS3FsAccess` has add customized s3 providers:
```
public static final List<String> AWS_CREDENTIALS_PROVIDERS = Arrays.asList(
DataLakeAWSCredentialsProvider.class.getName(),
TemporaryAWSCredentialsProvider.class.getName(),
SimpleAWSCredentialsProvider.class.getName(),
EnvironmentVariableCredentialsProvider.class.getName(),
IAMInstanceCredentialsProvider.class.getName());
```
And these providers are set as configuration value of
`fs.s3a.aws.credentials.provider`, which will be used as configuration
to build s3 reader in JNI readers. However,
`DataLakeAWSCredentialsProvider` is in `fe-core`, that is not dependent
by JNI readers, so we have to move s3 providers to `fe-common'.
fix simple auth check and default username
we should set simple auth to valid by default, and check whether to set
the default username in loginWithUGI
When `Export` statements are executed concurrently, the background uses
`Job schedule` to manage export tasks. Previously, the default value of
`async_task_consumer_thread_num` was 5, meaning that regardless of the
concurrency setting, a maximum of only 5 threads could execute
concurrently.
On the other hand, not only `Export` uses `Job schedule`, but other
scheduled tasks might also use `Job schedule`, leading to a shortage of
thread resources
Now, we have found that in many scenarios, `Export` needs to be set to a
high concurrency value and run concurrently according to that high
value. Clearly, `async_task_consumer_thread_num = 5` is no longer
sufficient, so we have changed the default value of
`async_task_consumer_thread_num` to 64
1. fix sql cache return old value when truncate partition
2. use expire_sql_cache_in_fe_second to control the expire time of the sql cache which in the NereidsSqlCacheManager
1. let AggStateType extends Type
2. remove useless interface isFixedLengthType and supportsTablePartitioning
3. let MapType implement interface isSupported
4. let VariantType extends ScalarType
The legacy planner encounters issues when handling filters such as: c1(boolean type)=0.0(decimalv3).
The literal 0.0 is interpreted as decimalv3(1,1), and the boolean type c1 is coerced to decimalv3(1,1).
decimalv3(1,1) can only retain values in the range [0,1), while the boolean true is represented as 1, exceeding the upper bound, thus causing an overflow problem.
This pull request addresses this issue by considering the boolean type as decimalv3(1,0), making both c1 and 0.0 being cast to decimal(2,1).
Co-authored-by: feiniaofeiafei <moailing@selectdb.com>