Commit Graph

86 Commits

Author SHA1 Message Date
0f053789fc branch-2.1: [fix](memory-leak) skip libzip and libjvm memory leak #51628 (#51632)
Cherry-picked from #51628

Co-authored-by: shuke <shuke@selectdb.com>
2025-06-12 09:12:25 +08:00
71deeec294 [conf](fe) Print jvm ClassHistogram in fe gc log after full gc (#44010) (#51007)
* Add `-XX:+PrintClassHistogramAfterFullGC` for JAVA_OPTS
* Add `classhisto*=trace` for JAVA_OPTS_FOR_JDK_17

fe.gc.log will print like this:
```
2024-11-15T11:49:00.316+0800: 11.346: [Class Histogram (after full gc):
 num     #instances         #bytes  class name
----------------------------------------------
   1:          7464        7053464  [B
   2:         37465        3656360  [C
   3:          7076        2909880  [Ljava.lang.Object;
   4:          4915        2306872  [I
   5:          9167        1719552  [S
   6:         16229        1168488  io.grpc.netty.shaded.io.netty.buffer.PoolSubpage
......
```

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [x] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [x] Previous test can cover this change.
        - [x] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-05-21 12:06:49 +08:00
4b69673e4e [fix](fe.conf) use g1 by default for jdk17 (#46395) 2025-01-06 17:52:35 +08:00
a233349ea2 [enchement](mc)Optimize reading of maxcompute partition tables. (#45148) (#45246)
bp #45148

### What problem does this PR solve?
Problem Summary:
Optimize reading of maxcompute partition tables:
1. Introduce batch mode to generate splits for Maxcompute partition
tables to optimize scenarios with a large number of partitions. Control
it through the variable `num_partitions_in_batch_mode`.
2. Introduce catalog parameter `mc.split_cross_partition`. The parameter
is true, which is more friendly to reading partition tables, and false,
which is more friendly to debug.
3. Add `-Darrow.enable_null_check_for_get=false` to be jvm to improve
the efficiency of mc arrow data conversion.
2024-12-11 14:56:27 +08:00
02fdf5307c [pick](branch-2.1) pick #42059 (#44938) 2024-12-04 17:49:08 +08:00
80fd76677e branch-2.1: [Improvement](LDAP Auth)Enhance LDAP authentication with a configurable group filter (#43293)
Cherry-picked from #42038

Co-authored-by: nsivarajan <117266407+nsivarajan@users.noreply.github.com>
Co-authored-by: Sivarajan Narayanan <narayanan_sivarajan@apple.com>
2024-11-10 10:06:13 +08:00
8c0f73cb90 [Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.(#40225 , #40888 ,#41386 ) (#41610)
bp #40225 , #40888 ,#41386

## Proposed changes
Among them, #40225 is the new api of mc,
#40888 is used to fix the bug when reading null between the new and old
apis,
#41386 is used for compatibility between the new and old versions
2024-10-11 11:55:41 +08:00
78b95e7595 [chore](configuration) Change be jvm default max heap size. (#41483)
## Proposed changes

Backport #41281
2024-09-30 15:41:25 +08:00
b192e89ec1 [chore](conf) Specify UTF8 as the default charset. (#40669)
## Proposed changes

pick from  #39521

<!--Describe your changes.-->
2024-09-12 14:35:00 +08:00
baf5b71b39 [branch-2.1](memory) Modify thedefault JEMALLOC_CONF and support flush Jemalloc tcache (#39829)
pick #38185
2024-08-23 17:21:42 +08:00
38c5030f97 [opt](log) refactor the log dir config (#32933)
Refactor the config for log dir of FE and BE

TLDR:
- Use env variable `LOG_DIR` to set root log dir
- Remove `sys_log_dir` for FE and BE

Details:

1. FE

    1. The root log dir is set by env variable `LOG_DIR` in `fe.conf`
    2. The default value of `audit_log_dir` is same as `${LOG_DIR}/`
    3. The default value of `spark_launcher_log_dir` is `${LOG_DIR}/spark_launcher_log`
    4. The default value of `nereids_trace_log_dir` is `${LOG_DIR}/nereids_trace_log`
    5. The origin `sys_log_dir` is deprecated, and default value is `""`.
        But for compatibility, if user already set `sys_log_dir` before, Doris will still use it as root log dir.

2. BE

     1. The root log dir is set by env variable `LOG_DIR` in `be.conf`
     2. Remove `pipeline_tracing_log_dir`, use `${LOG_DIR}` directly.
     3. The origin `sys_log_dir` is deprecated, and default value is `""`.
         But for compatibility, if user already set `sys_log_dir` before, Doris will still use it as root log dir.
2024-04-17 23:41:59 +08:00
ec43f65235 [feature](hudi) support hudi incremental read (#32052)
* [feature](hudi) support incremental read for hudi table

* fix jdk17 java options
2024-03-26 15:31:07 +08:00
6ea5218ee8 Revert "[Enhencement](env) Checking Master branch must use JDK17 (#31587)"
This reverts commit fa499cc200344eaaf837fd52211820dc7b7b9296.
2024-03-06 13:13:49 +08:00
fa499cc200 [Enhencement](env) Checking Master branch must use JDK17 (#31587)
Add to check the JDK version in `env.sh`,  and force master to use java 17 version
2024-03-06 13:05:58 +08:00
ea427e8c51 [fix](JDK17) It will report an exception whenwe start BE with JDK17 and query AVRO table : InaccessibleObjectException (#30541)
* [fix](JDK17) It will report an exception whenwe start BE with JDK17 and query AVRO  table : InaccessibleObjectException (#30003)
2024-01-30 15:33:40 +08:00
589e83561c [Fix](jdk17) support start FE with JDK17 (#29658) (#30485)
Issue Number: close #30484

problem:
gson will use Java's reflection mechanism to generate a default Adapter, but JDK17 is prohibited from visiting such an access.

solution:
gson has provided solutions since 2.9.1, which can bypass this problem: Add support for reflection access filter by Marcono1234 · Pull Request #1905 · google/gson

We need to upgrade the gson version and use this solution
2024-01-30 15:31:22 +08:00
b1a9370004 [fix](glue)support access glue iceberg with credential list (#30473)
merge from #30292
2024-01-28 18:23:07 +08:00
d61974db14 [chore](docs) fix some docs wrong & add important comment & fe start config for old machine (#29742)
fix some docs wrong & add important comment & fe start config for old machine
2024-01-23 13:22:14 +08:00
8fc9c18c85 [improvement](jdbc catalog) Put the jdbc connection pool parameters into catalog properties (#29195) 2024-01-12 11:40:28 +08:00
8c58bb6ade [fix](fe) PrintGCTimeStamps is not applicable in jdk9+ (#28544) 2023-12-19 00:00:05 +08:00
74c0a3060f [feature](jdk) Using G1 as defaut garbage colletor in FE (#28263) 2023-12-16 22:40:11 +08:00
f87c807979 [enhancement](jdk) support doris fe running in jvm with jdk16+ (#26889) 2023-11-15 10:27:30 +08:00
867f44d606 [opt](memory) jemalloc conf lg_tcache_max restore default #26362
tc/jemalloc_free_memory in web beip:8040/mem_tracker is the cache size of Jemalloc.

Previously lg_tcache_max:20, this will cache up to 1M Bin in the thread cache, which will cause the Jemalloc cache to be too large in some scenarios.

Restore the default lg_tcache_max:16, which can cache a maximum of 64K Bins.

If you are doing a performance POC, you can consider increasing it.
2023-11-03 15:35:38 +08:00
fc12362a6d [feature-wip](arrow-flight)(step2) FE support Arrow Flight server (#24314)
This is a POC, the design documentation will be updated soon
2023-09-20 14:42:54 +08:00
698fe55662 remove unused configs in be and broker (#24021) 2023-09-09 08:24:50 +08:00
801ddc0313 [feature-wip](arrow-flight) BE not start Arrow Flight Service by default (#23901) 2023-09-05 14:48:29 +08:00
039c76cbc0 [feature-wip] (arrow-flight) (step1) BE support Arrow Flight server, read data only (#23765) 2023-09-04 19:19:55 +08:00
6e51632ca9 [docs](kerberos)add FAQ cases and enable krb5 debug (#22821) 2023-08-17 14:25:09 +08:00
b9e344617a [typo](kerberos)support read jdk auth creds and add some krb tips in FAQ (#22535)
support read jdk auth creds and add some krb tips in FAQ
1. about the 'javax.security.auth.useSubjectCredsOnly': https://stackoverflow.com/questions/43660265/java-automatically-uses-kerberos-ticketcache-when-it-shouldnt
2. add tips for `No common protection layer between client and server` and yum jdk version.
2023-08-04 14:51:31 +08:00
bc87002028 [opt](conf) remote scanner thread num is changed to core num * 10 (#22427) 2023-08-01 23:09:49 +08:00
e8f4323e0f [Fix](jdbcCatalog) fix typo of some variable #22214 2023-07-26 08:34:45 +08:00
1afe090486 [improvement](memory) modify jemalloc conf in be.conf (#21943)
modify jemalloc conf in be.conf
    disable je_purge_all_arena_dirty_pages
2023-07-20 10:34:31 +08:00
fde73b6cc6 [Fix](multi-catalog) Fix hadoop short circuit reading can not enabled in some environments. (#21516)
Fix hadoop short circuit reading can not enabled in some environments.
- Revert #21430 because it will cause performance degradation issue.
- Add `$HADOOP_CONF_DIR` to `$CLASSPATH`.
- Remove empty `hdfs-site.xml`. Because in some environments it will cause hadoop short circuit reading can not enabled.
- Copy the hadoop common native libs(which is copied from https://github.com/apache/doris-thirdparty/pull/98
) and add it to `LD_LIBRARY_PATH`. Because in some environments `LD_LIBRARY_PATH` doesn't contain hadoop common native libs, which will cause hadoop short circuit reading can not enabled.
2023-07-06 15:00:26 +08:00
274203a59c [typo](storage)Fixed wrong description about Storage_root_path parameter (#20641) 2023-06-28 21:28:50 +08:00
53b2fe5db6 [improvement](jdbc) Set the JDBC connection timeout to be conf (#21000) 2023-06-20 14:23:48 +08:00
c3e6db827c [typo][docs] remove unuse config mysql_service_nio_enabled (#20862) 2023-06-16 09:58:33 +08:00
bcf103e993 [enhancement](log4j) support high performance mode for log4j to escape potential bottleneck for doris read and write (#20759)
As we know, log4j2 some times may be bottleneck in doris fe when there are many logs to be output in sync mode while asynchronous logging has a better performance, and we find that capturing caller location has a similar impact across all logging libraries, and slows down asynchronous logging by about 30-100x. so, here we provide three log mode for log4j2 to meet the needs of different users.
refer to https://logging.apache.org/log4j/2.x/performance.html
2023-06-14 15:16:04 +08:00
7942bd0bf9 [fix](planner) cast string literal to date like type should not be an implict cast (#20709)
1. cast string literal to date like type should not be an implict cast
2. the string representation of float like type should not be scientific notation
3. the data type of like function's regex expr should be string type even if it's a null literal
4. add -Xss4m in fe.conf to prevent stack overflow in some case
2023-06-13 17:57:14 +08:00
198433b131 [typo](config)Remove FE config max_conn_per_user (#20122)
---------

Co-authored-by: Yijia Su <suyijia@selectdb.com>
2023-05-29 17:20:36 +08:00
Pxl
c287e308ab [Chore](java-udf) add some java-udf function name to asan_suppr #20093 2023-05-26 18:05:31 +08:00
Pxl
618961053f [Bug](materialized-view) forbid create mv/rollup on mow table (#20001)
forbid create mv/rollup on mow table
2023-05-25 15:30:12 +08:00
514be8def1 [improvement](conf)Add an example of directly specifying an IP address #19860 2023-05-19 16:43:47 +08:00
Pxl
b927f8cd37 [Chore](asan) change asan_suppr from interceptor_via_lib to interceptor_via_fun (#19636)
change asan_suppr from interceptor_via_lib to interceptor_via_fun
2023-05-16 10:51:43 +08:00
9813406757 [Enhancement](HttpServer) Add http interface authentication for BE (#17753) 2023-05-04 23:46:49 +08:00
7b02fa5cd6 [optimization](conf) optimization JAVA_OPTS for be conf and be bin (#19029) 2023-04-27 13:48:46 +08:00
8864266a42 [fix](Jdbc Catalog) fix Druid Pool parameter and set testWhileIdle = true (#19049)
Set `testWhileIdle` for the druid pool to true
2023-04-26 11:44:45 +08:00
3007cd49f2 [enhancement](mysql) enable two-way ssl authentication (#18530)
According to the mysql-ssl, enable two-way SSL authentication.
2023-04-21 14:39:14 +08:00
Pxl
9e64951721 [Chore](asan) set decrementOutputRecursionDepth to suppressions and remove some unu… (#18845)
18845
2023-04-20 23:33:25 +08:00
Pxl
908fbf92cf [Chore](build) ignore compile warning on orc && fix invalid command curdate on conf (#18810)
ignore compile warning on orc && fix invalid command curdate on conf
2023-04-20 10:03:40 +08:00
e1b3955e05 [refactor](jdbc) using jvm parameters to init jdbc datasource (#18670)
using the jvm parameters to init jdbc datasource connect pool.
if anyone don't need to maintain the connect, so could set JDBC_MIN_POOL=0
2023-04-14 18:45:29 +08:00