Commit Graph

45 Commits

Author SHA1 Message Date
8c4f3d4126 [chore](macOS) Fix JAVA_OPTS in start_be.sh (#19267)
We should set -XX:-MaxFDLimit on macOS if we enable java support for BE otherwise BE may fail to start up.
2023-05-08 14:01:10 +08:00
7b02fa5cd6 [optimization](conf) optimization JAVA_OPTS for be conf and be bin (#19029) 2023-04-27 13:48:46 +08:00
04d18eec59 [Improve](be)check max open file #18888 2023-04-22 08:42:43 +08:00
Pxl
9e64951721 [Chore](asan) set decrementOutputRecursionDepth to suppressions and remove some unu… (#18845)
18845
2023-04-20 23:33:25 +08:00
7e61a85331 [refactor](libhdfs) introduce hadoop libhdfs (#18204)
1. Introduce hadoop libhdfs 
2. For Linux-X86 platform, use the hadoop libhdfs
3. For other platform, use libhdfs3, because currently we don't have  hadoop libhdfs binary for other platform

Co-authored-by: adonis0147 <adonis0147@gmail.com>
2023-03-31 18:41:39 +08:00
01d012bab7 [fix](memory) Remove page cache regular clear, disabled jemalloc prof by default (#18218)
Remove page cache regular clear
Now the page cache is turned off by default. If the user manually opens the page cache, it can be considered that the user can accept the memory usage of the page cache, and then can consider adding a manual clear command to the cache.

fix memory gc cancel top memory query

jemalloc prof is not enabled by default
2023-03-30 09:39:37 +08:00
f36465e76e [enhancement](memory) optimize jemalloc heap profile doc (#18094) 2023-03-25 13:04:45 +08:00
f21508baec [chore](macOS) Disable detect_container_overflow at BE startup (#17514)
BE failed to start up due to container-overflow errors reported by address sanitizer.
2023-03-08 10:21:45 +08:00
30df268c1f [fix](hdfs)(catalog) fix BE crash when hdfs-site.xml not exist in be/conf and fix compute node logic (#17244)
We set LIBHDFS3_CONF env in start_be.sh, so libhdfs3 will try to read this hdfs-site.xml,
if file does not exist, it will throw error. But Doris does not handle this error, cause BE crash.
This CL mainly changes:

Modify start_be.sh to only set LIBHDFS3_CONF if hdfs-site.xml exist.
Refactor the HDFSCommonBuilder so that it can return error correctly.
Add BE IP info in status, so that we can get ip from error msg like:
ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]failed to init reader for file  000.snappy.orc, err: 
[INTERNAL_ERROR][172.21.0.101]failed to init HDFSCommonBuilder, please check check be/conf/hdfs-site.xml
The logic of prefer compute node is wrong, which causing the external table query can only assign up to 3 backends.
This CL refactor this logic and also change some FE config:

prefer_compute_node_for_external_table

If set to true, query on external table will prefer to assign to compute node.
And the max number of compute node is controlled by min_backend_num_for_external_table.
If set to false, query on external table will assign to any node.

min_backend_num_for_external_table

Only take effect when prefer_compute_node_for_external_table is true.
If the compute node number is less than this value, query on external table will try to get some mix node
to assign, to let the total number of node reach this value.
If the compute node number is larger than this value, query on external table will assign to compute node only.
2023-03-02 11:09:55 +08:00
9f9651b2f2 [Enhancement](Jemalloc): correct the varialbe name of malloc_conf & enable prof (#15382)
enable profile and correct the conf name in Jemalloc.
2022-12-28 09:50:59 +08:00
5cf88a5339 [improvement](config) opt the message when missing JAVA_HOME for BE (#15045)
Make the error message easy to understand
2022-12-14 23:17:46 +08:00
44eb1cf1c3 [fix](chore) read max_map_count from proc and make notice much more understandable (#14137)
Some users can not use sysctl under non-root in linux, so we read max_map_count from proc.
Notice users that they can change max_map_count under root.
2022-11-11 23:05:54 +08:00
a58ac48a6e [chore](bin) do not set heap limit for tcmalloc until doris does not allocates large unused memory (#13761)
We set heap limit for tcmalloc to avoid oom introduced by tcmalloc which allocates memory for cache even free memory of a machine is little. However, doris allocates large memory unused in some cases, so tcmalloc would throw an oom exception even ther are a lot free memory in a machine.

We can set the limit after we fix the problem again.
2022-11-08 19:26:30 +08:00
2ef8f3f6f4 [enhancement](java-udf) Support loading libjvm at runtime (#13660) 2022-10-28 08:45:12 +08:00
2cf89c55c2 [chore](macOS) Fix issues found on macOS x86_64 (#13583)
1. Use `brew --prefix` instead of `brew --repo` in scripts.
2. `sprintf` is marked as a deprecated function in MacOSX sdk (13.0).
2022-10-24 20:59:20 +08:00
410e36ef5b [enhancement](macOS) Refine the build scripts for macOS (#13473)
Set the environment up before running the build scripts on macOS.
2022-10-19 22:52:22 +08:00
125def5102 [enhancement](macOS M1) Support building from source on macOS (M1) (#13195)
# Proposed changes

This PR fixed lots of issues when building from source on macOS with Apple M1 chip.

## ATTENTION

The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime:
1. Some errors with memory tracker occur when BE (RELEASE) starts.
2. Some UT cases fail.
...

Temporarily, the following changes are made on macOS to start BE successfully.
1. Disable memory tracker.
2. Use tcmalloc instead of jemalloc.

This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues.

## Use case

```shell
./build.sh -j 8 --be --clean

cd output/be/bin
ulimit -n 60000
./start_be.sh --daemon
```

## Something else

It takes around _**10+**_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the  development experience on macOS greatly when we finish the adaptation job.
2022-10-18 13:10:13 +08:00
55fc55d5e3 [improvement](tcmalloc) increase tcmalloc upper limit to 90% (#13245) 2022-10-11 15:40:24 +08:00
00c672340d [improvement](memory) set TCMALLOC_HEAP_LIMIT_MB to control memory consumption of tcmalloc (#12981) 2022-09-28 15:44:18 +08:00
16bb5cb430 [enhancement](memory) Jemalloc performance optimization and compatibility with MemTracker #12496 2022-09-28 12:04:29 +08:00
7798309807 [improvement](start_script) add ASAN and UBSAN env in start_be.sh #12014
Neither asan nor ubsan does not generate core file by default,
however, we need core file to analyze problems detected by asan and ubsan.
2022-08-24 08:43:00 +08:00
e63c83e8e1 [fix](script) Support starting BE without Java environment (#11910) 2022-08-19 17:58:40 +08:00
4fa53b4cdb [chore](workflow) Add shellcheck to check shell scripts (#11744) 2022-08-18 16:07:28 +08:00
6c9d158e9b [fix](script) Fix hdfs-site.xml file name typo. #11653 2022-08-10 21:42:08 +08:00
71d9b383d4 [Enhancement](hdfs) Support loading hdfs config for be from hdfs-site.xml (#11634)
* [Enhancement](hdfs) Support loading hdfs config for be from hdfs-site.xml


Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-08-10 14:49:50 +08:00
08ebef2992 [Enhancement] check vm.max_map_count before starting (#11052)
When vectorized engine is enabled, doris uses much more vmas than before,
and it leads to core dump due to memory allocation failure.
2022-07-21 21:16:48 +08:00
7147a7c290 [feature-wip](multi-catalog) Support s3 storage for file scan node (#10977)
This is an example of s3 hms_catalog:
```sql
CREATE CATALOG hms_catalog properties(
"type" = "hms",
"hive.metastore.uris"="thrift://localhost:9083",
"AWS_ACCESS_KEY" = "your access key",
"AWS_SECRET_KEY"="your secret key",
"AWS_ENDPOINT"="s3 endpoint",
"AWS_REGION"="s3-region",
"fs.s3a.paging.maximum"="1000");
```
All these params are necessary;
2022-07-21 17:38:53 +08:00
89bec9b56a [enhancement](be) be asan add asan_suppr.conf to ignore known leak. (#10768) 2022-07-12 19:51:34 +08:00
8a49c7ef04 [chore] Rename Doris binary output format 2022-06-24 15:30:05 +08:00
0d761f9909 [feature-wip][UDF][DIP-1] Support variable-size input and output for Java UDF (#8678)
This feature is proposed in DSIP-1. This PR support variable-length input and output Java UDF.
2022-04-11 09:36:16 +08:00
b89e4c7bba [feature-wip](java-udf) support java UDF with fixed-length input and output (#8516)
This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF). 
This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR.

To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java.
To achieve that, I use a UdfExecutor instead. 

For users, a UDF class must have a public evaluate method.
2022-03-23 10:32:50 +08:00
a390b766d4 [Improvement] BE could print log foreground when not use daemon mode (#8031) 2022-02-14 09:30:12 +08:00
bc4ceeca44 [improvement] optimize java cmd find (#7428)
* optimize java cmd find, if java_home not set use java in PATH
2021-12-30 10:16:56 +08:00
6a00c68264 Fix a typo ehco -> echo (#5433) 2021-03-03 14:46:54 +08:00
a61d0de173 [ODBC SCAN NODE] 4/4 Add ODBC_SCAN_NODE and Odbc_Scanner in BE and add ODBC_SCAN_NODE docs (#4438) 2020-09-25 10:19:50 +08:00
7d58aa530f Add --daemon option to start script (#642)
Add --daemon option to start_fe.sh/start_be.sh/start_broker.sh
If run scripte without --daemon, it will run as a foreground process.
2019-02-20 16:38:28 +08:00
603f4e0ca9 Fix a sending signal error when starting Doris BE (#367)
Redirect output message of kill to /dev/null.

Co-Authored-By: chalsliu <45041955+chalsliu@users.noreply.github.com>

ISSUE #365
2018-12-03 15:38:33 +08:00
2868793b6b Change license to Apache License 2.0 (#262) 2018-11-01 09:06:01 +08:00
ae9ce81453 Changed: change build.sh to use environment variable to get thirdparty's
path, and change PALO_HOME to DORIS_HOME
2018-10-30 16:29:06 +08:00
bea10e4f06 1. hide password and other sensitive information in log and audit log
2. add 2 new proc '/current_queries' and '/current_backend_instances' to monitor the current running queries.
3. add a manual compaction api on Backend to trigger cumulative or base compaction manually.
4. add Frontend config 'max_bytes_per_broker_scanner' to limit to bytes per one broker scanner. This is to limit the memory cost of a single broker load job
5. add Frontend config 'max_unfinished_load_job' to limit load job number: if number of running load jobs exceed the limit, no more load job is allowed to be submmitted.
6. a log of bug fixed
2018-09-19 20:04:01 +08:00
cc74efb3c5 merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224)
1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication.
2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS.
3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user.
4. A lot of bugs fixed.
5. Performance improvement.
2018-08-24 17:12:26 +08:00
585c21fab4 add feature and fix bugs (#148)
Add new features:
1. plugins of Ambari and k8s deploy
2. specified config 'priority_network' to solve some ip problems

Fix bugs:
fix bugs that rebalance does not work in some case.
fix count(*) from union stmt bug
fix some union stmt bugs
fix bugs when try to schema change a clone replica
2017-11-30 16:31:12 +08:00
fde6b39539 modify start and stop script. User can now specified PID file dir and log dir (#71) 2017-09-04 20:56:50 +08:00
6486be64c3 fix license statement (#29)
* change picture to word

* change picture to word

* SHOW FULL TABLES WHERE Table_type != VIEW sql can not execute

* change license description
2017-08-18 19:16:23 +08:00
e2311f656e baidu palo 2017-08-11 17:51:21 +08:00