doris

Author	SHA1	Message	Date
Adonis Ling	8c4f3d4126	[chore](macOS) Fix JAVA_OPTS in start_be.sh (#19267 ) We should set -XX:-MaxFDLimit on macOS if we enable java support for BE otherwise BE may fail to start up.	2023-05-08 14:01:10 +08:00
Pxl	ec517a53a8	[Chore](build) upgrade clang-format version to 16 && move thrift to fe-common (#19155 ) upgrade clang-format version to 16 move thrift to fe-common fix core dump on pipeline engine when operator canceled and not prepared	2023-04-28 14:14:51 +08:00
yongkang.zhong	7b02fa5cd6	[optimization](conf) optimization JAVA_OPTS for be conf and be bin (#19029 )	2023-04-27 13:48:46 +08:00
zhangdong	04d18eec59	[Improve](be)check max open file #18888	2023-04-22 08:42:43 +08:00
Pxl	9e64951721	[Chore](asan) set decrementOutputRecursionDepth to suppressions and remove some unu… (#18845 ) 18845	2023-04-20 23:33:25 +08:00
Mingyu Chen	7e61a85331	[refactor](libhdfs) introduce hadoop libhdfs (#18204 ) 1. Introduce hadoop libhdfs 2. For Linux-X86 platform, use the hadoop libhdfs 3. For other platform, use libhdfs3, because currently we don't have hadoop libhdfs binary for other platform Co-authored-by: adonis0147 <adonis0147@gmail.com>	2023-03-31 18:41:39 +08:00
Xinyi Zou	01d012bab7	[fix](memory) Remove page cache regular clear, disabled jemalloc prof by default (#18218 ) Remove page cache regular clear Now the page cache is turned off by default. If the user manually opens the page cache, it can be considered that the user can accept the memory usage of the page cache, and then can consider adding a manual clear command to the cache. fix memory gc cancel top memory query jemalloc prof is not enabled by default	2023-03-30 09:39:37 +08:00
WenYao	c3fe113894	rename PaloFe to DorisFE (#18167 )	2023-03-29 00:30:16 +08:00
Xinyi Zou	f36465e76e	[enhancement](memory) optimize jemalloc heap profile doc (#18094 )	2023-03-25 13:04:45 +08:00
Adonis Ling	f21508baec	[chore](macOS) Disable detect_container_overflow at BE startup (#17514 ) BE failed to start up due to container-overflow errors reported by address sanitizer.	2023-03-08 10:21:45 +08:00
Mingyu Chen	30df268c1f	[fix](hdfs)(catalog) fix BE crash when hdfs-site.xml not exist in be/conf and fix compute node logic (#17244 ) We set LIBHDFS3_CONF env in start_be.sh, so libhdfs3 will try to read this hdfs-site.xml, if file does not exist, it will throw error. But Doris does not handle this error, cause BE crash. This CL mainly changes: Modify start_be.sh to only set LIBHDFS3_CONF if hdfs-site.xml exist. Refactor the HDFSCommonBuilder so that it can return error correctly. Add BE IP info in status, so that we can get ip from error msg like: ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]failed to init reader for file 000.snappy.orc, err: [INTERNAL_ERROR][172.21.0.101]failed to init HDFSCommonBuilder, please check check be/conf/hdfs-site.xml The logic of prefer compute node is wrong, which causing the external table query can only assign up to 3 backends. This CL refactor this logic and also change some FE config: prefer_compute_node_for_external_table If set to true, query on external table will prefer to assign to compute node. And the max number of compute node is controlled by min_backend_num_for_external_table. If set to false, query on external table will assign to any node. min_backend_num_for_external_table Only take effect when prefer_compute_node_for_external_table is true. If the compute node number is less than this value, query on external table will try to get some mix node to assign, to let the total number of node reach this value. If the compute node number is larger than this value, query on external table will assign to compute node only.	2023-03-02 11:09:55 +08:00
AlexYue	7f2ff83480	[enhancement](FE)shut down fast throw JVM might do when throwing exception #16146 As discussed in 16107 Sometimes jvm would try to reduce the whole stack to just one line, it's kind of confusing for debugging. Issue Number: close #xxx	2023-01-28 14:18:25 +08:00
Mingyu Chen	726427b795	[refactor](fe) refactor and upgrade dependency tree of FE and support AWS glue catalog (#16046 ) 1. Spark dpp Move `DppResult` and `EtlJobConfig` to sparkdpp package in `fe-common` module. So taht `fe-core` is longer depends on `spark-dpp` module, so that the `spark-dpp.jar` will not be moved into `fe/lib`, which reduce the size of FE output. 2. Modify start_fe.sh Modify the CLASSPATH to make sure that doris-fe.jar is at front, so that when loading classes with same qualified name, it will be got from doris-fe.jar firstly. 3. Upgrade hadoop and hive version hadoop: 2.10.2 -> 3.3.3 hive: 2.3.7 -> 3.1.3 4. Override the IHiveMetastoreClient implementations from dependency `ProxyMetaStoreClient.java` for Aliyun DLF. `HiveMetaStoreClient.java` for origin Apache Hive metastore. Because I need to modified some of their method to make them compatible with different version of Hive. 5. Exclude some unused dependencies to reduce the size of FE output Now it is only 370MB (Before is 600MB) 6. Upgrade aws-java-sdk version to 1.12.31 7. Support AWS Glue Data Catalog 8. Remove HudiScanNode(no longer support)	2023-01-20 14:42:16 +08:00
spaces-x	9f9651b2f2	[Enhancement](Jemalloc): correct the varialbe name of malloc_conf & enable prof (#15382 ) enable profile and correct the conf name in Jemalloc.	2022-12-28 09:50:59 +08:00
Mingyu Chen	5cf88a5339	[improvement](config) opt the message when missing JAVA_HOME for BE (#15045 ) Make the error message easy to understand	2022-12-14 23:17:46 +08:00
Yongqiang YANG	44eb1cf1c3	[fix](chore) read max_map_count from proc and make notice much more understandable (#14137 ) Some users can not use sysctl under non-root in linux, so we read max_map_count from proc. Notice users that they can change max_map_count under root.	2022-11-11 23:05:54 +08:00
Yongqiang YANG	a58ac48a6e	[chore](bin) do not set heap limit for tcmalloc until doris does not allocates large unused memory (#13761 ) We set heap limit for tcmalloc to avoid oom introduced by tcmalloc which allocates memory for cache even free memory of a machine is little. However, doris allocates large memory unused in some cases, so tcmalloc would throw an oom exception even ther are a lot free memory in a machine. We can set the limit after we fix the problem again.	2022-11-08 19:26:30 +08:00
Adonis Ling	2ef8f3f6f4	[enhancement](java-udf) Support loading libjvm at runtime (#13660 )	2022-10-28 08:45:12 +08:00
Gabriel	78278f5943	[chore](be version) Check BE version by script (#13594 ) Check BE version by script	2022-10-25 16:20:38 +08:00
Adonis Ling	2cf89c55c2	[chore](macOS) Fix issues found on macOS x86_64 (#13583 ) 1. Use `brew --prefix` instead of `brew --repo` in scripts. 2. `sprintf` is marked as a deprecated function in MacOSX sdk (13.0).	2022-10-24 20:59:20 +08:00
Adonis Ling	410e36ef5b	[enhancement](macOS) Refine the build scripts for macOS (#13473 ) Set the environment up before running the build scripts on macOS.	2022-10-19 22:52:22 +08:00
Adonis Ling	125def5102	[enhancement](macOS M1) Support building from source on macOS (M1) (#13195 ) # Proposed changes This PR fixed lots of issues when building from source on macOS with Apple M1 chip. ## ATTENTION The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime: 1. Some errors with memory tracker occur when BE (RELEASE) starts. 2. Some UT cases fail. ... Temporarily, the following changes are made on macOS to start BE successfully. 1. Disable memory tracker. 2. Use tcmalloc instead of jemalloc. This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues. ## Use case ```shell ./build.sh -j 8 --be --clean cd output/be/bin ulimit -n 60000 ./start_be.sh --daemon ``` ## Something else It takes around _10+_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the development experience on macOS greatly when we finish the adaptation job.	2022-10-18 13:10:13 +08:00
Yongqiang YANG	55fc55d5e3	[improvement](tcmalloc) increase tcmalloc upper limit to 90% (#13245 )	2022-10-11 15:40:24 +08:00
Yongqiang YANG	00c672340d	[improvement](memory) set TCMALLOC_HEAP_LIMIT_MB to control memory consumption of tcmalloc (#12981 )	2022-09-28 15:44:18 +08:00
Xinyi Zou	16bb5cb430	[enhancement](memory) Jemalloc performance optimization and compatibility with MemTracker #12496	2022-09-28 12:04:29 +08:00
Yongqiang YANG	7798309807	[improvement](start_script) add ASAN and UBSAN env in start_be.sh #12014 Neither asan nor ubsan does not generate core file by default, however, we need core file to analyze problems detected by asan and ubsan.	2022-08-24 08:43:00 +08:00
Adonis Ling	e63c83e8e1	[fix](script) Support starting BE without Java environment (#11910 )	2022-08-19 17:58:40 +08:00
Adonis Ling	4fa53b4cdb	[chore](workflow) Add shellcheck to check shell scripts (#11744 )	2022-08-18 16:07:28 +08:00
Jibing-Li	6c9d158e9b	[fix](script) Fix hdfs-site.xml file name typo. #11653	2022-08-10 21:42:08 +08:00
caiconghui	71d9b383d4	[Enhancement](hdfs) Support loading hdfs config for be from hdfs-site.xml (#11634 ) * [Enhancement](hdfs) Support loading hdfs config for be from hdfs-site.xml Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2022-08-10 14:49:50 +08:00
Dongyang Li	6c065d3d59	[script](start_fe) support "--version" to show fe build info (#11563 )	2022-08-08 15:55:01 +08:00
caiconghui	411254c128	[Enhancement](hdfs) Support loading hdfs config from hdfs-site.xml (#11571 )	2022-08-08 14:18:28 +08:00
Dongyang Li	52290fed90	[tools](tpch)update queries for better performance (#11523 )	2022-08-05 14:04:26 +08:00
Dongyang Li	8a8e8e8b45	[fix](stop-script) use kill -9 to stop fe as usual (#11387 )	2022-08-01 14:18:10 +08:00
Dongyang Li	8483660fe7	[opt] unify stop script (#11337 )	2022-07-29 21:04:03 +08:00
Dongyang Li	e210db426c	[opt] stop script opt (#11183 )	2022-07-27 11:32:26 +08:00
Yongqiang YANG	08ebef2992	[Enhancement] check vm.max_map_count before starting (#11052 ) When vectorized engine is enabled, doris uses much more vmas than before, and it leads to core dump due to memory allocation failure.	2022-07-21 21:16:48 +08:00
huangzhaowei	7147a7c290	[feature-wip](multi-catalog) Support s3 storage for file scan node (#10977 ) This is an example of s3 hms_catalog: ```sql CREATE CATALOG hms_catalog properties( "type" = "hms", "hive.metastore.uris"="thrift://localhost:9083", "AWS_ACCESS_KEY" = "your access key", "AWS_SECRET_KEY"="your secret key", "AWS_ENDPOINT"="s3 endpoint", "AWS_REGION"="s3-region", "fs.s3a.paging.maximum"="1000"); ``` All these params are necessary;	2022-07-21 17:38:53 +08:00
Yongqiang YANG	f78db1d773	release memory allocated in agg function in vec stream load (#10739 ) release memory allocated in agg function in vec stream load When a load is cancelled, memory allocated by agg functions should be freeed.	2022-07-16 15:32:53 +08:00
Lei Zhang	89bec9b56a	[enhancement](be) be asan add asan_suppr.conf to ignore known leak. (#10768 )	2022-07-12 19:51:34 +08:00
minghong	6f29a8ac0d	[refactor] update stop_be.sh to avoid error message (#10691 ) * update stop_be.sh to avoid error message * update stop_be.sh	2022-07-08 20:49:00 +08:00
Lei Zhang	bff561c0da	[feature](script) add --grace option for stop_be.sh (#10626 ) be asan mem leak check needs exit app gracefully.	2022-07-06 17:53:01 +08:00
Mingyu Chen	8a49c7ef04	[chore] Rename Doris binary output format	2022-06-24 15:30:05 +08:00
smallhibiscus	4df1106e1e	[improvement](script) Add jvm parameters and the process will automatically stop when oom occurs in fe. (#9765 )	2022-05-30 09:44:12 +08:00
Pxl	e772163b98	[fix](script) meet error on start_fe.sh(#9187 ) start_fe.sh: line 174: [: -eq: unary operator expected	2022-04-26 10:21:03 +08:00
yiguolei	3bdfcde8e8	[Improvement] not print logs to fe.out when fe is running under daemon mode (#9195 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-04-25 18:29:29 +08:00
Henry2SS	4a10b37ca2	[feature](image tool) support image load tool (#8982 )	2022-04-23 21:36:58 +08:00
leo65535	419ec3b96c	[Fix Bug] Fix ehco command not found (#9021 )	2022-04-15 13:43:47 +08:00
Gabriel	0d761f9909	[feature-wip][UDF][DIP-1] Support variable-size input and output for Java UDF (#8678 ) This feature is proposed in DSIP-1. This PR support variable-length input and output Java UDF.	2022-04-11 09:36:16 +08:00
Gabriel	b89e4c7bba	[feature-wip](java-udf) support java UDF with fixed-length input and output (#8516 ) This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF). This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR. To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java. To achieve that, I use a UdfExecutor instead. For users, a UDF class must have a public evaluate method.	2022-03-23 10:32:50 +08:00

1 2

69 Commits