doris

Author	SHA1	Message	Date
Adonis Ling	95f2f43c02	[fix](macOS) Failed to run BE UT due to syscall to map cache into shared region failed (#15641 ) According to the post https://developer.apple.com/forums/thread/676684, the executable whose size is bigger than 2G may fail to start. The size of the executable `doris_be_test` generated by run-be-ut.sh is 2.1G (> 2G) now and we can't run it on macOS (arm64). We can separate the debug info from the executable `doris_be_test` to reduce the size. After that, we can run `doris_be_test` successfully.	2023-01-06 01:23:37 +08:00
Adonis Ling	388f067300	[chore](workflow) Disable memory tracker by default on BE UT (macOS) (#14508 )	2022-11-23 16:25:42 +08:00
Adonis Ling	249b688663	[chore](github) Add a workflow to check BE UT on macOS (#14506 )	2022-11-23 08:38:28 +08:00
Pxl	bcd641877f	[Enhancement](scan) disable build key range and filters when push down agg work (#14248 ) disable build key range and filters when push down agg work	2022-11-21 12:47:57 +08:00
HappenLee	74a1e28af3	[Opt](exec) prevent the scan key split whole range (#14088 ) prevent the scan key split whole range	2022-11-11 15:46:00 +08:00
Adonis Ling	2ef8f3f6f4	[enhancement](java-udf) Support loading libjvm at runtime (#13660 )	2022-10-28 08:45:12 +08:00
Adonis Ling	125def5102	[enhancement](macOS M1) Support building from source on macOS (M1) (#13195 ) # Proposed changes This PR fixed lots of issues when building from source on macOS with Apple M1 chip. ## ATTENTION The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime: 1. Some errors with memory tracker occur when BE (RELEASE) starts. 2. Some UT cases fail. ... Temporarily, the following changes are made on macOS to start BE successfully. 1. Disable memory tracker. 2. Use tcmalloc instead of jemalloc. This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues. ## Use case ```shell ./build.sh -j 8 --be --clean cd output/be/bin ulimit -n 60000 ./start_be.sh --daemon ``` ## Something else It takes around _10+_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the development experience on macOS greatly when we finish the adaptation job.	2022-10-18 13:10:13 +08:00
Yongqiang YANG	de9b9b3e8e	[chore](ut) enable asan core dump when running be ut (#12371 )	2022-09-07 10:09:18 +08:00
Adonis Ling	b4101d46f0	[fix](workflow) Fix the errors when using sh to run shell scripts (#11898 )	2022-08-19 21:28:52 +08:00
Adonis Ling	e63c83e8e1	[fix](script) Support starting BE without Java environment (#11910 )	2022-08-19 17:58:40 +08:00
Adonis Ling	4fa53b4cdb	[chore](workflow) Add shellcheck to check shell scripts (#11744 )	2022-08-18 16:07:28 +08:00
Adonis Ling	573ebf235e	[enhancement](build) Support customizing extra compile flags (#11444 )	2022-08-03 11:02:17 +08:00
Adonis Ling	5215d95064	[enhancement](workflow) Use ccache to speed the BE UT (Clang) up (#11339 )	2022-07-29 21:19:26 +08:00
Xinyi Zou	d9095922d9	[Enhancement] [Memory] add strict memory usage compile option STRICT_MEMORY_USE (#10936 ) In the strict memory usage mode of STRICT_MEMORY_USE=ON, when the capacity of the vectorized Hash Table is greater than 2G, it starts to grow when 75% of the capacity is satisfied, the memory usage of the vectorized Join becomes 50% of the previous value. STRICT_MEMORY_USE=ON` expects BE to use less memory, and gives priority to ensuring stability when the cluster memory is limited.	2022-07-18 16:16:43 +08:00
lihangyu	b04a791895	[Enhancement] support compile with jemalloc (#10542 ) A test feature to use jemalloc as default malloc.	2022-07-11 12:15:35 +08:00
Pxl	4750e94746	set default do not build benchmark-tool && and use lld/gold (#10215 )	2022-06-25 22:31:11 +08:00
yiguolei	90f229c038	[refactor] remove useless plugin test code (#10061 ) * remove plugin test code * remove plugin test Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-06-16 10:43:28 +08:00
Zhengguo Yang	39a2785ce2	[enhancement] support simd instructions on arm cpus through sse2neon (#10068 ) * [enhancement] support simd instructions on arm cpus through sse2neon	2022-06-14 09:17:09 +08:00
Zhengguo Yang	e0cf2677a0	[dependency][enhancement] support build libhdfs in arm cpus (#10018 ) Supports native hdfs functionality on arm cpu This pr mainly upgrades libdfs3 and supports running on arm，and make libhdfs3 with kerberos as default	2022-06-10 19:40:41 +08:00
Xinyi Zou	ca05d1ee01	[fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (#9661 ) 1. Fix Lru Cache MemTracker consumption value is negative. 2. Fix compaction Cache MemTracker has no track. 3. Add USE_MEM_TRACKER compile option. 4. Make sure the malloc/free hook is not stopped at any time.	2022-05-25 08:56:17 +08:00
gtchaos	b3a2a92bf5	[deps] libhdfs3 build enable kerberos support (#9524 ) Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication enabled, and found that kerberos-related dependencies（gsasl and krb5） were not added when build libhdfs3. so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5: - gsasl version: 1.8.0 - krb5 version: 1.19	2022-05-22 20:58:19 +08:00
Adonis Ling	119ff2c02d	[enhancement] Improve debugging experience. (#9677 )	2022-05-19 16:36:37 +08:00
zhannngchen	87fc46f84c	update comments in run-be-ut.sh (#9092 )	2022-04-26 12:48:35 +08:00
Pxl	d161161767	[refactor](script) remove unused parament on run-be-ut.sh (#9000 ) parament -v is not work on run-be-ut.sh now.	2022-04-14 11:45:35 +08:00
Zhengguo Yang	5a44eeaf62	[refactor] Unify all unit tests into one binary file (#8958 ) 1. solved the previous delayed unit test file size is too large (1.7G+) and the unit test link time is too long problem problems 2. Unify all unit tests into one file to significantly reduce unit test execution time to less than 3 mins 3. temporarily disable stream_load_test.cpp, metrics_action_test.cpp, load_channel_mgr_test.cpp because it will re-implement part of the code and affect other tests	2022-04-12 15:30:40 +08:00
Gabriel	0d761f9909	[feature-wip][UDF][DIP-1] Support variable-size input and output for Java UDF (#8678 ) This feature is proposed in DSIP-1. This PR support variable-length input and output Java UDF.	2022-04-11 09:36:16 +08:00
plat1ko	76e0634030	[fix](ut) fix be ut in olap (#8739 )	2022-03-30 09:53:21 +08:00
Pxl	a8af8d2981	[fix](vectorized) fix core dump on get_json_string and add some ut (#8496 )	2022-03-17 10:08:31 +08:00
dataroaring	77b21fba03	[chore] make options of build.sh and run-be-ut.sh work (#8271 ) The h option of build.sh and j option of run-be-ut.sh do not work in the docker with image apache/incubator-doris:build-env-ldb-toolchain-latest.	2022-03-02 10:17:50 +08:00
Zhengguo Yang	f8d086d87f	[feature](rpc) (experimental)Support implement UDF through GRPC protocol. (#7519 ) Support implement UDF through GRPC protocol. This brings several benefits: 1. The udf implementation language is not limited to c++, users can use any familiar language to implement udf 2. UDF is decoupled from Doris, udf will not cause doris coredump, udf computing resources are separated from doris, and doris services are not affected But RPC's UDF has a fixed overhead, so its performance is much slower than C++ UDF, especially when the amount of data is large. Create function like ``` CREATE FUNCTION rpc_add(INT, INT) RETURNS INT PROPERTIES ( "SYMBOL"="add_int", "OBJECT_FILE"="127.0.0.1:9999", "TYPE"="RPC" ); ``` Function service need to implement `check_fn` and `fn_call` methods Note: THIS IS AN EXPERIMENTAL FEATURE, THE INTERFACE AND DATA STRUCTURE MAY BE CHANGED IN FUTURE !!!	2022-02-08 09:25:09 +08:00
Adonis Ling	e7103bfd08	[chore] Set the full path of make program to CMake (#7909 )	2022-02-06 08:34:08 +08:00
Pxl	3ee000c13c	[chore] support build with libc++ && add some build config (#7903 ) support LIBCPP/LDD/BUILD_META_TOOL for build.sh	2022-01-30 16:47:22 +08:00
zhangstar333	fb6e22f4ca	[Fix] fix memory leak in be unit test (#7857 ) 1. fix be unit test memory leak 2. ignore mindump test with ASAN test	2022-01-29 01:00:38 +08:00
Pxl	cd73a6b84b	[chore] fix clang compile error (#7883 )	2022-01-26 12:53:35 +08:00
Adonis Ling	1c711705d7	[chore] Use ccache to speed recompiling test code up. (#7811 )	2022-01-22 10:18:52 +08:00
HappenLee	e1d7233e9c	[feature](vectorization) Support Vectorized Exec Engine In Doris (#7785 ) # Proposed changes Issue Number: close #6238 Co-authored-by: HappenLee <happenlee@hotmail.com> Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com> Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com> Co-authored-by: wangbo <506340561@qq.com> Co-authored-by: emmymiao87 <522274284@qq.com> Co-authored-by: Pxl <952130278@qq.com> Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com> Co-authored-by: thinker <zchw100@qq.com> Co-authored-by: Zeno Yang <1521564989@qq.com> Co-authored-by: Wang Shuo <wangshuo128@gmail.com> Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com> Co-authored-by: Gabriel <gabrielleebuaa@gmail.com> Co-authored-by: xinghuayu007 <1450306854@qq.com> Co-authored-by: weizuo93 <weizuo@apache.org> Co-authored-by: yiguolei <guoleiyi@tencent.com> Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com> Co-authored-by: awakeljw <993007281@qq.com> Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com> Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com> ## Problem Summary: ### 1. Some code from clickhouse ClickHouse is an excellent implementation of the vectorized execution engine database, so here we have referenced and learned a lot from its excellent implementation in terms of data structure and function implementation. We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers. The following comment has been added to the code from Clickhouse, eg: // This file is copied from // https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h // and modified by Doris ### 2. Support exec node and query: * vaggregation_node * vanalytic_eval_node * vassert_num_rows_node * vblocking_join_node * vcross_join_node * vempty_set_node * ves_http_scan_node * vexcept_node * vexchange_node * vintersect_node * vmysql_scan_node * vodbc_scan_node * volap_scan_node * vrepeat_node * vschema_scan_node * vselect_node * vset_operation_node * vsort_node * vunion_node * vhash_join_node You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set. ### 3. Data Model Vec Exec Engine Support Dup/Agg/Unq table, Support Block Reader Vectorized. Segment Vec is working in process. ### 4. How to use 1. Set the environment variable `set enable_vectorized_engine = true; `(required) 2. Set the environment variable `set batch_size = 4096; ` (recommended) ### 5. Some diff from origin exec engine https://github.com/doris-vectorized/doris-vectorized/issues/294 ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (Yes) 3. Has document been added or modified: (No) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (Yes)	2022-01-18 10:07:15 +08:00
Zhengguo Yang	760fc02bfe	Added bprc stub cache check and reset api, used to test whether the bprc stub cache is available, and reset the bprc stub cache (#6916 ) Added bprc stub cache check and reset api, used to test whether the bprc stub cache is available, and reset the bprc stub cache add a config used for auto check and reset bprc stub	2021-11-05 09:45:37 +08:00
Pxl	4dd610c28d	[Feature] Support for storage layer benchmark (#6506 ) * add benchmark tool	2021-09-02 09:57:19 +08:00
Mingyu Chen	9148bcb673	[Build] Reduce the parallel of build (#6469 )	2021-08-18 15:24:19 +08:00
HappenLee	9216735cfa	[New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329 ) 1. FE vectorized plan code 2. Function register vec function 3. Diff function nullable type 4. New thirdparty code and new thrift struct	2021-08-11 14:54:06 +08:00
Yingchun Lai	109b55ee5f	[Shell] Add build parallel option (#5819 ) Add build parallel option then we can build project with a user specified parallel not a fixed value.	2021-05-19 09:32:58 +08:00
Zhengguo Yang	029a8a046b	[Build] Turn on glibc compatibility by default for upgrading gcc10 (#5528 )	2021-03-21 11:22:53 +08:00
Zhengguo Yang	db2120a7f2	[Build][BE] Fix GLIBC_COMPATIBILITY can not compile in centos6 (#5472 ) Add option to disable glibc_compatibility	2021-03-07 20:47:13 +08:00
Zhengguo Yang	e536823f92	[Thirdparty] Fix build thirdparty may be failed (#5187 ) 1. fix build thirdparty may be failed in some os, because of default lib path is `lib` or`lib64` or `arrow` bulld failed by `brotil` and `zstd` 2. fix canot extract `.tar.bz2` file	2021-01-04 15:21:18 +08:00
Zhengguo Yang	279ae1cb75	Add fuzzy_parse option to speed up json import (#5114 ) add a flag of fuzzy_parse, if the json file all object keys are the same and has same order, we only need to parse the first row, and then use index instead key to parse value	2020-12-25 09:19:42 +08:00
Mingyu Chen	912547260a	[UnitTest] Refactor BE unit test script (#4266 ) 1. Rename run-ut.sh to run-be-ut.sh 2. Find all test files from build dir instead of declaring separately in the script 3. Add gtest output to collect the result of unit test.	2020-08-11 10:23:51 +08:00

46 Commits