Commit Graph

55 Commits

Author SHA1 Message Date
Pxl
7dc7ed97eb [Chore](build) remove some unused code and remove some wno (#20326)
remove some unused code about spinlock
remove some wno and fix warning
remove varadic macro usage
2023-06-05 10:48:07 +08:00
e08de52ee7 [chore](compile) using PCH for compilation acceleration under clang (#19303) 2023-05-08 19:51:06 +08:00
16a394da0e [chore](build) Use include-what-you-use to optimize includes (PART III) (#18958)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-24 14:51:51 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
Pxl
16fc3a0e22 [Chore](compile) remove some unused static on inline function to reduce compile time (#17603)
remove some unused static on inline function to reduce compile time
2023-03-13 11:11:59 +08:00
Pxl
e2ac06d6d6 [Chore](execution) change PipelineTaskState to enum class && remove some row-based code (#17300)
1. change PipelineTaskState to enum class
2. remove some row-based code on FoldConstantExecutor::_get_result
3. reduce memcpy on minmax runtime filter function(Now we can guarantee that the input data is aligned)
4. add Wunused-template check, and remove some unused function, change some static function to inline function.
2023-03-08 12:41:15 +08:00
Pxl
f50edff59d [Chore](build) enable fallthrough check annd fix some fallthrough bug (#16748)
* enable fallthrough check annd fix some fallthrough bug

* fix

* fix
2023-02-15 15:58:43 +08:00
Pxl
b1347f4c38 [Chore](build) make compile option work on C objects && some refactor of cmakelists (#16451)
make compile option work on C objects && some refactor of cmakelists
2023-02-14 13:35:20 +08:00
15d9dd114b [Fix](cpp17) fix gutil unary_function binary_function for cpp17 (#16670) 2023-02-13 20:30:10 +08:00
Pxl
ca73c60442 [Chore](build) enable ignored-qualifiers check (#16196)
enable ignored-qualifiers check
2023-02-01 15:15:59 +08:00
28fcc093a8 [improvement](bitshuffle)Enable avx512 support in bitshuffle for performance boost (#15972)
As AVX512 is available in most modern processors, it is good to use them if have performance boost.
In latest bitshuffle, AVX512 have been added. We could make it integrated in doris for AVX512 case.

Tested with master branch, queries(SSB query q1.1.sql~q4.3.sql total 13 queries) can be boost from 1.4%~3.2%. (use run-ssb-queries.sh 5 times, each time with 100 iterations.)

Signed-off-by: Wu, Kaiqiang <kaiqiang.wu@intel.com>
Co-authored-by: vesslanjin <jun.i.jin@intel.com>
2023-01-30 10:33:01 +08:00
c3da5a687a [fix]fixed dangerous usage of namespace std (#15741)
Co-authored-by: zhaochangle <zhaochangle@selectdb.com>
2023-01-10 16:10:49 +08:00
aa0f38f864 [chore](gutil) remove some gutil files and use c++ stl instead (#15357)
* [chore](gutil) remove some gutil files and use c++ stl instead

* fix

* fix
2022-12-26 21:25:09 +08:00
a98636a970 [bugfix](from_unixtime) fix timezone not work for from_unixtime (#15298)
* [bugfix](from_unixtime) fix timezone not work for from_unixtime
2022-12-23 19:05:09 +08:00
Pxl
1b07e3e18b [Chore](refactor) some modify for pass c++20 standard (#15042)
some modify for pass c++20 standard
2022-12-17 14:41:07 +08:00
035657c5a1 [typo](comment) Fix a lot of spell errors in be comments (#14208)
fix typos.
2022-11-12 16:06:15 +08:00
32fea672b0 [chore](gutil) remove some gutil macros and solve some macro conflict with brpc (#13954)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-11-07 13:39:52 +08:00
125def5102 [enhancement](macOS M1) Support building from source on macOS (M1) (#13195)
# Proposed changes

This PR fixed lots of issues when building from source on macOS with Apple M1 chip.

## ATTENTION

The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime:
1. Some errors with memory tracker occur when BE (RELEASE) starts.
2. Some UT cases fail.
...

Temporarily, the following changes are made on macOS to start BE successfully.
1. Disable memory tracker.
2. Use tcmalloc instead of jemalloc.

This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues.

## Use case

```shell
./build.sh -j 8 --be --clean

cd output/be/bin
ulimit -n 60000
./start_be.sh --daemon
```

## Something else

It takes around _**10+**_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the  development experience on macOS greatly when we finish the adaptation job.
2022-10-18 13:10:13 +08:00
Pxl
4e6a59df4c [Improvement][chore] add const to all operator== (#11251) 2022-07-27 21:46:47 +08:00
4960043f5e [enhancement] Refactor to improve the usability of MemTracker (step2) (#10823) 2022-07-21 17:11:28 +08:00
85362a907e [fix](mem tracker) Fix some memory leaks, inaccurate statistics, core dump, deadlock bugs (#10072)
1. Fix the memory leak. When the load task is canceled, the `IndexChannel` and `NodeChannel` mem trackers cannot be destructed in time.
2. Fix Load task being frequently canceled by oom and inaccurate `LoadChannel` mem tracker limit, and rewrite the variable name of `mem limit` in `LoadChannel`.
3. Fix core dump, when logout task mem tracker, phmap erase fails, resulting in repeated logout of the same tracker.
4. Fix the deadlock, when add_child_tracker mem limit exceeds, calling log_usage causes `_child_trackers_lock` deadlock.
5. Fix frequent log printing when thread mem tracker limit exceeds, which will affect readability and performance.
6. Optimize some details of mem tracker display.
2022-06-14 21:38:37 +08:00
ca05d1ee01 [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (#9661)
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
2022-05-25 08:56:17 +08:00
73c4ec7167 Fix some typos in be/. (#9681) 2022-05-19 20:55:39 +08:00
e61d296486 [Refactor] Replace '#ifndef' with '#pragma once' (#9456)
* Replace '#ifndef' with '#pragma once'
2022-05-10 09:25:59 +08:00
eeae516e37 [Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476)
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G

Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.
2022-03-20 23:06:54 +08:00
e17aef9467 [refactor] refactor the implement of MemTracker, and related usage (#8322)
Modify the implementation of MemTracker:
1. Simplify a lot of useless logic;
2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing;
3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes;
4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection
5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently;
6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later;
7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env;
8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.;

Modify where MemTracker is used:
1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code;
2. Added trackers for global objects such as ChunkAllocator and StorageEngine;
3. Added more fine-grained trackers such as ExprContext;
4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode;
5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;
2022-03-11 22:04:23 +08:00
50864aca7d [refactor] fix warings when compile with clang (#8069) 2022-02-19 11:29:02 +08:00
7a73645eee [refactor] remove some unused code (#8022) 2022-02-12 15:17:28 +08:00
Pxl
3ee000c13c [chore] support build with libc++ && add some build config (#7903)
support LIBCPP/LDD/BUILD_META_TOOL for build.sh
2022-01-30 16:47:22 +08:00
800a36343a [chore] Prolog of hermetic build with GCC 11 and Clang 13. (#7712)
Prepare to generate hermetic build using GCC 11 and Clang 13.
The ideal toolchain would be ldb toolchain generated by [ldb_toolchain_gen.sh](https://github.com/amosbird/ldb_toolchain_gen/releases/download/v0.3/ldb_toolchain_gen.sh)

To kick off a clang build, set `DORIS_TOOLCHAIN=clang` before running any build scripts.
2022-01-21 12:12:04 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
24d38614a0 [Dependency] Upgrade thirdparty libs (#6766)
Upgrade the following dependecies:

libevent -> 2.1.12
OpenSSL 1.0.2k -> 1.1.1l
thrift 0.9.3 -> 0.13.0
protobuf 3.5.1 -> 3.14.0
gflags 2.2.0 -> 2.2.2
glog 0.3.3 -> 0.4.0
googletest 1.8.0 -> 1.10.0
snappy 1.1.7 -> 1.1.8
gperftools 2.7 -> 2.9.1
lz4 1.7.5 -> 1.9.3
curl 7.54.1 -> 7.79.0
re2 2017-05-01 -> 2021-02-02
zstd 1.3.7 -> 1.5.0
brotli 1.0.7 -> 1.0.9
flatbuffers 1.10.0 -> 2.0.0
apache-arrow 0.15.1 -> 5.0.0
CRoaring 0.2.60 -> 0.3.4
orc 1.5.8 -> 1.6.6
libdivide 4.0.0 -> 5.0
brpc 0.97 -> 1.0.0-rc02
librdkafka 1.7.0 -> 1.8.0

after this pr compile doris should use build-env:1.4.0
2021-10-15 13:03:04 +08:00
7e30b28f3a [Optimize] Speed up converting the data of other types to string in mysql_result_writer (#6384)
Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-24 22:30:58 +08:00
2f5b06ae70 [Bug][Optimize] Fix race condition problem and optimize do_money_format function (#6350)
* [Bug][Optimize] Fix race condition problem and optimize do_money_format function

Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-06 16:29:34 +08:00
80220af271 [Enhancement] Use Parallel Hash Map Replace Unordered Map In Dict Encodeing Map And Hyper Set (#5990)
Use Parallel Hash Map Replace Unordered Map In Dict Encodeing Map And Hyper Set To Improve Ferformance
2021-06-10 17:38:08 +08:00
93a4c7efc1 [LOG] Standardize the use of VLOG in code (#5264)
At present, the application of vlog in the code is quite confusing.
It is inherited from impala VLOG_XX format, and there is also VLOG(number) format.
VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG
2021-01-21 12:09:09 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
09f97f8a05 [Refactor] Fixes some be typo part 2 (#4747) 2020-10-20 09:28:57 +08:00
75e0ba32a1 Fixes some be typo (#4714) 2020-10-13 09:37:15 +08:00
97d963468a [Code Cleanup] Template nest convert to c++11 syntax and style (#4442) 2020-08-26 10:51:52 +08:00
e20d905d70 Remove unused KUDU codes (#3175)
KUDU table is no longer supported long time ago. Remove code related to it.
2020-03-24 13:54:05 +08:00
09a4d3e50a [gutil] import scoped_refptr smart pointer from KUDU (#2899)
scoped_refptr is used to replace std::shared_ptr, is generally faster and smaller.
advantage
  (1) only requires a single allocation, and ref count is on the same cache line as the object
  (2) the pointer only requires 8 bytes (since the ref count is within the object)
  (3) you can manually increase or decrease reference counts when more control is required
  (4) you can convert from a raw pointer back to a scoped_refptr safely without worrying about double freeing
  (5) since we control the implementation, we can implement features, such as debug builds that capture the stack trace of every referent to help debug leaks.
disadvantage
  (1) the referred-to object must inherit from RefCounted
  (2) does not support the weak_ptr use cases
2020-02-14 13:32:03 +08:00
de4d1778c6 Fix incompatibility with arm architecture in util and gutil (#2650)
1. upgrade  gutil code from imapla  to new verison, include `cpuinfo`, `spinlock` and `linux_syscall_support `
2. impliments arm version  utf8 check code
3. remove incompatible code from stopwatch
2020-01-06 18:39:31 +08:00
2a8e77d9cb Support arm atomicops (#2626) (#2627) 2019-12-31 22:39:22 +08:00
9be86a3db8 Add gutil split and strip tool (#2238)
With these two tools, we can very easily perform splitting
and trimming operations on strings.

The subsequent PR will use them to replace the existing
`boost::split()` and `boost::trim()`
2019-11-20 09:58:52 +08:00
d0316d158d Refactor and reorganize the file utils (#2089) 2019-11-11 20:25:41 +08:00
024348d74b Enable auto convert when check in (#1926)
Leverage gitattributes to enable auto convert end-of-line to LF when
checking in. Convert already exist CRLF to LF by removing all files and
checking out with new .gitattributes file. Except .gitattributes, all
files are only modified at the end of line.
2019-10-09 22:31:27 +08:00
1e4dd77d2a Add bitmap agg type and udaf (#1610) 2019-08-26 14:24:42 +08:00
b2e678dfc1 Support Segment for BetaRowset (#1577)
We create a new segment format for BetaRowset. New format merge
data file and index file into one file. And we create a new format
for short key index. In origin code index is stored in format like
RowCusor which is not efficient to compare. Now we encode multiple
column into binary, and we assure that this binary is sorted same
with the key columns.
2019-08-06 17:15:11 +08:00
a0294b8f40 Add Env for file operation (#1321) 2019-06-17 10:18:16 +08:00