doris

Author	SHA1	Message	Date
DeadlineFen	a05dbd3f81	[chore](compile) Improves PCH cache hit ratio (#19469 ) Supplement the documentation of be-clion-dev, avoid the problem of undefined DORIS_JAVA_HOME and inability to find jni.h when using clion development without directly compiling through build.sh Complete the classification of header files in pch.h and introduce some header files that are not frequently modified in doris. Separate the declaration and definition in common/config.h. If you need to modify the default configuration now, please modify it in common/config.cpp. gen_cpp/version.h is regenerated every time it is recompiled, which may cause PCH to fail, so now you need to get the version information indirectly rather than directly.	2023-05-10 12:49:01 +08:00
奕冷	d24dd12b20	[enhancement](http) add fail reply for failed submitting tasks in single-replica-download (#19356 )	2023-05-10 10:54:32 +08:00
DeadlineFen	e08de52ee7	[chore](compile) using PCH for compilation acceleration under clang (#19303 )	2023-05-08 19:51:06 +08:00
奕冷	5bf1396efe	[enhancement](load) merge single-replica related services as non-standalone (#18421 )	2023-05-06 22:54:56 +08:00
Ashin Gau	b6c7f3aeb8	[opt](FileCache) Add file cache metrics and management (#19177 ) Add file cache metrics and management. 1. Get file cache metrics > If the performance of file cache is not efficient, there are currently no metrics to investigate the cause. In practice, hit ratio, disk usage, and segments removed status are very important information. API: `http://be_host:be_webserver_port/metrics` File cache metrics for each base path start with `doris_be_file_cache_` prefix. `hits_ratio` is the hit ratio of the cache since BE startup; `removed_elements` is the num of removed segment files since BE startup; Every cache path has three queues: index, normal and disposable. The capacity ratio of the three queues is 1:17:2. ``` doris_be_file_cache_hits_ratio{path="/mnt/datadisk1/gaoxin/file_cache"} 0.500000 doris_be_file_cache_hits_ratio{path="/mnt/datadisk1/gaoxin/small_file_cache"} 0.500000 doris_be_file_cache_removed_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 0 doris_be_file_cache_removed_elements{path="/mnt/datadisk1/gaoxin/small_file_cache"} 0 doris_be_file_cache_normal_queue_max_size{path="/mnt/datadisk1/gaoxin/file_cache"} 912680550400 doris_be_file_cache_normal_queue_max_size{path="/mnt/datadisk1/gaoxin/small_file_cache"} 8500000000 doris_be_file_cache_normal_queue_max_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 217600 doris_be_file_cache_normal_queue_max_elements{path="/mnt/datadisk1/gaoxin/small_file_cache"} 102400 doris_be_file_cache_normal_queue_curr_size{path="/mnt/datadisk1/gaoxin/file_cache"} 14129846 doris_be_file_cache_normal_queue_curr_size{path="/mnt/datadisk1/gaoxin/small_file_cache"} 14874904 doris_be_file_cache_normal_queue_curr_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 18 doris_be_file_cache_normal_queue_curr_elements{path="/mnt/datadisk1/gaoxin/small_file_cache"} 22 ... ``` 2. Release file cache > Frequent segment files swapping can seriously affect the performance of file cache. Adding a deletion interface helps users clean up the file cache. API: `http://be_host:be_webserver_port/api/file_cache?op=release&base_path=${file_cache_base_path}` Return the number of released segment files. If `base_path` is not provide in url, all cache paths will be released. It's thread-safe to call this api, so only the segment files not been read currently can be released. ``` {"released_elements":22} ``` 3. Specify the base path to store cache data > Currently, regression testing lacks test cases of file cache, which cannot guarantee the stability of file cache. This interface is generally used in regression testing scenarios. Different queries use different paths to verify different usage cases and performance. User can set session variable `file_cache_base_path` to specify the base path to store cache data. `file_cache_base_path="random"` as default, means chosing a random path from cached paths to store cache data. If `file_cache_base_path` is not one of the base paths in BE configuration, a random path is used.	2023-05-05 14:28:01 +08:00
xiaojunjie	9813406757	[Enhancement](HttpServer) Add http interface authentication for BE (#17753 )	2023-05-04 23:46:49 +08:00
yongjinhou	bee3aa3007	be conf action supports specify item (#19159 )	2023-04-28 19:12:51 +08:00
yixiutt	aef9355cd3	[feature-wip](partial update) PART1: support basic partial write (#17542 )	2023-04-28 17:17:57 +08:00
Zhengguo Yang	52b1bd2c81	[clone](download) fix be clone action download tablet content length overflow (#18851 )	2023-04-28 11:35:17 +08:00
Adonis Ling	e412dd12e8	[chore](build) Use include-what-you-use to optimize includes (PART II) (#18761 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-19 23:11:48 +08:00
HappenLee	b68857902e	[Compile](BE) Fix compile failed with tcmalloc (#18748 )	2023-04-18 09:26:45 +08:00
Adonis Ling	9e960f4c4f	[chore](build) Use include-what-you-use to optimize includes (#18681 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-17 11:44:58 +08:00
Pxl	975b373896	[Chore](thrift) add some check on client cache && remove some unused code && catch st… #18683	2023-04-15 17:47:51 +08:00
Xinyi Zou	c704351273	[enhancement](memory) Refactor memory limit exceeded behavior (#18590 ) No check mem tracker limit and no cancel task in mem hook, only in Allocator. This helps in clearer analysis of memory issues and reduces performance loss. PODArray/hash table/arena memory allocation will use Allocator. Optimize mem limit exceeded log printing Optimize compilation time	2023-04-14 10:42:35 +08:00
gitccl	7f8d92656e	[fix](streamload) fix stream load failed when enable profile (#18364 ) #18015 enables stream load profile log, however be will encounter rpc fail when loading tpch data(see #18291). This is because when `is_report_success` is true, be will reportExecStatus to fe, but fe cannot find QueryInfo in `coordinatorMap`, thus it will return error to be.	2023-04-05 01:01:46 +08:00
Pxl	e77833bfa1	[Bug](materialized-view) fix where clause persistence replay incorrect (#18228 ) fix where clause persistence replay incorrect	2023-04-03 12:49:01 +08:00
Mingyu Chen	05db6e9b55	[refactor](file-system)(step-2) remove env, file_utils and filesystem_utils (#18009 ) Follow #17586. This PR mainly changes: Remove env/ Remove FileUtils/FilesystemUtils Some methods are moved to LocalFileSystem Remove olap/file_cache Add s3 client cache for s3 file system In my test, the time of open s3 file can be reduced significantly Fix cold/hot separation bug for s3 fs. This is the last PR of #17764. After this, all IO operation should be in io/fs. Except for tests in #17586, I also tested some case related to fs io: clone concurrency query on local/s3/hdfs load error log create and clean disk metrics	2023-03-29 09:00:52 +08:00
yiguolei	359f5be53e	[refactor](cgroup) remove cgroup manager it is useless (#18124 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-27 23:02:18 +08:00
gitccl	0523860877	[Enhancement](streamload) print profile for streamload (#18015 ) When both enable_profile and enable_stream_load_profile_log is true, stream load profile is printed to the log	2023-03-24 20:17:33 +08:00
AlexYue	6cbf393665	[enhance](meta action) remove useless pb field and refactor writer cooldown meta code (#17652 )	2023-03-22 11:13:13 +08:00
Mingyu Chen	cb79e42e5c	[refactor](file-system)(step-1) refactor file sysmte on BE and remove storage_backend (#17586 ) See #17764 for details I have tested: - Unit test for local/s3/hdfs/broker file system: be/test/io/fs/file_system_test.cpp - Outfile to local/s3/hdfs/broker. - Load from local/s3/hdfs/broker. - Query file on local/s3/hdfs/broker file system, with table value function and catalog. - Backup/Restore with local/s3/hdfs/broker file system Not test: - cold & host data separation case.	2023-03-21 21:08:38 +08:00
Yongqiang YANG	e687f3badd	Revert "[feature-wip](BE http)Support BE http service using brpc (#16123 )" (#17219 ) This reverts commit 049ecccc578802496e5421db19e21e7eb256699d. Merge back after streamload is handled.	2023-03-01 09:18:25 +08:00
lvliang	34813bae13	[improvement](meta) make database,table,column names to support unicode (replace PR #13467 with this) (#14531 ) Make database, table, column and other names support unicode by changing LABEL_REGEX COMMON_NAME_REGIEX COMMON_TABLE_NAME_REGEX COLUMN_NAME_REGEX regular expressions in class FeNameFormat. P.S. @SharpRay has transfered PR #13467 to me, and I‘m responsible for the task now. There will be some modifications during the review period, so I create a new PR and the original #13467 could be closed. Thanks.	2023-02-28 18:50:36 +08:00
奕冷	049ecccc57	[feature-wip](BE http)Support BE http service using brpc (#16123 ) Now, streamload is not supported.	2023-02-28 09:59:29 +08:00
Zhengguo Yang	b51ce415e7	[Feature](load) Add submitter and comments to load job (#16878 ) * [Feature](load) Add submitter and comments to load job	2023-02-28 09:06:19 +08:00
yiguolei	03a4fe6f39	[enhancement](streamload) make stream load context as shared ptr and save it in global load mgr (#16996 )	2023-02-24 11:15:29 +08:00
Xinyi Zou	2074b83c67	[enhancement](third-party) Upgrade JEMalloc version from 5.2.1 to 5.3.0 (#14871 ) https://github.com/jemalloc/jemalloc/releases	2023-02-20 00:00:40 +08:00
zhengshengjun	d013d529c8	[Feature](ipv6)Support IPV6 (#14063 ) Support IPV6 in Apache Doris, the main changes are: 1. enable binding to IPV6 address if network priority in config file contains an IPV6 CIDR string 2. BRPC and HTTP support binding to IPV6 address 3. BRPC and HTTP support visiting IPV6 Services	2023-02-14 21:43:10 +08:00
huangzhaowei	f41a2055d3	[feature](Load)Remove user/password in properties for mysql load to avoid double auth. (#16073 ) Use FE cluster token to auth stream load. This auth is only open for be, and fe auth still only support http basic auth. I will use this auth for mysql load to build a no-auth stream load from fe to be. And this will avoid double auth in mysql load. More information to see the design doc.	2023-02-13 10:00:08 +08:00
AlexYue	1f631c388d	[enhance](cooldown)accelerate cooldown task produce efficiency (#16089 )	2023-02-10 16:58:27 +08:00
yiguolei	4fcd6cd236	[refactor](remove unused code) remove load stream mgr (#16580 ) remove old stream load pipe remove old stream load manager --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-02-10 07:46:18 +08:00
weizuo93	da27039fe4	[Fix](load) Fix memory leak for stream load 2pc #16430 StreamLoadContext is not deleted correctly. Co-authored-by: weizuo <weizuo@xiaomi.com>	2023-02-06 15:52:17 +08:00
Pxl	5e4bb98900	[Chore](build) enable -Wpedantic and update lowest gcc version to 11.1 (#16290 ) enable -Wpedantic and update lowest gcc version to 11.1	2023-02-03 11:28:48 +08:00
huangzhaowei	b878a7e61e	[feature](Load)Suppot skip specific lines number for csv stream load (#16055 ) Support set skip line number for stream load to load csv file. Usage `-H skip_lines:number`: ``` curl --location-trusted -u root: -T test.csv -H skip_lines:5 -XPUT http://127.0.0.1:8030/api/testDb/testTbl/_stream_load ``` Skip line number also can be used in mysql load as below: ```sql LOAD DATA LOCAL INFILE '${mysql_load_skip_lines}' INTO TABLE ${tableName} COLUMNS TERMINATED BY ',' IGNORE 2 LINES PROPERTIES ("auth" = "root:"); ```	2023-02-01 20:42:43 +08:00
Xinyi Zou	97fcad76f8	[enhancement](memtracker) Improve readability (#15716 )	2023-01-16 16:30:35 +08:00
Pxl	b727033906	[Chore](build) enable -Wextra and remove some -Wno (#15760 ) enable -Wextra and remove some -Wno	2023-01-15 10:40:35 +08:00
pengxiangyu	58c520dbfd	[Feature](remote) Cooldown cold data to object storage only one replica (#15832 )	2023-01-14 23:58:00 +08:00
plat1ko	ad68764977	[enhancement](tablet) Unify redundant `create_rowset_writer` methods (#15519 ) * Remove redundant create_rowset_writer methods * Set resource id when setting FS in rowset meta * fix * fix ut	2022-12-30 22:57:12 +08:00
AlexYue	ffef81a6ab	[feature](BE)pad missed version with empty rowset (#15030 ) If all replicas of one tablet are broken, user can use this http api to pad the missed version with empty rowset.	2022-12-29 11:20:44 +08:00
spaces-x	a22ee89431	[Enhancement](jemalloc):support heap dump by http request at runtime (#15429 )	2022-12-28 20:10:50 +08:00
Xin Liao	bf71943605	[feature](load) stream load trim double quotes for csv (#15241 )	2022-12-26 11:45:54 +08:00
Tiewei Fang	ec055e1acb	[feature](new file reader) Integrate new file reader (#15175 )	2022-12-26 08:55:52 +08:00
Zhengguo Yang	a98636a970	[bugfix](from_unixtime) fix timezone not work for from_unixtime (#15298 ) * [bugfix](from_unixtime) fix timezone not work for from_unixtime	2022-12-23 19:05:09 +08:00
Pxl	1b07e3e18b	[Chore](refactor) some modify for pass c++20 standard (#15042 ) some modify for pass c++20 standard	2022-12-17 14:41:07 +08:00
Mingyu Chen	0e1e5a802b	[config](load) enable new load scan node by default (#14808 ) Set FE `enable_new_load_scan_node` to true by default. So that all load tasks(broker load, stream load, routine load, insert into) will use FileScanNode instead of BrokerScanNode to read data 1. Support loading parquet file in stream load with new load scan node. 2. Fix bug that new parquet reader can not read column without logical or converted type. 3. Change jsonb parser function to "jsonb_parse_error_to_null" So that if the input string is not a valid json string, it will return null for jsonb column in load task.	2022-12-16 09:41:43 +08:00
plat1ko	f3aea7f0f0	[Enhancement](status) Unify error code and enable customed err msg for BE internal errors (#14744 )	2022-12-11 23:33:18 +08:00
Pxl	82da071b45	[Chore](format) update clang-format version to 15 (#13036 ) update clang-format version to 15	2022-11-29 14:46:10 +08:00
Xinyi Zou	21416f9947	[enhancement](memory) Support Jemalloc metrics and default allocator changed to Jemalloc (#14384 )	2022-11-18 21:02:54 +08:00
Xinyi Zou	dd11d5c0a5	[enhancement](memory) Support try catch bad alloc (#14135 )	2022-11-13 11:22:56 +08:00
xy720	035657c5a1	[typo](comment) Fix a lot of spell errors in be comments (#14208 ) fix typos.	2022-11-12 16:06:15 +08:00

1 2 3 4 5 ...

259 Commits