doris

Author	SHA1	Message	Date
Pxl	341cb40693	[Chore](log) adjust output order on PrintInstanceStandardInfo and reduce warning log when rpc finished (#33652 ) adjust output order on PrintInstanceStandardInfo and reduce warning log when rpc finished	2024-04-17 23:42:12 +08:00
Mryange	e5c52b5f82	[feature](profile) add non-zero counter in profile(#33342 ) add non-zero counter in profile	2024-04-17 23:42:00 +08:00
yujun	87806a0137	[fix](debug point) fix gcc compile (#33451 )	2024-04-12 15:09:25 +08:00
yujun	bc929686e3	[feature](debug point) add macro DBUG_RUN_CALLBACK (#33407 )	2024-04-11 09:31:50 +08:00
Pxl	3081fc584d	[Improvement](runtime-filter) support sync join node build side's size to init bloom runtime filter (#32180 ) support sync join node build side's size to init bloom runtime filter	2024-04-11 09:31:50 +08:00
camby	3e7b253e41	[fix](compress) data decompress failed while max_len equals 8M (#33456 ) * fix data decompress failed while max_len equals MAX_COMPRESSION_BUFFER_SIZE_FOR_REUSE * update code format error	2024-04-11 08:36:44 +08:00
yujun	e2ad7149c3	[feature](debug point) Add handler to debug point (#33350 )	2024-04-10 16:24:13 +08:00
Xinyi Zou	2b1ab89b5b	[fix](memory) Fix memory log compile by ASAN (#33162 ) ASAN compiles BE, add markers in memory logs	2024-04-10 15:26:09 +08:00
Pxl	8fd6d4c41b	[Chore](build) add -Wconversion and remove some unused code (#33127 ) add -Wconversion and remove some unused code	2024-04-10 15:26:08 +08:00
超威老仲	b0b5f84e40	[feature](load) support compressed JSON format data for broker load (#30809 )	2024-04-10 14:20:53 +08:00
HappenLee	193600ad9d	[Performance](sink) opt mysql result writer (#31816 )	2024-04-10 11:34:30 +08:00
HappenLee	3a6c37c6d5	[exec](column) change some complex column move to noexcept (#32954 )	2024-04-10 11:34:29 +08:00
plat1ko	97850cf2bb	[fix](cooldown) Fix hdfs path (#33315 )	2024-04-09 12:55:53 +08:00
yiguolei	4d98fe23a2	[enhancement](rpc) should print fe address in error msg during thrift rpc call (#33381 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-04-08 23:10:17 +08:00
zy-kkk	c318c48a38	[fix](compile) fix implicit float-to-int conversion in mem_info calculation (#33311 )	2024-04-08 07:34:22 +08:00
Mingyu Chen	586df24b9d	[fix](tvf) Support fs.defaultFS with postfix '/' (#33202 ) For HDFS tvf like: ``` select count(*) from hdfs( "uri" = "hdfs://HDFS8000871/path/to/1.parquet", "fs.defaultFS" = "hdfs://HDFS8000871/", "format" = "parquet" ); ``` Before, if the `fs.defaultFS` is end with `/`, the query will fail with error like: ``` reason: RemoteException: File does not exist: /user/doris/path/to/1.parquet ``` You can see that is a wrong path with wrong prefix `/user/doris` User need to set `fs.defaultFS` to `hdfs://HDFS8000871` to avoid this error. This PR fix this issue	2024-04-07 22:21:14 +08:00
Mingyu Chen	466972926e	[fix](dns-cache) do not detach the refresh thread (#33182 )	2024-04-07 22:18:56 +08:00
Mingyu Chen	c758a25dd8	[opt](fqdn) Add DNS Cache for FE and BE (#32869 ) In previously, when enabling FQDN, Doris will call dns resolver to get IP from hostname each time when 1) FE gets BE's grpc client. 2) BE gets other BE's brpc client. So when in high concurrency case, the dns resolver be overloaded and failed to resolve hostname. This PR mainly changes: 1. Add DNSCache for both FE and BE. The DNSCache will run on every FE and BE node. It has a cache, key is hostname and value is IP. Caller can get IP by hostname from this cache, and if hostname does not exist, it will try to resolve it and update the cache. In addition, DNSCache has a daemon thread to refresh the cache every 1 min, in case that the IP may be changed at anytime. There are other implements of this dns cache: 1. `36fed13997` This is for BE side, but it does not handle the IP change case. 3. https://github.com/apache/doris/pull/28479 This is for FE side, but it can only work with Master FE. Other FE node will not be aware of the IP change. And there are a bunch of BackendServiceProxy, this PR only handle cache in one of them.	2024-04-07 22:16:04 +08:00
Pxl	6462d913ca	[Improvement](brpc) log error message when AutoReleaseClosure meet brpc error or response… (#32628 ) log error message when AutoReleaseClosure meet brpc error or response with error status	2024-03-22 08:52:38 +08:00
Tiewei Fang	d7a3ff1ddf	[Fix](Outfile) Fix the column type mapping in the orc/parquet file format (#32281 ) \| Doris Type \| Orc Type \| Parquet Type \| \|---------------------\|--------------------\|------------------------\| \| Date \| Long (logical: DATE) \| int32 (Logical: Date) \| \| DateTime \| TIMESTAMP (logical: TIMESTAMP) \| int96 \|	2024-03-22 08:52:16 +08:00
Mryange	a40463617e	[feature](cpu cores) get the cores when running within a cgroup. (#32370 ) get the cores when running within a cgroup	2024-03-21 14:07:49 +08:00
Mingyu Chen	e99b33c274	[opt](file-meta-cache) reduce file meta cache size and disable cache for some cases (#32340 ) File meta cache on BE is used to cache the meta for external table's file such as parquet footer. This cache is counted by number, not memory consumption. So if the cache object is big(eg, a large parquet footer), the total memory consumption of this cache will be large and causing OOM. This PR mainly changes: 1. Add a new method `exceed_prune_limit()` for `CachePolicy` For `ObjLRUCache`, it always return true so that the minor of full gc on BE will prune the cache each time. 2. Reduce the default capability of file meta cache, from 20000 to 1000 Also change the default capability of hdfs file handle cache, from 20000 to 1000 4. Change judgement of whether enable file meta cache when querying If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache will be disabled for this query. Because cache is useless if there are too many files.	2024-03-21 14:07:22 +08:00
Mingyu Chen	2e564036ef	[fix](profile) avoid update profile in deconstructor (#32131 ) In previous, the counter in `profile` may be updated when close the file reader. And the file reader may be closed when the object being deconstruted. But at that time, the `profile` object may already be deleted, causing NPE and BE will crash. This PR try to fix this issue: 1. Remove the "profile counter update" logic from all `close()` method. 2. Add a new interface `ProfileCollector` It has 2 methods: - `collect_profile_at_runtime()` It can be called at runtime, eg, in every `get_next_block()` method. So that the counter in profile can be updated at runtime. - `collect_profile_before_close()` Should be called before the object call `close()`. And it will only be called once. 3. Derived from `ProfileCollector` All classes which may update the profile counter in `close()` method should extends the `ProfileCollector`. Such as `GenericReader`, etc. And implement `collect_profile_before_close()` And `collect_profile_before_close()` will be called in `scanner->mark_to_need_to_close()`.	2024-03-21 14:07:22 +08:00
Mingyu Chen	ef2151ae66	[Feature-WIP](multi-catalog) Add Hive sink on BE side. (#32306 ) (#32364 ) bp #32306 Co-authored-by: Qi Chen <kaka11.chen@gmail.com>	2024-03-18 11:23:01 +08:00
ryanzryu	c5ffeff833	[fix](s3 client)add default ca cert list for s3 client to avoid problem:'curlCode:77' (#32285 ) Co-authored-by: ryanzryu <ryanzryu@tencent.com>	2024-03-16 20:55:28 +08:00
yiguolei	62023d705d	[refactor](rename) rename task group to workload group in be (#32204 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-03-15 18:04:02 +08:00
Xinyi Zou	7b74b199a5	[fix](memory) Fix LRU cache deleter and memory tracking (#32080 ) In order to add common code to the value deleter of LRU cache, let all lru cache values inherit from LRUCacheValueBase class and tracking memory in destructor.	2024-03-15 17:57:58 +08:00
zy-kkk	bbdce3eb5e	[fix](jdbc catalog) fix jdbc-connector coredump as get env return nullptr (#32217 )	2024-03-14 16:05:08 +08:00
HappenLee	b031c95324	[Opt](exec) use libbase64 to replace base64 code in doris (#32078 ) * [Opt](exec) use libbase64 to replace base64 code in doris	2024-03-14 09:20:50 +08:00
HappenLee	f2404ff553	[Opt](exec) remove the unless mem alloc in base64 (#32019 )	2024-03-14 09:19:40 +08:00
Xin Liao	0aa7108ee2	[fix](merge-on-write) incorrect result caused by key range filter with pk (#31456 )	2024-02-29 19:51:47 +08:00
zclllyybb	82add8dfc1	[Fix](timezone) Introduce a config to use Doris tzdata directly (#31561 )	2024-02-29 12:38:03 +08:00
zclllyybb	b177b26d39	[branch-2.1](tracing) Pick pipeline tracing and relative bugfix (#31367 ) * [Feature](pipeline) Trace pipeline scheduling (part I) (#31027) * [fix](compile) Fix performance compile fail #31305 * [fix](compile) Fix macOS compilation issues for PURE macro and CPU core identification (#31357) * [fix](compile) Correct PURE macro definition to fix compilation on macOS * 2 --------- Co-authored-by: zy-kkk <zhongyk10@gmail.com>	2024-02-29 08:42:35 +08:00
AlexYue	f18c853495	[enhance](S3) Init default retry strategy for aws s3 sdk (#31329 )	2024-02-28 13:08:36 +08:00
yiguolei	70304bffd2	[refactor](wg) move memory gc logic to workload group (#31334 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-02-23 23:12:09 +08:00
yangshijie	8f77e6363a	[Feature](function) Support xxhash function like murmur hash function (#31193 )	2024-02-23 19:03:28 +08:00
AlexYue	2f9bd3e3bb	(enhance)(S3) Change s3 metric from bvar adder to latency recorder (#28861 )	2024-02-19 17:22:03 +08:00
lsy3993	870a9342b7	[fix](function) fix extract_url_parameter's bug then get the last key (#30929 ) fix extract_url_parameter's bug then get the last key	2024-02-18 14:45:25 +08:00
koarz	6cf7468073	[enhancement](function) change some function nullable mode (#30991 ) change some function nullable mode	2024-02-18 14:45:25 +08:00
zhiqiang	73940f96d3	[opt](string_to_unsigned_int) performance opt (#30825 )	2024-02-05 22:23:16 +08:00
yiguolei	2c99c53812	[refactor](taskqueue) remove old task scheduler based wg (#30832 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-02-05 22:00:27 +08:00
lihangyu	90d3f0d805	[Fix](json) avoid print warn log when parse failed (#30656 )	2024-02-01 19:00:50 +08:00
TengJianPing	ef8d9ad9a4	[pipelinex](profile) improve memory counter of pipelineX (#30538 )	2024-01-31 23:53:39 +08:00
slothever	b1a9370004	[fix](glue)support access glue iceberg with credential list (#30473 ) merge from #30292	2024-01-28 18:23:07 +08:00
yujun	8ca807578f	[fix](migrate disk) fix migrate disk lost data during publish version (#29887 ) Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>	2024-01-16 18:37:06 +08:00
Pxl	068367063f	[Improvement](function) optimization for substr with ascii string (#29799 )	2024-01-12 11:59:52 +08:00
zhangstar333	f8d3b20911	[improve](fmt) use format_to and FMT_COMPILE to speed up write data #29682	2024-01-12 11:53:57 +08:00
plat1ko	b0cac0014d	[enhance](FS) Improve FS error code (#29432 )	2024-01-06 21:17:22 +08:00
HappenLee	92533d544f	[LOG](exec) Add timer for hostname_to_ip (#29497 )	2024-01-04 18:27:27 +08:00
Xinyi Zou	82635d4b59	[opt](memory) All LRU Cache inherit from LRUCachePolicy (#28940 ) After all LRU Cache inherits from LRUCachePolicy, this will allow prune stale entry, eviction when memory exceeds limit, and define common properties. LRUCache constructor change to private, only allow LRUCachePolicy to construct it. Impl DummyLRUCache, when LRU Cache capacity is 0, will no longer be meaningless insert and evict.	2023-12-29 16:15:56 +08:00

1 2 3 4 5 ...

887 Commits