Commit Graph

725 Commits

Author SHA1 Message Date
0143ae8266 [fix]Add logging before _builtin_unreachable() (#24101)
Co-authored-by: 宋光璠 <songguangfan@sf.com>
2023-09-09 00:30:11 +08:00
2f8b075b71 [improvement](bitmap) support version for ser/deser of bitmap (#23959) 2023-09-07 09:55:29 +08:00
Pxl
a96adc01aa [Chore](function) refactor of quantile_state (#23862)
refactor of quantile_state
2023-09-06 15:39:19 +08:00
039c76cbc0 [feature-wip] (arrow-flight) (step1) BE support Arrow Flight server, read data only (#23765) 2023-09-04 19:19:55 +08:00
228f0ac5bb [Feature](Multi-Catalog) support query doris bitmap column in external jdbc catalog (#23021) 2023-09-02 12:46:33 +08:00
75e2bc8a25 [function](bitmap) support bitmap_to_base64 and bitmap_from_base64 (#23759) 2023-09-02 00:58:48 +08:00
e0efda1234 [Improvement](Slice) support move constructor and operator for Slice (#23694) 2023-09-01 21:05:10 +08:00
25b6e4deb2 [fix](daemon) Fix incorrect initialization order of daemon services (#23578)
Current initialization dependency:

      Daemon ───┬──► StorageEngine ──► ExecEnv ──► Disk/Mem/CpuInfo
                │
                │
BackendService ─┘
However, original code incorrectly initialize Daemon before StorageEngine.
This PR also stop and join threads of daemon services in their dtor, to ensure Daemon services release resources in reverse order of initialization via RAII.
2023-08-31 19:46:38 +08:00
hzq
c083336bbe [Improvement](pipeline) Cancel outdated query if original fe restarts (#23582)
If any FE restarts, queries that is emitted from this FE will be cancelled.

Implementation of #23704
2023-08-31 17:58:52 +08:00
1ce783fb23 [fix](stacktrace) Temporary fix ARM and MacOS stacktrace #23650 2023-08-30 14:51:20 +08:00
29b94c4ed7 [pipeline](refactor) refine pipeline fragment context (#23478) 2023-08-28 15:55:02 +08:00
ba351af452 [enhancement](thirdparty) upgrade thirdparty libs - again (#23414)
submit again #23290 (not upgrade brpc, because bthread local has error)

protobuf 3.15.0 -> 21.11
glog 0.4.0 -> 0.6.0
lz4 1.9.3 -> 1.9.4
curl 7.79.0 -> 8.2.1
zstd 1.5.2 -> 1.5.5
arrow 7.0.0 -> 13.0.0
abseil 20220623.1 -> 20230125.3
orc 1.7.2 -> 1.9.0
jemalloc for arrow 5.2.1 -> 5.3.0
xsimd 7.0.0 -> 13.0.0
opentelemetry-proto 0.19.0 -> 1.0.0
opentelemetry 1.8.3 -> 1.10.0

new:
c-ares -> 1.19.1
grpc -> 1.54.3
2023-08-26 22:59:10 +08:00
40be6a0b05 [fix](hive) do not split compress data file and support lz4/snappy block codec (#23245)
1. do not split compress data file
Some data file in hive is compressed with gzip, deflate, etc.
These kinds of file can not be splitted.

2. Support lz4 block codec
for hive scan node, use lz4 block codec instead of lz4 frame codec

4. Support snappy block codec
For hadoop snappy

5. Optimize the `count(*)` query of csv file
For query like `select count(*) from tbl`, only need to split the line, no need to split the column.

Need to pick to branch-2.0 after this PR: #22304
2023-08-26 12:59:05 +08:00
f66f161017 [fix](multi-catalog)fix hive table with cosn location issue (#23409)
Sometimes, the partitions of a hive table may on different storage, eg, some is on HDFS, others on object storage(cos, etc).
This PR mainly changes:

1. Fix the bug of accessing files via cosn.
2. Add a new field `fs_name` in TFileRangeDesc
    This is because, when accessing a file, the BE will get a hdfs client from hdfs client cache, and different file in one query
request may have different fs name, eg, some of are `hdfs://`, some of are `cosn://`, so we need to specify fs name
for each file, otherwise, it may return error:

`reason: IllegalArgumentException: Wrong FS: cosn://doris-build-1308700295/xxxx, expected: hdfs://[172.xxxx:4007](http://172.xxxxx:4007/)`
2023-08-26 00:16:00 +08:00
17e7c1ca53 [fix](fqdn)Fqdn with ipv6 (#22454)
now,`hostname_to_ip` only can resolve `ipv4`,Therefore, a method is provided to parse ipv4 or ipv6 based on parameters。
when `_heartbeat` call `hostname_to_ip`,Resolve to ipv4 or ipv6, determined by `BackendOptions.is_bind_ipv6` Decision
Additionally, a method is provided to first attempt to parse the host into ipv4, and then try ipv6 if it fails
2023-08-25 21:24:55 +08:00
8ef6b4d996 [fix](json) fix json int128 overflow (#22917)
* support int128 in jsonb

* fix jsonb int128 write

* fix jsonb to json int128

* fix json functions for int128

* add nereids function jsonb_extract_largeint

* add testcase for json int128

* change docs for json int128

* add nereids function jsonb_extract_largeint

* clang format

* fix check style

* using int128_t = __int128_t for all int128

* use fmt::format_to instead of snprintf digit by digit for int128

* clang format

* delete useless check

* add warn log

* clang format
2023-08-25 11:40:30 +08:00
9cacf9535a [Opt](functions) Use preloaded cache to accelerate timezone parsing (#22694)
* opt

* bugfix

* fix ut

* fix stylecheck
2023-08-25 10:00:48 +08:00
51ac92f65c Revert "[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)" (#23368)
This reverts commit 1c3cc77a54938ed948ad8186b8dea8385977d23c.
2023-08-23 18:27:35 +08:00
Pxl
8ed4045df9 [Chore](primitive-type) remove VecPrimitiveTypeTraits (#22842) 2023-08-23 08:37:40 +08:00
8819d73abd [fix](be) fix the crash when there is no tzfile in docker env (#23071) 2023-08-22 18:56:36 +08:00
Pxl
1a1f86486d [Improvement](function) opt for case when (#23068)
opt for case when
2023-08-22 18:31:40 +08:00
33dfa0c454 [Improve](serde) support text serde for nested type-array/map (#22738)
Now we can not support nested type array/map 
so this pr aim to:
1. add format option for string convert defined datatype to keep with origin from_string
2. support array map can nested array and map
2023-08-21 10:32:28 +08:00
0967d7ec04 [improvement](agg) Do not serialize bitmap to string (#23172) 2023-08-21 10:10:15 +08:00
1c3cc77a54 [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)
* [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty

* add ut

* fix nereids

* fix regression-test
2023-08-18 14:37:49 +08:00
c5c984b79b [refactor](bitmap) using template to reduce duplicate code (#23060)
* [refactor](bitmap) support for batch value insertion

* fix values was not filled for int8 and int16
2023-08-17 18:14:29 +08:00
Pxl
cf1865a1c8 [Bug](scan) fix core dump due to store_path_map (#23084)
fix core dump due to store_path_map
2023-08-17 15:24:43 +08:00
209f36f1bf [fix](multi-catalog)fix jdbc loader (#22814) 2023-08-11 14:36:19 +08:00
4360b8520e [fix](executor)Fix query hang when query bitmap Orth intersect #22828 2023-08-10 18:50:33 +08:00
919bfd73f1 [improvement](multi-catalog)add scanner isolation class loader (#22247)
Add scanner isolation class loader to make each plugin non-conflicting.
The BE will get scanner classes by JNI call and use JniClassLoader load them.
In the last version,we always get canner classes from the system class path by default,
so it cannot isolate the classes for each scanner
2023-08-10 10:02:46 +08:00
21beebde7d [fix](taskgroup) Fix task group overcommit memory GC profile (#22764) 2023-08-09 18:29:46 +08:00
1847e440b2 [fix](memory) enable Jemalloc arena dirty pages (#22639)
If there is a core dump here, it may cover up the real stack, if stack trace indicates heap corruption
(which led to invalid jemalloc metadata), like double free or use-after-free in the application.
Try sanitizers such as ASAN, or build jemalloc with --enable-debug to investigate further.
2023-08-06 19:18:44 +08:00
c2c01825c1 [opt](stacktrace) Optimize stacktrace output #22467 2023-08-06 15:53:53 +08:00
Pxl
7839a0e708 [Bug](brpc) fix brpc failed on big query came concurrently (#22600)
fix PriorityThreadPool get_info get wrong number
change brpc pool from priority to fifo
do not use brpc pool when send eos
2023-08-05 21:24:32 +08:00
Pxl
c4cee5122b [Chore](brpc) make error messages more verbose when brpc pool offer failed (#22558) 2023-08-03 22:02:37 +08:00
86e6f5d039 [FIX](decimal)fix decimal precision (#22364)
Now we make wrong for decimal parse from string
if given string precision is bigger than defined decimal precision, we will return a overflow error, but only digit part is bigger than typed digit length , we should return overflow error when we traverse given string to decimal value
2023-08-03 21:13:58 +08:00
Pxl
3d0d7a427b [Chore](brpc) display pool name when try offer failed (#22514) 2023-08-02 22:31:33 +08:00
f16a39aea1 [feature](time) using timev2 type to replace the old time type. (#22269) 2023-08-01 15:59:07 +08:00
66e540bebe [Fix](executor)Fix incorrect mem_limit return value type (#22415) 2023-07-31 22:28:41 +08:00
f2919567df [feature](datetime) Support timezone when insert datetime value (#21898) 2023-07-31 13:08:28 +08:00
ee754307bb [refactor](load) refactor memtable flush actively (#21634) 2023-07-30 21:31:54 +08:00
63a9a886f5 [enhance](S3) add s3 bvar metrics for all s3 operation (#22105) 2023-07-30 21:09:17 +08:00
765f1b6efe [Refactor](load) Extract load public code (#22304) 2023-07-29 12:56:31 +08:00
302de27985 [Refactor] Refactor some code with three-way comparison (#22170)
Refactor some code with three-way comparison
2023-07-29 11:30:15 +08:00
7ed997cba8 [improvement](s3) increase the connection num of s3 client (#22049)
The default maxConnection of s3 client is 25.
It should be increased to improve the query performance.

In my test, a tpch 300 benchmark with data stored on object storage, the total time
can reduce from 430s -> 330s
2023-07-26 22:52:40 +08:00
9e16c69925 [improvement](compression) support LZ4_HC algorithm and parse LZ4_RAW (#22165) 2023-07-26 18:23:39 +08:00
1f3de0eae3 [fix](memory) fix invalid large memory check && fix memory info thread safety (#22027)
fix invalid large memory check
fix memory info thread safety
2023-07-26 12:18:31 +08:00
ba3a0922eb [fix](ipv6)Support IPV6 (#22219)
fe:Remove restrictions from IPv4
be: thrift server Specify binding address
be: Restore changed code of “be/src/olap/task/engine_clone_task.cpp”
2023-07-26 08:40:32 +08:00
Pxl
19ba6bec38 [Improvement](pipeline) support send eos on local exchange and remove some unused code (#22086)
support send eos on local exchange and remove some unused code
2023-07-24 09:25:32 +08:00
ddd7e9871d [improvement](Jsonb) optimization Jsonb path parse (#21495)
The previous logic was to read jsonbvalue while parsing the json path. For complex json paths, there will be a lot of repeated parsing work. The optimization idea is to separate the analysis and value of jsonpath
2023-07-23 18:59:12 +08:00
d1c5025bce [Fix](compaction) add error message when load unsupport compression code (#22033) 2023-07-21 16:50:46 +08:00