Commit Graph

736 Commits

Author SHA1 Message Date
fc12362a6d [feature-wip](arrow-flight)(step2) FE support Arrow Flight server (#24314)
This is a POC, the design documentation will be updated soon
2023-09-20 14:42:54 +08:00
8aea31e383 [fix](timezone) fix timezone parse when there is no tzfile (#24578) 2023-09-20 14:28:12 +08:00
14bd290aec [feature](jsonb)support json_length and json_contains function (#24332) 2023-09-20 10:40:44 +08:00
23a75d0277 [FIX](decimalv3) Fix decimalv3 with abnormal value same with mysql result (#24499)
Fix decimalv3 with abnormal value same with mysql result
2023-09-18 11:12:26 +08:00
7dcc68736d [enhancement](disk) refine io error and report bad when disk is abnormal (#24390) 2023-09-17 11:07:05 +08:00
08740b47cd [FIX](decimalv3) fix decimalv3 value with leading zeros (#24416)
now we make error if we deal with leading zeros in decimal value , type_precision >= precision will make value overflow and DCHECK will fail , so if here has leading zero we should only make type_precision > precision to make value right
2023-09-15 13:35:20 +08:00
4fbb25bc55 [Enhancement](function) Support date_trunc(date) and use it in auto partition (#24341)
Support date_trunc(date) and use it in auto partition
2023-09-14 16:53:09 +08:00
049032b4b3 Refactor Slice move/copy ctor && assignment to default (#24169)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
Co-authored-by: yiguolei <676222867@qq.com>
2023-09-14 14:18:10 +08:00
2f74936382 [FIX](decimalv3) fix decimalv3 with precision cast (#24241)
now we use cast to decimalv3 may has error , because decimalv3 use type precision for translate string

mysql [test]>select cast("9999e-1" as decimal(2,1));
+------------------------------------+
| cast('9999e-1' as DECIMALV3(2, 1)) |
+------------------------------------+
|                              999.9 |
+------------------------------------+
1 row in set (0.01 sec)
this pr will fix this just keep reaction same with mysql

mysql> select cast('9999e-1' as decimalv3(2, 1));
+------------------------------------+
| cast('9999e-1' as DECIMALV3(2, 1)) |
+------------------------------------+
|                                9.9 |
+------------------------------------+
1 row in set (0.07 sec)
2023-09-13 11:35:33 +08:00
134b210c03 [improvement](shutdown) not print thread pool error stack trace when shutdown (#24155)
* [improvement](shutdown) not print thread pool error stack trace when shutdown

when thread pool shutdown, should not print error stack trace, it is very confuse.
arrow flight server should not call shutdown, if it is not enabled, because it will print error stack.
remove service unavailable from thrift because it is useless.
Part of this PR need to pick to 2.0 branch.

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-09-11 12:20:07 +08:00
f9a75b5c4f [feature](csv_serde)1.append csv serde for serialize to csv and deserialize from csv. 2.let csvReader use csv serde not text_converter. (#23352)
1. append csv serde for serialize to csv and deserialize from csv.
2. let csvReader use csv serde not text_converter.
2023-09-10 00:16:21 +08:00
0143ae8266 [fix]Add logging before _builtin_unreachable() (#24101)
Co-authored-by: 宋光璠 <songguangfan@sf.com>
2023-09-09 00:30:11 +08:00
2f8b075b71 [improvement](bitmap) support version for ser/deser of bitmap (#23959) 2023-09-07 09:55:29 +08:00
Pxl
a96adc01aa [Chore](function) refactor of quantile_state (#23862)
refactor of quantile_state
2023-09-06 15:39:19 +08:00
039c76cbc0 [feature-wip] (arrow-flight) (step1) BE support Arrow Flight server, read data only (#23765) 2023-09-04 19:19:55 +08:00
228f0ac5bb [Feature](Multi-Catalog) support query doris bitmap column in external jdbc catalog (#23021) 2023-09-02 12:46:33 +08:00
75e2bc8a25 [function](bitmap) support bitmap_to_base64 and bitmap_from_base64 (#23759) 2023-09-02 00:58:48 +08:00
e0efda1234 [Improvement](Slice) support move constructor and operator for Slice (#23694) 2023-09-01 21:05:10 +08:00
25b6e4deb2 [fix](daemon) Fix incorrect initialization order of daemon services (#23578)
Current initialization dependency:

      Daemon ───┬──► StorageEngine ──► ExecEnv ──► Disk/Mem/CpuInfo
                │
                │
BackendService ─┘
However, original code incorrectly initialize Daemon before StorageEngine.
This PR also stop and join threads of daemon services in their dtor, to ensure Daemon services release resources in reverse order of initialization via RAII.
2023-08-31 19:46:38 +08:00
hzq
c083336bbe [Improvement](pipeline) Cancel outdated query if original fe restarts (#23582)
If any FE restarts, queries that is emitted from this FE will be cancelled.

Implementation of #23704
2023-08-31 17:58:52 +08:00
1ce783fb23 [fix](stacktrace) Temporary fix ARM and MacOS stacktrace #23650 2023-08-30 14:51:20 +08:00
29b94c4ed7 [pipeline](refactor) refine pipeline fragment context (#23478) 2023-08-28 15:55:02 +08:00
ba351af452 [enhancement](thirdparty) upgrade thirdparty libs - again (#23414)
submit again #23290 (not upgrade brpc, because bthread local has error)

protobuf 3.15.0 -> 21.11
glog 0.4.0 -> 0.6.0
lz4 1.9.3 -> 1.9.4
curl 7.79.0 -> 8.2.1
zstd 1.5.2 -> 1.5.5
arrow 7.0.0 -> 13.0.0
abseil 20220623.1 -> 20230125.3
orc 1.7.2 -> 1.9.0
jemalloc for arrow 5.2.1 -> 5.3.0
xsimd 7.0.0 -> 13.0.0
opentelemetry-proto 0.19.0 -> 1.0.0
opentelemetry 1.8.3 -> 1.10.0

new:
c-ares -> 1.19.1
grpc -> 1.54.3
2023-08-26 22:59:10 +08:00
40be6a0b05 [fix](hive) do not split compress data file and support lz4/snappy block codec (#23245)
1. do not split compress data file
Some data file in hive is compressed with gzip, deflate, etc.
These kinds of file can not be splitted.

2. Support lz4 block codec
for hive scan node, use lz4 block codec instead of lz4 frame codec

4. Support snappy block codec
For hadoop snappy

5. Optimize the `count(*)` query of csv file
For query like `select count(*) from tbl`, only need to split the line, no need to split the column.

Need to pick to branch-2.0 after this PR: #22304
2023-08-26 12:59:05 +08:00
f66f161017 [fix](multi-catalog)fix hive table with cosn location issue (#23409)
Sometimes, the partitions of a hive table may on different storage, eg, some is on HDFS, others on object storage(cos, etc).
This PR mainly changes:

1. Fix the bug of accessing files via cosn.
2. Add a new field `fs_name` in TFileRangeDesc
    This is because, when accessing a file, the BE will get a hdfs client from hdfs client cache, and different file in one query
request may have different fs name, eg, some of are `hdfs://`, some of are `cosn://`, so we need to specify fs name
for each file, otherwise, it may return error:

`reason: IllegalArgumentException: Wrong FS: cosn://doris-build-1308700295/xxxx, expected: hdfs://[172.xxxx:4007](http://172.xxxxx:4007/)`
2023-08-26 00:16:00 +08:00
17e7c1ca53 [fix](fqdn)Fqdn with ipv6 (#22454)
now,`hostname_to_ip` only can resolve `ipv4`,Therefore, a method is provided to parse ipv4 or ipv6 based on parameters。
when `_heartbeat` call `hostname_to_ip`,Resolve to ipv4 or ipv6, determined by `BackendOptions.is_bind_ipv6` Decision
Additionally, a method is provided to first attempt to parse the host into ipv4, and then try ipv6 if it fails
2023-08-25 21:24:55 +08:00
8ef6b4d996 [fix](json) fix json int128 overflow (#22917)
* support int128 in jsonb

* fix jsonb int128 write

* fix jsonb to json int128

* fix json functions for int128

* add nereids function jsonb_extract_largeint

* add testcase for json int128

* change docs for json int128

* add nereids function jsonb_extract_largeint

* clang format

* fix check style

* using int128_t = __int128_t for all int128

* use fmt::format_to instead of snprintf digit by digit for int128

* clang format

* delete useless check

* add warn log

* clang format
2023-08-25 11:40:30 +08:00
9cacf9535a [Opt](functions) Use preloaded cache to accelerate timezone parsing (#22694)
* opt

* bugfix

* fix ut

* fix stylecheck
2023-08-25 10:00:48 +08:00
51ac92f65c Revert "[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)" (#23368)
This reverts commit 1c3cc77a54938ed948ad8186b8dea8385977d23c.
2023-08-23 18:27:35 +08:00
Pxl
8ed4045df9 [Chore](primitive-type) remove VecPrimitiveTypeTraits (#22842) 2023-08-23 08:37:40 +08:00
8819d73abd [fix](be) fix the crash when there is no tzfile in docker env (#23071) 2023-08-22 18:56:36 +08:00
Pxl
1a1f86486d [Improvement](function) opt for case when (#23068)
opt for case when
2023-08-22 18:31:40 +08:00
33dfa0c454 [Improve](serde) support text serde for nested type-array/map (#22738)
Now we can not support nested type array/map 
so this pr aim to:
1. add format option for string convert defined datatype to keep with origin from_string
2. support array map can nested array and map
2023-08-21 10:32:28 +08:00
0967d7ec04 [improvement](agg) Do not serialize bitmap to string (#23172) 2023-08-21 10:10:15 +08:00
1c3cc77a54 [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236)
* [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty

* add ut

* fix nereids

* fix regression-test
2023-08-18 14:37:49 +08:00
c5c984b79b [refactor](bitmap) using template to reduce duplicate code (#23060)
* [refactor](bitmap) support for batch value insertion

* fix values was not filled for int8 and int16
2023-08-17 18:14:29 +08:00
Pxl
cf1865a1c8 [Bug](scan) fix core dump due to store_path_map (#23084)
fix core dump due to store_path_map
2023-08-17 15:24:43 +08:00
209f36f1bf [fix](multi-catalog)fix jdbc loader (#22814) 2023-08-11 14:36:19 +08:00
4360b8520e [fix](executor)Fix query hang when query bitmap Orth intersect #22828 2023-08-10 18:50:33 +08:00
919bfd73f1 [improvement](multi-catalog)add scanner isolation class loader (#22247)
Add scanner isolation class loader to make each plugin non-conflicting.
The BE will get scanner classes by JNI call and use JniClassLoader load them.
In the last version,we always get canner classes from the system class path by default,
so it cannot isolate the classes for each scanner
2023-08-10 10:02:46 +08:00
21beebde7d [fix](taskgroup) Fix task group overcommit memory GC profile (#22764) 2023-08-09 18:29:46 +08:00
1847e440b2 [fix](memory) enable Jemalloc arena dirty pages (#22639)
If there is a core dump here, it may cover up the real stack, if stack trace indicates heap corruption
(which led to invalid jemalloc metadata), like double free or use-after-free in the application.
Try sanitizers such as ASAN, or build jemalloc with --enable-debug to investigate further.
2023-08-06 19:18:44 +08:00
c2c01825c1 [opt](stacktrace) Optimize stacktrace output #22467 2023-08-06 15:53:53 +08:00
Pxl
7839a0e708 [Bug](brpc) fix brpc failed on big query came concurrently (#22600)
fix PriorityThreadPool get_info get wrong number
change brpc pool from priority to fifo
do not use brpc pool when send eos
2023-08-05 21:24:32 +08:00
Pxl
c4cee5122b [Chore](brpc) make error messages more verbose when brpc pool offer failed (#22558) 2023-08-03 22:02:37 +08:00
86e6f5d039 [FIX](decimal)fix decimal precision (#22364)
Now we make wrong for decimal parse from string
if given string precision is bigger than defined decimal precision, we will return a overflow error, but only digit part is bigger than typed digit length , we should return overflow error when we traverse given string to decimal value
2023-08-03 21:13:58 +08:00
Pxl
3d0d7a427b [Chore](brpc) display pool name when try offer failed (#22514) 2023-08-02 22:31:33 +08:00
f16a39aea1 [feature](time) using timev2 type to replace the old time type. (#22269) 2023-08-01 15:59:07 +08:00
66e540bebe [Fix](executor)Fix incorrect mem_limit return value type (#22415) 2023-07-31 22:28:41 +08:00
f2919567df [feature](datetime) Support timezone when insert datetime value (#21898) 2023-07-31 13:08:28 +08:00