Commit Graph

12906 Commits

Author SHA1 Message Date
650cc25ea4 [fix](light-schema-change) fix schema consistency check failed (#23283) 2023-08-28 16:40:30 +08:00
6ac694aede [Configuration](multi-catalog) Modify default external cache expire time to 10 mins. (#23490)
Configuration Modify default external cache expire time to 10 mins.
2023-08-28 16:16:43 +08:00
29b94c4ed7 [pipeline](refactor) refine pipeline fragment context (#23478) 2023-08-28 15:55:02 +08:00
7e7cfd17bf [fix](tablet sink) check data valid of tablet sink data (#23530)
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
2023-08-28 15:54:12 +08:00
f70638e895 [Fix](autobucket) Fix autobucket partition size by using getAllDataSize including cooldown size (#23557) 2023-08-28 15:24:48 +08:00
Pxl
6e82178847 [Bug](materialized-view) fix loaddb analyze failed on MaterializedIndexMeta (#23442)
* fix loaddb analyze failed on MaterializedIndexMeta

* update

* update
2023-08-28 15:18:18 +08:00
Pxl
3049533e63 [Bug](materialized-view) fix core dump on create materialized view when diffrent mv column have same reference base column (#23425)
* Remove redundant predicates on scan node

update

fix core dump on create materialized view when diffrent mv column have same reference base column

Revert "update"

This reverts commit d9ef8dca123b281dc8f1c936ae5130267dff2964.

Revert "Remove redundant predicates on scan node"

This reverts commit f24931758163f59bfc47ee10509634ca97358676.

* update

* fix

* update

* update
2023-08-28 14:40:51 +08:00
bea5701cce [minor](docs) update docs for variable sql_select_limit (#23262) 2023-08-28 14:38:32 +08:00
28a2e71084 [pipelineX](refactor) refine codes (#23521)
* [pipelineX](refactor) refine codes

* update

* update
2023-08-28 14:38:07 +08:00
4c8fc06e40 [Feature](fe) Add admin set partition version statement (#23086)
This commit add a statement to modify partition visible version.
2023-08-28 14:31:54 +08:00
82fe5aa5a0 [fix](regression) rename tables in test_stream_load_move_memtable (#23545) 2023-08-28 14:31:00 +08:00
83467e5d9e [docs](website) fix a typo in docs tittle (#23431) 2023-08-28 14:27:45 +08:00
c05319b8eb [fix](agg) incorrect result of bitmap_agg and bitmap_union (#23558) 2023-08-28 14:22:19 +08:00
f7d2c1faf6 [feature](Nereids) support select key encryptKey (#23257)
Add select key

```
- CREATE ENCRYPTKEY key_name AS "key_string"
- select key my_key
+-----------------------------+
| encryptKeyRef('', 'my_key') |
+-----------------------------+
| ABCD123456789               |
+-----------------------------+
```
2023-08-28 14:07:26 +08:00
5be8d57f52 [fix](be-ut) fix ColumnFixedLenghtObjectTest on 32 bits system (#23519) 2023-08-28 14:02:05 +08:00
ef2fc44e5c [Improve](Job)Allows modify the configuration of the Job queue and the number of consumer threads (#23547) 2023-08-28 12:01:49 +08:00
2799393a50 forbid: test_stream_load_move_memtable (#23556) 2023-08-28 11:41:08 +08:00
e84989fb6d [feature](Nereids) support map type (#23493) 2023-08-28 11:31:44 +08:00
b181a9f099 [feature](Nereids) support array type in fold constant framework (#23373)
1. use legacy planner way to process constant folding result from be
2. support signature with complex type for constant folding on fe
2023-08-28 10:47:43 +08:00
962221cb18 [test](log) add log for debug case failure (#23506) 2023-08-28 10:45:25 +08:00
d19dcd6bc1 [improve](jdbc catalog) support sqlserver uniqueidentifier data type (#23297) 2023-08-28 10:30:10 +08:00
eadffedb33 [Feature](fe) Add admin set table status statement (#23139)
For some certain bugs, jobs is stuck in FE by the table state. For example, There is a bug which causes table remains ROLLUP state after adding rollup job, then other alter jobs later will not succeed because the table state is always ROLLUP but not NORMAL.

This commit adds a statement which is used to set the state of the specified table.
2023-08-28 10:22:09 +08:00
4dbec854e4 [Docs](inverted index) add tokenize function doc (#23518) 2023-08-28 10:19:03 +08:00
92bdf75836 [fix](Nereids): LogicalRepeat equals lack @Override (#23408) 2023-08-28 10:07:39 +08:00
981586155c [Improvement][json] optimize performance of json_extract by reusing json path object (#23430)
* reuse json path to speed up json function

* fix typo

* clang format

* path reentry safe

* fix compile error

* fix bug of continue
2023-08-27 17:39:10 +08:00
e0bf621fe0 [chore](build) Fix compilation errors for BE UT (#23535)
Issue Number: close #23536

This issue was introduced by #23414 .
2023-08-27 11:52:13 +08:00
153e8f0f72 [imporvement](table property) support for alter table property: skip wirte index , single compaction (#23475) 2023-08-26 23:52:09 +08:00
ba351af452 [enhancement](thirdparty) upgrade thirdparty libs - again (#23414)
submit again #23290 (not upgrade brpc, because bthread local has error)

protobuf 3.15.0 -> 21.11
glog 0.4.0 -> 0.6.0
lz4 1.9.3 -> 1.9.4
curl 7.79.0 -> 8.2.1
zstd 1.5.2 -> 1.5.5
arrow 7.0.0 -> 13.0.0
abseil 20220623.1 -> 20230125.3
orc 1.7.2 -> 1.9.0
jemalloc for arrow 5.2.1 -> 5.3.0
xsimd 7.0.0 -> 13.0.0
opentelemetry-proto 0.19.0 -> 1.0.0
opentelemetry 1.8.3 -> 1.10.0

new:
c-ares -> 1.19.1
grpc -> 1.54.3
2023-08-26 22:59:10 +08:00
93918253ba [fix](metric) fix issue that the counter of rejected transactions does not cover some failed situations for load (#23363)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2023-08-26 20:06:42 +08:00
a5761a25c5 [feature](move-memtable)[7/7] add regression tests (#23515)
Co-authored-by: laihui <1353307710@qq.com>
2023-08-26 17:52:10 +08:00
30658ebeda [Fix](planner) Fix query queue can not limit maxConcurrency #23418
2 Fix concurrent can not limit
2023-08-26 17:31:44 +08:00
30e3c5bbe6 [bugfix](file cache) Fix the init file cache coredump (#23464)
* [bugfix](file cache) Fix the init file cache coredump

* fix compile
2023-08-26 16:50:50 +08:00
40be6a0b05 [fix](hive) do not split compress data file and support lz4/snappy block codec (#23245)
1. do not split compress data file
Some data file in hive is compressed with gzip, deflate, etc.
These kinds of file can not be splitted.

2. Support lz4 block codec
for hive scan node, use lz4 block codec instead of lz4 frame codec

4. Support snappy block codec
For hadoop snappy

5. Optimize the `count(*)` query of csv file
For query like `select count(*) from tbl`, only need to split the line, no need to split the column.

Need to pick to branch-2.0 after this PR: #22304
2023-08-26 12:59:05 +08:00
36b7fcf055 [tmp](hive) support hive partition 00 (#23224)
in some case, a hive table with int partition column may has following partition value:
hour=00, hour=01
we need to support this.
2023-08-26 12:58:31 +08:00
bc020112fc [enhancement](routineload) add debug conf and set broker.name.ttl = 0 (#23302)
* set broker.name.ttl = 0

* add debug config for librdkafka
2023-08-26 10:56:35 +08:00
db8d18eb40 [Enhance](auth)row policy support role (#23022)
```
CREATE ROW POLICY test_row_policy_1 ON test.table1 
AS {RESTRICTIVE|PERMISSIVE} [TO  user] [TO ROLE role] USING (id in (1, 2)); // add `to role`

DROP [ROW] POLICY [IF EXISTS] test_row_policy;//delete `for user` and `on table`

SHOW ROW POLICY [FOR user][FOR ROLE role] // add `for role`
```
2023-08-26 10:24:59 +08:00
f32efe5758 [Fix](Outfile) Fix that it does not report error when export table to S3 with an incorrect ak/sk/bucket (#23441)
Problem:
It will return a result although we use wrong ak/sk/bucket name, such as:
```sql
mysql> select * from demo.student
    -> into outfile "s3://xxxx/exp_"
    -> format as csv
    -> properties(
    ->   "s3.endpoint" = "https://cos.ap-beijing.myqcloud.com",
    ->   "s3.region" = "ap-beijing",
    ->   "s3.access_key"= "xxx",
    ->   "s3.secret_key" = "yyyy"
    -> );
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
| FileNumber | TotalRows | FileSize | URL                                                                                                |
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
|          1 |         3 |       26 | s3://xxxx/exp_2ae166e2981d4c08-b577290f93aa82ba_ |
+------------+-----------+----------+----------------------------------------------------------------------------------------------------+
1 row in set (0.15 sec)
```

The reason for this is that we did not catch the error returned by `close()` phase.
2023-08-26 00:19:30 +08:00
f66f161017 [fix](multi-catalog)fix hive table with cosn location issue (#23409)
Sometimes, the partitions of a hive table may on different storage, eg, some is on HDFS, others on object storage(cos, etc).
This PR mainly changes:

1. Fix the bug of accessing files via cosn.
2. Add a new field `fs_name` in TFileRangeDesc
    This is because, when accessing a file, the BE will get a hdfs client from hdfs client cache, and different file in one query
request may have different fs name, eg, some of are `hdfs://`, some of are `cosn://`, so we need to specify fs name
for each file, otherwise, it may return error:

`reason: IllegalArgumentException: Wrong FS: cosn://doris-build-1308700295/xxxx, expected: hdfs://[172.xxxx:4007](http://172.xxxxx:4007/)`
2023-08-26 00:16:00 +08:00
8af1e7f27f [Fix](orc-reader) Fix incorrect result if null partition fields in orc file. (#23369)
Fix incorrect result if null partition fields in orc file. 

### Root Cause
Theoretically, the underlying file of the hive partition table should not contain partition fields. But we found that in some user scenarios, the partition field will exist in the underlying orc/parquet file and are null values. As a result, the  pushed down partition field which are null values. filter incorrectly.

### Solution
we handle this case by only reading non-partition fields. The parquet reader is already handled this way, this PR handles the orc reader.
2023-08-26 00:13:11 +08:00
a3a951c71d [Fix](multi-catalog) Fix load string dict issue for transactional hive tables. (#23306)
Fix load string dict issue for transactional hive tables. The column name need to pass 'row.column_name'.

apache/doris-thirdparty#112
2023-08-26 00:09:12 +08:00
2b6d876280 [feature](move-memtable)[6/7] add options to enable memtable on sink node (#23470)
Co-authored-by: Siyang Tang <82279870+TangSiyang2001@users.noreply.github.com>
2023-08-25 22:32:22 +08:00
da21b1cb24 [Feature](Job)Allow Job to perform all insert operations, and limit permissions to allow Admin operations (#23492) 2023-08-25 21:58:53 +08:00
6e6da733c6 [fix](invert index) fix the keyword type index length limit (#23503) 2023-08-25 21:34:11 +08:00
006c88827f [fix](stats) Fix auto analyze (#20426)
We only reanalyze those partition that lastVisibleTime is later than job's updatetime, so we shouldn't set this field when creat e system jobs
2023-08-25 21:30:59 +08:00
e3db0fddc1 [fix](iceberg) fix iceberg count(*) short circuit read bug (#23402) 2023-08-25 21:30:30 +08:00
468dfc97db [fix](meta) set broadcast_right_table_scale_factor when upgrading from 1.2 to 2.x (#23423)
When upgrading from 1.2 to 2.x(future version higher than 2.0), the default value of parameter broadcast_right_table_scale_factor may not be upgraded from old default value 10.0 to new default 0.0, which will cause the broadcast join behavior unexpected and may have a big performance impact. This pr will force to reset the value to new default value 0.0, to make sure the behavior correct.
2023-08-25 21:26:19 +08:00
17e7c1ca53 [fix](fqdn)Fqdn with ipv6 (#22454)
now,`hostname_to_ip` only can resolve `ipv4`,Therefore, a method is provided to parse ipv4 or ipv6 based on parameters。
when `_heartbeat` call `hostname_to_ip`,Resolve to ipv4 or ipv6, determined by `BackendOptions.is_bind_ipv6` Decision
Additionally, a method is provided to first attempt to parse the host into ipv4, and then try ipv6 if it fails
2023-08-25 21:24:55 +08:00
00826185c1 [fix](tvf view)Support Table valued function view for nereids (#23317)
Nereids doesn't support view based table value function, because tvf view doesn't contain the proper qualifier (catalog, db and table name). This pr is to support this function.

Also, fix nereids table value function explain output exprs incorrect bug.
2023-08-25 21:23:16 +08:00
8be0202b94 [improvement](old planner)Prune extra slots with old planner for sql like select count(1) from view (#23393)
The sql like
Select count(1) from view 
would contain all the columns in old planner's execution plan, which is slow, because BE need to read all the column in data files. This pr is to improve the plan to only contain one column.
2023-08-25 21:22:03 +08:00
29273771f7 [Fix](multi-catalog) Fix hive incorrect result by disable string dict filter if exprs contain null expr. (#23361)
Issue Number: close #21960

Fix hive incorrect result by disable string dict filter if exprs contain null expr.
2023-08-25 21:16:43 +08:00