Commit Graph

18263 Commits

Author SHA1 Message Date
Pxl
59c6139aa5 [Chore](parser) fix create view failed when view contained cast as varchar (#23043)
fix create view failed when view contained cast as varchar
2023-08-18 11:50:18 +08:00
df8e7f7f09 [enhancement](msg) add disk root path in message (#23000) 2023-08-18 11:21:59 +08:00
e6fe8c05d1 [fix](inverted index change) fix update delete bitmap incompletely when build inverted index on mow table (#23047) 2023-08-18 11:15:39 +08:00
16df7a7ec0 [chore](macOS) Fix SSL errors while building documents (#23127)
Issue Number: #23126

Add NODE_OPTIONS to fix this issue.
2023-08-18 10:57:05 +08:00
d018ac8fb7 fix show grants throw NullPointerException (#22943) 2023-08-18 10:48:56 +08:00
5b8a76a22e [doc](catalog)faq for lzo.jar not found (#23070) 2023-08-18 10:16:32 +08:00
a5ca6cadd6 [Improvement] Optimize count operation for iceberg (#22923)
Iceberg has its own metadata information, which includes count statistics for table data. If the table does not contain equli'ty delete, we can get the count data of the current table directly from the count statistics.
2023-08-18 09:57:51 +08:00
03d59ba81e [Fix](Nereids) fix sql-cache for nereids. (#22808)
1. should not use ((LogicalPlanAdapter)parsedStmt).getStatementContext().getOriginStatement().originStmt.toLowerCase() as the cache key (do not invoke toLowerCase()), for example: select * from tbl1 where k1 = 'a' is different with select * from tbl1 where k1 = 'A', so the cache should be missed.
2. according to issue 6735 , the cache key should contains all views' s ddl sql (including nested views)
2023-08-18 09:36:07 +08:00
hzq
38c182100a [refactor](mysql compatibility) An abstract class for all databases created for mysql compatibility (#23087)
Better code structure for mysql compatibility databases.
2023-08-18 09:16:23 +08:00
de98324ea7 [fix](inverted index change) make mutex for ALTER_INVERTED_INDEX task and STORAGE_MEDIUM_MIGRATE task (#22995) 2023-08-18 08:35:30 +08:00
314f5a5143 [Fix](orc-reader) Fix filling partition or missing column used incorrect row count. (#23096)
[Fix](orc-reader) Fix filling partition or missing column used incorrect row count.

`_row_reader->nextBatch` returns number of read rows. When orc lazy materialization is turned on, the number of read rows includes filtered rows, so caller must look at `numElements` in the row batch to determine how
many rows were not filtered which will to fill to the block.

In this case, filling partition or missing column used incorrect row count which will cause be crash by `filter.size() != offsets.size()` in filter column step.

When orc lazy materialization is turned off, add `_convert_dict_cols_to_string_cols(block, nullptr)` if `(block->rows() == 0)`.
2023-08-17 23:26:11 +08:00
1f19d0db3e [improvement](tablet clone) improve tablet balance, scaling speed etc (#22317) 2023-08-17 22:30:49 +08:00
57568ba472 [fix](be)shouldn't use arena to alloc memory for SingleValueDataString (#23075)
* [fix](be)shouldn't use arena to alloc memory for SingleValueDataString

* format code
2023-08-17 22:18:09 +08:00
29ff7b7964 [fix](merge-on-write) add sentinel mark when do compaction (#23078) 2023-08-17 20:08:01 +08:00
c5c984b79b [refactor](bitmap) using template to reduce duplicate code (#23060)
* [refactor](bitmap) support for batch value insertion

* fix values was not filled for int8 and int16
2023-08-17 18:14:29 +08:00
b91bb9f503 [fix](alter table property) fix alter property if rpc failed (#22845)
* fix alter property

* add regression case

* do not repeat
2023-08-17 18:02:34 +08:00
11d76d0ebe [fix](Nereids) non-inner join should not merge dist info (#22979)
1. left join should use left dist info.
2. right join should use right dist info.
3. full outer join should return ANY dist info.
2023-08-17 17:48:50 +08:00
330f369764 [enhancement](file-cache) limit the file cache handle num and init the file cache concurrently (#22919)
1. the real value of BE config `file_cache_max_file_reader_cache_size` will be the 1/3 of process's max open file number.
2. use thread pool to create or init the file cache concurrently.
    To solve the issue that when there are lots of files in file cache dir, the starting time of BE will be very slow because
    it will traverse all file cache dirs sequentially.
2023-08-17 16:52:08 +08:00
d7a6b64a65 [Fix](Planner) fix case function with null cast to array null (#22947) 2023-08-17 16:37:07 +08:00
b252c49071 [fix](hash join) fix heap-use-after-free of HashJoinNode (#23094) 2023-08-17 16:29:47 +08:00
47aac84549 Revert "[pipeline](branch-2.0) pr to branch-2.0 also run checks (#23004)" (#23101)
This reverts commit 41a52d45d33be6c1770531cef230aafe676bcce7.
2023-08-17 15:53:22 +08:00
a248cb720c [fix](jdbc catalog) fix DefaultValueExpr in Jdbc table column when CTAS (#22978) 2023-08-17 15:52:20 +08:00
f092afc946 [Regression](pipeline) update p1 pipeline to required 0817 (#23100)
update p1 pipeline to required 0817
2023-08-17 15:47:40 +08:00
e289e03a1a [fix](executor)fix no return with old type in time_round 2023-08-17 15:34:26 +08:00
Pxl
cf1865a1c8 [Bug](scan) fix core dump due to store_path_map (#23084)
fix core dump due to store_path_map
2023-08-17 15:24:43 +08:00
3fe419eafa [Fix](statistics)Fix update cached column stats bug (#23049)
`show column cached stats` sometimes show wrong min/max value:
```
mysql> show column cached stats hive.tpch100.region;
+-------------+-------+------+----------+-----------+---------------+------+------+--------------+
| column_name | count | ndv  | num_null | data_size | avg_size_byte | min  | max  | updated_time |
+-------------+-------+------+----------+-----------+---------------+------+------+--------------+
| r_regionkey | 5.0   | 5.0  | 0.0      | 24.0      | 4.0           | N/A  | N/A  | null         |
| r_comment   | 5.0   | 5.0  | 0.0      | 396.0     | 66.0          | N/A  | N/A  | null         |
| r_name      | 5.0   | 5.0  | 0.0      | 40.8      | 6.8           | N/A  | N/A  | null         |
+-------------+-------+------+----------+-----------+---------------+------+------+--------------+
```
This pr is to fix this bug. It is because while transferring ColumnStatistic object to JSON, it doesn't contain the minExpr and maxExpr attribute.
2023-08-17 15:20:02 +08:00
d59c2f763f [fix](test) add sync for test_pk_uk_case (#23067) 2023-08-17 15:18:07 +08:00
bf2b92f5e8 [fix](Nereids): PushdownDistinctThroughJoin don't push distinct for relation (#23066)
* [fix](Nereids): PushdownDistinctThroughJoin don't push distinct for relation.

* fix test
2023-08-17 14:50:34 +08:00
f5da9f4ccc [fix](muti-catalog)convert to s3 path when use aws endpoint (#22784)
Convert to s3 path when use aws endpoint
For compatibility, we can also use s3 client to access other cloud by setting s3 endpoint properties
2023-08-17 14:28:00 +08:00
6e51632ca9 [docs](kerberos)add FAQ cases and enable krb5 debug (#22821) 2023-08-17 14:25:09 +08:00
8b51da0523 [Fix](load) fix partiotion Null pointer exception (#22965) 2023-08-17 14:09:47 +08:00
41bce29ae3 [docs](docs)Rename Title and URL of Bitwise Functions (#22722) 2023-08-17 11:18:02 +08:00
92c8f842f7 [fix](nereids) dphyper join reorder use wrong method to get hash and other conjuncts (#22966)
should use getHashJoinConjuncts() and getOtherJoinConjuncts() to get hash and other conjuncts of hash join node instead of categorizing them by checking if it's 'EqualTo' expression
2023-08-17 11:03:45 +08:00
a288377118 [fix](regresstion) Fix sql server external case (#23031) 2023-08-17 10:54:54 +08:00
d71b99b88a [fix](dbt) fix dbt doris user non-root user permission for show frintends sql (#22815) 2023-08-17 09:40:53 +08:00
343a6dc29d [improvement](hash join) Return result early if probe side has no data (#23044) 2023-08-17 09:17:09 +08:00
a77e9fbc99 (chores)(ui) download profile filename add profile_id (#23065) 2023-08-17 09:11:01 +08:00
7a9ff47528 [Improve](CI)Modify Deadline-check trigger mode, and add maven cache for Sonarcheck (#23069)
There are a lot of deadlinks in stock, we will reopen it after a full repair…
2023-08-16 22:31:50 +08:00
814acbf331 [pipeline](exec) disable pipeline load in master code (#23061)
disable pipeline load in master code
2023-08-16 21:53:58 +08:00
390c52f73a [Improve](complex-type) update for array/map element_at with nested complex type with local tvf (#22927) 2023-08-16 20:47:36 +08:00
a5c73c7a39 [fix](partial update) set io_ctx.reader_type when reading columns for partial update (#22630) 2023-08-16 19:34:39 +08:00
0aa57d159e [Fix](Partial update) Fix wrong position using in segment writer (#22782) 2023-08-16 19:31:06 +08:00
0594acfcf1 [fix](Nereids) scan should output all invisiable column (#23003) 2023-08-16 18:07:59 +08:00
b815cf327a [enhancement](merge-on-write) Add more log info when delete bitmap correctness check failed (#22984) 2023-08-16 17:25:11 +08:00
f1880d32d9 [fix](nereids)bind slot failed because of "default_cluster" #23008
slot bind failed for following querys:
select tpch.lineitem.* from lineitem
select tpch.lineitem.l_partkey from lineitem

the unbound slot is tpch.lineitem.l_partkey, but the bounded slot is default_cluster:tpch.lineitem.l_partkey. They are not matched.
we need to ignore default_cluster: when compare dbName
2023-08-16 17:22:44 +08:00
92f443b3b8 [enhancement](Nereids): count(1) to count(*) #22999
add a rule to transform count(1) to count(*)
2023-08-16 17:19:23 +08:00
2dbca7a688 [Fix](Planner) fix multi phase analysis failed in multi instance environment substitution (#22840)
Problem:
When executing group_concat with order by inside in view, column can not be found when analyze.

Example:
create view if not exists test_view as select group_concat(c1,',' order by c1 asc) from table_group_concat;
select * from test_view;
it will return an error like: "can not find c1 in table_list"

Reason:
When we executing this sql in multi-instance environment, Planner would try to create plan in multi phase
aggregation. And because we analyze test_view independent with tables outside view. So we can not get
table informations inside view.

Solution:
Substitute order by expression of merge aggregation expressions.
2023-08-16 16:46:26 +08:00
7adb2be360 [Fix](Nereids) fix insert into return npe from follower node. (#22734)
insert into table command run at a follower node, it will forward to the master node, and the parsed statement is not set to the cascades context, but set to the executor::parsedStmt, we use the latter to get the user info.
2023-08-16 16:37:17 +08:00
6cf1efc997 [refactor](load) use smart pointers to manage writers in memtable memory limiter (#23019) 2023-08-16 16:34:57 +08:00
4512569a3a [docs](releasenote)Update en release note 2.0.0 (#23041) 2023-08-16 15:13:09 +08:00