Commit Graph

9308 Commits

Author SHA1 Message Date
990dce9a47 [fix](load) fix load channel timeout too fast in routine load task (#17796)
enlarge the timeout in routine load

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-03-15 21:12:02 +08:00
0ad459fea5 [fix](multi-catalog) fix forward to master throw NPE (#17791)
The ConnectionContext maybe null, and actually, we don't need ConnectionContext in MasterCatalogExecutor
2023-03-15 20:50:29 +08:00
e4a1e57d6f [feature](multi-catalog) support sap hana jdbc catalog and jdbc external table (#17780) 2023-03-15 20:37:36 +08:00
54a51933b6 [improvement](FQDN) Support existing cluster upgrade (#17659) 2023-03-15 20:13:13 +08:00
079e6a3e12 [regression-test](vectorized) remove unused vectorization flag (#17662) 2023-03-15 17:59:22 +08:00
1fdc265083 [fix](Nereids) lock tables when analyze may lead to dead lock (#17776)
Nereids use AutoCloseable to do tables read lock and unlock.
However, if AutoCloseable throw exception when open resource, its close function would not be called.
So, we do close manually when exception thrown in opening stage.
2023-03-15 14:00:27 +08:00
1e3da95359 [fix](Nereids): fix LAsscom split conjuncts. (#17792) 2023-03-15 13:39:08 +08:00
ceff7e851d [fix](cooldown)Check cooldown ttl and datetime when alter storage policy (#17779)
* Check cooldown ttl and datetime when alter storage policy
2023-03-15 12:19:30 +08:00
a378a6024d [fix](cooldown)Support change be.conf: max_sub_cache_file_size (#17773)
* delete files when sub file cache size is changed.
2023-03-15 12:19:12 +08:00
bbf88ecc49 [Bug](datetimev2) Fix BE crash if scale is invalid (#17763) 2023-03-15 12:08:23 +08:00
049b70b957 [test](Nereids) add yandex metrica p2 regression case (#17082) 2023-03-15 11:50:00 +08:00
c8de04f9d7 [fix][Nereids] fix not correct condition to checkReorder in InnerJoinRightAssociate. (#17799) 2023-03-15 11:49:03 +08:00
97bf07fe26 [enhancement](Nereids) add new distributed cost model (#17556)
Add a new distributed cost model in Nereids. The new cost model models the cost of the pipeline execute engine by dividing cost into run and start costs. They are:
* START COST: the cost from starting to emitting the fist tuple
* RUN COST: the cost from emitting the first tuple to emitting all tuples

For the parent operator and child operator, we assume the timeline of them is:
  ```
  child start ---> child run --------------------> finish
             |---> parent start ---> parent run -> finish
  ```

Therefore, in the parallel model, we can get:
  ```
  start_cost(parent) = start_cost(child) + start_cost(parent)
  run_cost(parent) = max(run_cost(child), start_cost(parent) + run_cost(parent))
  ```
2023-03-15 11:22:31 +08:00
66f3ef568e (functions) optimize const_column to full convert 2023-03-15 10:57:03 +08:00
85080ee3c3 [vectorized](function) support array_map function (#17581) 2023-03-15 10:51:29 +08:00
ca0367d846 FIX: es doc (#17771) 2023-03-15 10:40:53 +08:00
5ab758674e [fix](planner) nested loop join with left semi generate repeat result (#17767) 2023-03-15 09:56:44 +08:00
45fcdaabc7 [Bug](catalog) Fix fetching information_schema table timed out(#17692) (#17694)
Co-authored-by: hugoluo <hugoluo@tencent.com>
2023-03-15 09:56:24 +08:00
16a4dc0a85 [ehancement](profile) Disable profiling for the internal query (#17720) 2023-03-15 09:48:29 +08:00
64c2437be5 [fix](coalesce) support coalesce function for bitmap (#17798) 2023-03-15 09:34:44 +08:00
9b047d2c94 Feat: Add byte size to TTypedesc in TExpr. Which will be used to carry scalarType information. (#17757)
Co-authored-by: libinfeng <libinfeng@selectdb.com>
2023-03-15 08:24:32 +08:00
7872f3626a [feature](Nereids): Rewrite InPredicate to disjunction if there exist items < 3 elements in InPredicate (#17646)
* [feature](Nereids): Rewrite InPredicate to disjunction if there exists < 3 elements in InPredicate

* fix SimplifyRange
2023-03-15 08:23:56 +08:00
02220560c5 [Improvement](multi catalog)Hive splitter. Get HDFS/S3 splits by using FileSystem api (#17706)
Use FileSystem API to get splits for file in HDFS/S3 instead of calling InputFormat.getSplits.
The splits is based on blocks in HDFS/S3.
2023-03-15 00:25:00 +08:00
b28f31f98d [fix](meta) fix show create table result of hive table (#17677)
make it usable in hive.

current issue: type of partition column are wrapped by ``, it's not illegal in hive. One problem case:

CREATE TABLE t3p_parquet(
id int,
name string)
PARTITIONED BY (
dt int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
'hdfs://path/to/t3p_parquet'
TBLPROPERTIES (
'transient_lastDdlTime'='1671700883')
2023-03-14 22:50:35 +08:00
76f486980a [docs](user)update the users number (#17749) 2023-03-14 22:42:51 +08:00
e46077fbf4 print group id for physical plan node (#17742) 2023-03-14 22:35:08 +08:00
7180cf3d9b [Improve](row store) avoid serialize null slot into a jsonb row (#17734)
This could save some disk space
2023-03-14 22:13:41 +08:00
6348819c27 [fix](Nereids) remove bitmap_union_int(bigint) signature (#17356) 2023-03-14 20:42:47 +08:00
ff9e03e2bf [Feature](add bitmap udaf) add the bitmap intersection and difference set for mixed calculation of udaf (#15588)
* Add the bitmap intersection and difference set for mixed calculation of udaf

Co-authored-by: zhangbinbin05 <zhangbinbin05@baidu.com>
2023-03-14 20:40:37 +08:00
65f71d9e06 [enhance](nereids) broadcast cost calculate (#17711)
update broadcast join cost estimate according to BE implementation.
there is an enhancement on BE. in broadcast join, BE only build one hash table, not instanceNum hash tables.
2023-03-14 19:45:03 +08:00
699159698e [enhancement](planner) support update from syntax (#17639)
support update from syntax

note: enable_concurrent_update is not supported now

```
UPDATE <target_table>
  SET <col_name> = <value> [ , <col_name> = <value> , ... ]
  [ FROM <additional_tables> ]
  [ WHERE <condition> ]
```

for example:
t1
```
+----+----+----+-----+------------+
| id | c1 | c2 | c3  | c4         |
+----+----+----+-----+------------+
| 3  | 3  | 3  | 3.0 | 2000-01-03 |
| 2  | 2  | 2  | 2.0 | 2000-01-02 |
| 1  | 1  | 1  | 1.0 | 2000-01-01 |
+----+----+----+-----+------------+
```

t2
```
+----+----+----+------+------------+
| id | c1 | c2 | c3   | c4         |
+----+----+----+------+------------+
| 4  | 4  | 4  |  4.0 | 2000-01-04 |
| 2  | 20 | 20 | 20.0 | 2000-01-20 |
| 5  | 5  | 5  |  5.0 | 2000-01-05 |
| 1  | 10 | 10 | 10.0 | 2000-01-10 |
| 3  | 30 | 30 | 30.0 | 2000-01-30 |
+----+----+----+------+------------+
```

t3
```
+----+
| id |
+----+
| 1  |
| 5  |
| 4  |
+----+
```

do update
```sql
 update t1 set t1.c1 = t2.c1, t1.c3 = t2.c3 * 100 from t2 inner join t3 on t2.id = t3.id where t1.id = t2.id;
```

the result
```
+----+----+----+--------+------------+
| id | c1 | c2 | c3     | c4         |
+----+----+----+--------+------------+
| 3  | 3  | 3  |    3.0 | 2000-01-03 |
| 2  | 2  | 2  |    2.0 | 2000-01-02 |
| 1  | 10 | 1  | 1000.0 | 2000-01-01 |
+----+----+----+--------+------------+
```
2023-03-14 19:26:30 +08:00
f999b823fc [feature](array) support array for apache arrow convertor (#17682)
* support array type for arrow

* fix builder.Append() for each array row

* fix array child column append start offset
2023-03-14 17:53:16 +08:00
f1dde20315 [ehancemnet](nereids) Refactor statistics (#17637)
1. Support for more expression type
2. Support derive with histogram
3. Use StatisticRange to abstract to logic
4. Use Statistics rather than StatisDeriveResult
2023-03-14 13:10:55 +08:00
be3a7e69cd [refactor](Nereids): polish code SemiJoinLogicalJoinTranspose. (#17740) 2023-03-14 12:48:58 +08:00
77ab2fac20 [refactor](functioncontext) remove function context impl class (#17715)
* [refactor](functioncontext) remove function context impl class


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-03-14 11:21:45 +08:00
3a97190661 [fix](Nereids) Compare plan with their output rather than string in UnrankTest (#17698)
After adding a unique ID, the unRankTest fail because each plan has a different ID in the string.
To avoid the effect of unique ID, Compare the plan with the output rather than the string
2023-03-14 11:10:06 +08:00
5b39fa9843 [Feature](vec)(quantile_state): support quantile state in vectorized engine (#16562)
* [Feature](vectorized)(quantile_state): support vectorized quantile state functions
1. now quantile column only support not nullable
2. add up some regression test cases
3. set default enable_quantile_state_type = true
---------

Co-authored-by: spaces-x <weixiang06@meituan.com>
2023-03-14 10:54:04 +08:00
36a0d40ac3 Fix errors in the data-partition.md (#17756) 2023-03-14 10:44:57 +08:00
ba0f5a2355 [test](mv) Add mv case from fe ut (#17204)
add some mv case from fe ut MaterializedViewFunctionTest
2023-03-14 10:29:43 +08:00
2e0af4e33c [Enhancement](inverted-index) use read buffer when read index bytes in compound reader (#17306)
Read IO would be a problem when reading inverted index from disk.
Using read buffer to reduce IO.
Set use buffer flag to be true when reading internal bytes in compound reader for inverted index.
2023-03-14 10:10:59 +08:00
7d91114304 [fix](join) fix wrong result of null aware left anti join (#17752) 2023-03-14 09:35:46 +08:00
c6630a06c1 [Fix](multi-catalog) Fix "test_hive_other" regression test. (#17611) 2023-03-14 09:16:48 +08:00
76458cf091 [typo](partition)Modify the list partition document #17744 2023-03-14 08:27:26 +08:00
883ae8a86d [typo](docs) Add some content for bitmap_hash.md. (#17747) 2023-03-14 08:27:07 +08:00
f3c6ee5961 [Enhance](ComputeNode) ES Scan node support to be scheduled to compute node (#16533)
ES Scan node support to be scheduled to compute node.
2023-03-14 00:13:24 +08:00
9b7596f1c6 [Feature](Dynamic schema table) step1 support schema change expression (#17494)
1. introduce a new type `VARIANT` to encapsulate dynamic generated columns for hidding the detail of types and names of newly generated columns
2. introduce a new expression `SchemaChangeExpr` for doing schema change for extensibility
2023-03-13 15:12:42 +08:00
c302fa2564 [Feature](array-function) Support array_pushfront function (#17584) 2023-03-13 14:26:02 +08:00
ac944e2ac1 [fix](cooldown)Fix bug for storage policy in dynamic partition (#17665)
* fix bug for partition storage policy
2023-03-13 14:13:55 +08:00
be5147c32e [enhancement](feservice) catch throwable and print log for frontend service (#17708)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-13 11:27:00 +08:00
2b31fc1472 [fix](regression) segcompaction timeout too short (#16731) (#17565)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-03-13 11:19:21 +08:00