Commit Graph

15483 Commits

Author SHA1 Message Date
6e5f84635b [fix](Nereids): remove duplicated dependency (#28279) 2023-12-12 17:57:52 +08:00
7ac12ac7d7 [feature](Nereids): return edges that can be pulled up (#28280) 2023-12-12 17:55:59 +08:00
f401a9c7ec [fix](pipelineX) fix use-after-free in filter timer queue (#28236) 2023-12-12 17:25:14 +08:00
4e50b2791a [fix](Nereids) pull up cte anchor should also pull up cte in apply (#28214) 2023-12-12 16:56:04 +08:00
45b2dbab6a [improve](group commit) Group commit support max filter ratio when rows is less than value in config (#28139) 2023-12-12 16:33:36 +08:00
d25cbdd4dc [feature](Nereids): one side eager aggregation (#28143) 2023-12-12 15:38:31 +08:00
92e04c1453 [feature](Nereids): support comparing outer join in materialized view (#28237) 2023-12-12 15:36:36 +08:00
d9fb77ad10 [fix](mtmv)fix show task tvf finishTime is null (#28252) 2023-12-12 15:20:02 +08:00
9861cfc4bc [Fix](Transactional-Hive) Fix transactional hive core dump when TransactionalHiveReader::init_row_filters(). (#28238)
Fix transactional hive core dump when TransactionalHiveReader::init_row_filters().
2023-12-12 14:17:26 +08:00
cd8885f63e [fix](Nereids): support Chinese characters set (#28256) 2023-12-12 13:22:14 +08:00
3f202477ec [minor](import) modify some imports (#28206) 2023-12-12 11:39:54 +08:00
a5a7ab3c65 [pipelineX](profile) Add debug string if enable profile (#28262) 2023-12-12 11:39:08 +08:00
e49ed3d885 [regression test](memtable) add case for aggregation memtable (#28056)
1. create aggregation table
2. insert some data
3. drop the table and create again
4. modify some parameters for some branch
5. insert some data
6. change the parameters back to its default
2023-12-12 11:14:59 +08:00
8a7b0e5485 [upgrade](thirdparty) upgrade librdkafka from 1.8.2 to 2.0.2 (#28210)
Some error happen when using routine load
```
[INTERNAL_ERROR]Message at offset XXX might be too large to fetch, try increasing receive.message.max.bytes
```
Refer to https://github.com/confluentinc/librdkafka/issues/2993, we should upgrade librdkafka version to avoid this bug.
2023-12-12 11:12:10 +08:00
5ff110e845 [exec](profile) only build expr debug string enable profile (#28261) 2023-12-12 09:13:37 +08:00
7fba3fcb91 [pipelineX](improvement) block local shuffle sink by mem usage (#28224) 2023-12-11 21:25:31 +08:00
c4e484916b [Fix](table property) Fix table property disable_auto_compaction (#27853) 2023-12-11 20:48:11 +08:00
cd3d31ba13 [fix](statistics)Escape load stats sql (#28117)
Escape load stats sql, because column name may contain special characters.
2023-12-11 20:25:18 +08:00
Pxl
f30cc1f6ef [Bug](view) fix npe on create view with comment (#28189)
fix npe on create view with comment
2023-12-11 20:00:21 +08:00
d4f89389e3 [improve](group commit) Group commit support skip wal (#27957) 2023-12-11 19:38:32 +08:00
9b8de017df [Regression test](inverted index) fix regression case for index_compound_directory_fault_injection (#28232) 2023-12-11 19:17:28 +08:00
c1f666c497 [doc] fix typo (#28245) 2023-12-11 18:09:54 +08:00
877935442f [feature](pipelineX)use markFragments instead of markInstances in pipelineX (#27829) 2023-12-11 17:59:53 +08:00
3e1e8d2ebe [fix](jdbc catalog) Fixed data conversion problem when all data is null (#28230) 2023-12-11 17:57:57 +08:00
cff1de29ce [fix](group commit) Fix group commit memory calculation (#28242) 2023-12-11 17:05:26 +08:00
ad483efca5 [regression-test](case) forbid group commit case first #28244 2023-12-11 16:06:59 +08:00
3c2e8b0ecf [fix](Nereids) rewrite cte children check wrong map for consumer (#28220) 2023-12-11 14:58:42 +08:00
c2d6fbbc85 [feature](Nereids): add filter edge in hyperGraph (#28006) 2023-12-11 14:36:43 +08:00
593cc92501 [chore] Change default max segment size to 1GB (#28201) 2023-12-11 14:30:57 +08:00
1bbc54d1b2 [regression-test](variant) change p2 case to s3 load (#28193) 2023-12-11 12:31:25 +08:00
ac167f493b [fix](join) fix decimal overflow caused by left outer join (#28221)
For left outer join or full outer join, when build side data is empty, null data is output for build side, but nested column data of nullable column is not properly initialized, which may cause decimal arithmetic overflow
2023-12-11 11:51:05 +08:00
7c163fdf21 [test](decimal) add some cases about overflow (#28198) 2023-12-11 11:22:53 +08:00
f2fd66ad3b [feature-wip](nereids) Make nereids more compatible with spark-sql syntax. (#27231)
**Thanks for** pr #21855 to provide a wonderful reference. 

Maybe it is very difficult and **cost-expensive** to implement **a comprehensive logical plan adapter**, maybe there is just some small syntax variations between doris and some other engines (such as hive/spark), so we can just **focus on** the **difference** here.

This pr mainly focus on the **syntax difference between doris and spark-sql**. For instance, do some function tranformations and override some syntax validations.

- add a dialect named `spark_sql`
- move method `NereidsParser#parseSQLWithDialect` to `TrinoParser`
- extract some `FnCallTransformer`/`FnCallTransformers` classes, so we can reuse the logic about the function transformers
- allow derived tables without alias when we set dialect to `spark_sql`(legacy and nereids parser are both supported)
- add some function transformers for hive/spark built-in functions

### Test case (from our online doris cluster)

- Test derived table without alias

```sql
MySQL [(none)]> show variables like '%dialect%';
+---------------+-------+---------------+---------+
| Variable_name | Value | Default_Value | Changed |
+---------------+-------+---------------+---------+
| sql_dialect   | spark_sql  | doris         | 1       |
+---------------+-------+---------------+---------+
1 row in set (0.01 sec)

MySQL [(none)]> select * from (select 1);
+------+
| 1    |
+------+
|    1 |
+------+
1 row in set (0.03 sec)

MySQL [(none)]> select __auto_generated_subquery_name.a from (select 1 as a);
+------+
| a    |
+------+
|    1 |
+------+
1 row in set (0.03 sec)

MySQL [(none)]> set sql_dialect=doris;
Query OK, 0 rows affected (0.02 sec)

MySQL [(none)]> select * from (select 1);
ERROR 1248 (42000): errCode = 2, detailMessage = Every derived table must have its own alias
MySQL [(none)]> 
```

- Test spark-sql/hive built-in functions

```sql
MySQL [(none)]> show global functions;
Empty set (0.01 sec)

MySQL [(none)]> show variables like '%dialect%';
+---------------+-------+---------------+---------+
| Variable_name | Value | Default_Value | Changed |
+---------------+-------+---------------+---------+
| sql_dialect   | spark_sql  | doris         | 1       |
+---------------+-------+---------------+---------+
1 row in set (0.01 sec)

MySQL [(none)]> select get_json_object('{"a":"b"}', '$.a');
+----------------------------------+
| json_extract('{"a":"b"}', '$.a') |
+----------------------------------+
| "b"                              |
+----------------------------------+
1 row in set (0.04 sec)

MySQL [(none)]> select split("a b c", " ");
+-------------------------------+
| split_by_string('a b c', ' ') |
+-------------------------------+
| ["a", "b", "c"]               |
+-------------------------------+
1 row in set (1.17 sec)
```
2023-12-11 11:16:53 +08:00
e1587537bc [Fix](status) fix unhandled status in exprs #28218
which marked static_cast<void> in https://github.com/apache/doris/pull/23395/files
partially fixed #28160
2023-12-11 11:04:58 +08:00
53802fe0da [doc] document desc param is incorrect #26063 (#26064) 2023-12-11 10:33:07 +08:00
f236261256 [fix](regression) compaction cases adapt force_olap_table_replica_num option (#28136) 2023-12-11 10:08:21 +08:00
8f2202c89d [minor](log) Add debug info in operators (#28211) 2023-12-11 10:02:24 +08:00
1e5ff40e17 [refactor](group commit) remove future block (#27720)
Co-authored-by: huanghaibin <284824253@qq.com>
2023-12-11 08:41:51 +08:00
320ddf4987 [pipelineX](improvement) Support multiple instances execution on single tablet (#28178) 2023-12-10 20:18:41 +08:00
485d7db516 [fix](partial update) Fix missing rowsets during doing alignment when flushing memtable due to compaction (#28062) 2023-12-10 12:09:48 +08:00
a3cd36ce60 [bug](cooldown) Fix incorrect remote rowset dir after restarting BE (#28140) 2023-12-10 00:44:01 +08:00
5aa90a3bce [pipelineX](local shuffle) Fix bucket hash shuffle (#28202) 2023-12-10 00:35:00 +08:00
61379b141e [fix](insert) fix group commit regression test (#28142) 2023-12-09 16:24:20 +08:00
4e86f9bab5 [improve](move-memtable) include and check offset when append data (#28159) 2023-12-09 16:21:36 +08:00
16e232a8a1 [minor](lower-table-names) use GlobalVariable.lowerCaseTableNames instead of Config.lower_case_table_names (#27911)
GlobalVariable.lowerCaseTableNames instead of Config.lower_case_table_names
2023-12-09 12:04:26 +08:00
363721e066 [Bug](udf) java-udf function open failed cause BE core dump #28063
when the java-udf open function failed, and some JNI have not set,
so in close function can't call jni.
2023-12-09 11:00:30 +08:00
42aa174405 [chore](log) Log to trace before wait rpc timeout #28024 2023-12-09 10:04:43 +08:00
9d9b6462bf [improve](group_commit) optimize group commit select be logic #28190
Group commit choose be always first no decommissioned be in all be.

Choose be with selectBackendIdsByPolicy like common stream load and do not choose decommissioned be may be better.
2023-12-09 05:09:52 +08:00
287bd87a4f [typo](docs)add some faq for flink-connector-doris (#26309)
* add flink-connector-doris faq

* add faq
2023-12-09 02:19:49 +08:00
bd8130154a [fix](doc) spell errors fixes hardware-info-action (#28154) 2023-12-09 01:47:19 +08:00