Commit Graph

18429 Commits

Author SHA1 Message Date
45b2dbab6a [improve](group commit) Group commit support max filter ratio when rows is less than value in config (#28139) 2023-12-12 16:33:36 +08:00
d25cbdd4dc [feature](Nereids): one side eager aggregation (#28143) 2023-12-12 15:38:31 +08:00
92e04c1453 [feature](Nereids): support comparing outer join in materialized view (#28237) 2023-12-12 15:36:36 +08:00
d9fb77ad10 [fix](mtmv)fix show task tvf finishTime is null (#28252) 2023-12-12 15:20:02 +08:00
9861cfc4bc [Fix](Transactional-Hive) Fix transactional hive core dump when TransactionalHiveReader::init_row_filters(). (#28238)
Fix transactional hive core dump when TransactionalHiveReader::init_row_filters().
2023-12-12 14:17:26 +08:00
cd8885f63e [fix](Nereids): support Chinese characters set (#28256) 2023-12-12 13:22:14 +08:00
3f202477ec [minor](import) modify some imports (#28206) 2023-12-12 11:39:54 +08:00
a5a7ab3c65 [pipelineX](profile) Add debug string if enable profile (#28262) 2023-12-12 11:39:08 +08:00
e49ed3d885 [regression test](memtable) add case for aggregation memtable (#28056)
1. create aggregation table
2. insert some data
3. drop the table and create again
4. modify some parameters for some branch
5. insert some data
6. change the parameters back to its default
2023-12-12 11:14:59 +08:00
8a7b0e5485 [upgrade](thirdparty) upgrade librdkafka from 1.8.2 to 2.0.2 (#28210)
Some error happen when using routine load
```
[INTERNAL_ERROR]Message at offset XXX might be too large to fetch, try increasing receive.message.max.bytes
```
Refer to https://github.com/confluentinc/librdkafka/issues/2993, we should upgrade librdkafka version to avoid this bug.
2023-12-12 11:12:10 +08:00
5ff110e845 [exec](profile) only build expr debug string enable profile (#28261) 2023-12-12 09:13:37 +08:00
7fba3fcb91 [pipelineX](improvement) block local shuffle sink by mem usage (#28224) 2023-12-11 21:25:31 +08:00
c4e484916b [Fix](table property) Fix table property disable_auto_compaction (#27853) 2023-12-11 20:48:11 +08:00
cd3d31ba13 [fix](statistics)Escape load stats sql (#28117)
Escape load stats sql, because column name may contain special characters.
2023-12-11 20:25:18 +08:00
Pxl
f30cc1f6ef [Bug](view) fix npe on create view with comment (#28189)
fix npe on create view with comment
2023-12-11 20:00:21 +08:00
d4f89389e3 [improve](group commit) Group commit support skip wal (#27957) 2023-12-11 19:38:32 +08:00
9b8de017df [Regression test](inverted index) fix regression case for index_compound_directory_fault_injection (#28232) 2023-12-11 19:17:28 +08:00
c1f666c497 [doc] fix typo (#28245) 2023-12-11 18:09:54 +08:00
877935442f [feature](pipelineX)use markFragments instead of markInstances in pipelineX (#27829) 2023-12-11 17:59:53 +08:00
3e1e8d2ebe [fix](jdbc catalog) Fixed data conversion problem when all data is null (#28230) 2023-12-11 17:57:57 +08:00
cff1de29ce [fix](group commit) Fix group commit memory calculation (#28242) 2023-12-11 17:05:26 +08:00
ad483efca5 [regression-test](case) forbid group commit case first #28244 2023-12-11 16:06:59 +08:00
3c2e8b0ecf [fix](Nereids) rewrite cte children check wrong map for consumer (#28220) 2023-12-11 14:58:42 +08:00
c2d6fbbc85 [feature](Nereids): add filter edge in hyperGraph (#28006) 2023-12-11 14:36:43 +08:00
593cc92501 [chore] Change default max segment size to 1GB (#28201) 2023-12-11 14:30:57 +08:00
1bbc54d1b2 [regression-test](variant) change p2 case to s3 load (#28193) 2023-12-11 12:31:25 +08:00
ac167f493b [fix](join) fix decimal overflow caused by left outer join (#28221)
For left outer join or full outer join, when build side data is empty, null data is output for build side, but nested column data of nullable column is not properly initialized, which may cause decimal arithmetic overflow
2023-12-11 11:51:05 +08:00
7c163fdf21 [test](decimal) add some cases about overflow (#28198) 2023-12-11 11:22:53 +08:00
f2fd66ad3b [feature-wip](nereids) Make nereids more compatible with spark-sql syntax. (#27231)
**Thanks for** pr #21855 to provide a wonderful reference. 

Maybe it is very difficult and **cost-expensive** to implement **a comprehensive logical plan adapter**, maybe there is just some small syntax variations between doris and some other engines (such as hive/spark), so we can just **focus on** the **difference** here.

This pr mainly focus on the **syntax difference between doris and spark-sql**. For instance, do some function tranformations and override some syntax validations.

- add a dialect named `spark_sql`
- move method `NereidsParser#parseSQLWithDialect` to `TrinoParser`
- extract some `FnCallTransformer`/`FnCallTransformers` classes, so we can reuse the logic about the function transformers
- allow derived tables without alias when we set dialect to `spark_sql`(legacy and nereids parser are both supported)
- add some function transformers for hive/spark built-in functions

### Test case (from our online doris cluster)

- Test derived table without alias

```sql
MySQL [(none)]> show variables like '%dialect%';
+---------------+-------+---------------+---------+
| Variable_name | Value | Default_Value | Changed |
+---------------+-------+---------------+---------+
| sql_dialect   | spark_sql  | doris         | 1       |
+---------------+-------+---------------+---------+
1 row in set (0.01 sec)

MySQL [(none)]> select * from (select 1);
+------+
| 1    |
+------+
|    1 |
+------+
1 row in set (0.03 sec)

MySQL [(none)]> select __auto_generated_subquery_name.a from (select 1 as a);
+------+
| a    |
+------+
|    1 |
+------+
1 row in set (0.03 sec)

MySQL [(none)]> set sql_dialect=doris;
Query OK, 0 rows affected (0.02 sec)

MySQL [(none)]> select * from (select 1);
ERROR 1248 (42000): errCode = 2, detailMessage = Every derived table must have its own alias
MySQL [(none)]> 
```

- Test spark-sql/hive built-in functions

```sql
MySQL [(none)]> show global functions;
Empty set (0.01 sec)

MySQL [(none)]> show variables like '%dialect%';
+---------------+-------+---------------+---------+
| Variable_name | Value | Default_Value | Changed |
+---------------+-------+---------------+---------+
| sql_dialect   | spark_sql  | doris         | 1       |
+---------------+-------+---------------+---------+
1 row in set (0.01 sec)

MySQL [(none)]> select get_json_object('{"a":"b"}', '$.a');
+----------------------------------+
| json_extract('{"a":"b"}', '$.a') |
+----------------------------------+
| "b"                              |
+----------------------------------+
1 row in set (0.04 sec)

MySQL [(none)]> select split("a b c", " ");
+-------------------------------+
| split_by_string('a b c', ' ') |
+-------------------------------+
| ["a", "b", "c"]               |
+-------------------------------+
1 row in set (1.17 sec)
```
2023-12-11 11:16:53 +08:00
e1587537bc [Fix](status) fix unhandled status in exprs #28218
which marked static_cast<void> in https://github.com/apache/doris/pull/23395/files
partially fixed #28160
2023-12-11 11:04:58 +08:00
53802fe0da [doc] document desc param is incorrect #26063 (#26064) 2023-12-11 10:33:07 +08:00
f236261256 [fix](regression) compaction cases adapt force_olap_table_replica_num option (#28136) 2023-12-11 10:08:21 +08:00
8f2202c89d [minor](log) Add debug info in operators (#28211) 2023-12-11 10:02:24 +08:00
1e5ff40e17 [refactor](group commit) remove future block (#27720)
Co-authored-by: huanghaibin <284824253@qq.com>
2023-12-11 08:41:51 +08:00
320ddf4987 [pipelineX](improvement) Support multiple instances execution on single tablet (#28178) 2023-12-10 20:18:41 +08:00
485d7db516 [fix](partial update) Fix missing rowsets during doing alignment when flushing memtable due to compaction (#28062) 2023-12-10 12:09:48 +08:00
a3cd36ce60 [bug](cooldown) Fix incorrect remote rowset dir after restarting BE (#28140) 2023-12-10 00:44:01 +08:00
5aa90a3bce [pipelineX](local shuffle) Fix bucket hash shuffle (#28202) 2023-12-10 00:35:00 +08:00
61379b141e [fix](insert) fix group commit regression test (#28142) 2023-12-09 16:24:20 +08:00
4e86f9bab5 [improve](move-memtable) include and check offset when append data (#28159) 2023-12-09 16:21:36 +08:00
16e232a8a1 [minor](lower-table-names) use GlobalVariable.lowerCaseTableNames instead of Config.lower_case_table_names (#27911)
GlobalVariable.lowerCaseTableNames instead of Config.lower_case_table_names
2023-12-09 12:04:26 +08:00
363721e066 [Bug](udf) java-udf function open failed cause BE core dump #28063
when the java-udf open function failed, and some JNI have not set,
so in close function can't call jni.
2023-12-09 11:00:30 +08:00
42aa174405 [chore](log) Log to trace before wait rpc timeout #28024 2023-12-09 10:04:43 +08:00
9d9b6462bf [improve](group_commit) optimize group commit select be logic #28190
Group commit choose be always first no decommissioned be in all be.

Choose be with selectBackendIdsByPolicy like common stream load and do not choose decommissioned be may be better.
2023-12-09 05:09:52 +08:00
287bd87a4f [typo](docs)add some faq for flink-connector-doris (#26309)
* add flink-connector-doris faq

* add faq
2023-12-09 02:19:49 +08:00
bd8130154a [fix](doc) spell errors fixes hardware-info-action (#28154) 2023-12-09 01:47:19 +08:00
c6f8b1b2ee [fix](repository) the exist repo_file must contails same name with new repo (#27668)
The user manually adjusted the 'name' field in the __repo_info file under the repo file on S3, but did not modify the folder name. This led to an issue when the user created a repo with the same name as the folder in a certain cluster. The system parsed the 'name' field in the existing __repo_info and used an incorrect name, causing the subsequent repo to be unusable. A judgment has been added here: the 'name' field in the __repo_info must be the same as the new repo's name, otherwise, an error will be reported.
2023-12-09 01:46:54 +08:00
07336980f9 [fix](meta) show partitions with Limit for external HMS tables (27835) (#27835)
This enhancement shall extend existing logic for SHOW PARTITIONS FROM to include: -

Limit/Offset
Where [partition name only] [equal operator and like operator]
Order by [partition name only]
Issue Number: close #27834
2023-12-09 01:44:45 +08:00
99b38ddca7 [improve](env) Ensure next majority is met before drop an alive follower (#28101)
Here is an example:

```
mysql> ALTER SYSTEM DROP FOLLOWER "127.0.0.1:19017";
ERROR 1105 (HY000): errCode = 2, detailMessage = Unable to drop this alive
follower, because the quorum requirements are not met after this command
execution. Current num alive followers 2, num followers 3, majority after
execution 2
```
2023-12-09 01:41:38 +08:00
99be9d6ad3 [fix](memlimiter) refresh memtracker before flush active memtables (#28196)
Currently, _flush_active_memtables() is using stale memtracker data, especially when some other thread has just it.
Refresh memtrackers before flush to avoid this problem.
2023-12-09 01:40:51 +08:00