Commit Graph

12375 Commits

Author SHA1 Message Date
76da9f181d fix join count (#22619) 2023-08-05 12:19:28 +08:00
839b469879 [fix](meta) set parallel_pipeline_task_num when upgrading from 1.2 to 2.0 (#22618) 2023-08-05 11:04:39 +08:00
fcdd1b96d2 [docs](delete-recover) merge docs: recover catalog and recover tablet trash #22525
Doris trash include FE catalog recycle bin and BE tablet trash. Users sometimes may be confused abount them. Put them together to let them better understand.
2023-08-05 10:31:48 +08:00
38f9ac99df [fix](bug) fix be custom conf persistence path and read path are inconsistent (#22520)
be_custom.conf persistence path is ${doris_home}/conf/be_custom.conf, but if we set ${custom_config_dir} is a different path, will cause be can't read be_custom.conf from ${custom_config_dir}.

set be_custom.conf persist path to ${custom_config_dir}.
2023-08-05 10:22:08 +08:00
12262a2025 [fix](compaction) filter block row locations with delete sign should ignore merge on read scenario (#22628) 2023-08-05 09:15:38 +08:00
ef0e0b7d79 [case](fix) add sync after stream load (#22601) 2023-08-05 08:28:26 +08:00
26e78ab418 [fix](compaction)none vertical compaction should also use _unique_key_next_block function to read block (#22614) 2023-08-05 00:24:57 +08:00
55d46e0ab9 [Fix](ut)Fix NPE due to lack of editlog object in unit test of production consumption model (#22625) 2023-08-04 22:32:15 +08:00
Pxl
c1c38c956d [exec] fix coredump when limit<0 and limit!=-1 with 1.2 fe (#22622) 2023-08-04 22:18:45 +08:00
8bbccc59ef [refactor](load) split segment flush out of beta rowset writer (#21725) 2023-08-04 19:48:56 +08:00
ea674aa540 [docs](community) Delete Gitter Mannual of EN & CN Verison (#22348) 2023-08-04 19:45:31 +08:00
846d6edab8 [docs](docs) Rename Advanced Usage Files for SEO (#22511) 2023-08-04 19:33:57 +08:00
d040a858f2 [docs](docs) Capitalize Query Acceleration Files Name and Title (#22512) 2023-08-04 19:33:31 +08:00
30b8c7b9e6 [docs](docs) Rename Lakehouse Files for SEO (#22513) 2023-08-04 19:33:02 +08:00
577cd51fde [docs](docs) Capitalize Ecosystem Files Name and Titles (#22515) 2023-08-04 19:32:39 +08:00
89fff98ced [docs](docs)Update spark_load.md (#22428) 2023-08-04 19:22:52 +08:00
265cded7da [Fix](Planner) fix window function in aggregation (#22603)
Problem:
When window function in aggregation function, executor would report an error like: Required field 'node_type' was not present!

Example:
SELECT SUM(MAX(c1) OVER (PARTITION BY c2, c3)) FROM test_window_in_agg;

Reason:
When analyze aggregate, analytic expr (window function carrior when analyze) transfered to slot and loss message. So when
serialize to thrift package, TExpr can not determine node_type of analytic expr.

Solved:
We do not support aggregate(window function) yet. So we report an error when analyze.
2023-08-04 19:15:51 +08:00
b122f9b80c [fix](concat) ColumnString::chars is resized with wrong size (#22610)
FunctionStringConcat::execute_impl resized with size that include string null terminator, which causes ColumnString::chars.size() does not match with ColumnString::offsets.back, this will cause problems for some string functions, e.g. like and regexp.
2023-08-04 19:13:35 +08:00
872280135d [exec](pipeline) revert FE pipeline instance num pr (#22617)
* Revert "[fix](executor) only mysql connect to set GlobalPipelineTask (#22205)"
* Revert "[feature](executor) using fe version to set instance_num (#22047)"
2023-08-04 19:07:14 +08:00
d974af5feb [Fix](Load)Multi table plan not include task info (#22613) 2023-08-04 18:52:22 +08:00
9f92861c91 [fix](stats) Load partition stats unexpectedly (#22589)
syncLoadColStats method invoke stale method to deserialize columnstats after supporting load part stats,
2023-08-04 18:50:38 +08:00
95aa4d8631 [Feature](Export) Supports concurrently export of table data (#21911) 2023-08-04 18:50:17 +08:00
93593a013d [feature](load) add segment bytes limit in segcompaction (#22526) 2023-08-04 18:00:52 +08:00
7fe08c74fe [fix](inverted index) return empty result instead of error for empty match query (#22592)
return empty result instead of error for empty match query as follows:

`SELECT * FROM t WHERE msg MATCH ''`

`SELECT * FROM t WHERE msg MATCH 'stop_word'`
2023-08-04 17:36:32 +08:00
672acb8784 [fix](show-table-status) fix hive view NPE and external meta cache refresh issue (#22377) 2023-08-04 16:55:10 +08:00
dc06c486e8 [fix](compatibility) Version 1.2 upgraded to 2.0 compatible with miniload metadata (#22590) 2023-08-04 16:52:51 +08:00
9cf6b1b4cf [docs](typo) fix some typo of docs (#22591) 2023-08-04 16:29:04 +08:00
56e8ad197c [improvement](stats) Reduce unnecessary SQL from full auto analyze #22583
1. Remove bunch of SQLs related to partition's information
2. Fix the duplicate SQLs submission
3. Fix bug that table's stats not get updated after system job finished
2023-08-04 15:52:25 +08:00
7d1e08eafa [Fix](Nereids) rand() and uuid() should not fold constant (#22492)
rand() and uuid() should not fold constant and we change the default value of fold constant for non-deterministic function to false.
2023-08-04 15:36:03 +08:00
ef53a27887 [fix](nereids) allow in or exits subquery in binary operator (#22391)
support subquery in binary operator like if( xx  in ( subquery ), 1, 0 )
2023-08-04 15:35:19 +08:00
d379b04b39 [fix](planner) fix bug of push conjuncts through second phase agg (#22417)
If there is a second phase agg, the output of the 1st phase agg is its intermediate tuple not the output tuple.
This pr fix it
2023-08-04 15:21:18 +08:00
b9e344617a [typo](kerberos)support read jdk auth creds and add some krb tips in FAQ (#22535)
support read jdk auth creds and add some krb tips in FAQ
1. about the 'javax.security.auth.useSubjectCredsOnly': https://stackoverflow.com/questions/43660265/java-automatically-uses-kerberos-ticketcache-when-it-shouldnt
2. add tips for `No common protection layer between client and server` and yum jdk version.
2023-08-04 14:51:31 +08:00
3d758de7a2 [improvement](binlog) gc be binlog metas when tablet is dropped. (#22447) 2023-08-04 14:38:13 +08:00
34164f69ba [Enhancement](binlog) Add Barrier log into BinlogManager (#22559)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-08-04 14:37:12 +08:00
34b7f381b1 [fix](multi catalog)Filter .hive-staging dir under hive file path. #22574
Hive file path may contain temporary directory like this:

drwxrwxrwx   - root supergroup          0 2023-03-22 21:03 /usr/hive/warehouse/datalake_performance.db/clickbench_parquet_hits/.hive-staging_hive_2023-03-22_21-03-12_047_8461238469577574033-1
drwxrwxrwx   - root supergroup          0 2023-05-18 15:03 /usr/hive/warehouse/datalake_performance.db/clickbench_parquet_hits/.hive-staging_hive_2023-05-18_15-03-52_780_3065787006787646235-1
This will cause error when be try to read these files. Need to filter them during FE plan.
2023-08-04 14:14:53 +08:00
3d5b90befe [fix](tablet clone) fix not add colocate replica and print some logs #22378 2023-08-04 14:09:02 +08:00
24c1953e91 [fix](debug) add bvar counter for memtable & loadchannel (#22578)
* [fix](debug) add bvar counter for memtable & loadchannel

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

* format code

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

---------

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-08-04 13:58:28 +08:00
658d75c816 [feature](Nereids): normalize join condition after expanding or condition NLJ (#22555) 2023-08-04 13:37:37 +08:00
d5a21de796 [Enhancement](planner)support fold constant for date_trunc() (#22122) 2023-08-04 13:32:48 +08:00
3d0d5bfd6d [chore](cmake) Split thirdparty into cmake/thirdparty.cmake (#22572)
* [chore](cmake) Split thirdparty into cmake/thirdparty.cmake

* Add Apache License into thirdparty.cmake

Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>

---------

Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2023-08-04 13:21:22 +08:00
f828a3d826 [shape](nereids) ssb sf100 plan shape check (#22596) 2023-08-04 13:12:21 +08:00
62b1a7bcf3 [tpcds](nereids) add rule to eliminate empty relation #22203
1. eliminate emptyrelation,
2. const fold after filter pushdown
2023-08-04 12:49:53 +08:00
0e9fad4fe9 [stats](nereids) improve Anti join stats estimation #22444
No impact on TPC-H
impact on TPC-DS 16/69/94  improved
2023-08-04 12:48:39 +08:00
d3cab017ec [chore](topn-opt) temporary disable two phase read for TableQueryPlanActionQ (#22543) 2023-08-04 11:53:48 +08:00
ed6bb1fc9d [fix](memory) remove memory tracker profile refresh thread #22582
Memtrackers are usually bound to operators in query/load. If a large number of query/loads are stuck, memtrackers will be very large. memory tracker profile refresh thread will get stuck on the lock.

This pr is for branch-2.0, I will rewrite the memory profile in the next pr
2023-08-04 11:51:19 +08:00
868e65d618 [fix](compaction) rowid_conversion should ignore deleted row (#22579) 2023-08-04 11:41:17 +08:00
bad8237850 [BugFix](Es Catalog) fix bug that es catalog will return error when query partial columns (#22423)
Bug:
When the value of some ES column is empty, querying these value in doc_values mode will receive an error.

Reson:
In doc values mode, these values are empty, We need to determine if the array is empty
2023-08-04 11:28:30 +08:00
9c0528daf6 [Opt](orc-reader) opt the performance of date convertion. (#22381)
Opt the performance of date conversion in orc reader.

```
mysql> select count(l_commitdate) from lineitem;
+---------------------+
| count(l_commitdate) |
+---------------------+
|           600037902 |
+---------------------+
1 row in set (1.28 sec)

mysql> select count(l_commitdate) from lineitem;
+---------------------+
| count(l_commitdate) |
+---------------------+
|           600037902 |
+---------------------+
1 row in set (0.19 sec)
```
2023-08-04 10:52:09 +08:00
0d9caaee7d not run workload group by default (#22497) 2023-08-04 10:12:01 +08:00
0c68f7e347 [peformance](load) cancel unstarted segcompaction tasks when build rowset (#22392) 2023-08-04 10:10:38 +08:00