Commit Graph

11209 Commits

Author SHA1 Message Date
9c30fb5a21 [fix](script)Fix the JAVA_OPTS version error of the BE start script (#20766) 2023-06-14 15:25:00 +08:00
bcf103e993 [enhancement](log4j) support high performance mode for log4j to escape potential bottleneck for doris read and write (#20759)
As we know, log4j2 some times may be bottleneck in doris fe when there are many logs to be output in sync mode while asynchronous logging has a better performance, and we find that capturing caller location has a similar impact across all logging libraries, and slows down asynchronous logging by about 30-100x. so, here we provide three log mode for log4j2 to meet the needs of different users.
refer to https://logging.apache.org/log4j/2.x/performance.html
2023-06-14 15:16:04 +08:00
f707dc9395 [fix](stats) Fix NPE when analyze database sync (#20775) 2023-06-14 15:01:02 +08:00
f2025b9eed [fix](memory) before compaction run, check memory exceed limit #20782 2023-06-14 14:20:48 +08:00
20ac940711 [Bug](pipeline) fix bug for file scan node on pipeline engine (#20763) 2023-06-14 12:52:56 +08:00
1c394f4964 Fix](Nereids) insert into table not need unpartitioned as root fragment's data partition (#20737) 2023-06-14 11:57:41 +08:00
8726047f86 [fix](nereids) select text as minimum column unexpected (#20745)
column of string and text types has width -1, and shouldn't be considered as minimum size column
2023-06-14 11:49:22 +08:00
affe36d32e [test](find_in_set) add find_in_set function test case (#20718) 2023-06-14 09:43:48 +08:00
9b4b0d4bf9 [fix](cooldown) Fix bug when cooldown a dropped tablet (#20750) 2023-06-14 09:42:55 +08:00
cd46f459db [minor](script) fix typo in build.sh (#20757) 2023-06-14 09:05:01 +08:00
edd0a1590d [chore](workflow) Improve the robustness of BE UT (Clang) (#20744) 2023-06-14 08:33:14 +08:00
a58a0d4003 [doc](community)update connector release doc (#20476)
Co-authored-by: wudi <>
2023-06-14 01:01:00 +08:00
ba3e065955 [typo](doc) add column type description for range partition (#20691) 2023-06-14 00:59:30 +08:00
fd97587aff [fix](merge-on-write) fix the merged rows is not equal to missed rows when do cumulative compaction (#20754) 2023-06-13 22:18:59 +08:00
35c19daec7 [opt](routine load) log BE id when get partitions failed. (#20749)
Add BackendId when get partitions failed to make debug error easier.
2023-06-13 19:15:05 +08:00
f1fd486f84 [fix](docker)Fix docker be init script restart failed bug (#20505)
fix docker be restart failed bug
2023-06-13 19:05:31 +08:00
5d2758cb8f [improvement](build) move add BE extension jars to java_extensions dir (#20740)
Follow #20185
Move all BE java extension jars to `be/lib/java_extensions/` dir.
Also remove `udf` dir, used for BE native udf, which is deprecated since v1.2

The final output is:

```
output
├── be
│   ├── bin
│   ├── conf
│   ├── dict
│   ├── lib
|   ├── java_extensions
│       ├── hudi-scanner-jar-with-dependencies.jar
│       ├── java-udf-jar-with-dependencies.jar
│       ├── jdbc-scanner-jar-with-dependencies.jar
│       ├── max-compute-scanner-jar-with-dependencies.jar
│       └── paimon-scanner-jar-with-dependencies.jar
│   ├── LICENSE-dist.txt
│   ├── licenses
│   ├── log
│   ├── NOTICE.txt
│   ├── storage
│   └── www
└── fe
    ├── bin
    ├── conf
    ├── doris-meta
    ├── lib
    ├── LICENSE-dist.txt
    ├── licenses
    ├── log
    ├── mysql_ssl_default_certificate
    ├── NOTICE.txt
    ├── spark-dpp
    └── webroot
```
2023-06-13 18:55:12 +08:00
Pxl
9244cb6553 [Chore](runtime-filter) do not make query fail when rf publish failed (#20742)
do not make query fail when rf publish failed
2023-06-13 18:23:46 +08:00
37db0145b4 [fix](load) fix mysql load parse response npe (#20699) 2023-06-13 18:14:03 +08:00
ad2f1b5647 [Update](clucene) synchronize clucene version to address PFOR adaptation issue (#20736) 2023-06-13 18:04:48 +08:00
7636dd1fdc [fix](nereids) always use colocate scan when agg's fragment has olap scan (#20695) 2023-06-13 17:59:17 +08:00
7942bd0bf9 [fix](planner) cast string literal to date like type should not be an implict cast (#20709)
1. cast string literal to date like type should not be an implict cast
2. the string representation of float like type should not be scientific notation
3. the data type of like function's regex expr should be string type even if it's a null literal
4. add -Xss4m in fe.conf to prevent stack overflow in some case
2023-06-13 17:57:14 +08:00
0e82c0d7a2 [Fix](Nereids) constant folding for function timestamp() (#20607) 2023-06-13 17:41:58 +08:00
feb21fc9e9 [fix](group_concat) use default seperator ',' instead of ', ' for group_concat, to be consistant with mysql (#20741) 2023-06-13 17:20:29 +08:00
2dddab03a1 [compatibility](schema cache) ensure schema version when using schema cache (#20729)
When FE is old version, be is new version, issue a schema change(add column) and
then query, old version of FE query without schema version could result in reading
stale schema from schema cache
2023-06-13 15:19:26 +08:00
4b15185e25 [improvement](hdfs) add parquet footer cache and hdfs file handle cache (#20544)
1. Add hdfs file handle cache for hdfs file reader

    Copied from Impala, `https://github.com/apache/impala/blob/master/be/src/util/lru-multi-cache.h`. (Thanks for the Impala team)
    This is a lru cache that can store multi entries with same key.
    The key is build with {file name + modification time}
    The value is the hdfsFile pointer that point to a certain hdfs file.
    
    This cache is to avoid reopen same hdfs file mutli time, which can save
    query time.
    
    Add a BE config `max_hdfs_file_handle_cache_num` to limit the max number
    of file handle cache, default is 20000.

2. Add file meta cache

	The file meta cache is a lru cache. the key is {file name + modification time},
	the value is the parsed file meta info of the certain file, which can save
	the time of re-parsing file meta everytime.
	Currently, it is only used for caching parquet file footer.
	
The test show that is cache is hit, the `FileOpenTime` and `ParseFooterTime` is reduce to almost 0
in query profile, which can save time when there are lots of files to read.
2023-06-13 15:13:57 +08:00
2adf5169e6 [improvement](test) improve p2 case of githubevents (#20727)
Check rows of github_events table after restore finish.
2023-06-13 14:31:24 +08:00
54a7dbeb4d [Refactor](External) Move Common ODBC Methods to JDBC Class and Add Default config to Disable ODBC Creation (#20566)
This PR addresses the refactoring of common methods that were originally located within the ODBC classes, but were used by the JDBC classes. These methods have now been moved to the JDBC classes to improve code readability and maintainability.

In addition, we have disabled the creation of ODBC external tables by default. However, this will not affect the existing usage of ODBC. You can still enable the ODBC external tables through the enable_odbc_table setting. Please be aware that we plan to completely remove the ODBC external tables in future versions, so we recommend using the JDBC Catalog as a priority.
2023-06-13 14:29:04 +08:00
033f64de93 [tools](tpch)add analyze in run-tpch-queries.sh (#20733) 2023-06-13 14:11:45 +08:00
eaa13e66f9 [fix](planner) inplement constant folding for function to_monday() (#20708) 2023-06-13 11:40:44 +08:00
Pxl
e010fa8d4f [Chore](runtime filter) remove runtime filter ready_for_publish/publish_finally (#20593) 2023-06-13 11:20:49 +08:00
ee0e2b40da [Improvement](meta) support return brief info of restore job (#20653) 2023-06-13 10:47:31 +08:00
ce3050d75c [fix](regression) fix vertical compaction test (#20601) 2023-06-13 10:31:22 +08:00
e28187feb7 [fix](hive) fix NPE of hive meta store client (#20664)
The failed to connect to hive meta store, the exception will be thrown.
But there is a bug that the exception object may not be set, causing NPE.
2023-06-13 09:41:49 +08:00
57656b2459 [Enhancement](java-udf) java-udf module split to sub modules (#20185)
The java-udf module has become increasingly large and difficult to manage, making it inconvenient to package and use as needed. It needs to be split into multiple sub-modules, such as : java-commom、java-udf、jdbc-scanner、hudi-scanner、 paimon-scanner.

Co-authored-by: lexluo <lexluo@tencent.com>
2023-06-13 09:41:22 +08:00
51bbf17786 [Refactor](Profile) Add and refactor the join profile (#20693) 2023-06-13 09:06:51 +08:00
ef4410821f [typo](doc)document optimization (#20645)
* document optimization

* document optimization
2023-06-13 09:01:03 +08:00
4ac38ca67a [typo](docs) add a python example for stream load. (#20697) 2023-06-13 08:57:01 +08:00
550584e4e9 [docs](docs)Add the list of BI tools supported by Doris. (#20690) 2023-06-13 08:56:01 +08:00
73ad885e19 [Feature][Fix](multi-catalog) Implements transactional hive full acid tables. (#20679)
After supporting insert-only transactional hive full acid tables #19518, #19419, this PR support transactional hive full acid tables.

Support hive3 transactional hive full acid tables.
Hive2 transactional hive full acid tables need to run major compactions.
2023-06-13 08:55:16 +08:00
939575f5f3 [fix](mtmv)create mtmv failed when not specifying refresh strategy #20696
* fix no refresh error

* add ut
2023-06-13 08:53:24 +08:00
Pxl
5e3a96d605 [Bug](pipeline) fix memory leak because pipeline shared ptr not release #20710 2023-06-13 08:50:34 +08:00
412ca9059e [fix](routine-load) fix stackoverflow bug in routine load (#20704)
When executing routine load job, there may encounter StackOverflowException.
This is because the expr in column setting list will be analyze for each routine load sub task,
and there is a self-reference bug that may cause endless loop when analyzing expr.

The following columns expr list may trigger this bug:

```
columns(col1, col2,
col2=null_or_empty(col2),
col1=null_or_empty(col2))
```

This fix is verified by user, but I can't add regression test for this case, because I can't submit a routine load job
in our regression test, and this bug can only be triggered in routine load.
2023-06-13 00:07:56 +08:00
283c55720d [bug](cooldown) Fix the issue of unused remote files not being deleted (#19785) 2023-06-12 21:05:09 +08:00
1433544c56 [fix](case expr) fix coredump of case for null value 3 #20711 2023-06-12 20:58:01 +08:00
6652287b52 [Fix](regression-test) fix unstable test case nereids_p0/update (#20692) 2023-06-12 20:55:22 +08:00
b4e552c3c3 [typo](docs) add parameter version (#20672) 2023-06-12 18:47:32 +08:00
c25c19bddc [test](regression) Add cases to test join condition push and not like (#20453)
Add testing cases to issue #19613
2023-06-12 18:26:23 +08:00
Pxl
5fd9f58bd3 [Chore](pipeline-engine) adjus queryt canceled log on pipeline engine (#20702)
adjus queryt canceled log on pipeline engine
2023-06-12 18:23:19 +08:00
565095eb52 [bug](function) fix is_null/is_not_null check is_const has error (#20562)
fix is_null/is_not_null check is_const has error
2023-06-12 18:21:12 +08:00