Commit Graph

10394 Commits

Author SHA1 Message Date
41d4ed8367 [Improvement](multicatalog) support show_partitions for hms catalog (#19242)
* [Improvement](multicatalog) support show_partitions for hms catalog

* update according review advice
2023-05-11 01:17:23 +08:00
840dbdc7c0 [typo](docs) add comment of partition and key/value column (#19448)
* change docker compose to 'docker-compose'

* modify sql of mysql

* fix docker start and stop cmd

* new commit

* add comment of partition and key/value column

* Update cn doc format

---------

Co-authored-by: Luzhijing <82810928+luzhijing@users.noreply.github.com>
2023-05-11 01:14:17 +08:00
68505a1192 [Test](multi catalog)Add test case for Iceberg External Table. #19488 2023-05-11 01:13:40 +08:00
47edc5a06e [fix](functions) Support nullable column for multi_string functions (#19498) 2023-05-11 01:13:13 +08:00
28e088aee1 [optimization](be) optimization for ColumnConst when writing mysql result (#19122)
* opt for result

* fix
2023-05-11 01:04:18 +08:00
8845c2cf44 [fix](bdbje) remove System.exit(-1) in BDBEnvironment.close() (#19335)
* https://github.com/apache/doris/issues/18766
2023-05-11 01:01:38 +08:00
0f6c69de53 [Fix](multi-catalog) Fix sync hms event failed when start FE soon. (#19344)
* [Fix](multi-catalog) Fix sync hms event failed when start FE soon after.

* [Fix](multi-catalog) Fix sync hms event failed when start FE soon after.

---------

Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>
2023-05-11 01:00:55 +08:00
b129c9901b [improvement](FQDN)Change the implementation of fqdn (#19123)
Main changes:
1. If fqdn is enabled in the configuration file, when fe starts, localAddr will obtain fqdn instead of IP, priority_ Networks will fail
2. The IP and host names of Backend and Front are combined into one field, host. When fqdn is enabled, it represents the host name, and when not enabled, it represents the IP address
3. The communication between clusters directly uses fqdn, and various Connection pool add authentication mechanisms to prevent the IP address of the domain name from changing and the connection between nodes from making errors
4. No longer requires polling to verify if the IP has changed, delete fqdnManager
5. Change the method of verifying the legitimacy of nodes between FEs from obtaining client IP to displaying the identity of the transmitting node itself in the HTTP request header or the message body of the throttle
6. When processing the heartbeat, if BE finds that the host stored by itself is inconsistent with the host stored by the master, after verifying the legitimacy of the host, it will change its own host instead of directly reporting an error
7. Simplify the generation logic of fe name

Scope of influence:
1. Establishing communication connections between clusters
2. Determine whether it is the same node through attributes such as IP
3. Print Log
4. Information display
5. Address Splicing
6. k8s deployment
7. Upgrade compatibility

Test plan:
1. Change the IP address of the node, while keeping the fqdn unchanged, change the IP addresses of fe and be, and verify whether the cluster can read and write data normally
2. Use the master code to generate metadata, and use the previous metadata on the current pr to verify whether it is compatible with the old version (upgrading is no longer supported if fqdn has been enabled before)
3. Deploy fe and be clusters using k8s to verify whether the cluster can read and write data normally
4. According to https://doris.apache.org/zh-CN/docs/dev/admin-manual/cluster-management/fqdn?_highlight=fqdn#%E6%97%A7%E9%9B%86%E7%BE%A4%E5%90%AF%E7%94%A8fqdn Upgrading old clusters
5. Use streamload to specify the fqdn of fe and be to import data separately
6. Use different users to start transactions and write data using insert statements
2023-05-11 00:44:48 +08:00
3a22af836e [fix](jdbc catalog) fix error to clickhouse uint64 type Conversion (#19463)
* [fix](jdbc catalog) fix error to clickhouse uint64 type Conversion

* add test case
2023-05-10 21:53:30 +08:00
69ebb90225 [bugfix](core) be will core when coordinator callback (#19497)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-05-10 21:46:43 +08:00
9ffdbae442 [bugfix](jdbcconnector) jdbc connector cast string to array core (#19494)
introduced by https://github.com/apache/doris/pull/18328/files
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-05-10 21:46:20 +08:00
d0a8cd0fc5 [fix](nereids) dphyper join reorder may lost some join conjuncts (#19318) 2023-05-10 19:02:35 +08:00
337732ae01 [fix](nereids) lost exchange before global limit merge node sometimes (#19396)
should add exchange node between global and local limit
2023-05-10 17:57:21 +08:00
894801f5ce [feature](load-refactor) Step1: InsertStmt as facade layer and run S3/Broker Load (#19142) 2023-05-10 17:48:50 +08:00
d20b5f90d8 [feature](executor) Automatically set the instance_num using the info from be. (#19345)
1. fixed some error regressions (results error with big nstance_num due to incorrect order by).
2. if set parallel_fragment_exec_instance_num to 0, the concurrency in the Pipeline execution engine will automatically be set to half of the number of CPU cores.
3. add limit to parallel_fragment_exec_instance_num that it cannot be set to more than fe.conf::max_instance_num(Default: 128)
```
mysql [(none)]>set parallel_fragment_exec_instance_num = 514;
ERROR 1231 (42000): errCode = 2, detailMessage = Variable 'parallel_fragment_exec_instance_num' can't be set to the value of '514(Should not be set to more than 128)'
```
2023-05-10 17:07:41 +08:00
0dd35c81b4 [docs](data-model):add sql statements to import data (#19390)
* [docs](data-model):add sql statements to import data
* [docs](data-model)synchronize documents in English
2023-05-10 17:06:50 +08:00
Pxl
9b7a419aed [Chore](build) update some doc about build enviroment (#19325)
update some doc about build enviroment
2023-05-10 16:18:44 +08:00
bdf54963ae Update query-analysis.md (#19456) 2023-05-10 16:05:51 +08:00
7631c82eff [typo](doc) Fixed typos in native-user-defined-function.md (#19459) 2023-05-10 16:05:30 +08:00
e1d8d2aa64 [typo](doc)optimize description of bitmap in materialized view document (#19464)
* [DOC]optimize descreption of bitmap in materialized view document

* Update materialized-view.md

---------

Co-authored-by: zhuwei <zhuwei8421@gmail.com>
Co-authored-by: Luzhijing <82810928+luzhijing@users.noreply.github.com>
2023-05-10 16:05:16 +08:00
e60129a28b [typo](doc) Fixed typos in variables.md (#19451)
* [typo](doc) Fixed typos in variables.md

* Update variables.md
2023-05-10 16:04:53 +08:00
14a56da397 [chore](testcase) change tpcds q67 testcase name to q67_ignore_temporarily (#19227)
change tpcds q67 testcase name to q67_ignore_temporarily since the error should be ignored temporarily
due to precision problem that will be fixed further.
2023-05-10 15:06:23 +08:00
208d21b01d [tools](tpch) use origin TPCH qurries (#19479) 2023-05-10 14:29:45 +08:00
4483e3a6e1 [Improvement](scan) add a config for scan queue memory limit (#19439) 2023-05-10 13:14:23 +08:00
a05dbd3f81 [chore](compile) Improves PCH cache hit ratio (#19469)
Supplement the documentation of be-clion-dev, avoid the problem of undefined DORIS_JAVA_HOME and inability to find jni.h when using clion development without directly compiling through build.sh
Complete the classification of header files in pch.h and introduce some header files that are not frequently modified in doris.
Separate the declaration and definition in common/config.h. If you need to modify the default configuration now, please modify it in common/config.cpp.
gen_cpp/version.h is regenerated every time it is recompiled, which may cause PCH to fail, so now you need to get the version information indirectly rather than directly.
2023-05-10 12:49:01 +08:00
ab8cfbbfb6 [bugfix](regression-test) add some window function test (#19460)
Only 2000 union will cause BE use a lot of memory, so that I enable other test in this PR only disable 2000 union case.



---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-05-10 12:06:02 +08:00
553068f7be [feat](Nereids): trace enumeration of DPHyp (#19394) 2023-05-10 11:57:35 +08:00
fae2e5fd22 [enchancement](statistics) implement automatically analyzing statistics and support table level statistics #19420
Add table level statistics, support SHOW TABLE STATS statement to show table level statistics.
Implement automatically analyze statistics, support ANALYZE... WITH AUTO ... statement to automatically analyze statistics.
TODO:

collate relevant p0 tests
Supplement the design description to README.md
Issue Number: close #xxx
2023-05-10 11:47:34 +08:00
601565341b [fix](gson) avoid gson serde with EsRepository (#19385)
To avoid error like:

class org.apache.doris.external.elasticsearch.EsRepository declares multiple JSON fields named runnable
2023-05-10 11:37:18 +08:00
Pxl
5473795a51 [Bug](scan) forbiden push down in predicate when in_state->use_set is false (#19471)
forbiden push down in predicate when in_state->use_set is false
2023-05-10 11:12:20 +08:00
78435823b6 [Fix](multi catalog)Return all partition values while reading hive table. (#19434)
Return all partition values while reading hive table.
Add a config item for the max value of hive table to partition list cache.
Default value is 100.
2023-05-10 10:55:33 +08:00
d24dd12b20 [enhancement](http) add fail reply for failed submitting tasks in single-replica-download (#19356) 2023-05-10 10:54:32 +08:00
b72ff93c7a [chore](java udf)Add Java UDF compilation options (#19468) 2023-05-10 10:51:11 +08:00
1a423350f8 [Interface](exec) Add interface for multi cast data sink (#19372) 2023-05-10 10:29:33 +08:00
cf8ceb8586 [fix](scan) fix scanner mem tracker (#19354) 2023-05-10 09:56:41 +08:00
b2371c1246 [Refact](Literal)refact literal get field and value (#19351) 2023-05-10 09:01:17 +08:00
03538381a3 [enhancement](memory) MemCounter supports lock-free thread safety (#19256)
make try_add() and update_peak() thread-safe.
2023-05-10 02:24:07 +08:00
68eb420cab [fix](MySQL) the way Doris handles boolean type is consistent with MySQL (#19416) 2023-05-10 00:58:09 +08:00
f8eb08252c [chore](workflows) Disable PCH in GitHub workflows by default (#19447)
PCH slows the BE UT workflows down. Disable it by default in workflows.
2023-05-10 00:05:32 +08:00
096aa25ca6 [improvement](orc-reader) Implements ORC lazy materialization (#18615)
- Implements ORC lazy materialization, integrate with the implementation of https://github.com/apache/doris-thirdparty/pull/56 and https://github.com/apache/doris-thirdparty/pull/62.
- Refactor code: Move `execute_conjuncts()` and `execute_conjuncts_and_filter_block()` in `parquet_group_reader `to `VExprContext`, used by parquet reader and orc reader.
- Add session variables `enable_parquet_lazy_materialization` and `enable_orc_lazy_materialization` to control whether enable lazy materialization.
- Modify `build.sh` to update apache-orc submodule or download package every time.
2023-05-09 23:33:33 +08:00
Pxl
dfad7b6b38 [Feature](generic-aggregation) some prowork of generic aggregation (#19343)
some prowork of generic aggregation
2023-05-09 21:42:21 +08:00
7c7db9ce93 [typo](docs) Add an open page cache hint to the benchmark (#19449) 2023-05-09 21:28:39 +08:00
1bc405c06f [fix](catalog) fix doris jdbc catalog largeint select error (#19407)
when I use mysql-jdbc 5.1.47 create a doris jdbc catalog, the largeint cannot select
When mysql-jdbc reads largeint, it will convert the format to string because it is too long

mysql> select `largeint` from type3;
ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INTERNAL_ERROR]Fail to convert jdbc type of java.lang.String to doris type LARGEINT on column: largeint. You need to check this column type between external table and doris table.
2023-05-09 17:34:48 +08:00
b07053f47d [chore](simdjson reader) default enable simdjson for json reader (#19375) 2023-05-09 16:53:21 +08:00
aeb3450151 [feature](graph)Support querying data from the Nebula graph database (#19209)
Support querying data from the Nebula graph database
This feature comes from the needs of commercial customers who have used Doris and Nebula, hoping to connect these two databases

changes mainly include:

* add New Graph Database JDBC Type
* Adapt the type and map the graph to the Doris type
2023-05-09 15:30:11 +08:00
1424fb96ca [bugfix](regression-test) disable string column length too large test and disable auto statistics collector and disable window function test (#19428) 2023-05-09 14:57:02 +08:00
2504b243f0 [Fix](build) fix clucene build type (#19376)
RelWithDebInfo default uses O2 as compile flags which hurt performance for clucene
2023-05-09 14:29:04 +08:00
e3d4723849 [fix](JDBC) set jdbc parameters to compatible with both MySQL and Doris when reading boolean type (#19399)
Fix errors when read boolean type from external doris cluster by jdbc catalog:
```
ERROR 1105 (HY000): errCode = 2, detailMessage = (172.16.10.11)[INTERNAL_ERROR]Fail to convert jdbc type of java.lang.Integer to doris type BOOL on column: deleted. 
You need to check this column type between external table and doris table.
```
MySQL Types and Return Values for GetColumnTypeName and GetColumnClassName are presented in https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-type-conversions.html.
However when tinyInt1isBit=false, GetColumnClassName of MySQL returns java.lang.Boolean, while that of Doris returns java.lang.Integer. In order to be compatible with both MySQL and Doris, Jdbc params should set tinyInt1isBit=true&transformedBitIsBoolean=true
2023-05-09 13:53:17 +08:00
729cd319f1 [enhance](regression) add timeout for cold&heat case (#19360) 2023-05-09 13:08:40 +08:00
d8dd0536e0 [enhance](S3FileWriter) sync when s3 file writer early quits #19393 2023-05-09 11:02:58 +08:00