Commit Graph

7448 Commits

Author SHA1 Message Date
302da03b18 [enhancement](Nereids): Use long bitmap in DPHyp (#14725) 2022-12-01 20:47:45 +08:00
ba9a777554 [fix](function) StringRef should not be key of timezone cache (#14719) 2022-12-01 16:31:47 +08:00
9dd1d989e8 [test](decimalv3) add regression test cases for decimalv3 (#14672) 2022-12-01 15:18:40 +08:00
f496d1972a [improvement](multi-catalog) return root cause of exception (#14708) 2022-12-01 14:58:05 +08:00
176f519fa1 [enhancement](memtracker) Optimize exec node memory tracking (#14711) 2022-12-01 14:52:21 +08:00
b4d32a0c44 [fix](join) runtime filter shared from other instance wasn't be published (#14717) 2022-12-01 14:17:23 +08:00
Pxl
bba77fa9dd [Enhancement](profile) enhance column predicates display on profile (#14664) 2022-12-01 13:07:12 +08:00
3c6b96b9be [enhancement](Nereids) avoid add project that output same with child to memo (#14180) 2022-12-01 10:49:44 +08:00
7873bc95a6 [Enhancement](bitmapfilter) Support bitmap filter to apply zone_map index to filter pages (#14635) 2022-12-01 10:41:09 +08:00
ce9a160d16 [enhancement](macOS) Make CLion work out of the box (#14689)
We can't build the project after import it to CLion on macOS. Some options must be provided by default.
2022-12-01 10:40:04 +08:00
12791f1c79 [opt](docs) Add select except usage in select doc (#14696) 2022-12-01 10:06:17 +08:00
2a3a758c75 [doc](community) update release-verify doc when gpg import (#14706) 2022-12-01 10:04:58 +08:00
6c70d794f6 [fix](bitmapfilter) fix core dump caused by bitmap filter (#14702) 2022-12-01 09:56:22 +08:00
36737fe9f4 [feature](Nereids): Add cache to avoid repeatly calculation in DPhyp (#14585) 2022-11-30 21:35:45 +08:00
9bbbcf031c [enhancement](k8s) Support fqdn mode for be in k8s enviroment (#9172)
In the k8s environment, the ip of the pod can be changed, but the hostname of pod is stable. When the host machine of the pod fails, the k8s can schedule the failed pod to the new host machine for reconstruction. After that, the newly created pod's hostname remains unchanged, and the ip address has been changed. The change of the be node's ip address can be detected by FQDNManager when enable_fqdn_mode is true

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-11-30 20:42:15 +08:00
80baca2643 [Docs](memory) Admin-manual adds mem tracker, memory exceeds limit, OOM analysis (#14419) 2022-11-30 18:02:05 +08:00
738c36109f rename tpch dir (#14668) 2022-11-30 17:59:13 +08:00
11735043d6 [improvement](test) logging load result (#14694)
When a load failed, we have to login to doris to investigate result.
2022-11-30 16:57:35 +08:00
593a916ae6 [feature](nereids) split AggregateDisassemble into two rules (#14611)
# Proposed changes

Issue Number: close #14280

## Problem summary

The AggregateDisassemble rule is refactored and split into two rules, which are not dependent on each other.
1. AggregateDisassemble splits the agg into two phases: Local, Global.
1.1. For count function, the implementation is as follows:distinct_multi_count(update)+ distinct_multi_count(merge)

2. DistinctAggregateDisassemble splits the agg into 4 stages: Local, Global, Distinct Local, Distinct GLobal.
2.1. For count function, the implementation is as follows:distinct_multi_count(update)+ distinct_multi_count(merge)+sum(update)+ sum(merge)
2022-11-30 14:02:42 +08:00
79688a54d6 [bug](jsonb) fix be core at insert invalid json to JSONB column (#14686) 2022-11-30 14:00:50 +08:00
f3cf83a933 (fix)[test] add some logs (#14695) 2022-11-30 12:45:12 +08:00
3ca3af2234 [improvement](planner)sort show catalogs result by name (#14684)
Result of show databases, show tables, show data are all sorted by name, so make show catalogs behavior same.
2022-11-30 11:55:14 +08:00
3ff409551c [enhencement](netty) bind netty's default logger when launching fe (#14675)
The logger Doris Fe uses is log4j, while netty might use slf4j to choose one logger.
And it's reported some confusing occasions would happen under such circumstance.
And this binding doesn't take effect if move the bind logic to other file or other place within PaloFe.java,
so I have to leave it before the main function.
2022-11-30 11:54:39 +08:00
486a77fec0 [fix](tcmalloc) use low_watermark instead of hard_mem_limit (#14660)
* [fix](tcmalloc) use low_watermark instead of hard_mem_limit

hard_mem_limit is removed.

* format
2022-11-30 11:29:57 +08:00
9272680d00 [feature](multi-catalog) support Jdbc catalog (#14527)
Issue Number: close #xxx

I add jdbc catalog for doris multi-catalog feature.
Currently, the jdbc catalog only supports MYSQL DBMS.

TODO:

support for postgre DB
Support for other databases.
Problem summary
For jdbc catalog, we can create catalog like:

CREATE CATALOG jdbc4 PROPERTIES (
    "type"="jdbc",
    "jdbc.user"="root",
    "jdbc.password"="123456",
    "jdbc.jdbc_url" = "jdbc:mysql://127.0.0.1:13396/demo?yearIsDateType=false",
    "jdbc.driver_url" = "file:/mnt/disk2/ftw/tools/jar/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar",
    "jdbc.driver_class" = "com.mysql.jdbc.Driver"
);
Note:
yearIsDateType is a param of jdbc:
If yearIsDateType configuration property is set to false, then the returned object type is java.sql.Short. If set to true (the default), then the returned object is of type java.sql.Date with the date set to January 1st, at midnight.
To compat with mysql, we force the use of yearIsDateType=false in FE. if user sets yearIsDateType=true, doris FE will force to change yearIsDateType=false.
2022-11-30 11:28:08 +08:00
82f3980774 [feature](Nereids) estimation without column statistics (#14526)
estimate plan cost without column statistics.
change list:
1. remove original StatsCalculator, it is replaced by StatsCalculatorV2. rename StatsCalculatorV2 to StatsCalculator
2. remove FilterSelectivityCalculator, it is replaced by FilterEstimation
3. remove session var:ENABLE_NEREIDS_STATS_DERIVE_V2
4. add ColumnStatistics.isUnKnown, which means the column is not analyzed, and its stats is not accurate.
5. add estimatedRowCount() function for OLAP tables
6. add unit tests for FilterEstimation and StatsCalculator
2022-11-30 11:27:51 +08:00
3a362fab76 [fix](fe)table function node use wrong info for projection (#14667) 2022-11-30 10:41:32 +08:00
Pxl
7a1fde379c [Enhancement](function) optimize for decimal arithmetic calculation (#14674)
* optimize for decimal arithmetic calculation

* Apply suggestions from code review

Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>

Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
2022-11-30 10:41:03 +08:00
ca90253b09 [config](storage-policy) add a FE config to disable storage policy by default (#14655)
the cold-hot separation feature is still
under development. And seems there are some unsolved feature remains.
So I add a fe config enable_storage_policy, and default is false, to disable the creation and usage of storage policy by default.

So that user can aware that he is using an experimental feature on his own, and it will not be released formally in v1.2.0.

Disable storage policy by default, user can not use or create storage policy. Configured by enable_storage_policy.

Remove property remote_storage_policy, it is duplicate with storage_policy

Change the persist field in DataProperty.java.
And remove remoteCooldownTime from DataProperty, because it can be got from StoragePolicy.
2022-11-30 10:04:33 +08:00
dd7ec8f4ca [improvement](test) add tpch1 orc for hive catalog and refactor some test dir (#14669)
Add tpch 1g orc test case in hive docker

Refactor some suites dir of catalog test cases.

And "-internal" for dlf endpoint, to support access oss with aliyun vpc.
2022-11-30 10:03:58 +08:00
4faca56819 [bug](jsonb) fix INSERT/CAST NULL to JSONB (#14682)
Add NULL -> JSONB in implicitCastMap to support INSERT/CAST NULL to JSONB.
2022-11-30 09:53:16 +08:00
d5ee721621 [improvement](planner)Adjust the field naming rules when creating tables (#14671)
Adjust the field naming rules when creating tables.

The original table field rules are letters or underscores or @ characters as the first letter,
followed by a maximum of 63 characters, and the total cannot exceed 64 characters.
However, in many industries, such as the financial industry, the length of the derived fields often exceeds 64 
characters, so adjust the regular The rules are from 64 characters to 128 characters.
Many users load data from Hive to Doris through appearance or BrokerLoad.
Arabic numerals can be used as the first letter in the Hive table, so the regular rules are adjusted
to support Arabic numerals as the first letter.
2022-11-30 09:45:27 +08:00
b12ac90d8f [tools](tpch) upgrade decimal type to decimalv3 (#14665) 2022-11-30 08:41:06 +08:00
5a2e3869df [regression](test) enable fe and be fuzzy test (#14673) 2022-11-30 08:40:32 +08:00
33cda9f22a [improvement](planner)support like in show catalogs stmt #14678
Co-authored-by: yuleiyang <yuleiyang@tencent.com>
2022-11-30 08:38:42 +08:00
33ad616839 [fix](statistics) Fix potential NPE in ShowStatisticsStmt #14679
When required cache hasn't been loaded yet, cache would always return ColumnStatistics.DEFAULT which not define the max/min literal expr, add judge for that.
2022-11-30 08:38:20 +08:00
898d0d42f1 [improvement](load)add more log for better bug tracing experience for be write (#14424)
Recently when tracing one bug happened in version 1.1.4
I found out there were some places we can add more log for a better tracing.
2022-11-29 22:28:39 +08:00
82579126cf [fix](Dictionary-codec) heap overflow with in-predicate on nullable columns (#14319) (#14641)
Losing segmentid info will mess up the _segment_id_to_value_in_dict_flags map
in InListPredicate, causing two distinct segments to collide and crash the BE
at last.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2022-11-29 21:22:18 +08:00
22883e7e08 [fuzzy](test) be fuzzy conf (#14654) 2022-11-29 19:38:40 +08:00
03aa5572da [feature](docker)Add Broker Docker image related files (#14621)
Add Broker Docker image related files
2022-11-29 18:34:10 +08:00
85ce3c37b5 [fix](DOE) fix ES query dsl is wrong after FE restarted. (#14652)
Some of default properties of ES catalog is not persisted in EditLog. So when FE is restarted,
the default properties is lost, such as `elasticsearch.doc_value_scan`, `elasticsearch.keyword_sniff` and so on.
2022-11-29 17:06:48 +08:00
a60490651f [improvement](function) add timezone cache for convert_tz (#14616) 2022-11-29 17:00:54 +08:00
1713af6cd6 [test](java udf)add new java udf case (#14653) 2022-11-29 16:43:53 +08:00
fe95b84c34 [fix](jsonb)fix CAST String to JSONB nullable problem (#14626)
fix CAST String to SONB nullable problem in DEBUG mode.
2022-11-29 16:22:22 +08:00
7a08a799e9 [Vectorized](function) support order by convert_to function (#14555) 2022-11-29 15:22:27 +08:00
facb7cf4e2 [fix](spark load)Temp partition with spark load (#14648)
* [fix](spark load)losing temporary partition item entry

* [fix](spark load)Temp partition with spark load
2022-11-29 15:21:44 +08:00
c5f9fd5619 [fix](spark load)partition column is not duplicate key, spark load IndexOutOfBounds error (#14661)
* [fix](spark load)partition column is not duplicate key,spark load IndexOutOfBoundsException error

Co-authored-by: 张放(vivianv.zhang) <vivianv.zhang@huolala.cn>
2022-11-29 15:21:21 +08:00
3e8b3658c7 [feature-wip](decimalv3) Support basic agg and arithmetic operations for decimal v3 (#14513) 2022-11-29 15:12:41 +08:00
Pxl
82da071b45 [Chore](format) update clang-format version to 15 (#13036)
update clang-format version to 15
2022-11-29 14:46:10 +08:00
97f0d3a756 [Improvement](datatype) disable new types if vectorized engine is disabled (#14561)
* [Imptovement](datatype) disable new types if vectorized engine is disabled

disable datev2/datetimev2/decimalv3 if vectorized engine is disabled
2022-11-29 10:33:46 +08:00