Commit Graph

4601 Commits

Author SHA1 Message Date
cd105bee0a [refactor](es) Clean es tcp scannode and related thrift definitions (#9553)
PaloExternalSourcesService is designed for es_scan_node using tcp protocol.
But es tcp protocol need deploy a tcp jar into es code. Both es version and lucene version are upgraded,
and the tcp jar is not maintained any more.

So that I remove all the related code and thrift definitions.
2022-05-14 10:03:55 +08:00
a9653f00bb [fix](lateral-view) Error view includes lateral view (#9530)
Fixed #9529

When the lateral view based on a inline view which belongs to a view,
Doris could not resolve the column of lateral view in query.
When a query uses a view, it mainly refers to the string representation of the view.
That is, if the view's string representation is wrong, the view is wrong.
The string representation of the inline view lacks the handling of the lateral view.
This leads to query errors when using such views.
This PR mainly fixes the string representation of inline views.
2022-05-14 09:57:08 +08:00
fa6e4db4ca [fix](Function) fix case when function return null with abs function (#9493) 2022-05-14 09:50:45 +08:00
b817efd652 [feature] add vectorized vjson_scanner (#9311)
This pr is used to add the vectorized vjson_scanner, which can support vectorized json import in stream load flow.
2022-05-14 09:50:05 +08:00
f144041a3c [doc] [Improved] The flink connector documentation is perfect (#9528)
Co-authored-by: 王磊 <lei.wang@unidt.com>
2022-05-13 16:22:54 +08:00
4ca5be94a7 [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor (#9491)
* [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor
2022-05-13 16:13:07 +08:00
650e3a6ba0 [feature-wip](array-type) array_contains support more nested data types (#9170)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-05-13 12:42:40 +08:00
34e64fbea9 [doc]Add ARM architecture compilation tutorial content (#9535)
Co-authored-by: manyi <fop@freeoneplus.com>
2022-05-13 10:24:19 +08:00
8c166d747c Clean the version.sh file before build, otherwise the version information in the binary package produced by this compilation is still the commit id of the last time. (#9534)
Co-authored-by: stephen <hello-stephen@qq.com>
2022-05-13 10:23:44 +08:00
e0ef04a5a7 [fix][vectorized-storage] did not check column writer's write status 2022-05-13 09:57:33 +08:00
955b7a3ba2 [bugfix](load) fix coredump in ordinal index flush (#9518)
commit #9123 introduce the bug. bitshuffle page return error when
page is full, so scalar column write cannot switch to next page, which make
ordinal index is null when flush.

All page builder should return ok when page full, and column writer procedure
shoud be append_data, check is_page_full, switch to next page

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-05-12 21:10:49 +08:00
8a0097cfb9 [style](java) format fe code with some check rules (#9460)
Issue Number: close #9403 

set below rules' severity to error and format code according check info.
a. Merge conflicts unresolved
b. Avoid using corresponding octal or Unicode escape
c. Avoid Escaped Unicode Characters
d. No Line Wrap
e. Package Name
f. Type Name
g. Annotation Location
h. Interface Type Parameter
i. CatchParameterName
j. Pattern Variable Name
k. Record Component Name
l. Record Type Parameter Name
m. Method Type Parameter Name
n. Redundant Import
o. Custom Import Order
p. Unused Imports
q. Avoid Star Import
r. tab character in file
s. Newline At End Of File
t. Trailing whitespace found
2022-05-12 20:14:38 +08:00
86c9227dbb [regression test]add the regression test for json load (#9517)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-05-12 16:08:03 +08:00
da1a0c96db Incorrect sequence numbers in revision documents. (#9496)
Co-authored-by: smallhibiscus <844981280>
2022-05-12 15:44:41 +08:00
4ccaa0dfc5 [Bug] (load) Broker load kerberos auth fail (#9494) 2022-05-12 15:43:29 +08:00
a0b95d8fcb [fix](storage) fix core for string predicate in storage layer (#9500)
Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-05-12 15:41:39 +08:00
4cd579b155 [refactor] Check status precise_code instead of construct OLAPInternalError (#9514)
* check status precise_code instead of construct OLAPInternalError
* move is_io_error to Status
2022-05-12 15:39:29 +08:00
d26f5d22be [refactor]Cleanup unused empty files (#9497) 2022-05-12 14:58:28 +08:00
d7705ace65 [fix](binlog-load) binlog load fails because txn exceeds the default value (#9471)
binlog load Because txn exceeds the default value, resume is a failure,
and a friendly prompt message is given to the user, instead of prompting success now,
it still fails after a while, and the user will feel inexplicable
Issue Number: close #9468
2022-05-12 13:31:22 +08:00
cfbf13710b [fix](broker-load) can't load parquet file with column name case sensitive with Doris column (#9358) 2022-05-12 13:27:03 +08:00
122cc3b772 [chore](fe code style)add suppressions to fe check style (#9429)
Current fe check style check all files. But some rules should be only applied on production files.
Add suppressions to suppress some rules on test files.
2022-05-12 12:16:55 +08:00
f11d320213 [feature] support row policy filter (#9206) 2022-05-11 22:11:10 +08:00
289608cc20 [fixbug]fix bug for OLAP_SUCCESS with Status (#9427) 2022-05-11 20:04:06 +08:00
e3bac86b43 [bugfix](vtablet_sink) fix max_pending_bytes for vtablet_sink (#9462)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-05-11 18:00:56 +08:00
3ba5ff4705 [doc] update fe checkstyle doc (#9373) 2022-05-11 15:44:29 +08:00
74352c807e [refactor](Nereids): cascades refactor (#9470)
Describe the overview of changes.

- rename GroupExpression
- use `HashSet<GroupExpression> groupExpressions` in `memo`
- add label of `Nereids` for CI
- remove `GroupExpr` from Plan
2022-05-11 11:07:58 +08:00
ad88eb739b [fix](http) Hardening Recommendations Disable TRACE/TRAC methods (#9479) 2022-05-11 09:41:59 +08:00
8fa0122ed0 [refactor](backend) Refactor the logic of selecting Backend in FE. (#9478)
There are many places in FE where a group of BE nodes needs to be selected according to certain requirements. for example:
1. When creating replicas for a tablet.
2. When selecting a BE to execute Insert.
3. When Stream Load forwards http requests to BE nodes.

These operations all have the same logic. So this CL mainly changes:
1. Create a new `BeSelectionPolicy` class to describe the set of conditions for selecting BE.
2. The logic of selecting BE nodes in `SystemInfoService` has been refactored, and the following two methods are used uniformly:
    1. `selectBackendIdsByPolicy`: Select the required number of BE nodes according to the `BeSelectionPolicy`.
    2. `selectBackendIdsForReplicaCreation`: Select the BE node for the replica creation operation.

Note that there are some changes here:
For the replica creation operation, the round-robin method was used to select BE nodes before,
but now it is changed to `random` selection for the following reasons:
1. Although the previous logic is round-robin, it is actually random.
2. The final diff of the random algorithm will not be greater than 5%, so it can be considered that the random algorithm
     can distribute the data evenly.
2022-05-11 09:40:57 +08:00
a738d385db [regression] add regression test for compaction (#9437)
Trigger compaction via REST API in this case.
2022-05-11 09:40:21 +08:00
375c1bf5c0 [feature](mysql-table) support utf8mb4 for mysql external table (#9402)
This patch supports utf8mb4 for mysql external table.

if someone needs a mysql external table with utf8mb4 charset, but only support charset utf8 right now.

When create mysql external table, it can add an optional propertiy "charset" which can set character fom mysql connection, 
default value is "utf8". You can set "utf8mb4" instead of "utf8" when you need.
2022-05-11 09:39:23 +08:00
092a12e983 [feature] show create materialized view (#9391) 2022-05-11 09:29:55 +08:00
718a51a388 [refactor][style] Use clang-format to sort includes (#9483) 2022-05-10 21:25:35 +08:00
ce926a7abb [refactor] delete OLAP_LOG_WARNING related macro definition (#9484)
Co-authored-by: BePPPower <fangtiewei@selectdb.com>
2022-05-10 20:53:45 +08:00
b34ed43ec9 [feature-wip] (memory tracker) (step6, End) Fix some details (#9301)
1. Fix LoadTask, ChunkAllocator, TabletMeta, Brpc, the accuracy of memory track.
2. Modified some MemTracker names, deleted some unnecessary trackers, and improved readability.
3. More powerful MemTracker debugging capabilities.
4. Avoid creating TabletColumn temporary objects and improve BE startup time by 8%.
5. Fix some other details.
2022-05-10 18:17:09 +08:00
99b8e08a5f [Enhancement](Optimizer) Nereids pattern matching base framework (#9474)
This pr provide a new pattern matching framework for Nereids optimizer.

The new pattern matching framework contains this concepts:

1. `Pattern`/`PatternDescriptor`: the tree node's multiple hierarchy shape, e.g. `logicalJoin(logicalJoin(), any()` pattern describe a plan that root is a `LogicalJoin` and the left child is `LogicalJoin` too.
2. `MatchedAction`: a callback function when the pattern matched, usually you can create new plan to replace the origin matched plan.
3. `MatchingContext`: the param pass through MatchedAction, contains the matched plan root and the PlannerContext.
4. `PatternMatcher`: contains PatternDescriptor and MatchedAction
5. `Rule`: a rewrite rule contains RuleType, PatternPromise, Pattern and transform function(equals to MatchedAction)
6. `RuleFactory`: the factory can help us build Rules easily. RuleFactory extends Patterns interface, and have some predefined pattern descriptors.

for example, Join commutative:
```java
public class JoinCommutative extends OneExplorationRuleFactory {
    @Override
    public Rule<Plan> build() {
        return innerLogicalJoin().thenApply(ctx -> {
            return new LogicalJoin(
                JoinType.INNER_JOIN,
                ctx.root.getOnClause(),
                ctx.root.right(),
                ctx.root.left()
            );
        }).toRule(RuleType.LOGICAL_JOIN_COMMUTATIVE);
    }
}
```

the code above show the three step to create a Rule
1. 'innerLogicalJoin()' declare pattern  is an inner logical join. 'innerLogicalJoin' is a predefined pattern.
2. invoke 'thenApply()' function to combine a MatchedAction, return a new LogicalJoin with exchange children.
3. invoke 'toRule()' function to convert to Rule

You can think the Rule contains three parts: 
1. Pattern
2. transform function / MatchedAction
3. RuleType and RulePromise

So
1. `innerLogicalJoin()` create a `PatternDescriptor`, which contains a `Pattern`
2. `PatternDescriptor.then()` convert `PatternDescriptor` to `PatternMatcher,` witch contains Pattern and MatchedAction
3. `PatternMatcher.toRule()` convert `PatternMatcher` to a Rule

This three step inspired by the currying in function programing.

It should be noted, #9446 provide a generic type for TreeNode's children, so we can infer multiple hierarchy type in this pattern matching framework, so you can get the really tree node type without unsafely cast. like this:
```java
logicalJoin(logicalJoin(), any()).then(j -> {
     // j can be inferred type to LogicalJoin<LogicalJoin<Plan, Plan>, Plan>
     // so j.left() can be inferred type to LogicalJoin<Plan, Plan>,
     // so you don't need to cast j.left() from 'Plan' to 'LogicalJoin'
     var node = j.left().left();
})
```
2022-05-10 10:06:04 +08:00
e61d296486 [Refactor] Replace '#ifndef' with '#pragma once' (#9456)
* Replace '#ifndef' with '#pragma once'
2022-05-10 09:25:59 +08:00
51db78d375 [refactor] modify all OLAP_LOG_WARNING to LOG(WARNING) (#9473)
Co-authored-by: BePPPower <fangtiewei@selectdb.com>
2022-05-10 09:25:25 +08:00
76154f11b7 [docs][typo] Fix some typoes in "update.md" content. (#9455)
Fix some typoes in "update.md" content
2022-05-10 09:02:04 +08:00
70642e3bff [Doc] 添加CTAS文档 (#9454)
* ADD: 添加CTAS文档
2022-05-10 09:01:42 +08:00
eec1dfde3a [feature] (vec) instead of converting line to src tuple for stream load in vectorized. (#9314)
Co-authored-by: xiepengcheng01 <xiepengcheng01@xafj-palo-rpm64.xafj.baidu.com>
2022-05-09 11:24:07 +08:00
d1b85d51a0 [code style](fe) Include test sources (#9366)
Include test sources, we also need to check them.
2022-05-09 09:40:44 +08:00
ae01862ae4 [fix](ut) fix DeltaWriter::close_wait parameter mismatch in delta_writer_test (#9457) 2022-05-09 09:38:12 +08:00
7e86c1beab [fix] UT MathFunctionTest.round_test fix (#9447)
Function round support two format round(double) and round(double, int), the argument is variadic.
But FunctionBinaryArithmetic not support variadic argument now, make get_function for round(double, int) failed.

reproduce steps:
1. set enable_vectorized_engine=true;
2. try to call round(double, int);
```
> select round(10.12345,2);
ERROR 1105 (HY000): errCode = 2, detailMessage = Function round is not implemented
```
2022-05-09 09:37:27 +08:00
6834fb23ca [fix](s3) fix s3 Temp file may write failed because of has no space on disk (#9421) 2022-05-09 09:28:43 +08:00
5df5d39161 [doc] update data-model.md and data-partition.md (#9448)
update data-model.md and data-partition.md
2022-05-09 09:19:09 +08:00
35f0725387 [doc] Update DECIMAL.md (#9451)
* Update DECIMAL.md
2022-05-09 09:17:24 +08:00
327f61b796 Update data-partition.md (#9450)
Update data-partition.md
2022-05-09 09:17:00 +08:00
8932fcaf59 [Doc] fix doc link suffix .html to .md (#9442)
* fix doc link suffix html to md
2022-05-09 09:16:06 +08:00
580ce38a3f [fix](schema_hash) Fix bug that introduced by removing schema_hash (#9449) 2022-05-08 21:03:10 +08:00
c633402ce3 [feature] (sql-digest) support sql digest (#8919) 2022-05-08 17:25:41 +08:00