Commit Graph

13721 Commits

Author SHA1 Message Date
4ba75d3195 [feature] Add StoragePolicyResource for Remote Storage (#9554)
Add StoragePolicyResource for Remote Storage
2022-05-17 20:17:33 +08:00
d95fe08458 [feature] group_concat support distinct (#9576) 2022-05-17 19:29:47 +08:00
ec2cd0083a [code format]Upgrade clang-format in BE Code Formatter from 8 to 13 (#9602) 2022-05-17 19:28:15 +08:00
7417f9dfa3 [doc]modified the spark-load doc (#9605) 2022-05-17 19:27:02 +08:00
0aac9489ae [doc]add largeint doc (#9609)
add largeint doc
2022-05-17 19:26:45 +08:00
536d8ca1ed [Bug][Vectorized] Fix insert bimmap column with nullable column (#9408)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-17 14:42:20 +08:00
1cc9653bd8 [Bug][Vectorized] Fix BE crash with delete condition and enable_storage_vectorization (#9547)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-17 14:01:22 +08:00
7d9fa04472 [fix](storage-vectorized) fix VMergeIterator core dump (#9564)
It could be re appeared on rowset with many segment, it means segment overlap. Maybe could not reappear it easily.
2022-05-17 11:58:59 +08:00
72e0042efb [feature-wip](hudi) Step1: Support create hudi external table (#9559)
support create hudi table
support show create table for hudi table

### Design
1. create hudi table without schema(recommanded)
```sql
    CREATE [EXTERNAL] TABLE table_name
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```

2. create hudi table with schema
```sql
    CREATE [EXTERNAL] TABLE table_name
    [(column_definition1[, column_definition2, ...])]
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```
When create hudi table with schema, the columns must exist in corresponding table in hive metastore.
2022-05-17 11:30:23 +08:00
bee5c2f8aa [feature-wip](parquet-vec) Support parquet scanner in vectorized engine (#9433) 2022-05-17 09:37:17 +08:00
7e8e14b3c6 [docs]Modifide flink-doris-connector.md (#9595) 2022-05-17 09:01:14 +08:00
5660815dc6 [chore] Fix compilation errors reported by clang (#9584) 2022-05-16 22:36:16 +08:00
c731e84341 [fix](planner)VecNotImplException thrown when query need rewrite and some slot cannot changed to nullable (#9589) 2022-05-16 22:34:02 +08:00
9f9b666bc1 [Feature](Nereids) Data structure of comparison predicate (#9506)
1. The data structure of the comparison expression
2. Refactored the inheritance and implementation relationship of tree node

```
        +-- ---- ---- ---+- ---- ---- ---- ---+- ---- ----- ---- ----TreeNode-----------------+
        |                |                    |                                               |
                                                                                              |
        |                |                    |                                               |
                                                                                              v
        v                v                    v                                           Abstract Tree Node
    Leaf Node        Unary Node          Binary Node                              +--------          ---------+
        |                |                    |                                   |        (children)         |
                                                                                  |                           |
        v                v                    v                                   v                           v
Leaf Expression   Unary Expression      Binary Expression              +------Expression----+           Plan Node
        |                |                    |                        |                    |
                                                                       |                    |
        |                |                    |                        v                    v
        |                |                    +- ---- ---- -----> Comparison Predicate     Named Expr
                                                                                       +----   -------+
        |                |                                                             v              v
        |                +- -- --- --- --- --- --- --- --- --- --- --- --- --- ---> Alias Expr      Slot
                                                                                                      ^
        |                                                                                             |
        |                                                                                             |
        +---- --- ---- ------ ---- ------- ------ ------- --- ------ ------ ----- ---- ----- ----- ---+
```
2022-05-16 15:01:13 +08:00
953429e370 [fix](function) fix last_value get wrong result when have order by clause (#9247) 2022-05-15 23:56:01 +08:00
e0c790094c [enhancement][betarowset]optimize lz4 compress and decompress speed by reusing context (#9566) 2022-05-15 21:18:32 +08:00
cfe22f8691 [Doc]Add show tables help documentation (#9568) 2022-05-15 10:18:33 +08:00
1075648093 [doc]fix doc typo in data-model and date data type (#9571) 2022-05-15 10:17:46 +08:00
9151cf717a ADD: 补充idea开发文档,添加help-resource.zip的生成步骤 (#9561) 2022-05-14 19:04:11 +08:00
3cfa83784e [bugfix](vectorized) vectorized write: invalid memory access caused by podarray resize (#9556) 2022-05-14 19:03:51 +08:00
cd105bee0a [refactor](es) Clean es tcp scannode and related thrift definitions (#9553)
PaloExternalSourcesService is designed for es_scan_node using tcp protocol.
But es tcp protocol need deploy a tcp jar into es code. Both es version and lucene version are upgraded,
and the tcp jar is not maintained any more.

So that I remove all the related code and thrift definitions.
2022-05-14 10:03:55 +08:00
a9653f00bb [fix](lateral-view) Error view includes lateral view (#9530)
Fixed #9529

When the lateral view based on a inline view which belongs to a view,
Doris could not resolve the column of lateral view in query.
When a query uses a view, it mainly refers to the string representation of the view.
That is, if the view's string representation is wrong, the view is wrong.
The string representation of the inline view lacks the handling of the lateral view.
This leads to query errors when using such views.
This PR mainly fixes the string representation of inline views.
2022-05-14 09:57:08 +08:00
fa6e4db4ca [fix](Function) fix case when function return null with abs function (#9493) 2022-05-14 09:50:45 +08:00
b817efd652 [feature] add vectorized vjson_scanner (#9311)
This pr is used to add the vectorized vjson_scanner, which can support vectorized json import in stream load flow.
2022-05-14 09:50:05 +08:00
f144041a3c [doc] [Improved] The flink connector documentation is perfect (#9528)
Co-authored-by: 王磊 <lei.wang@unidt.com>
2022-05-13 16:22:54 +08:00
4ca5be94a7 [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor (#9491)
* [Improvement] remove unnecessary memcpy in OlapBlockDataConvertor
2022-05-13 16:13:07 +08:00
650e3a6ba0 [feature-wip](array-type) array_contains support more nested data types (#9170)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-05-13 12:42:40 +08:00
34e64fbea9 [doc]Add ARM architecture compilation tutorial content (#9535)
Co-authored-by: manyi <fop@freeoneplus.com>
2022-05-13 10:24:19 +08:00
8c166d747c Clean the version.sh file before build, otherwise the version information in the binary package produced by this compilation is still the commit id of the last time. (#9534)
Co-authored-by: stephen <hello-stephen@qq.com>
2022-05-13 10:23:44 +08:00
e0ef04a5a7 [fix][vectorized-storage] did not check column writer's write status 2022-05-13 09:57:33 +08:00
955b7a3ba2 [bugfix](load) fix coredump in ordinal index flush (#9518)
commit #9123 introduce the bug. bitshuffle page return error when
page is full, so scalar column write cannot switch to next page, which make
ordinal index is null when flush.

All page builder should return ok when page full, and column writer procedure
shoud be append_data, check is_page_full, switch to next page

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-05-12 21:10:49 +08:00
8a0097cfb9 [style](java) format fe code with some check rules (#9460)
Issue Number: close #9403 

set below rules' severity to error and format code according check info.
a. Merge conflicts unresolved
b. Avoid using corresponding octal or Unicode escape
c. Avoid Escaped Unicode Characters
d. No Line Wrap
e. Package Name
f. Type Name
g. Annotation Location
h. Interface Type Parameter
i. CatchParameterName
j. Pattern Variable Name
k. Record Component Name
l. Record Type Parameter Name
m. Method Type Parameter Name
n. Redundant Import
o. Custom Import Order
p. Unused Imports
q. Avoid Star Import
r. tab character in file
s. Newline At End Of File
t. Trailing whitespace found
2022-05-12 20:14:38 +08:00
86c9227dbb [regression test]add the regression test for json load (#9517)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-05-12 16:08:03 +08:00
da1a0c96db Incorrect sequence numbers in revision documents. (#9496)
Co-authored-by: smallhibiscus <844981280>
2022-05-12 15:44:41 +08:00
4ccaa0dfc5 [Bug] (load) Broker load kerberos auth fail (#9494) 2022-05-12 15:43:29 +08:00
a0b95d8fcb [fix](storage) fix core for string predicate in storage layer (#9500)
Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-05-12 15:41:39 +08:00
4cd579b155 [refactor] Check status precise_code instead of construct OLAPInternalError (#9514)
* check status precise_code instead of construct OLAPInternalError
* move is_io_error to Status
2022-05-12 15:39:29 +08:00
d26f5d22be [refactor]Cleanup unused empty files (#9497) 2022-05-12 14:58:28 +08:00
d7705ace65 [fix](binlog-load) binlog load fails because txn exceeds the default value (#9471)
binlog load Because txn exceeds the default value, resume is a failure,
and a friendly prompt message is given to the user, instead of prompting success now,
it still fails after a while, and the user will feel inexplicable
Issue Number: close #9468
2022-05-12 13:31:22 +08:00
cfbf13710b [fix](broker-load) can't load parquet file with column name case sensitive with Doris column (#9358) 2022-05-12 13:27:03 +08:00
122cc3b772 [chore](fe code style)add suppressions to fe check style (#9429)
Current fe check style check all files. But some rules should be only applied on production files.
Add suppressions to suppress some rules on test files.
2022-05-12 12:16:55 +08:00
f11d320213 [feature] support row policy filter (#9206) 2022-05-11 22:11:10 +08:00
289608cc20 [fixbug]fix bug for OLAP_SUCCESS with Status (#9427) 2022-05-11 20:04:06 +08:00
e3bac86b43 [bugfix](vtablet_sink) fix max_pending_bytes for vtablet_sink (#9462)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-05-11 18:00:56 +08:00
3ba5ff4705 [doc] update fe checkstyle doc (#9373) 2022-05-11 15:44:29 +08:00
74352c807e [refactor](Nereids): cascades refactor (#9470)
Describe the overview of changes.

- rename GroupExpression
- use `HashSet<GroupExpression> groupExpressions` in `memo`
- add label of `Nereids` for CI
- remove `GroupExpr` from Plan
2022-05-11 11:07:58 +08:00
ad88eb739b [fix](http) Hardening Recommendations Disable TRACE/TRAC methods (#9479) 2022-05-11 09:41:59 +08:00
8fa0122ed0 [refactor](backend) Refactor the logic of selecting Backend in FE. (#9478)
There are many places in FE where a group of BE nodes needs to be selected according to certain requirements. for example:
1. When creating replicas for a tablet.
2. When selecting a BE to execute Insert.
3. When Stream Load forwards http requests to BE nodes.

These operations all have the same logic. So this CL mainly changes:
1. Create a new `BeSelectionPolicy` class to describe the set of conditions for selecting BE.
2. The logic of selecting BE nodes in `SystemInfoService` has been refactored, and the following two methods are used uniformly:
    1. `selectBackendIdsByPolicy`: Select the required number of BE nodes according to the `BeSelectionPolicy`.
    2. `selectBackendIdsForReplicaCreation`: Select the BE node for the replica creation operation.

Note that there are some changes here:
For the replica creation operation, the round-robin method was used to select BE nodes before,
but now it is changed to `random` selection for the following reasons:
1. Although the previous logic is round-robin, it is actually random.
2. The final diff of the random algorithm will not be greater than 5%, so it can be considered that the random algorithm
     can distribute the data evenly.
2022-05-11 09:40:57 +08:00
a738d385db [regression] add regression test for compaction (#9437)
Trigger compaction via REST API in this case.
2022-05-11 09:40:21 +08:00
375c1bf5c0 [feature](mysql-table) support utf8mb4 for mysql external table (#9402)
This patch supports utf8mb4 for mysql external table.

if someone needs a mysql external table with utf8mb4 charset, but only support charset utf8 right now.

When create mysql external table, it can add an optional propertiy "charset" which can set character fom mysql connection, 
default value is "utf8". You can set "utf8mb4" instead of "utf8" when you need.
2022-05-11 09:39:23 +08:00