Commit Graph

2290 Commits

Author SHA1 Message Date
8bb65863f5 [Doc] Update doc of fe-idea-dev.md (#4485) 2020-08-31 10:09:10 +08:00
7b67da30d2 [Spark Load] Redirect the spark launcher's log to a separated log file (#4470) 2020-08-30 21:10:04 +08:00
1d93ba027a [Compaction] Compaction show policy type and disk format (#4466)
Add more information in compaction show api
1、add cumulative policy type
2、format rowset total disk size
2020-08-30 21:09:47 +08:00
wyb
ffe696d17c [Doc] Add spark load sql statement doc and update manual (#4463)
1. add sql statement in dml
2. update spark load manual
2020-08-30 21:09:17 +08:00
65cacbff7c [Bug] Fix bug that memory copy may overflow in MemIndex::load_segment (#4458)
Segment index file content is not set as 0 when it is constructed in write procedure, 
so when load index from this file, and meet a null VARCHAR cell,
the null field of this cell is 0, but the length field which is not initialized maybe a large random number,
then memory copy may cause overflow.
This patch fix this bug, and also skip useless memory copy to improve a bit of performance.
2020-08-30 21:08:55 +08:00
123237afb7 [Compaction] Persistence stale rowsets meta (#4454)
Persistence stale rowsets meta. When BE reboots, stale rowsets meta
can resume and the stale version can also be readable before stale gc time.

ISSUE: #4453
2020-08-30 21:05:48 +08:00
3b7614e174 [Refactor] Use camelCase in thrift generated java sources (#4443)
Use camelCase in thrift generated java sources to make us fe's code style is more unified
2020-08-28 13:28:11 +08:00
0db9194dc0 [Doc] Fix wrong doc name (#4477)
Co-authored-by: morningman <chenmingyu@baidu.com>
2020-08-28 11:56:59 +08:00
004b955ca4 [Bug] Fix a null pointer bug in PlanFragmentExecutor. (#4473)
Fix a null pointer bug in PlanFragmentExecutor. Add null check operation before it is used.
Detail: #4472
2020-08-28 09:28:23 +08:00
wyb
ec64789e89 [Bug][Colocation Join] Fix colocation balance endless loop bug (#4471)
1. Only one available backend.
2. All backends are checked but this round is not changed. For example, all backends are on the same host.
2020-08-28 09:27:57 +08:00
174c9f89ea [DOCS] Add batch delete docs (#4435)
update documents for batch delete #4051
2020-08-28 09:24:07 +08:00
wyb
82940a4905 [Spark Load] Fix spark load bugs (#4464)
1. fix write dpp result when dpp throw exception
2. boolean value:true, false(IgnoreCase), 0, 1
3. wrong dest column for source data check
4. support * in source file path 
5. if job state is cancelled or finished, submitPushTasks would throw all partitions have no load data exception,
    because tableToLoadPartitions was already cleaned up

#3433
2020-08-27 23:40:33 +08:00
84c63f1350 [Bug] replace libltdl.so when compile the unixodbc library (#4461) 2020-08-27 20:53:28 +08:00
976e3bb219 [Bug][Compile] Add missing imports (#4468)
Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>
2020-08-27 18:14:11 +08:00
ad738fa198 Add OLAP_ERR_DATE_QUALITY_ERR error status to display schema change failure (#4388)
In the process of historical data transformation of materialized views, it may occur that the transformation fails due to data quality.
Add an error status code :OLAP_ERR_DATE_QUALITY_ERR to determine if a data problem is causing the failure

#3344
2020-08-27 17:52:53 +08:00
fe0c21bf93 [Bug] Fix mysql return bug (#4450)
Send fields after first row arrived so that error packet can be send to client
when exception thrown from coord.getNext(). 
Golang and Python can not identify error if fields packet arrived before error packet.
2020-08-27 12:17:24 +08:00
3c784b9c90 [SQL] support StringLiteral try to cast BigInt (#4445) 2020-08-27 12:15:28 +08:00
b85bb0e2e9 [Bug-Fix] Some deleted tablets are not recycled on BE (#4401) 2020-08-27 12:09:19 +08:00
8c38c79104 [SparkLoad]Use the yarn command to get status and kill the application (#4383)
This cl will use yarn command as follows to kill or get status of application running on YARN.

```
yarn --config confdir application <-kill | -status> <Application ID>
```
2020-08-27 12:08:55 +08:00
f218327dd9 [Mysql Compatibility] Support convert() and signed/unsigned interger cast (#4364)
1. Support convert(expr, target_type) function, which is same as CastExpr
2. Support cast (expr as signed/unsigned int)
   This is just for compatibility, the signed/unsigned specification is meaningless.
2020-08-27 12:07:58 +08:00
8b0b120aca [Profile] Add 2 Segment related metrics in query profile (#4348)
Total number of segments and filterd number of segment
2020-08-27 12:07:21 +08:00
e4e9af4577 This PR contain three things (#4448)
1. Fix core bug wild pointer in PlanFragmentExecutor, fix issue #4447
2. Fix core bug wild pointer json load, fix issue #4452
3. Change the declare order of ODBC type in thrift for compatibility
2020-08-26 10:53:53 +08:00
78e1615db9 Show column display name on Show Proc stmt (#4446)
The mv column with bitmap_union function is named `mv_bitmap_union_k1` inside of Doris.
But this column name should not be shown to user in `Show Proc` stmt.
Instead, using define expr is easier to understand.

Change-Id: Id07274fef9b3a97c97f1635dd3d6cf7b09561c1e
2020-08-26 10:52:56 +08:00
97d963468a [Code Cleanup] Template nest convert to c++11 syntax and style (#4442) 2020-08-26 10:51:52 +08:00
09129b5ddd [MV] Keep the scale and precision of type when creating mv (#4436)
The DECIMAL, CHAR, VARCHAR have their own scale and precision in column.
The mv column should keep those scale and precision.

Fixed #4433
Change-Id: Ie288738a4356e60d11ea472dd274e54bc7ae6990
2020-08-26 10:51:12 +08:00
b4d8b3d9ba Forbidden the illegal column types on BITMAP_UNION OR HLL_UNION mv (#4432)
1. The base column of bitmap_union could must be integer. The largeint is not supported too.
2. The base column of hll_union could not be decimal.

Check error msg of const expr in Union Node

If user wants to insert a negative number into bitmap mv, Doris will thrown exception 'invalid input'.
The const value in Union Node is checked in this commit.
2020-08-26 10:49:32 +08:00
a5d1d010c0 [Doc] Fix typo about plugin content (#4416) 2020-08-26 10:48:07 +08:00
664e6a5898 [Storage] "align_tag_path" and ALIGN_TAG_PREFIX is needless (#4410) 2020-08-26 10:47:21 +08:00
ca5e224594 [Bug] Fix the bug that replication_num in show create table is incorrect (#4393) 2020-08-26 10:43:59 +08:00
763a42c9af [MySQL Compatibility 2/4][Bug] Fix bug and improve compatibility with mysql protocol (#4362)
1. select database() will only return database name, without cluster name.
2. select user() will return the IP which user connected in.
2020-08-26 10:40:42 +08:00
613c44e889 [Optimize]Optimize the disk selection strategy on BE for tablet creation (#4373)
When creating a tablet, it is necessary to select a disk from all disks that
meet the requirements on the BE node to store the tablet.

In Doris, the current disk selection strategy is to randomly select a disk
from all disks that meet the requirements for tablet creation.

After the cluster has been running for a long time, we found that the
distribution of the number of tablets on different disks in a BE node is unbalanced.

In order to solve this problem, we introduced the algorithm of "two random choices"
for disk selection when creating the tablet:
(1) Select two disks from all disks that meet the requirements on the BE node randomly;
(2) Choose the disk with a smaller number of tablet from the two disks selected in (1) for tablet creation.
2020-08-26 10:35:33 +08:00
0040153c51 [MySQL Compatibility 1/4][Bug] Fix bug that set sql_mode with concat() function failed (#4359)
Support `set sql_mode = concat(@@sql_mode, "STRICT_TRANS_TABLES");`
2020-08-26 10:28:25 +08:00
b1c7841c20 [SQL] Fix TupleIsNull miss in SelectStmt resultExpr (#4279) 2020-08-26 10:27:50 +08:00
d5a0a738f4 [SQL] Rewrite count(distinct if(bool, bitmap, null)) to bitmap_union_count (#4201)
Add IF(BOOL, BITMAP, BITMAP) function.
2020-08-26 10:26:40 +08:00
wyb
691227922e [SQL Plan]Fix explicit broadcast join bug (#4424)
Use broadcast join when users specify explicitly [BROADCAST] in queries.
2020-08-25 22:06:45 +08:00
c201cf6e4f Support batch delete[part 2] (#4425)
support batch delete for read compaction
2020-08-25 14:05:04 +08:00
1410d4e623 [Doc] Add in predicate support content in delete-manual.md (#4404)
Add in predicate support content in delete-manual.md
2020-08-24 21:52:28 +08:00
67b842ce04 [License] Organize and modify the license of the code (#4371)
1. Disable the MySQL client and LZO library by default when building the Doris.

    MySQL client library is used for MySQL external table feature.
    This feature will be replaced by the new ODBC external table soon.

    LZO library is used to compress/decompress data of some old data format of Doris,
    which is no longer used anymore.

2. Add missing license to some files.

3. For all non-Apache-License code, all are explained in NOTICE file and the corresponding license is declared.

4. Remove the js source code from webroot, it will be downloaded as thirdparty
2020-08-24 21:51:55 +08:00
976820ba20 [SegmentV2] Change the default storage format to SegmentV2 (#4387)
Since the Segment V2 has been released for a long time, we should make it as default storage format for newly created table.

This CL mainly changes:
1. For all newly created tables, their default storage format is Segment V2.
2. For all already exist tablets, their storage format remain unchanged.
3. Fix  bugs described in Fix #4384 and Fix #4385
2020-08-24 21:51:17 +08:00
5fc79561d7 [MemTracker][Bug-Fix] Fix core in DECHECK in memory tracker (#4421)
Fix DECHECK failed in mem_tracker, issue #4420
2020-08-23 22:41:02 +08:00
af2b749a87 make some readFields Deprecated (#4399)
We have changed most of our serialization methods to json. In order to be compatible with previous data, these classes still retain the readFields method. Some prs that involve modifying metadata often modify the readFields method. To avoid this, we should Mark these methods as Deprecated #4398
2020-08-21 22:58:08 +08:00
d61c10b761 [Delete] Support batch delete [part 1] (#4310)
* Implements the grammar of the batch delete #4051 
* Process create, alter table when table has delete sign column
* Support the syntax for enabling the delete column
* Automatically filtered deleted data in the select statement.
* Automatically add delete sign when create  rollup table
TODO:
 * Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction
2020-08-21 22:57:16 +08:00
984006adf9 [ODBC SCAN NODE] 2/4 Add Thrift Interface of odbc_scan_node (#4389)
issue:#4376
2020-08-21 21:26:48 +08:00
a8fe54b7b9 [ODBC SCAN NODE] 1/4 Add unix odbc library. (#4377) 2020-08-21 21:26:14 +08:00
5976395bb6 [BUG] Remove the deduplication of LEFT SEMI/ANTI JOIN with not equal predicate (#4417)
```
SELECT *
FROM
  (SELECT cs_order_number,
          cs_warehouse_sk
   FROM catalog_sales
   WHERE cs_order_number = 125005
     AND cs_warehouse_sk = 4) cs1
LEFT SEMI JOIN
  (SELECT cs_order_number,
          cs_warehouse_sk
   FROM catalog_sales
   WHERE cs_order_number = 125005) cs2
ON cs1.cs_order_number = cs2.cs_order_number
AND cs1.cs_warehouse_sk <> cs2.cs_warehouse_sk;
```

The above query has an equal predicate and a not equal predicate.
If there exists not equal preidcate, the build table should be remained
as it is. So the deduplication should be removed.
2020-08-21 19:55:09 +08:00
76a04de6c4 [MV] Input correct keys type of index meta when Add Partition (#4408)
Define Expr will not serialized in Column `toThrift`.

1. When adding partition, different indexes should use their own keys type
instead of using the keys type of base table uniformly.
`
2. There are two kinds of define expr in Column , one is analyzed, and the other is not analyzed.
Currently, analyzed define expr is only used when creating materialized views, so the define expr in RollupJob must be analyzed.
In other cases, such as define expr in `MaterializedIndexMeta`, it may not be analyzed after being relayed.
When executing the load, the analyzed define expr (such as to_bitmap(cast(k1, varchar))) will not be analyzed again.
Only a cast function will be added to the inner layer(such as to_bitmap(cast(cast(k1 ,int), varchar))) which is analyzed too.
The define expr that has not been analyzed (such as cast(k1, varchar)) will be analyzed when executing the load.
2020-08-21 10:42:41 +08:00
a7422ee142 [UT][Bug-Fix] Resolve UT memory leak problem (#4406)
Fix ut memory leak on Fix #4164
2020-08-21 10:41:54 +08:00
09b1965499 [MV] Fix errors when alter materialized view which based on dup table (#4375)
1. Input the correct keys type when mv is updated.
The keys type of mv should be used in schema change job rather then keys type of base table.
Otherwise, the be will core and thrown exception "Create replicas failed".

2. Forbidden add non-key column on agg mv directly when base table is duplicate model
If a dup table has a agg mv, user will not add a non-key column on mv.
The non-key column can only be added to dup index.
2020-08-21 10:36:03 +08:00
0715c54004 Fix mispelling (#4407)
Centers to centos
2020-08-21 09:14:21 +08:00
04a75b7c28 [Doc] Fix spelling errors in dynamic partition docs (#4395)
Change-Id: I84de1602b99c6b89b59ccc5869c96516c40a181d
2020-08-20 09:31:33 +08:00