Commit Graph

2696 Commits

Author SHA1 Message Date
47e33c7987 Support create index on unique value column (#5305)
* support create index on unique table value columns
2021-02-03 13:22:00 +08:00
ddd85d8ae8 [ODBC] Fix Memory consumption of ODBC MySQL Driver (#5322) (#5323) 2021-02-01 00:12:49 +08:00
bb7ba00ccf [Backup]Support content, exclude and whole database in backup (#5314)
This PR support following functions:
1. Support content properties in backup stmt. It means user can backup only metadata or
meta+data which use content [METADATA_ONLY| ALL]attribute to distinguish.
2. Support exclude some tables in backup and restore stmt. This means that some
very large and unimportant tables can be excluded when the entire database is backed up.
3. Support backup and restore whole database instead of declaring each table name
in the backup and restore statement.

The backup and restore api has changed as following:
```
BACKUP SNAPSHOT [db_name].{snapshot_name}
TO 'repo_name'
[ON|EXCLUDE (
    'table_name' [partition (p1,...)]
)]
[properties (
    "content" = "metadata_only|all"
)]

RESTORE SNAPSHOT [db_name].{snapshot_name}
TO 'repo_name'
[EXCLUDE|ON (
    'table_name' [partition (p1,...)]
)]
[properties (
)]
```
2021-02-01 00:12:35 +08:00
Ben
7b5468e7b8 [Build] Fix ui build error: classnames and react-router dependencies (#5283) 2021-02-01 00:11:47 +08:00
2d70cc532c [Bug] Fix CompactionPermitLimiter cv starve bug (#5274)
Fix _permits_cv.wait maybe starve to death bug.
2021-02-01 00:11:29 +08:00
b315244ba7 [Doc] Fix the error description for the number of bytes of double type. (#5273)
Modify the error description of double type: 12 bytes is modified to 8 bytes
2021-02-01 00:11:14 +08:00
f3aded9370 [Bug] System metric init failed cause be start failed (#5262)
System metric init failed cause be start failed
2021-02-01 00:10:57 +08:00
be0b0f930c [Load] Load job should not begin transaction when task queue in loadingLoadTaskScheduler is full to avoid txn timeout (#5205) 2021-02-01 00:10:24 +08:00
cd96ded1ad [Bugs] Fix bugs that FE heartbeat api of httpv2 does not return version info (#5306)
Co-authored-by: morningman <chenmingyu@baidu.com>
2021-01-30 20:34:33 +08:00
de57667d6d [Delete] Support delete with multi partitions (#5252)
Support delete statement like:
1. delete from table partitions(p1, p2) where xxx;  // apply to p1, p2
2. delete from table where xxx;     // apply to all partitions

Also remove code about the deprecated sync/async delete job.

This CL changes FE meta version to 94
2021-01-30 20:33:34 +08:00
bf0cb78b67 [optimization] avoid extra memory copy while build hash table (#5301)
avoid extra memory copy while build hash table
2021-01-30 20:32:12 +08:00
8fe372f82b [Bug] Fix NoSuchElementException when accessing empty partition info (#5201) 2021-01-30 16:36:24 +08:00
90c2da54bd [Bug] Fix bug and add graceful exit for compaction producer (#5124)
1. add graceful exit mechanism for the compaction producer thread.
2. if compaction task submits unsuccessfully, the compaction task should pop from `_tablet_submitted_compaction`.
2021-01-30 16:35:36 +08:00
4ffc61be32 fix apply condition to unique table value columns incorrectly (#5302) 2021-01-29 10:34:47 +08:00
6bd22bc573 [BackupAndRestore] Support backup and restore view and external odbc table (#5299)
[BackupAndRestore] Support backup and restore view and external odbc table

1. Support backup and restore view and odbc table. The syntax is the same as that of the backup and restore table.
2. If the table associated with the view does not exist in the snapshot,
   the view can still be backed up successfully, but the TableNotFound exception will be thrown when querying the view.
3. If the odbc table associated with the odbc resource, the odbc resource will be backuped and restored together.
4. If the same view, odbc table and resource already exists in the database, it will compare whether the metadata of snapshot is consistent.
   If it is inconsistent, the restoration will fail.
4. This pr also modified the json format of the backup information.
   A `new_backup_objects` object is added to the root node to store backup meta-information other than olap table,
   such as views and external tables.
   ```
   {
       "backup_objects": {},
       "new_backup_objects": {
           "view": [
               {"name": "view1", "id": "10001"}
           ],
           "odbc_table": [
               {"name":"xxx", xxx}
           ]
           "odbc_resources": [
               {"name": "bj_oracle"}
           ]
       }
   }
   ```
5. This pr changes the serialization and deserialization method of backup information
   from manual construction to automatic analysis by Gson tools.

Change-Id: I216469bf2a6484177185d8354dcca2dc19f653f3
2021-01-28 18:50:18 +08:00
e774314ffb Fix some problems related to thrift rpc when use nonblokcing IO model (#5117)
* Fix some problems related to thrift rpc when use nonblokcing IO model

Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-01-28 10:57:30 +08:00
54814a7260 [Auidt] Fix bug for that only the last statement can be audited when user send multi-statement (#5244)
Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-01-28 10:54:18 +08:00
67b0631257 [Enhancement] Fix bug for auditloader plugin that audit event cannot be processed in time (#5194)
* [Enhancement] Fix bug that audit event cannot be processed in time

Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-01-28 10:53:32 +08:00
ca10205137 [Function] Support show create function statement (#5197)
* [Function]Support show create function stmt

Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-01-28 10:52:37 +08:00
41ef9ccda9 (#5224)some little fix for spark load (#5233)
* (#5224)some little fix for spark load

* 1 use yyyy-MM-dd instead of YYYY-MM-DD
2 unify lower case for bitmap column name
2021-01-27 11:16:59 +08:00
c084276600 Revert "[Bug] Fix row_number and group by are inconsistent with 0 and -0 partition (#5226)" (#5297)
This reverts commit 34bfc429868a9a22481d209c24ccd50d85cc3c9f.
The hash algo may be overflow
2021-01-26 13:58:19 +08:00
92bcd2f9fa [Doc] Update README.md: a reword suggestion (#5288) 2021-01-26 09:14:08 +08:00
6729c3b226 [gitignore] Update .gitignore: remove a duplicate pattern (#5289) 2021-01-26 09:13:49 +08:00
2010d331d3 [Doc] Translate a Chinese statement which appears in English version doc (#5290) 2021-01-26 09:13:30 +08:00
8ee4c48f13 [Compile] fix compile error in gcc10 (#5294) 2021-01-26 09:13:11 +08:00
067abac342 [bug] fix bug of getBrokerDesc object is null (#5295)
Co-authored-by: wangxiaobaidu11 <328642799@qq.com>
2021-01-26 09:12:44 +08:00
3db08a14bf [Bug] Fix bug of outer join cause error result. (#5285)
issue: #5284
2021-01-24 10:14:26 +08:00
72ca5c5f3d [Optimize] Remove path check when start BE (#5268)
remove path check when start BE
2021-01-24 10:14:07 +08:00
139709d060 [Storage] Optimize Zone map create policy (#5260)
If there are too large fields in the table, there may be only one row in each page,
and this row also has a zone map index
This causes the stored data to expand three times the original data,
It also takes up more memory when reading those segments
Therefore, we need to Disable the creation of zonemap indexes for segments with too few rows
2021-01-24 10:11:21 +08:00
ab06e92021 [Load Parallel][2/3] Support parallel flushing memtable during load (#5163)
In the previous implementation, in an load job,
multiple memtables of the same tablet are written to disk sequentially.
In fact, multiple memtables can be written out of order in parallel,
only need to ensure that each memtable uses a different segment writer.
2021-01-24 10:10:30 +08:00
7e61400e3c [Load Parallel][1/3] Broker Load supports setting the load parallelism (#5277)
* [Load] Broker Load supports setting the load parallelism

Similar to the parallel_fragment_exec_instance_num parameter,
it allows the user to set the parallelism of the load execution plan
on a single node when the broker load is submitted.

eg:
```
...
properties (
"load_parallelism" = "4";
...
)
```

This parameter is currently only used to support the load parallelism setting,
but it cannot significantly improve the load speed for the time being.
The speed increase will be completed in subsequent code submissions.
Documents will also be added in subsequent submissions.

This PR also update the FE meta version.
2021-01-24 10:09:53 +08:00
1884c7b25d [Doc] Update README.md (#5287)
Target:
1. Make the statement more friendly.  
2. Substitute the word "environment" to its plural format "environments".
2021-01-23 21:10:30 +08:00
d3f1b49faa [Bug] Fix bug that recover table throw NPE (#5279) 2021-01-23 21:09:54 +08:00
34bfc42986 [Bug] Fix row_number and group by are inconsistent with 0 and -0 partition (#5226)
The essence of the problem is behavior of negative zero (- 0.0) in comparison with positive zero (+ 0.0).
Currently in GroupBy and HashPartition, -0.0 is not equal to 0.0 (result of Hash function),
so the -0.0 and 0.0 are divided into 2 partitions.

In row_number analytic function, for the sorted data, a new partition will be opened when the values ​​​​of
the upper and lower rows are not equal. But in C++ the comparison 0.0 == -0.0 is true, so 0.0 and -0.0
are divided into the same partition for row_number.

(Floating point arithmetic in C++ is often IEEE-754. This norm defines two different representations for
the value zero: positive zero and negative zero. It is also defined that those two representations must
compare equals. Refer to https://stackoverflow.com/questions/45795397)
2021-01-23 21:08:43 +08:00
2ddf537094 [Meta] Add some consistency check in image put api (#5219) 2021-01-23 21:08:04 +08:00
a5298d617d [Performance Improve] Push Down _conjunctf of 'not in' and '!=' to Storage Engine. (#5207) 2021-01-23 21:07:01 +08:00
93a4c7efc1 [LOG] Standardize the use of VLOG in code (#5264)
At present, the application of vlog in the code is quite confusing.
It is inherited from impala VLOG_XX format, and there is also VLOG(number) format.
VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG
2021-01-21 12:09:09 +08:00
50ba5d336d [Bug] Colocate Join and Bucket shuffle join may scan some tablet twice time. (#5256)
Fix issue #5255
2021-01-20 21:42:04 +08:00
b25bcee5d3 [Bug] Remove schema hash and fix bug of calculating table signature (#5254)
1. Schema hash is useless long time ago
    Currently, schema hash can only be generated as a random integer, no need to calculated
    from real schema.

2. The CRC32 algo is not enough to generate the table' signature.
    Table's signature is used to determine whether the tables have the same schema.
    And current CRC32 algo may return same signature even if table's schema are different.

    So I change it to calculate the md5 of a signature string assembled by schema info of a table.
2021-01-20 21:38:06 +08:00
83b7a23d5c fix alter routine load not work (#5257) 2021-01-20 10:52:02 +08:00
3a6476b37b add as sdk to thirdparty (#5234) 2021-01-20 10:51:22 +08:00
a59831d119 [Bug] Fix fe restart failed bug when replay erase table log (#5221)
Co-authored-by: gengjun <gengjun@dorisdb.com>
2021-01-19 10:25:49 +08:00
64b3660be2 [UT] fix the bug of getting current running dir (#5193)
Fixed the logic after `readlink`, add a test_util function `GetCurrentRunningDir()`.
2021-01-19 10:23:50 +08:00
73a67901ed [Metric] Add system memory metrics for fe (#5149)
Currently, fe's SystemMetrics only support tcp. I add system memory metrics for fe.
Then we can get system memory metrics , which is used to troubleshoot memory problems.
2021-01-17 09:37:01 +08:00
6794dd08bd [Doc] Update PULL_REQUEST_TEMPLATE.md (#5248)
A reword suggestion. 
Reasons: Before my change, the statement is "If this change need a document change, I have updated the document", 
and there is a grammar error in it evidently: "change" cannot be paired with "need".
Either "changes need" or "change need" will be ok at the grammar level.
According to the context, "changes need" will be better. 
Now, the statement is "If these changes need document changes, I have updated the document".
2021-01-16 21:38:38 +08:00
3dcbbbea95 [Enhancement] Fill assignment param of bucket shuffle and colocate shuffle for debug (#5167)
When Doris is in debug mode, function `Coordinator#traceInstance` is used to print
the physical execute plan of a fragment instance for debug.
Function  `Coordinator#traceInstance` uses param `scanRangeAssignment` to print
the detail of a fragment. But bucket shuffle join and colocate shuffle join do not fill the param.
That will cause debug not work well.
This path fill assignment param of bucket shuffle and colocate shuffle for debug.
2021-01-16 21:37:33 +08:00
99b22c92f8 [Feature] Add a http interface for single tablet migration between different disks (#5101)
Based on PR #4475, this patch add a new feature for single tablet migration between different disks by http.
Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-01-16 21:35:20 +08:00
d692764934 [Optimize]Take all scan nodes of one sql into consideration when select host for a tablet (#4984)
Currently when a scan node scans many tablets, Doris will assure it load balance when choosing which replica for scan task to be executed. But it does not take other scan nodes into consideration to implement a global load balance. This patch tries to make all tables of all scan nodes to be load balance.

Co-authored-by: wangxixu <wangxixu@xiaomi.com>
2021-01-15 11:18:57 +08:00
78fd4b68f8 [Bug] Fix bucket shuffle join bug of query failed (#5228)
Fix #5227:Fix bug query failed when bucket cut in left table.
2021-01-15 10:44:04 +08:00
35bb099b8b [Doc] Update README.md: a reword suggestion (#5241) 2021-01-15 10:43:07 +08:00