Commit Graph

18263 Commits

Author SHA1 Message Date
3fa3dfbeda [Bug][Fold constant] remove reanalyze in get constant expr (#6400)
fix #6399
2021-08-11 16:38:30 +08:00
708b6c529e [RoutineLoad] Support pause or resume all routine load jobs (#6394)
1. PAUSE ALL ROUTINE LOAD;
2. RESUME ALL ROUTINE LOAD;
2021-08-11 16:38:06 +08:00
7e93405df3 [Alter] Support alter table and column's comment (#6387)
1. alter table tbl1 modify comment "new comment";
2. alter table tbl1 modify column k1 comment "k1", modify column v1 comment "v1";
2021-08-11 16:37:42 +08:00
9216735cfa [New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329)
1. FE vectorized plan code
2. Function register vec function
3. Diff function nullable type
4. New thirdparty code and new thrift struct
2021-08-11 14:54:06 +08:00
1a5b03167a [Doc] Add document for datax and sample codes (#6389)
Add documents for datax in extension catalog.
Add documents for sampes in best-practice catalog.
2021-08-11 11:51:13 +08:00
10f410f1c3 [Improvement] Imporve metrics text format for FE (#6382) (#6383)
Fix #6382
2021-08-11 10:26:19 +08:00
0930e89452 [http][manager] Add manager related http interface. (#6396)
Encapsulate some http interfaces for better management and maintenance of doris clusters.

The http interface includes getting cluster connection information, node information, node configuration information, batch modifying node configuration, and getting query profile.

For details, please refer to the document:  
`docs/zh-CN/administrator-guide/http-actions/fe/manager/`
2021-08-10 10:58:31 +08:00
636b30b1d1 [Bug] Fix be core when failed to add batch (#6388)
Fix be core when failed to add batch
2021-08-10 10:57:57 +08:00
Qi
5f7c7ce743 [Bug][Cache] Map.get with cache key real value. (#6377) 2021-08-10 10:14:46 +08:00
929b33ac0a [DataX] doriswriter support csv (#6373)
make doriswriter of DataX support format csv.  Format csv is more simple and faster than
format json when data is simple

add property format: csv/json
add property column_separator: effect when format is csv, for example "\x01" , "^", etc...
2021-08-10 10:14:21 +08:00
35c8b6a0bf [DOC] Update dynamic-partition.md (#6371)
Update dynamic-partition.md
The default value of dynamic_partition_check_interval_seconds is 600 in source code.
2021-08-10 10:13:45 +08:00
312dc83118 [Bug][BloomFilter] Fix bloom filter null flag (#6367)
Fix #6366 

There is a bloom filter for each data page in a column which has bloom filter index.
`_has_null` flag can help to judge whether `null` exists in a data page.
If `null` value is added to a data page, `_has_null` will be set `true`.
After bloom filter for a data page finished, `_has_null`should be reset to `false` to prepare for next data page.
2021-08-10 10:13:30 +08:00
bf616dcb8f [Config] Add default configuration of load_parallelism (#6290)
- Make load_parallelism configurable. 
- Different clusters should be configured with different load_parallelism values.
- Some user don't know how to set load_parallelism, or don't know the best load_parallelism value.
2021-08-10 10:11:46 +08:00
Pxl
236e0f1eda [Feature] Support for querying the trash used capacity (#6247)
Support for querying the trash used capacity.

```
SHOW TRASH [ON ...]
```

Now user can proactively scan trash directory.
2021-08-10 10:10:47 +08:00
d9fc1bf3ca [Feature]:Flink-connector supports streamload parameters (#6243)
Flink-connector supports streamload parameters
#6199
2021-08-09 22:12:46 +08:00
c8c571af37 [New Feature] Support synchronizing MySQL binlog in real time [stage 1] (#6289)
This commit is the first stage of #6287 

In this commit, we support:
1、Sync Job
1)、 Creating sync job and data channel in Fe.
2)、Pause sync job.
3)、Resume sync job.
4)、Stop sync job.
5)、Show sync jobs.

2、Canal
1)、Subscribing and getting the binlog data of canal with creating syncjob.
2021-08-08 21:39:34 +08:00
882242ed15 [Enhance][Fold constant] Support fold constants in InlineView by BE (#6393)
Add support for folding constants in InlineView by BE.
2021-08-07 21:34:02 +08:00
3519a4ff47 [BUG] Fix Left Semi Join Got a Wrong Result (#6379)
```
SELECT count(distinct products_id) FROM a_table as a WHERE 1=1 AND products_id in ( SELECT products_id from b_table );
```
Because hash table construction errors may lead to unstable results
2021-08-07 21:33:44 +08:00
36fe112eb7 [BUG] Fix query failure caused by using case-insensitive system view names in information_schema. (#6374)
The system view names in information_schema are case-insensitive,
but we should not refer to one of these using different cases within the same statement.  

The following sql is correct:
```
select * from information_schema.TAbles where TAbles.ENGINE = 'Doris';
```

The following sql is wrong because `TAbles` and `tables` are used:
```
select * from information_schema.TAbles order by tables.CREATE_TIME;
```
2021-08-07 21:33:29 +08:00
612684fb2e [DOC]Add a profile counter of local exchange send bytes (#6372)
Add a profile counter of local exchange send bytes: LocalBytesSent
2021-08-07 21:32:44 +08:00
c6aa37f5ef [Alter] Support doing compaction for tablets under alter operation (#6365)
The problem I want to solve is described in #6355.
This CL mainly changes:

1. Support compacting tablets under alter operations

   On BE side, the compaction logic will select tablets which state is "TABLET_NOTREADY" to do cumulative compaction.

2. Remove "alter_task" field in tablet's meta on BE side.

   "alter_task" field is never used long time ago

3. Support doing delete operation when table is doing alter operation.

   Previously, when a table is doing alter operation, execution of delete will return error: Table's state is not NORMAL.
   But now, delete can be executed successfully only if the condition column is not under schema change.
   And delete condition will be applied to all materialized indexes.
2021-08-07 21:32:26 +08:00
d587440e40 [Best Practice] Add systemd service config file (#6353)
Add Systemd serivce config file, which can manage doris sevice's start and stop,
and automatically restart them when they are unscheduled failed.
2021-08-07 21:31:43 +08:00
f772649535 [Optimize] Optimize lock when check error storage (#6321)
1. `StorageEngine::_delete_tablets_on_unused_root_path` will try to obtain tablet shard write lock in `TabletManager`
```
StorageEngine::_delete_tablets_on_unused_root_path
  TabletManager::drop_tablets_on_error_root_path
    obtain each tablet shard's write lock
```
2. `TabletManager::build_all_report_tablets_info` and other methods will obtain tablet shard read lock frequently.

So, `StorageEngine::_delete_tablets_on_unused_root_path` will hold `_store_lock` for a long time.
This will make it difficult for other threads to get write `_store_lock`, such as `StorageEngine::get_stores_for_create_tablet`

`drop_tablets_on_error_root_path` is a small probability event, `TabletManager::drop_tablets_on_error_root_path` should return when its param `tablet_info_vec` is empty
2021-08-07 21:30:49 +08:00
70825ce846 [Feature] Support alias function (#6261)
Implement #6260.

Add alias function type.
2021-08-07 21:29:13 +08:00
2f5b06ae70 [Bug][Optimize] Fix race condition problem and optimize do_money_format function (#6350)
* [Bug][Optimize] Fix race condition problem and optimize do_money_format function

Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-06 16:29:34 +08:00
b067bdcdd5 [Bug] fix overflow while convert fixed char to number (#6368)
because Slice::size was unsigned int64, p >= 0 was always true
2021-08-05 14:35:29 +08:00
39ee97e95d [Doc] Add a description of the restriction of the materialized view on the use of the unique model (#6362)
Add a description of the restriction of the materialized view on the use of the unique model
2021-08-05 14:35:13 +08:00
de7376062a [Demo] Add flink mysql cdc to doris demo (#6352)
add flink mysql cdc to doris
2021-08-05 14:34:52 +08:00
b5d8ee35f5 [Demo] Add spring mybatis jdbc demo (#6349)
add spring mybatis jdbc demo
2021-08-05 14:34:34 +08:00
21f94c5d6c [BUG][ARRAY] fix array bound bug (#6347)
fix #6346
2021-08-05 14:34:12 +08:00
216295d1b8 [Demo] Add stream load demo (#6344)
add stream load demo
2021-08-05 14:33:39 +08:00
2f3cd0573a [Broker] Fix ugi confusion bug (#6325)
Use UserGroupInformation.loginUserFromKeytabAndReturnUGI instead of UserGroupInformation.loginUserFromKeytab in multiple principal scenario.
2021-08-05 14:33:18 +08:00
12730e7a3b [Broker] Fix bug that broker can not handle FSDataInputStream which does not implement ByteBufferReadable (#6308)
Currently, Doris supports loading OSS/S3A files by using params like fs.s3a.access.key, but there is a bug when using it to load such type files. The root cause is broker can not handle FSDataInputStream which does not implement ByteBufferReadable.

See Issue #6307
S3A input stream to support ByteBufferReadable
https://issues.apache.org/jira/browse/HADOOP-14603
2021-08-05 14:16:36 +08:00
2823e4daba [Feature] Support SHOW DATA SKEW stmt (#6219)
SHOW DATA SKEW FROM tbl PARTITION(p1)

to view the data distribution of a specified partition

```
mysql> admin show data skew from tbl1 partition(tbl1);
+-----------+-------------+-------+---------+
| BucketIdx | AvgDataSize | Graph | Percent |
+-----------+-------------+-------+---------+
| 0         | 0           |       | 100.00% |
+-----------+-------------+-------+---------+
1 row in set (0.01 sec)
```

Also modify the result of `admin show replica distribution`, add replica size distribution

```
mysql> admin show replica distribution from tbl1 partition(tbl1);
+-----------+------------+-------------+----------+------------+-----------+-------------+
| BackendId | ReplicaNum | ReplicaSize | NumGraph | NumPercent | SizeGraph | SizePercent |
+-----------+------------+-------------+----------+------------+-----------+-------------+
| 10002     | 1          | 0           | >        | 100.00%    |           | 100.00%     |
+-----------+------------+-------------+----------+------------+-----------+-------------+
```
2021-08-05 14:05:41 +08:00
a16ad3fccd [BUG] Fix bug in fe metric Http restful API (#6364)
Fix bug in fe metric Http restful API. see #6363
2021-08-05 00:16:55 +08:00
866814dc47 [Improve] Add FE function timestamp (#6339)
* Add FE function timestamp
2021-08-04 11:52:46 +08:00
d1007afe80 Use fmt and std::from_chars to make convert integer to string and convert string to integer more efficient (#6361)
* [Optimize] optimize the speed of converting integer to string

* Use fmt and std::from_chars to make convert integer to string and convert string to integer more efficient

Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-04 10:55:19 +08:00
16bc5fa585 [Bug] fix violating C/C++ aliasing rules cause a error hash value in decimal value (#6348)
In RuntimeFilter BloomFilter, decimal column will got a wrong hash value because violating  aliasing rules
decimal12_t decimal = { 12, 12 };
murmurhash3(decimal) in bloom filter: 2167721464
expect: 4203026776
2021-08-03 12:00:03 +08:00
2c208e932b [Bug][RoutineLoad] Avoid TOO_MANY_TASKS error (#6342)
Use `commitAsync` to commit offset to kafka, instead of using `commitSync`, which may block for a long time.
Also assign a group.id to routine load if user not specified "property.group.id" property, so that all consumer of
this job will use same group.id instead of a random id for each consume task.
2021-08-03 11:59:06 +08:00
748604ff4f [RoutineLoad] Support alter broker list and topic for kafka routine load (#6335)
```
alter routine load for cmy2 from kafka("kafka_broker_list" = "ip2:9094", "kafka_topic" = "my_topic");
```

This is useful when the kafka broker list or topic has been changed.

Also modify `show create routine load`, support showing  "kafka_partitions" and "kafka_offsets".
2021-08-03 11:58:38 +08:00
a2eb1b9d5b fix version (#6334) 2021-07-30 09:25:22 +08:00
9ca369aa58 [Feature][LDAP] Add LDAP authentication login and LDAP group authorization support. (#6333)
* [Feature][LDAP] Add LDAP authentication login and LDAP group authorization support.

* Update docs/.vuepress/sidebar/en.js

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2021-07-30 09:24:50 +08:00
cf1fcdd614 fix BE coredump in UserFunctionCache (#6331)
Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-07-30 09:24:30 +08:00
14db74fac6 [Bug] Fix show load like match (#6314)
* fix show load like match

* Compatible with historical issues
2021-07-30 09:24:06 +08:00
6597a338dc [Feature] Support config max length of zone map index (#6293) 2021-07-30 09:23:11 +08:00
5a7237062f remove palo_internal_service.proto and PInternalService from code base, because it is not used now (#6341) 2021-07-30 09:22:50 +08:00
d416cd3e79 [Bug] Fix BE coredump when flushing memtable (#6317)
Fix #6316

If the size of memtable is greater than max segment size and the memtable will flush more than
one segment file. BE coredump will be triggered when flushing memtable.
2021-07-27 13:42:19 +08:00
1454aacd69 [Metric] Add metrics to monitor size of queued tasks in load thread pool (#6306)
(1) Add metrics to monitor the size of queued tasks in load thread pool.
(2) Change some log level to VLOG_NOTICE
2021-07-27 13:41:44 +08:00
776df2effc [BUG][stack-buffer-overflow] fix overflow while calculate hash code in ArrayType and fix some warning 2021-07-27 13:41:00 +08:00
cdffe1ae20 [Doc] Modify the storage path configuration instructions in the installation and BE configuration documents (#6298) 2021-07-27 13:40:15 +08:00