Commit Graph

5948 Commits

Author SHA1 Message Date
fa290383dc [Doc] Modify README to add some statistical indicators (#6486)
1. Add license/total line/release badegs.
2. Add monthly active contributor and contributor growth graph
3. fix a pom.xml bug
4. Modify some routine load log on BE side
2021-08-25 09:36:26 +08:00
7e30b28f3a [Optimize] Speed up converting the data of other types to string in mysql_result_writer (#6384)
Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-24 22:30:58 +08:00
146060dfc0 [Bug]Fix result_writer may coredump (#6482)
fix result_writer may coredump, let BufferControlBlock owns the memory
2021-08-22 22:04:00 +08:00
4ff6eb55d0 [FlinkConnector] Make flink datastream source parameterized (#6473)
make flink datastream source parameterized as List<?> instead of Object.
2021-08-22 22:03:32 +08:00
c71f58fef9 [Doc] Add sidebar for percentile doc (#6470) 2021-08-22 22:03:07 +08:00
0cf2bc6644 [Doc] Refactor all grammar help documents (#6337)
See #6336 for details
2021-08-22 22:02:51 +08:00
6c23f8d413 [Bug] Fix bug that check point load image failed in some circumstance (#6465)
Fix bug that check point load image failed in some circumstance
2021-08-19 14:17:57 +08:00
52f39e3fde [Bug][SparkLoad]: bitmap value in or operator in spark load should be deep copied (#6453)
fix multi rollup hold the same Ref of bitmapvalue which may be updated repeatedly.
fix #6452
2021-08-19 14:17:31 +08:00
fa382f8602 [Bug][MemLimit] Modify the memory limit of storage page cache (#6451)
This CL mainly changes:

1. the `storage_page_cache_limit` is based on config `mem_limit`

    the default is 20% of `mem_limit`. 

2. the `buffer_pool_limit` is based on config `mem_limit`

    the default is 20% of `mem_limit`. 

3. the `buffer_pool_clean_pages_limit` is based on config `buffer_pool_limit`

    the default is 50% of `buffer_pool_limit`

4. Fix some show bugs of lru cache hit ratio and usage ratio
5. Fix a create view bug that `notEvalNondeterministicFunction` should be reset after analyze.
2021-08-19 14:16:53 +08:00
c65ec3136b [Improvement] spark load without agg and de/serialization (#6270)
fix #6269 

The outline of our changes is to improve our memory in case of OOM in BE and to speed up the calculation.
1. We do not need to do Aggregation in load, which has already been done in the ETL spark job.
2. Based on 1, we do not need to serialize/deserialize bitmap/HLL objects.
2021-08-19 14:15:01 +08:00
4ea2fcefbc [Improve]The connector supports spark 3.0, flink 1.13 (#6449)
Modify the flink/spark compilation documentation
2021-08-18 15:57:50 +08:00
66a7a4b294 [Feature] Support exact percentile aggregate function (#6410)
Support to calculate the exact percentile value array of numeric column `col` at the given percentage(s).
2021-08-18 15:56:06 +08:00
9148bcb673 [Build] Reduce the parallel of build (#6469) 2021-08-18 15:24:19 +08:00
Pxl
999eaeb276 fix Wrong use on SCOPED_RAW_TIMER (#6459) 2021-08-18 09:06:18 +08:00
0c5c3f7d87 Fixed the problem that there may be redundant retries when the query result export fails (#6436) 2021-08-18 09:06:02 +08:00
8738ce380b Add long text type STRING, with a maximum length of 2GB. Usage is similar to varchar, and there is no guarantee for the performance of storing extremely long data (#6391) 2021-08-18 09:05:40 +08:00
2f90aaab8e [Doc] flink/spark connector: add sources/javadoc plugins (#6435)
spark-doris-connector/flink-doris-connect add plugins to generate javadoc and sources jar,
so can be easy to distribute and debug.
2021-08-16 22:41:24 +08:00
b13e512a65 [Feature] Support spark connector sink data to Doris (#6256)
support spark conector write dataframe to doris
2021-08-16 22:40:43 +08:00
63a0d9d23a Add statistics struct and Support manually inject statistics (#6420)
* Add statistics struct and Support manually inject statistics

This PR mainly developed the data structure used by statistical information
and the function of manually modifying the statistical information.
We use a statistics package alone to store statistical information,
and use the 'statistics manager' as a unified entry for statistical information.
For detailed data structure and explanation, please refer to the comments on the class.

Manually modify statistics include: Manually modify table statistics and column statistics.
The syntax is explained in the issue #6370.

* Show table and column statistics

'SHOW TABLE STATS' used to show the statistics of table.
'SHOW COLUMN STATS' used to show the statistics of columns.

Currently, only the tables and columns for setting statistics
will be displayed in the results.
2021-08-16 17:20:05 +08:00
4be06a470f fix typo: dynamic_partitoin -> dynamic_partition (#6445) 2021-08-16 09:17:57 +08:00
285d44cd48 [BUG] Fix potential overflow exception when do money format for double (#6408)
* [BUG] Fix potential overflow bug when do money format for double

Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-15 18:40:26 +08:00
2030c44dba [Log] Modify some log level on BE side (#6381) 2021-08-14 10:25:45 +08:00
42fedc0a56 [Docs] Support json file format in routine load doc (#6439) 2021-08-14 10:25:06 +08:00
34af66bf1d [BUG][Memory] fix memory tracker DCHECK fail in debug mode and Fix Process Memory limit fail (#6438) 2021-08-14 10:24:33 +08:00
d9cc235d6d Fix typo pdqosrt (#6441) 2021-08-14 10:24:12 +08:00
6f6d50a484 fix typo: '分许'->'分离' (#6440) 2021-08-14 10:22:28 +08:00
5e6f1b89da [Feature] Support sql block rule (#6192)
Support grammar:
- SHOW SQL_BLOCK_RULE [FOR NAME]
- CREATE SQL_BLOCK_RULE test_rule PROPERTIES ("user"="default", "sql"="select .* from .* join .*", "enable": "true");
- ALTER SQL_BLOCK_RULE test_rule PROPERTIES ("user"="test_user", "enable": "false");
- DROP SQL_BLOCK_RULE test_rule1,test_rule2;
2021-08-13 21:56:34 +08:00
240dd9b110 fix typo: '一下' -> '以下' (#6434) 2021-08-13 12:18:42 +08:00
671d8f6af8 [Bug] Return error if user failed to pause/resume a certain routine load. (#6426)
When operating a single job, maintain the same behavior as before
This problem is introduced by #6394.
2021-08-12 11:50:57 +08:00
a7d620c359 fix issue6390,wrong result in query view with an added null column (#6395) 2021-08-12 10:08:58 +08:00
Pxl
8a267f1ac5 [Feature] Support for cleaning the trash actively (#6323) 2021-08-12 10:07:51 +08:00
047e31d977 [Repository] Normalize the path of file on repository (#6402) 2021-08-12 09:41:34 +08:00
8ca86824df fix compile error (#6425)
Change-Id: If022dd00f00772166096483ee1d82f2cd34e0dec

Co-authored-by: qijianliang01 <qijianliang01@baidu.com>
2021-08-12 09:34:28 +08:00
Pxl
9a6e53a7f8 [Bug] fix wrong title at 'show trash' (#6407) 2021-08-11 16:39:21 +08:00
f6bcabe0d1 [Bug] Fixed bug that caused export and backup to fail when principal keytab file was created failed (#6404)
Co-authored-by: Geoffrey <gaofeng01@rd.netease.com>
2021-08-11 16:39:01 +08:00
3fa3dfbeda [Bug][Fold constant] remove reanalyze in get constant expr (#6400)
fix #6399
2021-08-11 16:38:30 +08:00
708b6c529e [RoutineLoad] Support pause or resume all routine load jobs (#6394)
1. PAUSE ALL ROUTINE LOAD;
2. RESUME ALL ROUTINE LOAD;
2021-08-11 16:38:06 +08:00
7e93405df3 [Alter] Support alter table and column's comment (#6387)
1. alter table tbl1 modify comment "new comment";
2. alter table tbl1 modify column k1 comment "k1", modify column v1 comment "v1";
2021-08-11 16:37:42 +08:00
9216735cfa [New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329)
1. FE vectorized plan code
2. Function register vec function
3. Diff function nullable type
4. New thirdparty code and new thrift struct
2021-08-11 14:54:06 +08:00
1a5b03167a [Doc] Add document for datax and sample codes (#6389)
Add documents for datax in extension catalog.
Add documents for sampes in best-practice catalog.
2021-08-11 11:51:13 +08:00
10f410f1c3 [Improvement] Imporve metrics text format for FE (#6382) (#6383)
Fix #6382
2021-08-11 10:26:19 +08:00
0930e89452 [http][manager] Add manager related http interface. (#6396)
Encapsulate some http interfaces for better management and maintenance of doris clusters.

The http interface includes getting cluster connection information, node information, node configuration information, batch modifying node configuration, and getting query profile.

For details, please refer to the document:  
`docs/zh-CN/administrator-guide/http-actions/fe/manager/`
2021-08-10 10:58:31 +08:00
636b30b1d1 [Bug] Fix be core when failed to add batch (#6388)
Fix be core when failed to add batch
2021-08-10 10:57:57 +08:00
Qi
5f7c7ce743 [Bug][Cache] Map.get with cache key real value. (#6377) 2021-08-10 10:14:46 +08:00
929b33ac0a [DataX] doriswriter support csv (#6373)
make doriswriter of DataX support format csv.  Format csv is more simple and faster than
format json when data is simple

add property format: csv/json
add property column_separator: effect when format is csv, for example "\x01" , "^", etc...
2021-08-10 10:14:21 +08:00
35c8b6a0bf [DOC] Update dynamic-partition.md (#6371)
Update dynamic-partition.md
The default value of dynamic_partition_check_interval_seconds is 600 in source code.
2021-08-10 10:13:45 +08:00
312dc83118 [Bug][BloomFilter] Fix bloom filter null flag (#6367)
Fix #6366 

There is a bloom filter for each data page in a column which has bloom filter index.
`_has_null` flag can help to judge whether `null` exists in a data page.
If `null` value is added to a data page, `_has_null` will be set `true`.
After bloom filter for a data page finished, `_has_null`should be reset to `false` to prepare for next data page.
2021-08-10 10:13:30 +08:00
bf616dcb8f [Config] Add default configuration of load_parallelism (#6290)
- Make load_parallelism configurable. 
- Different clusters should be configured with different load_parallelism values.
- Some user don't know how to set load_parallelism, or don't know the best load_parallelism value.
2021-08-10 10:11:46 +08:00
Pxl
236e0f1eda [Feature] Support for querying the trash used capacity (#6247)
Support for querying the trash used capacity.

```
SHOW TRASH [ON ...]
```

Now user can proactively scan trash directory.
2021-08-10 10:10:47 +08:00
d9fc1bf3ca [Feature]:Flink-connector supports streamload parameters (#6243)
Flink-connector supports streamload parameters
#6199
2021-08-09 22:12:46 +08:00