Commit Graph

6608 Commits

Author SHA1 Message Date
904a32c758 [docs] fix 0.14 release date in download page (#7253)
The release date of 0.14 in download page is wrong
2021-11-30 15:00:36 +08:00
9b3c834396 [docs](release) Update download page to add release 0.15 (#7244)
Also modify some steps in release processing document
2021-11-29 16:06:32 +08:00
91a3150910 [fix](reader) Fix the bug that reader call _capture_rs_readers function twice (#7224) 2021-11-26 10:17:33 +08:00
baa5d6089f [fix](alter) Fix bug that partition column of a unique key table can be modified (#7217)
The partition columns can not be modified.
2021-11-26 10:16:01 +08:00
948a2a738d [performance] Improve DeltaWriter's performance. (#7216)
1. Support batch write for DeltaWriter.
2. Use mutex instead of SpinLock.
2021-11-26 10:15:27 +08:00
178fda593d [docs] Refine documents for commit message tags. (#7215) 2021-11-26 10:14:39 +08:00
52cd12a1f9 [fix](planner) fix preaggregation reason error (#7205)
this pr is going to Fix #7204.
2021-11-26 10:13:53 +08:00
a1bf2878c0 [feat-opt](json-function) optimize get_json_xx function (#7157)
Avoid repeated parsing json string is the first parameter of function is constant.
2021-11-26 10:12:55 +08:00
70670b5a42 [feat-wip](lateral-iew) Pruning output slot of TableFunctionNode (#7148)
If the calculation of the lateral view function is completed,
the result will be directly returned to the upper layer.
It will cause a lot of memory copy and network transmission.
The reason is that the original column that generally participates
in the lateral view is very likely to be a very long value.
If Doris still retain this column after calculating the lateral view,
it need to perform a memory copy.
However, in many cases, the upper plan node does not need the original columns of the lateral view,
so it is necessary to perform column pruning after the calculation of the lateral view,
so as to avoid useless memory copy and network transmission.
For example, the following query can prune the original column v1

```select k1, e1 from table lateral view explode_split(v1, ",") tmp as e1;```

The `outputSlotIds` in TableFunctionNode is used to store the columns that should be retained after pruning.

* Support scalar function in lateral view

The child 0 of explode_split function could be a scalar function
such as: concat(k1, ",", k2)

This pr mainly detects whether the lateral view with function satisfies the following specifications in semantics.
1. The columns in the function must all belong to the original table
2. The function must be a scalar function
2021-11-26 10:10:05 +08:00
Pxl
2445f10868 [fix](bitmap-function) fix core dump at some bitmap function (#7221) 2021-11-25 22:52:50 +08:00
c9e578032b optimize bitmap function count, use roaring cardinality method, this will more fast than current version (#7151) 2021-11-24 14:42:48 +08:00
b6a9207a25 [deps](brpc) fix compile bug that could not find protobuf lib during compile (#7197) 2021-11-24 10:44:26 +08:00
fb5adaf18e [fix](mem-tracker) Fix mem limit -1 in partition aggregate node (#7181)
Make error message more clear.
2021-11-24 10:43:35 +08:00
3fd8148100 [doc] Add build-dev image 1.4.2 to compilation document (#7174)
Add build-dev image 1.4.2 to compilation document
2021-11-24 10:42:52 +08:00
5a8591aaf0 [doc] add FAQ document (#7173)
From Apache Doris wechat count, authorized.
2021-11-24 10:42:33 +08:00
Pxl
3fcb3db57a [fix](vectorized-engine) fix core when enable_vectorized_engine open (#7159) 2021-11-24 10:42:12 +08:00
e74bfea8e4 [chore](clang-format)(license-eye) Add Clang Format/Skywalking eyes github action (#7132)
1. The clang format action will be triggered when a PR is submitted.
2. Skywalking eyes actions will be triggered when a PR is submitted and after merging to master branch.
2021-11-24 10:41:02 +08:00
3b988204fc [doc] Modify the wrong comment of the ScanTime (#7109)
Modify the wrong comment of the ScanTime.
2021-11-24 10:40:00 +08:00
Pxl
a74fdf184c [refactor](be) refactor predicate function creator (#7054)
Refactor predicate function creator, make MinMaxFunction/HybridSet/BloomFilter
use a unified interface through template to get function.
2021-11-24 10:39:29 +08:00
d3c020b3cb [feat-opt](fe-config) Add tablets number limit to void wrong usage (#7025)
1. Add new FE config `default_db_replica_quota_size`
2. Check replica quota after create table/partition
2021-11-24 10:37:54 +08:00
4b45b806da [doc] Created commit-format-specification.md (#7190)
We found that many commit messages submitted at present have ambiguous information.
Clear commit messages can help developers submit pull requests more readable,
committers merge easily and Release Manager easy to release.

Therefore, we have sorted out a version of the commit format specification.
We hope that subsequent contributors can sort out the commit messages according to
the specification when submitting Pull Request.
2021-11-24 10:30:54 +08:00
d420ff0afd display current load bytes to show load progress, (#7134)
this value may greate than the file size when loading
parquert or orc file, will less than file size when loading
csv file.
2021-11-24 10:08:32 +08:00
e2d3d0134e dd a method to get doris current memory usage (#6979)
Add all memory usage check when TryConsume memory
2021-11-24 10:07:54 +08:00
ad0d2b82ab [fix](memory) fix bug that ~BitShufflePageDecoder destroys uninitialized chunk (#7172)
Added a safe way to destroy Chunk.
2021-11-23 15:24:25 +08:00
ce7fa5d6d9 [typo] Update multi-tenant.md (#7162)
A double quote is missing
2021-11-22 14:47:00 +08:00
836c95c2ca [feat](memory-track) Print peak memory use of all backend after query in audit log (#7030)
Add a new field `peakMemoryBytes` in fe.audit.log
2021-11-22 14:46:08 +08:00
07296a301b [chore](fe) Fix build error caused by Inaccessible pentaho-aggdesigner-algorithm jar (#7161) 2021-11-20 21:48:26 +08:00
fcd4f0b5c2 [fix](profile) fix some bugs about ReportProfile on BE (#7144)
1. setting _report_thread_active to false is not necessary protected by _report_thread_lock, because 
_report_thread_active's type is bool, writing data is multi-threadly safety if size <= marchine word length

2. report_profile thread terminates early is possiable, in the function report_profile(), while (_report_thread_active) may 
break if  _report_thread_active is false,  the thread of calling open() may be scheduled out between 
_report_thread_started_cv.wait(l) and _report_thread_active = true, we should not assume that how long time elapsed 
between a thread be scheduled twice
2021-11-20 21:43:57 +08:00
49eac402e3 [fix](export) fix export retry error (#7143)
fix #7142
clear export status `alreadySentBackendIds` before Coordinator retry Export task.
2021-11-20 21:41:53 +08:00
a81f4da4e4 [feat](minidump) Add minidump support (#7124)
Now minidump file will be created when BE crashes.
And user can manually trigger a minidump by sending SIGUSR1 to BE process.

More details can be found in minidump.md documents
2021-11-20 21:41:26 +08:00
a88541d2d4 [refactor] extract duplicate code to writePropertiesToFile (#7119)
Extract duplicate code to writePropertiesToFile in org/apache/doris/persist/Storage.java
2021-11-20 21:40:50 +08:00
143d3769b1 [feat](config) add FE config to limit the replica num per tablet (#7087) 2021-11-20 21:40:23 +08:00
52ebb3d8f5 [feat](mysql-compatibility) Increase compatibility with mysql (#7041)
Increase compatibility with mysql
  1. Added two system tables files and partitions
  2. Improved the return logic of mysql error code to make the error code more compatible with mysql
  3. Added lock/unlock tables statement and show columns statement for compatibility with mysql dump
  4. Compatible with mysqldump tool, now you can use mysql dump to dump data and table structure from doris

now use mysqldump may print error message like 
```
$ mysqldump -h127.0.0.1 -P9130 -uroot test_query_qa > a
mysqldump: Error: 'errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): `EXTRA`' when trying to dump tablespaces
```

This error message not effect the export file, you can add `--no-tablespaces` to avoid this error
2021-11-20 21:39:37 +08:00
e9282205f1 [feat-opt](spark-load) support bitmap binary data from hive in spark load (#6883)
Support to load the binary data of bitmap value from Hive into Doris.
fix #6461
2021-11-20 21:38:38 +08:00
1238f8de46 [fix](auth) do not allow drop or create root user (#7140)
root user should not be dropped or created
2021-11-18 14:39:33 +08:00
94fa6db196 [feat-opt](binlog-load) add how to open binlog load to the error message (#7138) 2021-11-18 14:38:42 +08:00
4f7d7a52bd [refactor] remove unused code (#7137)
Remove unused code in ImportAction.java
2021-11-18 14:37:31 +08:00
b842b282b5 [typo] fix typo in CONTRIBUTING.md and CONTRIBUTING_CN.md (#7136)
"Apache" is incorrectly spelled as "Aapche"
2021-11-18 14:35:29 +08:00
00e653e812 [deps] add breakpad for minidump (#7128)
Related #7123
2021-11-18 14:34:38 +08:00
eaebe6a40b [typo] correct getLogger argument (#7127) 2021-11-18 14:33:54 +08:00
be89f0f77e [feat-opt](routine-load) Support show offset lag in show routine load stmt (#7114)
Add a new field `Lag` in result of `show routine load` stmt.

`Lag: {"0":10, "1":0}` means kafka partition 0 has 10 msg behind and partition 1 is update-to-date.
2021-11-18 14:31:16 +08:00
74e8264c48 [fix](session-var) Fix the incompatibility of sql mode between Doris and MySQL (#7108)
Introduce by pr #4359

VariableMgr.fillValue() method should not call in ExpressionFunctions.eval(),
because in method analyzeImpl() of SysVariableDesc, it has been already called once.

If VariableMgr.fillValue() was called twice, the type of SysVariableDesc will become BigInt,
which is incorrect.
2021-11-18 14:30:31 +08:00
9487899047 Update members.md (#7115)
Update the Committer List
2021-11-17 14:38:48 +08:00
01c5ef2f05 [Doc]Update memmber.md (#7133)
Add some Committers at the list , they are voted as Committers in the last half a year.
and the Chinese Documents has been updated at #7115
2021-11-17 14:38:15 +08:00
36360ba846 [BUG] fix profile not working with sql_cache enabled (#7105)
Fix profile not working in sql_cache enabled. It will thrown NullPointerException.
The reason is that the Coordinator in init profile is null when cache is enable.
Therefore, we should perform different profile processing in the case of cache hits and misses, so as to avoid the situation of null pointers.

Fixed #7104
2021-11-17 14:38:00 +08:00
f5a35c28e9 [Optimize] [Memory] BitShufflePageDecoder use memory allocated by ChunkAllocator instead of Faststring (#6515)
BitShufflePageDecoder reuses the memory for storing decoder results, allocate memory directly from the 
`ChunkAllocator`, the performance is improved to a certain extent.

In the case of #6285, the total time consumption is reduced by 13.5%, and the time consumption ratio of `~Reader()` 
has also been reduced from 17.65% to 1.53%, and the memory allocation is unified to `ChunkAllocator` for centralized 
management , Which is conducive to subsequent memory optimization.

which can avoid the memory waste caused by `Mempool`, because the chunk can be free at any time, but the 
performance is lower than the allocation from `Mempool`. The guess is that there is no `Mempool` after secondary 
allocation of large chunks , Will directly apply for a large number of small chunks from `ChunkAllocator`, and it takes 
longer to lock in `pop_free_chunk` and `push_free_chunk` (but this is not proven from the flame graphs of BE's cpu and 
contention).
2021-11-17 11:20:21 +08:00
7b712925fc [Lateral View] Multi lateral views map one TableFunctionNode (#7000)
1. Forbidden non-string column as params of explode_view.
The first param of explode_view must be string column(VARCHAR/CHAR/STRING)

2. N-1 n lateral views map one TableFunctionNode
The TableFunctionNode include all of fnExprs which belongs to one table.
For example:
select pageid,mycol1, mycol2 from pageAds
    lateral view explode_string(col1) myTable1 as mycol1
    lateral view explode_string(col2) myTable2 as mycol2;
TableFunctionNode
|----
|- fnExprList: explode_string(col1), explode_string(col2)
2021-11-17 11:13:08 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
4bc5ba8819 mark the load job fail when more than a half of replica write failed of a tablet, (#7126)
the code before is counting all replica has more than a half write failed.
2021-11-17 10:18:04 +08:00
dcad6ff5e5 [License] Add License header for missing files (#7130)
1. Add License header for missing files
2. Modify the spark pom.xml to correct the location of `thrift`
2021-11-16 18:37:54 +08:00