Commit Graph

5948 Commits

Author SHA1 Message Date
667eac9b7d Utility-Statements SQL Help (#8952)
Utility-Statements SQL Help
2022-04-12 08:44:16 +08:00
067309c466 [fix](compile) fix compilation bug (#8950) 2022-04-11 13:12:34 +08:00
0b1b3e225d Revert "[Thirdparty]Add llvm for codegen (#8938)" (#8948)
This reverts commit 32133621c69a2d7544549c5ea54ed6d9de60415e.

Reverts #8938
The LLVM requires GLIBC_2.15. I decided to create a branch for the llvm feature first.
And once we resolve the low version glibc issue, it will be merged back to master.
2022-04-11 13:03:16 +08:00
Pxl
8a066e2586 [fix](vectorized) core dump on ST_AsText (#8870) 2022-04-11 09:39:32 +08:00
8158b05ea0 [fix] Fix bug that tablet data size and row num info are failed to report. (#8945)
Introduced from #8146
2022-04-11 09:38:28 +08:00
fd054ca2f6 [doc](java-udf) add docs for Java UDF (#8944) 2022-04-11 09:37:48 +08:00
2abb9c1bca [doc](readme) Add Spark / Flink Connector (#8943)
Add Spark / Flink Connector
2022-04-11 09:37:14 +08:00
7f7172807f [feature](function)(vectorized) Support all geolocation functions on vectorized engine (#8846) 2022-04-11 09:36:53 +08:00
0d761f9909 [feature-wip][UDF][DIP-1] Support variable-size input and output for Java UDF (#8678)
This feature is proposed in DSIP-1. This PR support variable-length input and output Java UDF.
2022-04-11 09:36:16 +08:00
174e22b9f0 [feature](github-action) add scope labeler (#8935) 2022-04-10 23:06:03 +08:00
936b942e3a [fix](error-code) replace invalid format specifier (#8940)
change %lu and %ld to %d
2022-04-10 20:37:32 +08:00
32133621c6 [Thirdparty]Add llvm for codegen (#8938) 2022-04-10 20:37:09 +08:00
6ed59bb98b [refactor](code_style) remove useless inline #8933
1.Member functions defined in a class are inline by default (implicitly), and do not need to be added
2.inline is a keyword used for implementation, which has no effect when placed before the function declaration
2022-04-10 18:29:55 +08:00
1fe4ea4c7c [Refactor-step1] Add OLAPInternalError to status (#8900) 2022-04-10 00:16:43 +08:00
71aedb994e [refactor][doc] add data-table doc (#8927) 2022-04-09 19:18:44 +08:00
1ee8633e5e [fix](account) use LOG.info instead of LOG.debug (#8911)
This complements (#8849)
2022-04-09 19:18:13 +08:00
5706679e08 [fix] fix the problem that using tsan to compile,BE will stack overflow when start (#8904)
Currently TSAN can only be compiled using CLang, not GCC.
And when compiling with -o0, stack overflow occurs at startup, issue #8868.
A function definition will be reported missing at compile time, the file provided in PR #8665 is required.
2022-04-09 19:17:28 +08:00
f28ad36c02 [test][improvement] support execute multiple sql in sql file (#8902)
regression testing framework support execute multiple sql in sql file
2022-04-09 19:15:53 +08:00
ce6b5169c2 [fix](join) Fix error bucket num get in bucket shuffle join in dynamic partition (#8891) 2022-04-09 19:11:44 +08:00
2c1c7f40b6 [refactor][doc] Add data backup, data restore and data delete recovery (#8865)
1.Add data backup doc,
2.add data restore doc,
3.add data delete recovery doc
2022-04-09 19:04:57 +08:00
1de0ea2dc4 [refactor][doc] Added documentation for advanced usage section (#8826)
1.Materialized view
2.Schema Change
3.Dynamic Partition
4.Bucket Shuffle Join
5.Colocation Join
6.Runtime Filter
7.partition cache
8.Orthogonal BITMAP calculation
9.Variable
10.Time zone
11.File Manager
2022-04-09 19:03:43 +08:00
a290104966 [fix](routine load) Routine load task doesn't reallocate when previous BE is down. (#8824)
if previous be is not alive, should assigned another available BE instead.
2022-04-09 19:02:55 +08:00
0f10f84075 [refactor][doc] Add update-delete documentation (#8821) 2022-04-09 19:02:16 +08:00
ddf7ef9327 [improvement](join) update broadcast join cost algorithm (#8695)
broadcast join cost is used compressed data size currently.
The amount of memory used may be significantly more than estimated.
This patch:
1. add a compressed ratio to broadcast join cost and set to 5 according to the experience.
2. add a new session variable `auto_broadcast_join_threshold` to limit memory used by broadcast in bytes, the default value is 1073741824(1GB)
2022-04-09 19:00:27 +08:00
2059e88d43 [fix][doc] remove non-exist outfile.md (#8913) 2022-04-08 23:24:10 +08:00
Pxl
453485abfb [Bug] Fix some bugs(rewrite rule/symbol transport) of like predicate (#8770) 2022-04-08 14:32:09 +08:00
c5718928df [feature-wip](array-type) support explode and explode_outer table function (#8766)
explode(ArrayColumn) desc:
> Create a row for each element in the array column. 

explode_outer(ArrayColumn) desc:
> Create a row for each element in the array column. Unlike explode, if the array is null or empty, it returns null.

Usage example:
1. create a table with array column, and insert some data;
2. open enable_lateral_view and enable_vectorized_engine;
```
set enable_lateral_view = true;
set enable_vectorized_engine=true;
```
3. use explode_outer
```
> select * from array_test;
+------+------+--------+
| k1   | k2   | k3     |
+------+------+--------+
|    3 | NULL | NULL   |
|    1 |    2 | [1, 2] |
|    2 |    3 | NULL   |
|    4 | NULL | []     |
+------+------+--------+

> select k1,explode_column from array_test LATERAL VIEW explode_outer(k3) TempExplodeView as explode_column;
+------+----------------+
| k1   | explode_column |
+------+----------------+
|    1 |              1 |
|    1 |              2 |
|    2 |           NULL |
|    4 |           NULL |
|    3 |           NULL |
+------+----------------+
```
4. explode usage example. explode return empty rows while the ARRAY is null or empty
```
> select k1,explode_column from array_test LATERAL VIEW explode(k3) TempExplodeView as explode_column;
+------+----------------+
| k1   | explode_column |
+------+----------------+
|    1 |              1 |
|    1 |              2 |
+------+----------------+
```
2022-04-08 12:11:04 +08:00
bd0a3369b7 [fix] check disk capacity before writing data (#8887)
1. We forgot to check disk capacity when writing data.
2. TODO: the user specified disk capacity is not used now. We need to find a way to use it.
3. Avoid print too much compaction log when there is not suitable version for compaction.
2022-04-08 11:29:49 +08:00
f854f0e83e remove unreadable char in comment (#8909) 2022-04-08 09:26:53 +08:00
3dd6b42781 [fix](datax) Fix the problem of keyword error when importing datax (#8893) 2022-04-08 09:20:54 +08:00
Pxl
dbbc6549bd [feature](vectorized) support vexplode_bitmap (#8890) 2022-04-08 09:20:26 +08:00
fa8e4ec2f0 [fix] Disable cast operation of object type (#8882)
Disable cast between string and object type(bitmap, hll, quantile_state)
2022-04-08 09:13:56 +08:00
69ab1f8681 [doc] add doc of fe dev in vscode (#8875) 2022-04-08 09:12:43 +08:00
3f04220d49 [typo] Fix typo in function.cpp (#8873) 2022-04-08 09:09:19 +08:00
e3daa9580a [Fix](Lateral View) The Error expr type when exploding a function result of inline view (#8851)
Fixed #8850

The column in inline view maybe a function instead of slotRef.
So when this column is used as the input of explode function,
it can't be converted to slotRef.

The correct way is to treat it as an Expr and extract the required slotRef for materialization.
For example:
```
with d as (select k1+k1 as k1_plus from table)
select k1_plus from d explode_split(k1_plus, ",")
```
FnExp: SlorRef<k1_plus>
SubstituteFnExpr: functionCallExpr<k1+k1>
originSlotRefList: SlotRef<k1>
2022-04-08 09:08:55 +08:00
318feb01f3 [improvement](account) support to account management sql (#8849)
Add [IF EXISTS] support to following statements:
- CREATE [IF NOT EXISTS] USER
- CREATE [IF NOT EXISTS] ROLE
- DROP [IF EXISTS] USER
- DROP [IF EXISTS] ROLE
2022-04-08 09:08:08 +08:00
0b98d78664 [improvement](hll) Optimize Hyperloglog (#8829)
In meituan, pr #6625 was revert due to the oom probleam.
currently, we are trying to modify the old hyperloglog, based on pr #8555, we did some works.
via some test, we find it better than old hll, and better than apache:master hll.

Changes summary:

- use SIMD max tp speed up heavy function _merge_registers
- use phmap::flat_hash_set rather than std::set
- replace std::max
- other small changes
2022-04-08 09:06:08 +08:00
b88bf73ca7 [refactor][doc] Added doc for compilation, deployment and data export (#8776) 2022-04-08 09:04:03 +08:00
519305cb22 [feature-wip] (memory tracker) (step4) Switch TLS mem tracker to separate more detailed memory usage (#8669)
Based on #8605, Separate out the memory usage of each operator from the Query/Load/StorageEngine mem tracker.
2022-04-08 09:02:26 +08:00
7fb4b6a6e2 [chore](tsan) add file mremap_fallback for tsan (#8665) 2022-04-08 09:01:53 +08:00
24bb9810b4 [doc](manager) Add space list documents (#8658)
Add space list and access control document. Remove some pictures to reduce the size of source code.
2022-04-08 09:01:23 +08:00
d51545a952 [fix](ut)(memory-leak) Fix be asan ut failed and hdfs file reader memory leak (#8905) 2022-04-08 00:07:00 +08:00
Pxl
2a25b90cb3 [Test] Fix explode test and build fail (#8885) 2022-04-07 14:23:57 +08:00
32bba15e34 [refactor][fix] remove useless import in Config.java (#8878) 2022-04-07 11:40:05 +08:00
c9cb07a270 [typo](doc)Update upgrade.md (#8866) 2022-04-07 11:36:39 +08:00
02be8176c3 [fix] access parallel_flat_hash_map via thread safely methods (#8854)
Iterator of parallel_flat_hash_map is not thread safely, so
we should use if_contains instead.
2022-04-07 11:35:59 +08:00
64d18364db [improvement](restore) set table property 'dynamic_partition.enable' to false after restore (#8852)
when restore table with dynamic partition properties, 'dynamic_partition.enable' is set to the backup time value.
but Doris could not turn on dynamic partition automatically when restore.
So we cloud see table never do dynamic partition with dynamic_partition.enable is set to 'true'.
2022-04-07 11:34:01 +08:00
ce50c4d826 [feature](diagnose) support "ADMIN DIAGNOSE TABLET" stmt (#8839)
`ADMIN DIAGNOSE TABLET tablet_id`

This statement makes it easier to quickly diagnose the status of a tablet.
See "ADMIN-DIAGNOSE-TABLET.md" for details

```
mysql> admin diagnose tablet 10196;
+----------------------------------+------------------------------+------------+
| Item                             | Info                         | Suggestion |
+----------------------------------+------------------------------+------------+
| TabletExist                      | Yes                          |            |
| TabletId                         | 10196                        |            |
| Database                         | default_cluster:db1: 10192   |            |
| Table                            | tbl1: 10194                  |            |
| Partition                        | tbl1: 10193                  |            |
| MaterializedIndex                | tbl1: 10195                  |            |
| Replicas(ReplicaId -> BackendId) | {"10197":10002}              |            |
| ReplicasNum                      | OK                           |            |
| ReplicaBackendStatus             | Backend 10002 is not alive.  |            |
| ReplicaVersionStatus             | OK                           |            |
| ReplicaStatus                    | OK                           |            |
| ReplicaCompactionStatus          | OK                           |            |
+----------------------------------+------------------------------+------------+
```
2022-04-07 11:30:03 +08:00
ca4055244e [fix](storage) Fix core bug of convert to predicate column (#8833)
recurrent:
When `enable_low_cardinality_optimize = true`, for the TPCH dataset, using the following SQL query will Core
```sql
select count(*) from lineitem where l_comment = 'ously even exc';
```

This SQL will trigger the execution of `ColumnDictionary::convert_to_predicate_column_if_dictionary`, and `res->reserve(_codes.size())` is problematic because the current `_codes.size()` is smaller than its reserve value, so inserting a value into `PredicateColumn` will Core.
2022-04-07 11:29:26 +08:00
e72ccfd80c [Refactor][httpv2]remove http v1 code (#8848)
http v2 has been actually tested in production, and it is completely replaceable to have http code. In order to simplify code maintenance, remove the previous http part of the code
2022-04-07 08:38:29 +08:00