Commit Graph

6621 Commits

Author SHA1 Message Date
e094e6ca71 [typo](docs)add hive-bitmap compile and package des #13237 2022-10-10 14:52:50 +08:00
63903136c4 [refactor](jcup) Format keywords in sql_parser.cup (#13133)
The key keyword definition section of `sql_parser.cup` is unordered and messy:
1. It is almost unreadable
2. There are no rules to format it when we make a change to it
3. **It takes unnecessary effort to resolve conflict caused by the unordered keywords**

We can apply some simple rules to format it:
1. Sort in lexicographical order
4. Break into several "sections", keywords in each section have the same prefix `KW_${first_letter}`
5. Every 2 sections are connected with an empty line containing only 4 white spaces

e.g.

```
terminal String
    KW_A...

    KW_B...

    ...

    KW_Z...
```
2022-10-10 14:34:51 +08:00
86d55dd79c [Improvement](like function) avoid to convert const column to full column (#13214) 2022-10-10 14:19:46 +08:00
a8535e91af [Improvement](runtimefilter) DO NOT allocate memory for bbf in prepare phase (#13207) 2022-10-10 14:19:33 +08:00
375dfedd83 [feature](nereids) dump physical tree and memo (#13091)
dump memo info and physical plan in stdout and log
set `enable_nereids_trace` variable true/false to open/close this dump.

following is a fragment of memo:
```
Group[GroupId#8]
GroupId#8(plan=PhysicalHashJoin ( type=INNER_JOIN, hashJoinCondition=[(r_regionkey#250 = n_regionkey#255)], otherJoinCondition=Optional.empty, stats=null )) children=[GroupId#6 GroupId#7 ] stats=(rows=25, isReduced=false, width=2)
GroupId#8(plan=PhysicalHashJoin ( type=INNER_JOIN, hashJoinCondition=[(r_regionkey#250 = n_regionkey#255)], otherJoinCondition=Optional.empty, stats=null )) children=[GroupId#7 GroupId#6 ] stats=(rows=25, isReduced=false, width=2)
```
2022-10-10 13:05:28 +08:00
Pxl
bdcb600f3d [Bug](load) fix core dump on big block load (#13014) 2022-10-10 12:38:32 +08:00
1cd4e5cec6 refractor insert_xxx functions (#13088)
As mentioned in #13074, there will be some problem in ColumnVector<int>::insert_many_in_copy_way.
Column::insert_xxx functions will append some data, they should reserve or resize before append data.
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-10-10 11:54:27 +08:00
20b583c91e [Bug](array-type) Fix memory buffer overflow (#13074) 2022-10-10 11:42:13 +08:00
935ef5a598 [feature-wip](new-scan) Add new ES scanner and new ES scan node #13027 2022-10-10 09:56:38 +08:00
dd089259be [feature-wip](multi-catalog) Optimize the performance of boolean & dictionary decoding (#13212)
Generate vector for dictionary data.
Decode boolean values in batch.
2022-10-10 08:41:11 +08:00
e829061614 [fix](sort)should not change resolvedTupleExprs in toThrift method (#13211)
The toThrift method will be called mutilple times for sending data to different be but the changes of resolvedTupleExprs should be done only once. This pr make sure the resolvedTupleExprs can only be changed only once
2022-10-10 08:39:58 +08:00
3dc4dc6d43 [compaction](http_action) enable be run manual compaction concurrently (#13219)
In some case, we need to run manual compaction via http interface
concurrently, so we remove the mutex and tablet's compaction lock
is enough to prevent concurrent compaction in tablet.

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-10-10 08:33:18 +08:00
15c7c0b754 [chore](release build) copy license and notice file to output folder and strip debug info from meta tool (#13222)
* [chore](release build) copy license and notice file to output folder and strip debug info from meta tool

Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-10-10 08:31:34 +08:00
581494dea8 [fix](test) resolve load in tpch_sf100_unique_p2 and tpch_sf10_unique_p2 (#13208) 2022-10-09 20:30:00 +08:00
b9516b50c1 [typo](docs)fix docs 404 url (#13157)
* fix docs 404 url
2022-10-09 20:02:48 +08:00
7b2fdd26a1 [schema change](fix) fix coredump of schema change (#13183)
When schema change and compaction is executing simutaneously, both
nullable and not nullable data can be read for the same column, need to
reset _nullmap for each Block when converting Block data, or else Column
case will be wrong.
2022-10-09 19:44:00 +08:00
3302e0b57e [enhancement](regression-test) add sync for unique table debug test (#13210) 2022-10-09 19:32:28 +08:00
f2159709a8 [Regression](outfile) Fix concurrency test failure caused by outfile (#13209) 2022-10-09 19:09:44 +08:00
fc711d89c8 [fix](projections) Open the project expressions properly. (#13162)
In current 'ExecNode::open' function, the 'open(_projections)' is unreachable which might cause serious crashed. (#13150)
2022-10-09 18:43:45 +08:00
89514fc964 [fix](rowset) fix that rowset writer doesn't process the return value, which may result in data loss (#13189) 2022-10-09 17:10:11 +08:00
15fc3c2c89 [enhancement](statistics) optimize the default configuration related to statistics, etc. (#13136)
This pr is mainly to optimize statistical tasks. Includes the following:
1. No longer generate statistics tasks for empty tables, and move the logic of skipping empty partitions to the process of task generation.
2. Adjusted the default configuration related to statistics to improve the efficiency of statistics collection, parameters include `cbo_concurrency_statistics_task_num`,`statistic_job_scheduler_execution_interval_ms`  and `statistic_task_scheduler_execution_interval_ms`.
3. Optimize the display of statistical tasks.
4. In addition, some `org.apache.parquet.Strings` packages are changed to `com.google.common.base.Strings` to avoid the exception that Strings cannot be found in local debug.

etc.
2022-10-09 16:34:20 +08:00
da933ecd21 [fix](Nereids) plan broadcast on right semi join by mistake (#13206) 2022-10-09 16:32:12 +08:00
cfade2dfe0 [typo](docs)Fix Docs 404 Url #13175 2022-10-09 16:22:26 +08:00
dc2d33298b [chore](be config) remove config use_mmap_allocate_chunk #13196
This config is never used online and there exist bugs if enable this config. So that I remove this config and related tests.


Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-10-09 16:19:59 +08:00
e5fbecc621 [typo](docs)Fix the jump link 404 in delete recover.md (#13156)
* [typo](docs)Fix the jump link 404 in delete-recover.md
2022-10-09 16:12:34 +08:00
207e913b55 fix the bad link fo delete-recover.md (#13203)
fix the bad link fo delete-recover.md
2022-10-09 16:08:19 +08:00
9c64fde8f5 [tools](banchmark) upgrade date type (#13197)
upgrade date type to datev2
2022-10-09 14:17:12 +08:00
f373b22dcf [fix](string) Fix over-allocated memory for string type (#13167)
For string/varchar/text type, the length field is fixed to 2GB. (`ColumnMetaPB`)
We don't actually have to allocate 2GB for every string type because we
will reallocate the precise size of memory for the string in
`WrapperField::from_string()`

```
    Status from_string(const std::string& value_string, const int precision = 0,
                       const int scale = 0) {
        if (_is_string_type) {
            if (value_string.size() > _var_length) {
                Slice* slice = reinterpret_cast<Slice*>(cell_ptr());
                slice->size = value_string.size();
                _var_length = slice->size;
                _string_content.reset(new char[slice->size]);
                slice->data = _string_content.get();
            }
        }
        return _rep->from_string(_field_buf + 1, value_string, precision, scale);
    }
```
2022-10-09 14:14:39 +08:00
Pxl
245490d6b7 [Enhancement](runtime filter) optimize for runtime filter (#12856)
optimize for runtime filter
2022-10-09 14:11:03 +08:00
8f36f8b83a Add be Parameter Description(#13201)
Add be Parameter Description
2022-10-09 12:49:57 +08:00
33fe389d62 [regression](datev2) Add regression tests for datev2 (#13040) 2022-10-09 11:55:06 +08:00
e0cff02c1a add sync for stream load test (#13185) 2022-10-09 11:36:01 +08:00
bbb6d2758a [fix](regression-test) fix test_segment_iterator_delete using order_qt_sql (#13192) 2022-10-09 11:35:22 +08:00
62c82bd575 [enhancement](test) Rewrite test_update_schema_change case (#13191) 2022-10-09 11:35:05 +08:00
9e42804298 [feature-wip](unique-key-merge-on-write) unique key with merge on write table support schema change (#12886) 2022-10-09 11:31:53 +08:00
671dc93035 [feature-wip](unique-key-merge-on-write) fix that versions of multiple replicas are inconsistent when rebalance (#12363) 2022-10-09 11:31:27 +08:00
e6f4c771d9 [fix](docs) fix trim, lower, upper function docs error (#13179) 2022-10-09 10:32:26 +08:00
555f9520e3 fix community module error url (#13182)
fix community module error url
2022-10-09 10:27:02 +08:00
c53d2d6a8b install deploy doc fix (#13177)
install deploy doc fix
2022-10-09 10:26:28 +08:00
e0044e5a5f [typo](docs)Sql doc link fix (#13151)
* sql doc link fix
2022-10-09 09:26:00 +08:00
ece4a6c194 [doc][fix](multi-catalog) add doc for multi catalog and fix refresh bug (#13097)
1. Add all document about multi catalog feature.
2. Fix a bug that REFRESH edit log is not handled
2022-10-09 09:14:44 +08:00
d16ff79217 [fix](flinkCDC Demo):fix flinkcdc demo execution error (#13148) 2022-10-09 09:13:18 +08:00
b8b18e5153 [enhancement](array-type) Handle cast empty string value to array (#13028)
Handle empty value between two comma when cast string to array type.

before:
mysql> select cast("[a,b,c,,,,]" as array<string>);
+-----------------------------------+
| CAST('[a,b,c,,,,]' AS ARRAY<TEXT>) |
+-----------------------------------+
| ['a', 'b', 'c', ',', ',']                |
+-----------------------------------+
1 row in set (0.01 sec)

after:
mysql> select cast("[a,b,c,,,,]" as array<string>);
+-----------------------------------+
| CAST('[a,b,c,,,,]' AS ARRAY<TEXT>) |
+-----------------------------------+
| ['a', 'b', 'c', '', '', '']                |
+-----------------------------------+
1 row in set (0.01 sec)
2022-10-08 21:45:42 +08:00
869fe2bc5d [Improvement](outfile) Support ORC format in outfile (#13019) 2022-10-08 20:56:32 +08:00
344377beb7 [typo](docs)Fix jump link 404 in jdbc load.md (#13170) 2022-10-08 20:01:52 +08:00
86e47650cf Update outfile.md (#13172) 2022-10-08 20:01:20 +08:00
4386f41442 sql server 2017 version ODBC usage instructions (#13178)
sql server 2017 version ODBC usage instructions
2022-10-08 20:00:53 +08:00
6b0410450b [typo](docs)Fix jump link 404 in external storage load.md (#13173) 2022-10-08 19:59:44 +08:00
c5f802b93c [Bug](libjvm) reorder initialization of JNI (#13165) 2022-10-08 18:53:47 +08:00
b81a8789c3 [feature-wip](parquet-reader) optimize the performance of column conversion (#13122)
Convert Parquet column into doris column via batch method.
In the previous implementation, only numeric types can be converted in batches,
and other types can only be inserted one by one.
This process will generate repeated virtual function calls and container expansion.
2022-10-08 18:03:10 +08:00