Commit Graph

10111 Commits

Author SHA1 Message Date
fd4576e420 [Fix](auth) fix some problem of skip_localhost_auth_check in FE config #18996 2023-04-25 09:10:01 +08:00
171a194070 [minor](regression) fix unstable test case (#19018)
* [minor](regression) fix unstable test case

* update
2023-04-25 09:09:24 +08:00
93c48f2bb0 [fix](regression) fix show create table in_memory = false test result error #19022 2023-04-25 09:04:59 +08:00
72632b1e32 [improvement](regression-test) add max_failure_num to skip tests when too much failure #19003 2023-04-25 09:03:36 +08:00
bf75e74065 [typo](docs) add oceanbase jdbc catalog doc (#18994)
* [typo](docs) add oceanbase jdbc catalog doc

* fix
2023-04-25 08:50:31 +08:00
8e808abbd4 [doc](remove-useless-code)remove useless doc description #18957
Co-authored-by: journeychen <journeychen@tencent.com>
2023-04-25 08:49:24 +08:00
207c827cdb [fix](test) fix result of CHARACTER_OCTET_LENGTH in . (#18896) 2023-04-25 08:42:54 +08:00
4e9b32d622 [bugfix](exception) remove fmt code to test if there still exist core (#19009) 2023-04-25 07:24:14 +08:00
3899c08036 [optimize](compile) remove unused template param from load channel (#18980)
* [optimize](compile) remove unused template param from load channel



---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-24 23:36:47 +08:00
efebb3d21e [fix](schema) fix show create table get wrong random distribution info (#18895)
* [fix](schema) fix show create table get wrong random distribution info


---------

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-04-24 23:33:42 +08:00
54d58364c1 [fix](Nereids): move SimplifyAggGroupBy before NormalizeAggregate. (#18918) 2023-04-24 19:00:27 +08:00
b2c26e17e1 [Compile](vec) Fix compile by BHREAD_SCANNER (#18979) 2023-04-24 17:07:06 +08:00
a01a9c2d7d [community](action) modify teamcity trigger action (#18981) 2023-04-24 15:17:47 +08:00
16a394da0e [chore](build) Use include-what-you-use to optimize includes (PART III) (#18958)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-24 14:51:51 +08:00
6b219ab599 [typo](doc) add declaration of row level auth policy #18959 2023-04-24 14:27:45 +08:00
17e206c538 [Feature](resource-group) Support drop resource group (#18873) 2023-04-24 14:00:00 +08:00
6bf51150f3 [fix](nereids) remove unnecessary project above scan node (#18920)
1. remove unnecessary project node above scan node.
2. fix in subquery may be recognized as scalar subquery bug
3. fix some Quantile related functions' return type bug
2023-04-24 13:58:57 +08:00
d368326cc2 [fix](Nereids) should not fallback to legacy planner when execution failed (#18847) 2023-04-24 13:29:29 +08:00
22cdfc5970 [refactor](fs)(step1) add new storage file system (#18938)
PR1: add new storage file system template and move old storage to new package
PR2: extract some method in old storage to new file system.
PR3: use storages to access remote object storage, and use file systems to access file in local or remote location. Will add some unit tests.

---------

Co-authored-by: jinzhe <jinzhe@selectdb.com>
2023-04-24 11:41:48 +08:00
Pxl
1f9450e0f7 [Chore](case) add some regression-test case about materialized-view #18946 2023-04-24 11:36:56 +08:00
296b0c92f7 [Enhancement](compaction) stop tablet compaction when table dropped (#18702)
* [Enhancement](compaction) stop tablet compaction when table dropped

* fix be ut
2023-04-24 11:04:27 +08:00
ab2a6864bc [function](json) Json unquote (#18037) 2023-04-24 10:33:29 +08:00
8d7a9fd21b [refactor](exceptionsafe) add factory creator to some class (#18978)
make vexprecontext,vexpr,function,query context,runtimestate thread safe.


---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-24 10:32:11 +08:00
8e4710079d [improvement](profile) Insert into add LoadChannel runtime profile (#18908)
TabletSink and LoadChannel in BE are M: N relationship,
Every once in a while LoadChannel will randomly return its own runtime profile to a TabletSink, so usually all LoadChannel runtime profiles are saved on each TabletSink, and the timeliness of the same LoadChannel profile saved on different TabletSinks is different, and each TabletSink will periodically send fe reports all the LoadChannel profiles saved by itself, and ensures to update the latest LoadChannel profile according to the timestamp.
2023-04-24 09:41:57 +08:00
d2f50ce3f5 [Fix](HttpServer) Chinese garbled characters appear when obtaining query plan (#18820)
When obtaining the query plan, the Chinese garbled characters in the predicate lead to incorrect data results.
2023-04-24 08:49:44 +08:00
2d7903e2bd [Feature](multi-catalog) support query hive views. (#18815)
A very simple implementation to query hive views, it is an EXPERIMENTAL feature.
We can try to parse the ddl of hive views and try to execute the query relies on the fact that HiveQL
is very similar to Doris SQL. But if the ddl of hive views use some complicated or incompatible grammar,
the query might fail.
2023-04-24 08:49:26 +08:00
0dd45ce158 [feature](ui)add copy profile button #18965 2023-04-24 08:44:38 +08:00
0c95d760fe [fix](fixed_hashtable) The incorrect implementation of copy constructor (#18921) 2023-04-24 08:36:52 +08:00
4ba6c8b6ce [community](collaborator) add more collaborators (#18976) 2023-04-24 08:30:20 +08:00
b4282641c1 [typo](doc) Fixed typos in ADMIN-SHOW-CONFIG.md (#18969)
* [typo](doc) Fixed typos in ADMIN-SHOW-CONFIG.md

* Update ADMIN-SHOW-CONFIG.md
2023-04-24 08:29:55 +08:00
e4f058bad5 modified some text errors (#18968) 2023-04-24 08:29:45 +08:00
5c31a0867c [typo](doc) Fixed typos in OUTFILE.md (#18967) 2023-04-24 08:29:35 +08:00
27b8227cb5 [typo](docs)Optimize the installation And deployment directory structure (#18966) 2023-04-24 08:29:24 +08:00
07ea350201 [Fix](inverted index) fix memory leak when create bkd reader (#18914)
The function compoundReader->openInput is called three times, and if any of these calls fail,
an error is logged, and the function returns early. If one or two of the calls succeed, but the others fail,
there might be a situation where the allocated memory for the IndexInput objects is not freed.

To fix this, you could use std::unique_ptr to manage the memory for IndexInput objects.
This would automatically clean up the memory when the function goes out of scope.
2023-04-23 23:21:44 +08:00
c3baa65de3 [feature](io) enable s3 file writer with multi part uploading concurrently (#17585)
Formerly S3FileWriter has to write each buffer with 5MB or more then upload one part, after all these works are done it could then process the incoming data, it's blocking and inefficient. This pr brings one bufferpool where the data could write into memory buffer immediately if has free buffer and then it would be uploaded into the S3.
This pr doesn't provide the ability to elegantly support cases where there is no free buffer, i'll leave it as one future work.
2023-04-23 23:19:44 +08:00
3736530585 [refactor](query context) rename query fragments context to query context and make query context safe (#18950)
* [refactor](query context) rename query fragments context to query context and make query context safe

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-23 22:53:56 +08:00
29fdf1fb7e [typo](docs) add enable_ssl config doc (#18961) 2023-04-23 22:27:28 +08:00
1e7ef35741 [fix](Nereids) two phase read for topn only support simple case (#18955)
1. topn must has merge node
2. topn must the top node of plan
2023-04-23 21:32:23 +08:00
45d0f53529 [Regression-test](Export) add regression test for export #18897 2023-04-23 19:43:22 +08:00
a9ac930e5f [Fix](mutli-catalogs) Fix jdbc regression tests. (#18927)
- Fix `test_show_where` result.
- Remove `enable_decimal_conversion = true` in `test_mysql_jdbc_catalog`.
- Remove `test_show_create_catalog`.
2023-04-23 19:42:13 +08:00
25e8c71943 [test](fix) fix postgresql test (#18900)
* [test](fix) fix postgresql test

* fix
2023-04-23 18:41:41 +08:00
2c776584e5 [doc](releasenote)release 1.2.4 (#18934)
* release 1.2.4

* Update README.md

* Update sidebars.json
2023-04-23 16:04:25 +08:00
0da2cf270a [improvement](fetch data) Merge result into batch to reduce rpc times (#17828) 2023-04-23 15:07:28 +08:00
63e8fb7300 [chore](regression) Add 'sync' after stream_load in some cases (#18945) 2023-04-23 14:39:33 +08:00
166bed11d4 [Enchancement](auth) Forbid to login doris from 127.0.0.1 without password (#18816)
* forbid to login from 127.0.0.1 without password

* add localhost limit

* rename
2023-04-23 13:56:31 +08:00
61b44108e2 [bugfix](asan) fix possible asan check bug in exception to string (#18936)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-23 12:26:36 +08:00
29f502380c [opt](FileReader) merge small IO to optimize read performace (#18796)
Add `MergeRangeFileReader` to merge small IO to optimize parquet&orc read performance.

`MergeRangeFileReader` is a FileReader that efficiently supports random access in format like parquet and orc.
In order to merge small IO in parquet and orc, the random access ranges should be generated when creating the 
reader. The random access ranges is a list of ranges that order by offset.
The range in random access ranges should be reading sequentially, can be skipped, but can't be read repeatedly.
When calling read_at, if the start offset located in random access ranges, the slice size should not span two ranges.

For example, in parquet, the random access ranges is the column offsets in a row group.

When reading at offset, if [offset, offset + 8MB) contains many random access ranges,
the reader will read data in [offset, offset + 8MB) as a whole, and copy the data in random access ranges into small 
buffers(name as box, default 1MB, 64MB in total). A box can be occupied by many ranges,
and use a reference counter to record how many ranges are cached in the box. If reference counter equals zero,
the box can be release or reused by other ranges. When there is no empty box for a new read operation,
the read operation will do directly.

## Effects
The runtime of ClickBench reduces from 102s to 77s, and the runtime of Query 24 reduces from 24.74s to 9.45s.
The profile of Query 24:
```
 VFILE_SCAN_NODE  (id=0):(Active:  8s344ms,  %  non-child:  83.06%)
    -  FileReadBytes:  534.46  MB
    -  FileReadCalls:  1.031K  (1031)
    -  FileReadTime:  28s801ms
    -  GetNextTime:  8s304ms
    -  MaxScannerThreadNum:  12
    -  MergedSmallIO:  0ns
        -  CopyTime:  157.774ms
        -  MergedBytes:  549.91  MB
        -  MergedIO:  94
        -  ReadTime:  28s642ms
        -  RequestBytes:  507.96  MB
        -  RequestIO:  1.001K  (1001)
    -  NumScanners:  18
```
1001 request IOs has been merged into 94 IOs.

## Remaining problems
1. Add p2 regression test in nest PR
2. Profiles are scattered in various codes and will be refactored in the next PR
3. Support ORC reader
2023-04-23 10:51:38 +08:00
b81b470d4f [fix](planner) fix pr "using crchash replace murmurhash in the runtime filter" (#18759) 2023-04-23 10:33:35 +08:00
9756be6bf0 [improvement](stream-load) use vector instead of skiplist when insert dup keys (#18686) 2023-04-23 09:40:09 +08:00
e7ad536a71 [scirpte](download) add 1.2.4 download script (#18932) 2023-04-23 07:40:19 +08:00