Commit Graph

8276 Commits

Author SHA1 Message Date
500c36717d [Bug-Fix][Vectorized] Full join return error result (#9690)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-23 13:29:37 +08:00
77297bb7ee Fix some typos in fe/. (#9682) 2022-05-23 12:11:01 +08:00
5b13fa2b15 [typo] Fix typos in comments (#9710) 2022-05-23 12:01:37 +08:00
ddda91c89d [doc] Update dev image (#9721) 2022-05-23 11:59:15 +08:00
d97e2b1eb2 [doc] update docs for FE UT (#9718) 2022-05-22 21:36:45 +08:00
d8f1b77cc1 [improvement](planner) Backfill the original predicate pushdown code (#9703)
Due to the current architecture, predicate derivation at rewrite cannot satisfy all cases,
because rewrite is performed on first and then where, and when there are subqueries, all cases cannot be derived.
So keep the predicate pushdown method here.

eg.
select * from t1 left join t2 on t1 = t2 where t1 = 1;

InferFiltersRule can't infer t2 = 1, because this is out of specification.

The expression(t2 = 1) can actually be deduced to push it down to the scan node.
2022-05-22 21:35:32 +08:00
3768fdd3f8 [doc] Add trim_tailing_spaces_for_external_table_query variable to the docs. (#9701) 2022-05-22 21:32:23 +08:00
d270f4f2d4 [config](checksum) Disable consistency checker by default (#9699)
Disable by default because current checksum logic has some bugs.
And it will also bring some overhead.
2022-05-22 21:31:43 +08:00
ad4da4aa8f [doc] Fix typos in documentation (#9692) 2022-05-22 21:30:22 +08:00
c13a6a1d8a [fix] NullPredicate should implement evaluate_vec (#9689)
select column from table where column is null
2022-05-22 21:29:53 +08:00
75b3707a28 [refactor](load) add tablet errors when close_wait return error (#9619) 2022-05-22 21:27:42 +08:00
3391de482b [Refactor] simplify some code in routine load (#9532) 2022-05-22 21:25:39 +08:00
b3a2a92bf5 [deps] libhdfs3 build enable kerberos support (#9524)
Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication 
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.

so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:

- gsasl version: 1.8.0
- krb5 version: 1.19
2022-05-22 20:58:19 +08:00
97fad7a2ff [doc]Add insert best practices (#9723)
Add insert best practices
2022-05-22 16:24:20 +08:00
31e40191a8 [Refactor] add vpre_filter_expr for vectorized to improve performance (#9508) 2022-05-22 11:45:57 +08:00
0c4b47756a [enhancement](community): enhance java style (#9693)
Enhance java style.

Now: checkstyle about code order is in this page--Class and Interface Declarations

This pr can make idea auto rearrange code
2022-05-20 15:24:30 +08:00
61a60d1dcc [code style] minor update for code style (#9695) 2022-05-20 11:47:49 +08:00
8fa677b59c [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner (#9666)
* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code

It only changed vectorized load, not original row based load.

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-20 11:43:03 +08:00
6f61af7682 [Vectorized][java-udf] add datetime&&largeint&&decimal type to java-udf (#9440) 2022-05-20 10:26:09 +08:00
5fa6e892be [fix](broker-scan-node) Remove trailing spaces in broker_scanner. Make it consistent with hive and trino behavior. (#9190)
Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't.
This would cause different query result with hive.

Add a new session variable "trim_tailing_spaces_for_external_table_query".
If set to true, when reading csv from broker scan node, it will trim the tailing space of the column
2022-05-20 09:55:13 +08:00
defdae1e7d [improvement](stream-load) adjust read unit of http to optimize stream load (#9154) 2022-05-20 09:52:36 +08:00
1e940f28b0 [docs] Fix error command of meta tool docs (#9590)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-20 09:36:26 +08:00
c2d41c84bf [feature](nereids): add join rules base code (#9598) 2022-05-20 08:18:08 +08:00
2c79d223e4 [refactor][rowset]move rowset writer to a single place (#9368) 2022-05-19 23:57:02 +08:00
c048b1f0f9 [fix](sparkload): fix min_value will be negative number when maxGlobalDictValue exceeds integer range (#9436) 2022-05-19 23:56:24 +08:00
ef65f484df [Enhancement] improve parquet reader via arrow's prefetch and multi thread (#9472)
* add ArrowReaderProperties to parquet::arrow::FileReader

* support perfecth batch
2022-05-19 23:52:01 +08:00
1355bc162b [Enhance] Add host info to heartbeat error msg (#9499) 2022-05-19 23:45:53 +08:00
Pxl
6951c42d5c [Bug][Vectorized] fix schema change add varchar type column default value get wrong result (#9523) 2022-05-19 23:38:57 +08:00
c09858671d [improvement][performance] improve lru cache resize performance and memory usage (#9521) 2022-05-19 23:37:59 +08:00
939daa07f1 [fix] fix Code Quality Analysis failed (#9685) 2022-05-19 23:13:47 +08:00
0f9ef26576 [Bug] Fix timestamp_diff issue when timeunit is year and month (#9574) 2022-05-19 21:24:43 +08:00
73c4ec7167 Fix some typos in be/. (#9681) 2022-05-19 20:55:39 +08:00
87e3904cc6 Fix some typos for docs. (#9680) 2022-05-19 20:55:21 +08:00
cbc7b167b1 [Feature] cancel load support state (#9537) 2022-05-19 16:37:56 +08:00
119ff2c02d [enhancement] Improve debugging experience. (#9677) 2022-05-19 16:36:37 +08:00
235d586f11 [style](fe) code correct rules and name rules (#9670)
* [style](fe) code correct rules and name rules

* revert some change according to comments
2022-05-19 16:36:03 +08:00
7c2db79b73 [BUG] fix bug for vectorized compaction and some storage vectorization bug (#9610) 2022-05-19 16:35:15 +08:00
cbf1e20fbc [doc]update streamload 2pc doc (#9651)
Co-authored-by: wudi <>
2022-05-19 14:30:17 +08:00
7a9bf5b23e [FeConfig](Project) Project optimization is enabled by default (#9667) 2022-05-19 14:03:14 +08:00
86b2c01e85 [refactor][regressiontest] reorder license header and import statement (#9672) 2022-05-19 14:00:33 +08:00
dd5e9fa9a4 [doc] Fixed a error in the Bitmap Index section of the document (#9679) 2022-05-19 13:55:52 +08:00
3efe97e73c [website] fix doris website with no link to the Privacy Policy. (#9665)
All websites must link to the Privacy Policy
2022-05-18 22:49:49 +08:00
a3183ec45c [fix](planner) unnecessary cast will be added on children in CaseExpr sometimes (#9600)
unnecessary cast will be added on children in CaseExpr because use symbolized equal to compare to `Expr`'s type.
it will lead to expression compare mistake and then lead to expression substitute failed when use `ExprSubstitutionMap`
2022-05-18 22:44:51 +08:00
6602adf499 [regression test] Add compaction regression test case for different data models (#9660) 2022-05-18 17:12:20 +08:00
bdaf0b3fcc [fix](storage) low_cardinality_optimize core dump when is null predicate (#9586)
Issue Number: close #9555
Make the last value of the dictionary null, when ColumnDict inserts a null value,
add the encoding corresponding to the last value of the dictionary·
2022-05-18 14:57:13 +08:00
c9ab5e22fe [fixbug](vec-load) fix core of segment_writer while it is not thread-safe (#9569)
introduce in stream-load-vec #9280, it will cause multi-thread
operate to same segment_write cause BetaRowset enable multi-thread
of memtable flush, memtable flush call rowset_writer.add_block, it
use member variable _segment_writer to write, so it will cause
multi-thread in segment write.

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-05-18 11:29:15 +08:00
94c89e8a37 [improment](planner) push down predicate past two phase aggregate (#9498)
Push down predicate past aggregate cannot push down predicate past 2 phase aggregate.

origin plan is like this:
```
second phase agg (conjuncts on olap scan node tuples)
|
first phase agg
|
olap scan node
```
should be optimized to
```
second phase agg
|
first phase agg
|
olap scan node (conjuncts on olap scan node tuples)
```
2022-05-18 10:09:39 +08:00
682cc14182 [bug] (init) Java version check fail (#9607) 2022-05-18 07:47:03 +08:00
bfb1ab059d [BUG] fix information_schema.columns results not correctly on vec engine (#9612)
* VSchemaScanNode get_next bugfix

* add regression-test case for VSchemaScanNode

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-05-18 07:44:32 +08:00
b6f5c89f6c [regression test] add some case for json load regression test (#9614)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-05-18 07:43:51 +08:00