31e40191a8
[Refactor] add vpre_filter_expr for vectorized to improve performance ( #9508 )
2022-05-22 11:45:57 +08:00
0c4b47756a
[enhancement](community): enhance java style ( #9693 )
...
Enhance java style.
Now: checkstyle about code order is in this page--Class and Interface Declarations
This pr can make idea auto rearrange code
2022-05-20 15:24:30 +08:00
61a60d1dcc
[code style] minor update for code style ( #9695 )
2022-05-20 11:47:49 +08:00
8fa677b59c
[Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner ( #9666 )
...
* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code
It only changed vectorized load, not original row based load.
Co-authored-by: lihaopeng <lihaopeng@baidu.com >
2022-05-20 11:43:03 +08:00
6f61af7682
[Vectorized][java-udf] add datetime&&largeint&&decimal type to java-udf ( #9440 )
2022-05-20 10:26:09 +08:00
5fa6e892be
[fix](broker-scan-node) Remove trailing spaces in broker_scanner. Make it consistent with hive and trino behavior. ( #9190 )
...
Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't.
This would cause different query result with hive.
Add a new session variable "trim_tailing_spaces_for_external_table_query".
If set to true, when reading csv from broker scan node, it will trim the tailing space of the column
2022-05-20 09:55:13 +08:00
defdae1e7d
[improvement](stream-load) adjust read unit of http to optimize stream load ( #9154 )
2022-05-20 09:52:36 +08:00
1e940f28b0
[docs] Fix error command of meta tool docs ( #9590 )
...
Co-authored-by: lihaopeng <lihaopeng@baidu.com >
2022-05-20 09:36:26 +08:00
c2d41c84bf
[feature](nereids): add join rules base code ( #9598 )
2022-05-20 08:18:08 +08:00
2c79d223e4
[refactor][rowset]move rowset writer to a single place ( #9368 )
2022-05-19 23:57:02 +08:00
c048b1f0f9
[fix](sparkload): fix min_value will be negative number when maxGlobalDictValue exceeds integer range ( #9436 )
2022-05-19 23:56:24 +08:00
ef65f484df
[Enhancement] improve parquet reader via arrow's prefetch and multi thread ( #9472 )
...
* add ArrowReaderProperties to parquet::arrow::FileReader
* support perfecth batch
2022-05-19 23:52:01 +08:00
1355bc162b
[Enhance] Add host info to heartbeat error msg ( #9499 )
2022-05-19 23:45:53 +08:00
6951c42d5c
[Bug][Vectorized] fix schema change add varchar type column default value get wrong result ( #9523 )
2022-05-19 23:38:57 +08:00
c09858671d
[improvement][performance] improve lru cache resize performance and memory usage ( #9521 )
2022-05-19 23:37:59 +08:00
939daa07f1
[fix] fix Code Quality Analysis failed ( #9685 )
2022-05-19 23:13:47 +08:00
0f9ef26576
[Bug] Fix timestamp_diff issue when timeunit is year and month ( #9574 )
2022-05-19 21:24:43 +08:00
73c4ec7167
Fix some typos in be/. ( #9681 )
2022-05-19 20:55:39 +08:00
87e3904cc6
Fix some typos for docs. ( #9680 )
2022-05-19 20:55:21 +08:00
cbc7b167b1
[Feature] cancel load support state ( #9537 )
2022-05-19 16:37:56 +08:00
119ff2c02d
[enhancement] Improve debugging experience. ( #9677 )
2022-05-19 16:36:37 +08:00
235d586f11
[style](fe) code correct rules and name rules ( #9670 )
...
* [style](fe) code correct rules and name rules
* revert some change according to comments
2022-05-19 16:36:03 +08:00
7c2db79b73
[BUG] fix bug for vectorized compaction and some storage vectorization bug ( #9610 )
2022-05-19 16:35:15 +08:00
cbf1e20fbc
[doc]update streamload 2pc doc ( #9651 )
...
Co-authored-by: wudi <>
2022-05-19 14:30:17 +08:00
7a9bf5b23e
[FeConfig](Project) Project optimization is enabled by default ( #9667 )
2022-05-19 14:03:14 +08:00
86b2c01e85
[refactor][regressiontest] reorder license header and import statement ( #9672 )
2022-05-19 14:00:33 +08:00
dd5e9fa9a4
[doc] Fixed a error in the Bitmap Index section of the document ( #9679 )
2022-05-19 13:55:52 +08:00
3efe97e73c
[website] fix doris website with no link to the Privacy Policy. ( #9665 )
...
All websites must link to the Privacy Policy
2022-05-18 22:49:49 +08:00
a3183ec45c
[fix](planner) unnecessary cast will be added on children in CaseExpr sometimes ( #9600 )
...
unnecessary cast will be added on children in CaseExpr because use symbolized equal to compare to `Expr`'s type.
it will lead to expression compare mistake and then lead to expression substitute failed when use `ExprSubstitutionMap`
2022-05-18 22:44:51 +08:00
6602adf499
[regression test] Add compaction regression test case for different data models ( #9660 )
2022-05-18 17:12:20 +08:00
bdaf0b3fcc
[fix](storage) low_cardinality_optimize core dump when is null predicate ( #9586 )
...
Issue Number: close #9555
Make the last value of the dictionary null, when ColumnDict inserts a null value,
add the encoding corresponding to the last value of the dictionary·
2022-05-18 14:57:13 +08:00
c9ab5e22fe
[fixbug](vec-load) fix core of segment_writer while it is not thread-safe ( #9569 )
...
introduce in stream-load-vec #9280 , it will cause multi-thread
operate to same segment_write cause BetaRowset enable multi-thread
of memtable flush, memtable flush call rowset_writer.add_block, it
use member variable _segment_writer to write, so it will cause
multi-thread in segment write.
Co-authored-by: yixiutt <yixiu@selectdb.com >
2022-05-18 11:29:15 +08:00
94c89e8a37
[improment](planner) push down predicate past two phase aggregate ( #9498 )
...
Push down predicate past aggregate cannot push down predicate past 2 phase aggregate.
origin plan is like this:
```
second phase agg (conjuncts on olap scan node tuples)
|
first phase agg
|
olap scan node
```
should be optimized to
```
second phase agg
|
first phase agg
|
olap scan node (conjuncts on olap scan node tuples)
```
2022-05-18 10:09:39 +08:00
682cc14182
[bug] (init) Java version check fail ( #9607 )
2022-05-18 07:47:03 +08:00
bfb1ab059d
[BUG] fix information_schema.columns results not correctly on vec engine ( #9612 )
...
* VSchemaScanNode get_next bugfix
* add regression-test case for VSchemaScanNode
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com >
2022-05-18 07:44:32 +08:00
b6f5c89f6c
[regression test] add some case for json load regression test ( #9614 )
...
Co-authored-by: hucheng01 <hucheng01@baidu.com >
2022-05-18 07:43:51 +08:00
26353ba8b5
[clang build]fix clang compile error ( #9615 )
2022-05-18 07:42:31 +08:00
908f9cb7b9
[Improvement][ASAN] make BE can exit normally and ASAN memory leak checking work ( #9620 )
2022-05-18 07:40:57 +08:00
4312ef93d7
[Improvement] reduce string size in serialization ( #9550 )
2022-05-17 22:38:34 +08:00
7d9c25e718
[config] Remove some old config and session variable ( #9495 )
...
1. Remove session variable "enable_lateral_view"
2. Remove Fe config: enable_materialized_view
3. Remove Fe config: enable_create_sync_job
4. Fe config dynamic_partition_enable is only used for disable dynamic partition scheduler.
2022-05-17 22:37:11 +08:00
2ba81899d0
[fix] fix bug that replica can not be repaired duo to DECOMMISSION state ( #9424 )
...
Reset state of replica which state are in DECOMMISSION after finished scheduling.
2022-05-17 22:36:30 +08:00
4ba75d3195
[feature] Add StoragePolicyResource for Remote Storage ( #9554 )
...
Add StoragePolicyResource for Remote Storage
2022-05-17 20:17:33 +08:00
d95fe08458
[feature] group_concat support distinct ( #9576 )
2022-05-17 19:29:47 +08:00
ec2cd0083a
[code format]Upgrade clang-format in BE Code Formatter from 8 to 13 ( #9602 )
2022-05-17 19:28:15 +08:00
7417f9dfa3
[doc]modified the spark-load doc ( #9605 )
2022-05-17 19:27:02 +08:00
0aac9489ae
[doc]add largeint doc ( #9609 )
...
add largeint doc
2022-05-17 19:26:45 +08:00
536d8ca1ed
[Bug][Vectorized] Fix insert bimmap column with nullable column ( #9408 )
...
Co-authored-by: lihaopeng <lihaopeng@baidu.com >
2022-05-17 14:42:20 +08:00
1cc9653bd8
[Bug][Vectorized] Fix BE crash with delete condition and enable_storage_vectorization ( #9547 )
...
Co-authored-by: lihaopeng <lihaopeng@baidu.com >
2022-05-17 14:01:22 +08:00
7d9fa04472
[fix](storage-vectorized) fix VMergeIterator core dump ( #9564 )
...
It could be re appeared on rowset with many segment, it means segment overlap. Maybe could not reappear it easily.
2022-05-17 11:58:59 +08:00
72e0042efb
[feature-wip](hudi) Step1: Support create hudi external table ( #9559 )
...
support create hudi table
support show create table for hudi table
### Design
1. create hudi table without schema(recommanded)
```sql
CREATE [EXTERNAL] TABLE table_name
ENGINE = HUDI
[COMMENT "comment"]
PROPERTIES (
"hudi.database" = "hudi_db_in_hive_metastore",
"hudi.table" = "hudi_table_in_hive_metastore",
"hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
);
```
2. create hudi table with schema
```sql
CREATE [EXTERNAL] TABLE table_name
[(column_definition1[, column_definition2, ...])]
ENGINE = HUDI
[COMMENT "comment"]
PROPERTIES (
"hudi.database" = "hudi_db_in_hive_metastore",
"hudi.table" = "hudi_table_in_hive_metastore",
"hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
);
```
When create hudi table with schema, the columns must exist in corresponding table in hive metastore.
2022-05-17 11:30:23 +08:00