Commit Graph

6608 Commits

Author SHA1 Message Date
218b0857ab [fix](string) allocate memory according to actual size instead of max size (#13112)
String column lengh is 2GB, if we allocate memory according to column length,
string would consume a lot of memory. It also misleads memory tracker.
2022-10-06 09:56:22 +08:00
d286aa7bf7 [fix](spark-load) no need to filter row group when doing spark load (#13116)
1. Fix issue #13115 
2. Modify the method of `get_next_block` or `GenericReader`, to return "read_rows" explicitly.
    Some columns in block may not be filled in reader, if the first column is not filled, use `block->rows()` can not return real row numbers.
3. Add more checks for broker load test cases.
2022-10-05 23:00:56 +08:00
90512ebd59 [typo](docs)Metadata Operations and Maintenance link error (#13090)
* Metadata Operations and Maintenance link error
2022-10-05 22:58:24 +08:00
7b75c2df54 [fix](BE) fix the stream load error when upgrade BE from 1.1.2 to master (#13058) 2022-10-05 12:13:26 +08:00
80e1f401f0 [enhancement](memory) Fix USE_MEM_TRACKER=OFF compile (#13085) 2022-10-05 12:10:49 +08:00
4a0b4f1836 [fix](fe-test) TestWithFeService do not clean up dorisHome (#13073) 2022-10-04 21:32:27 +08:00
b083fb6d5f [fix](decimal) retain Decimal trailing zero when select on fe (#13065) 2022-10-04 21:31:18 +08:00
74fc98ceeb [improvement](ResourceTag) support upper case in tag name (#13063) 2022-10-04 21:30:37 +08:00
984d387945 [Regression](load) Add broker load regression test. (#13062)
Add basic broker load regression test. It has been tested. But default
2022-10-04 21:29:05 +08:00
e00124d825 [typo](doc) Modify the comment of light schema change (#13061) 2022-10-04 21:28:11 +08:00
0c67b14b6d [typo](doc) replace unuse parameter max_base_compaction_concurrency (#13047) 2022-10-04 21:27:38 +08:00
3f47f67b16 [fix](parquet) fix parquet write setting property is not effective (#12912) 2022-10-04 21:25:57 +08:00
e167aa120f [fix](jdbc) fix insert into date type to oracle using wrong type (#12883)
using JDBC insert into date type to ORACLE,
it's should be use to_date function convert string to java.sql.date
2022-10-04 21:24:33 +08:00
5092ef78da [doc] Add python env for Mac M1 (#12792)
For Mac M1, the default is python3 instead of python.
When FE compiles, there will be an error that python cannot be found.
This PR complements this part of the description.
2022-10-04 21:24:08 +08:00
d10ab474f4 [fix](test) try to let cases run in parallel (#13114) 2022-10-04 20:56:22 +08:00
Pxl
db89b0b703 [Enhancement](optimize) optimize for function multiply on decimalv2 (#13049)
optimize for function multiply on decimalv2
2022-10-04 16:07:18 +08:00
0dd2fb758c [fix](test) add sync and drop table for insert.groovy and test_array_load.groovy (#13105)
We need sync for multi fe env.
2022-10-04 10:24:38 +08:00
b53533408b not allow alter mow property (#13108) 2022-10-03 21:31:09 +08:00
026ffaf10d [feature-wip](parquet-reader) add detail profile for parquet reader (#13095)
Add more detail profile for ParquetReader:
ParquetColumnReadTime: the total time of reading parquet columns
ParquetDecodeDictTime: time to parse dictionary page
ParquetDecodeHeaderTime: time to parse page header
ParquetDecodeLevelTime: time to parse page's definition/repetition level
ParquetDecodeValueTime: time to decode page data into doris column
ParquetDecompressCount: counter of decompressing page data
ParquetDecompressTime: time to decompress page data
ParquetParseMetaTime: time to parse parquet meta data
2022-10-02 15:11:48 +08:00
8b14c4aa98 [fix](compaction) don't log cumu policy name for quick compaction (#13101) 2022-10-01 21:40:42 +08:00
6fb9337095 [fix](test) add sync for some cases and adjust data path for tpch_unique_sql_zstd_p0 (#13102) 2022-10-01 21:26:50 +08:00
e9809b5721 [fix](test) add tpch_sf100 and fix results of tpcds_sf100 (#13098) 2022-10-01 20:53:04 +08:00
d44af5decf [fix](alter-load) fix bug that tablet version may be wrong when doing alter and load (#13070)
the `isRunning()` method of `TransactionState` is missing `PRE_COMMITTED` status.
Which cause wrong judgment of `isPreviousTransactionsFinished`
2022-09-30 23:39:30 +08:00
48d32de9ae [enhancement](test) add some cases from trino to p0 (#12699) 2022-09-30 21:35:30 +08:00
fd52f3bd51 [Doc](ReadME) Update the slack links (#13089) 2022-09-30 20:50:37 +08:00
95561baddd [fix](planner) throw NPE when all group by expr is constant and no agg expr in select list (#13087) 2022-09-30 18:47:01 +08:00
3294b18674 [Improvement](datev2) fix some compatible problems for datev2 (#13079) 2022-09-30 13:56:01 +08:00
e7f18e998a [chore](be-ut) Remove useless lines which cause compilation errors (#13053) 2022-09-30 11:26:25 +08:00
d73e437718 [fix](array-type) fix the be core dump when use string to insert array (#12728)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-09-30 10:44:27 +08:00
90f11ed7c1 [enhancement](Nereids) remove unnecessary exchange between global and distinct local aggregate node (#13057)
Add partition info into LogicalAggregate and set it as original group expression list of aggregate when we do aggregate disassemble with distinct aggregate function.
2022-09-29 23:12:37 +08:00
31a23baa37 [fix](planner) Add default execution interval time for stats framework (#13044)
Set a default execution interval for stats collection related threads.
2022-09-29 22:40:27 +08:00
7aae98eb71 [fix](comment) sparkload comment mislead which file types it support (#12982) 2022-09-29 20:23:57 +08:00
287ff50a6f [Bug](datev2) Fix compatible error between datev2 and date (#13024) 2022-09-29 18:01:55 +08:00
a7b42a7029 [Fix](Nereids) Fix exception message when can't bind slot. (#13048) 2022-09-29 16:51:07 +08:00
42729786bf [enhancement](Nereids) push filter into join otherJoinCondition (#12842) 2022-09-29 16:19:30 +08:00
1ae9454771 [enhancement](Nereids) planner performance speed up (#12858)
optimize planner by:
1. reduce duplicated calculation on equals, getOutput, computeOutput eq.
2. getOnClauseUsedSlots: the two side of equalTo is centainly slot, so not need to use List.
2022-09-29 16:01:10 +08:00
34b14a71c8 [Improvement](string) Optimize scanning for string #12911
~0.2X performance boost for queries containing string predicates
2022-09-29 15:11:16 +08:00
fef1062835 [optimization](array-type) optimize the help docs of array type (#13001)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-09-29 14:36:32 +08:00
fae7296336 [Enhancement](fe-core) make UT-SelectRollupTest more stable (#13030) 2022-09-29 14:25:01 +08:00
c2fae109c3 [Improvement](outfile) Support output null in parquet writer (#12970) 2022-09-29 13:36:30 +08:00
29fc167548 [Bug](Datax)Fix bug that the dataxwriter will drop column when convert map to json (#13042)
* fix bug that when value is null,toJSONString will drop this key value.
2022-09-29 11:37:10 +08:00
6b6d548df9 [enhancement](test) add more p0 cases (#12285) 2022-09-29 10:45:17 +08:00
bc2966ed80 [fix](like)the dictionary column should call get_shrink_value to get correct string value (#13032)
* [fix](like)the dictionary column should call get_shrink_value to get correct string value
2022-09-29 09:09:36 +08:00
36bf8ad3eb [Opt](Vec) Support const column check nullable and remove nullable (#13020) 2022-09-29 08:39:19 +08:00
a853dd3c61 [Bug](aarch64) Fix the mmap errors which make BE down during starting up (#13031) 2022-09-29 08:36:58 +08:00
d53205076e [feature](Nereids) implicit cast StringLiteral to another side type of BinaryOperator if available (#13038)
for expression 5 > '1'. before this PR, we normalize it to '5' > '1'. After this PR, we normalize it to 5 > 1 to compatible with legacy planner.
2022-09-28 21:34:25 +08:00
820ec435ce [feature-wip](parquet-reader) refactor parquet_predicate (#12896)
This change serves the  following purposes:
1.  use ScanPredicate instead of TCondition for external table, it can reuse old code branch.
2. simplify and delete some useless old code
3.  use ColumnValueRange to save predicate
2022-09-28 21:27:13 +08:00
d739aa7c53 [enhancement](Nereids) optimization for star-schema join reorder (#12817)
the basic idea of star-schema support is:
1. fact_table JOIN dimension_table, if dimension table are filtered, the result can be regarded as applying a filter on fact table.
2. fact_table JOIN dimension_table, if the dimension table is not filtered, the number of join result tuple equals to the number of fact tuples.
3. dimension table JOIN fact table, the number of join result tuple is that of fact table or 2 times of dimension table.

If star-schema support is enabled:
1. nereids regard duplicate key(unique key/aggregation key) as primary key
2. nereids try to regard one join key as  primary key and another join key as foreign key.
3. if nereids found that no join key is primary key, nereids fall back to normal estimation.
2022-09-28 21:09:55 +08:00
7019166469 [enhancement](Nereids) let BinaryArithmetic's dataType and nullable match with BE (#13015)
Do type promotion for BinaryArithmetic:
- Add
- Subtract
- Multiply

Do always nullable for:
- Mod
2022-09-28 20:02:27 +08:00
cd549d8a8f [improvement](scan) remove concurrency limit if scan has predicate (#13021)
If a scan node has predicate, we can not limit the concurrency of scanner.
Because we don't know how much data need to be scan.
If we limit the concurrency, this will cause query to be very slow.

For exmple:
select * from tbl limit 1, the concurrency will be 1;
select * from tbl where k1=1 limit 1, the concurrency will not limit.
2022-09-28 17:07:07 +08:00