Commit Graph

423 Commits

Author SHA1 Message Date
581494dea8 [fix](test) resolve load in tpch_sf100_unique_p2 and tpch_sf10_unique_p2 (#13208) 2022-10-09 20:30:00 +08:00
3302e0b57e [enhancement](regression-test) add sync for unique table debug test (#13210) 2022-10-09 19:32:28 +08:00
f2159709a8 [Regression](outfile) Fix concurrency test failure caused by outfile (#13209) 2022-10-09 19:09:44 +08:00
fc711d89c8 [fix](projections) Open the project expressions properly. (#13162)
In current 'ExecNode::open' function, the 'open(_projections)' is unreachable which might cause serious crashed. (#13150)
2022-10-09 18:43:45 +08:00
33fe389d62 [regression](datev2) Add regression tests for datev2 (#13040) 2022-10-09 11:55:06 +08:00
e0cff02c1a add sync for stream load test (#13185) 2022-10-09 11:36:01 +08:00
bbb6d2758a [fix](regression-test) fix test_segment_iterator_delete using order_qt_sql (#13192) 2022-10-09 11:35:22 +08:00
62c82bd575 [enhancement](test) Rewrite test_update_schema_change case (#13191) 2022-10-09 11:35:05 +08:00
b8b18e5153 [enhancement](array-type) Handle cast empty string value to array (#13028)
Handle empty value between two comma when cast string to array type.

before:
mysql> select cast("[a,b,c,,,,]" as array<string>);
+-----------------------------------+
| CAST('[a,b,c,,,,]' AS ARRAY<TEXT>) |
+-----------------------------------+
| ['a', 'b', 'c', ',', ',']                |
+-----------------------------------+
1 row in set (0.01 sec)

after:
mysql> select cast("[a,b,c,,,,]" as array<string>);
+-----------------------------------+
| CAST('[a,b,c,,,,]' AS ARRAY<TEXT>) |
+-----------------------------------+
| ['a', 'b', 'c', '', '', '']                |
+-----------------------------------+
1 row in set (0.01 sec)
2022-10-08 21:45:42 +08:00
cf2b93532b [fix](file-scanner) fix some logic about broker load with parquet with new file scanner (#13135)
Fix some logic about broker load using new file scanner, with parquet format:

1. If columns are specified in load stmt, but none of them are in parquet file,
    error will be thrown like `err: No columns found in file`. See `parquet_s3_case4`

2. If the first column of table are not in table, the result number of rows is wrong.
    See `parquet_s3_case8`

3. If column specified in `columns` in load stmt does not exist in file and table,
    error will be thrown like: `failed to find default value expr for slot: x1`. See `parquet_s3_case2`
2022-10-08 13:08:08 +08:00
e0f17f217f [fix](test) resolve tpch_sf100_unique_p2 and tpch_sf10_unique_p2 to run in parallel (#13138) 2022-10-08 09:10:22 +08:00
8b03977689 fix bug that last line of data lost for stream load when line delimiter is more than one character (#13066) 2022-10-07 16:12:05 +08:00
d286aa7bf7 [fix](spark-load) no need to filter row group when doing spark load (#13116)
1. Fix issue #13115 
2. Modify the method of `get_next_block` or `GenericReader`, to return "read_rows" explicitly.
    Some columns in block may not be filled in reader, if the first column is not filled, use `block->rows()` can not return real row numbers.
3. Add more checks for broker load test cases.
2022-10-05 23:00:56 +08:00
b083fb6d5f [fix](decimal) retain Decimal trailing zero when select on fe (#13065) 2022-10-04 21:31:18 +08:00
984d387945 [Regression](load) Add broker load regression test. (#13062)
Add basic broker load regression test. It has been tested. But default
2022-10-04 21:29:05 +08:00
d10ab474f4 [fix](test) try to let cases run in parallel (#13114) 2022-10-04 20:56:22 +08:00
0dd2fb758c [fix](test) add sync and drop table for insert.groovy and test_array_load.groovy (#13105)
We need sync for multi fe env.
2022-10-04 10:24:38 +08:00
6fb9337095 [fix](test) add sync for some cases and adjust data path for tpch_unique_sql_zstd_p0 (#13102) 2022-10-01 21:26:50 +08:00
e9809b5721 [fix](test) add tpch_sf100 and fix results of tpcds_sf100 (#13098) 2022-10-01 20:53:04 +08:00
48d32de9ae [enhancement](test) add some cases from trino to p0 (#12699) 2022-09-30 21:35:30 +08:00
d73e437718 [fix](array-type) fix the be core dump when use string to insert array (#12728)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-09-30 10:44:27 +08:00
287ff50a6f [Bug](datev2) Fix compatible error between datev2 and date (#13024) 2022-09-29 18:01:55 +08:00
c2fae109c3 [Improvement](outfile) Support output null in parquet writer (#12970) 2022-09-29 13:36:30 +08:00
6b6d548df9 [enhancement](test) add more p0 cases (#12285) 2022-09-29 10:45:17 +08:00
28ce1878ca [fix](planner) fix push down no grouping agg (#12983)
The value column of the agg does not support zone_map index, fixing the value column pushing down to zone map causes null pointer.
2022-09-28 17:01:01 +08:00
e627d285e0 [chore](regression-test) add default group(p0) for regression-test (#12977) 2022-09-28 11:47:19 +08:00
a79d2e592b [improvement](test) cache data from s3 to cacheDataPath (#13018)
Now, regression data is stored in sf1DataPath, which is local or remote.
For performance reason, we use local dir for community pipeline, however, we need prepare data for every machine, 
this process is easy mistake. So we cache data from s3 in local transparently, thus, we just need to config one data source.
2022-09-28 10:43:55 +08:00
339877930d [fix](join)report 'natural join is not supported' instead of getting wrong result (#13008)
* [fix](join)report 'natural join is not supported' instead of getting wrong result

* add regression test
2022-09-28 09:08:56 +08:00
Pxl
ee3dd423b9 [Bug](function) core dump on substr #13007 2022-09-28 08:54:49 +08:00
57570f2090 [feature](Nereids) Set pre-aggregation status for OLAP table scan (#12785)
This is the second step for #12303.

The previous PR #12464 added the framework to select the rollup index for OLAP table, but pre-aggregation is turned on by default.
This PR set pre-aggregation for scan OLAP table.

The main steps are as below:
1. Select rollup index when aggregate is present, this is handled by `SelectRollupWithAggregate` rule.  Expressions in aggregate functions, grouping expressions, and pushdown predicates would be used to check whether the pre-aggregation should be turned off.
2. When selecting from olap scan table without aggregate plan, it would be handled by `SelectRollupWithoutAggregate`.
2022-09-27 19:12:15 +08:00
c21ecdd867 [enhancement](test) add tpcds_sf1000 to p2 (#12695) 2022-09-27 17:12:52 +08:00
eba71cf5da [enhancement](test) add tpch_sf10 cases to p2 (#12698) 2022-09-27 17:12:37 +08:00
cbdef66757 [test](join)add join case5 #12854 2022-09-27 15:48:36 +08:00
3dfcfc69ee [regression-test](join)add join case5 #12854 2022-09-27 15:47:36 +08:00
3f99dd5c4b [function](bitmap) support bitmap_hash64 (#12992) 2022-09-27 12:16:02 +08:00
a6db5e63df [fix](projection)sort node's unmaterialized slots should be removed from resolvedTupleExprs (#12963) 2022-09-27 11:46:44 +08:00
df9dcba6db [regression-case](improve) improve regression test case (#12979) 2022-09-27 08:53:53 +08:00
1bb42a7bc0 [function](hash) add support of murmur_hash3_64 (#12923) 2022-09-26 14:23:37 +08:00
7f2ea35b63 [enhancement](test) add brown cases to p2 (#12694) 2022-09-25 23:46:45 +08:00
60556070bb [enhancement](test) add github events cases to p2 (#12696) 2022-09-25 23:46:15 +08:00
bb36490d95 [test](Nereids) add TPC-H Q2 as regression test case (#12840) 2022-09-23 11:00:31 +08:00
0203b36cc4 [regressiontest](test_with)add with_case test (#12814) 2022-09-23 09:10:33 +08:00
6cd4c9ecb5 [bugfix](fe) Fix test_materialized_view_hll case npt. (#12829)
when enable light schema change, run test_materialized_view_hll case throw NullPointerException.
  java.lang.NullPointerException: null
      at org.apache.doris.analysis.SlotDescriptor.setColumn(SlotDescriptor.java:153)
      at org.apache.doris.planner.OlapScanNode.updateSlotUniqueId(OlapScanNode.java:399)
2022-09-22 09:50:53 +08:00
7b46e2400f [enhancement](Nereids) add all necessary PhysicalDistribute on Join's child to ensure get correct cost (#12483)
In an earlier PR #11976 , we add shuffle join and bucket shuffle support. But if join's right child's distribution spec satisfied join's require, we do not add distribute on right child. Instead of, do it in plan translator.
It is hard to calculate accurate cost in this way, since we some distribute cost do not calculated.
In this PR, we introduce a new shuffle type BUCKET, and change the way of add enforce to ensure all necessary distribute will be added in cost and enforcer job.
2022-09-21 12:18:37 +08:00
b6e20db997 [fix](outfile) select OBJECT and HLL columns into outfile as null. (#12734) 2022-09-21 11:24:31 +08:00
632867c1c1 [Bug](datetimev2) Fix lost precision for datetimev2 (#12723) 2022-09-21 11:15:02 +08:00
f1539761e8 [Bugfix](string_functions) rearrange code to avoid global buffer overflow in FindInSetOp::execute (#12677) 2022-09-21 09:19:38 +08:00
c5b6056b7a [fix](lateral_view) fix lateral view explode_split with temp table (#12643)
Problem describe:

follow SQL return wrong result:
WITH example1 AS ( select 6 AS k1 ,'a,b,c' AS k2) select k1, e1 from example1 lateral view explode_split(k2, ',') tmp as e1;

Wrong result:

+------+------+
| k1   | e1   |
+------+------+
|    0 | a    |
|    0 | b    |
|    0 | c    |
+------+------+
Correct result should be:
+------+------+
| k1   | e1   |
+------+------+
|    6 | a    |
|    6 | b    |
|    6 | c    |
+------+------+
Why?
TableFunctionNode::outputSlotIds do not include column k1.

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-09-21 09:19:18 +08:00
7dfbb7c639 [chore](regression-test) add order by column in tpch_sf1_p1/tpch_sf1/nereids/q11.groovy (#12770) 2022-09-20 22:26:24 +08:00
cc072d35b7 [Bug](date) Fix wrong type in TimestampArithmeticExpr (#12727) 2022-09-20 21:08:48 +08:00