doris

Author	SHA1	Message	Date
zy-kkk	a7675243d9	[fix](jdbc catalog) fix adaptation to Oracle special character `/` table names (#23080 ) The changes of this PR for JdbcOracleClient are as follows: #### bug fixes: 1. Fix the problem that if there is an approximate table name for Schema synchronization with a table name with `/` characters, the synchronization Column will be confused 2. Fix the NPE problem of metadata synchronization after enabling lower_case_table_names configuration #### improvement: 1. Modify the method of synchronizing Oracle User to Doris Database mapping, use `metadata.getSchemas` instead of `SELECT DISTINCT OWNER FROM all_tables` 2. When synchronizing metadata, change `null` at the catalog level to `conn.getcatalog`	2023-08-22 15:25:42 +08:00
morrySnow	da2eb69eba	[test](Nereids) add array scalar function test cases (#23303 )	2023-08-22 15:05:28 +08:00
Ashin Gau	9d2e23b1aa	[fix](parquet) A row of complex type may be stored across more pages (#23277 ) A row of complex type may be stored across two(or more) pages, and the parameter `align_rows` indicates that whether the reader should read the remaining value of the last row in previous page.	2023-08-22 14:47:10 +08:00
Jibing-Li	6c8af92175	[fix])(nereids)Support select catalog.db.table.column from xxx for nereids planner. #23221 Nereids doesn't support select table.* from table, this pr is to fix this bug. Support three layer qualifier. (catalog.database.table)	2023-08-22 13:58:25 +08:00
Ashin Gau	5ff7b57fc1	[fix](parquet) parquet reader confuses logical/physical/slot id of columns (#23198 ) `ParquetReader` confuses logical/physical/slot id of columns. If only reading the scalar types, there's nothing wrong, but when reading complex types, `RowGroup` and `PageIndex` will get wrong statistics. Therefore, if the query contains complex types and pushed-down predicates, the probability of the result set is incorrect.	2023-08-22 13:35:29 +08:00
zy-kkk	51db11ed0b	[improve](jdbc catalog) Add a variable to accommodate the final keyword in ClickHouse Jdbc Catalog queries (#23282 )	2023-08-22 12:13:36 +08:00
Tiewei Fang	b471862dba	[Fix](regression-test) fix es regression test (#23160 )	2023-08-22 11:52:37 +08:00
Tiewei Fang	5d9678700c	[feature](Nereids) support select tablets with nereids optimizer (#23164 )	2023-08-22 10:14:27 +08:00
morrySnow	b670dd0db7	[feature](Nereids) support array type (#22851 ) FEATURE: 1. enable array type in Nereids 2. support generice on function signature 3. support array and map type in type coercion and type check 4. add element_at and element_slice syntax in Nereids parser REFACTOR: 1. remove AbstractDataType BUG FIX: 1. remove FROM from nonReserved keyword list TODO: 1. support lambda expression 2. use Nereids' way do function type coercion 3. use castIfnotSame when do implict cast on BoundFunction 4. let AnyDataType type coercion do same thing as function type coercion 5. add below array function - array_apply - array_concat - array_filter - array_sortby - array_exists - array_first_index - array_last_index - array_count - array_shuffle shuffle - array_pushfront - array_pushback - array_repeat - array_zip - reverse - concat_ws - split_by_string - explode - bitmap_from_array - bitmap_to_array - multi_search_all_positions - multi_match_any - tokenize	2023-08-22 09:47:55 +08:00
amory	ae9f04f969	[fix](array) fix typeExtactMatch for array() type (#23264 ) if we write sql with : `select cast(array() as array<varchar(10)>)` castexpr in fe will call analyze() with `Type.matchExactType(childType, type, true);` here array type only check contains_null , but should check inner type to make array matchExactType right	2023-08-21 19:41:09 +08:00
starocean999	8411705e36	[fix](nereids)scalar subquery shouldn't be used in mark join (#22907 ) * [fix](nereids)scalar subquery shouldn't be used in mark join	2023-08-21 15:38:22 +08:00
abmdocrt	81dd00f6e4	[Feature](Compaction) Support do full compaction by table id (#22010 )	2023-08-21 11:54:51 +08:00
Calvin Kirs	b9127fa847	[Fix](View)Use the element_at function to replace %element_extract% to prevent parsing errors (#23093 )	2023-08-21 11:19:10 +08:00
Pxl	a11e0e3bc4	[Bug](agg) fix QUANTILE_UNION many problems (#23181 ) fix QUANTILE_UNION many problems	2023-08-21 10:04:27 +08:00
Ashin Gau	4bf055c818	[fix](parquet) the key colum of map type in parquet may be nullable (#23180 ) Fix errors when reading map type with nullable key column in parquet file. `ParquetReader` support to read nullable key column, but add a check to prevent reading nullable key column. Unfortunately, this check error was not thrown correctly, causing the BE to crash, and thrown meaningless error logs in be.out: ``` ... 11# doris::vectorized::ParquetReader::get_columns(std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, doris::TypeDescriptor, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, doris::TypeDescriptor> > >, std::unordered_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) at /root/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:508 12# doris::vectorized::VFileScanner::_get_next_reader() in /root/yun_you_external/output/be/lib/doris_be 13# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState, doris::vectorized::Block, bool*) at /root/doris/be/src/vec/exec/scan/vfile_scanner.cpp:241 ... ```	2023-08-20 22:59:18 +08:00
slothever	97fa840324	[feature](multi-catalog)support iceberg hadoop catalog external table query (#22949 ) support iceberg hadoop catalog external table query	2023-08-20 19:29:25 +08:00
Mingyu Chen	7c4870c371	[fix](catalog) fix hive partition prune bug on nereids (#23026 )	2023-08-18 18:31:01 +08:00
daidai	419e922a69	[fix](json)Fix the bug that does not stop when reading json files (#23062 ) * [fix](json)Fix the bug that does not stop when reading json files	2023-08-18 18:23:19 +08:00
ZenoYang	1c3cc77a54	[fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty (#21236 ) * [fix](function) to_bitmap parameter parsing failure returns null instead of bitmap_empty * add ut * fix nereids * fix regression-test	2023-08-18 14:37:49 +08:00
Siyang Tang	a7771ea507	[fix](planner) fix current_timestamp param type mismatch when doing stream load (#23092 ) FileLoadScanNode did not analyze the default value expr, result in target param type int32 become int8 as the original IntLiteral type.	2023-08-18 14:28:45 +08:00
Ashin Gau	795006ea3d	[fix](multi-catalog) conversion of compatible numerical types (#23113 ) Hive support schema change, but doesn't rewrite the parquet file, so the physical type of parquet file may not equal the logical type of table schema.	2023-08-18 14:05:33 +08:00
amory	2d96d19030	[FIX](array-func) fix array() with decimal type (#23117 ) if we write sql with : select array(1.0,2.0,null, null,2.0) here will pass arg type with uint8 to be which does not match array() func sign with deicmal, and make be core. so here should cast from be and make null tag to cast decimal type	2023-08-18 12:12:50 +08:00
Pxl	59c6139aa5	[Chore](parser) fix create view failed when view contained cast as varchar (#23043 ) fix create view failed when view contained cast as varchar	2023-08-18 11:50:18 +08:00
luozenglin	d018ac8fb7	fix show grants throw NullPointerException (#22943 )	2023-08-18 10:48:56 +08:00
Qi Chen	314f5a5143	[Fix](orc-reader) Fix filling partition or missing column used incorrect row count. (#23096 ) [Fix](orc-reader) Fix filling partition or missing column used incorrect row count. `_row_reader->nextBatch` returns number of read rows. When orc lazy materialization is turned on, the number of read rows includes filtered rows, so caller must look at `numElements` in the row batch to determine how many rows were not filtered which will to fill to the block. In this case, filling partition or missing column used incorrect row count which will cause be crash by `filter.size() != offsets.size()` in filter column step. When orc lazy materialization is turned off, add `_convert_dict_cols_to_string_cols(block, nullptr)` if `(block->rows() == 0)`.	2023-08-17 23:26:11 +08:00
morrySnow	11d76d0ebe	[fix](Nereids) non-inner join should not merge dist info (#22979 ) 1. left join should use left dist info. 2. right join should use right dist info. 3. full outer join should return ANY dist info.	2023-08-17 17:48:50 +08:00
LiBinfeng	d7a6b64a65	[Fix](Planner) fix case function with null cast to array null (#22947 )	2023-08-17 16:37:07 +08:00
Mryange	e289e03a1a	[fix](executor)fix no return with old type in time_round	2023-08-17 15:34:26 +08:00
zhangguoqiang	a288377118	[fix](regresstion) Fix sql server external case (#23031 )	2023-08-17 10:54:54 +08:00
TengJianPing	343a6dc29d	[improvement](hash join) Return result early if probe side has no data (#23044 )	2023-08-17 09:17:09 +08:00
amory	390c52f73a	[Improve](complex-type) update for array/map element_at with nested complex type with local tvf (#22927 )	2023-08-16 20:47:36 +08:00
bobhan1	4510e16845	[improvement](delete) support delete predicate on value column for merge-on-write unique table (#21933 ) Previously, delete statement with conditions on value columns are only supported on duplicate tables. After we introduce delete sign mechanism to do batch delete, a delete statement with conditions on value columns on unique tables will be transformed into the corresponding insert into ..., __DELETE_SIGN__ select ... statement. However, for unique table with merge-on-write enabled, the overhead of inserting these data can be eliminated. So this PR add the ability to allow delete predicate on value columns for merge-on-write unique tables.	2023-08-16 12:18:05 +08:00
Calvin Kirs	3efa06e63e	[Fix](View)varchar type conversion error (#22987 )	2023-08-16 11:49:04 +08:00
zy-kkk	221e7bdd17	[test](jdbc external) fix mysql and pg external regression test (#22998 )	2023-08-16 10:44:47 +08:00
amory	c8c46e042d	[Improve](regress-test)add regress test for map_agg with nested type and insert to doris inner table #23006	2023-08-16 09:21:02 +08:00
minghong	423002b20a	[fix](nereids) partitionTopN & Window estimation (#22953 ) * partitionTopN & winExpr estimation * tpcds 44/47/57	2023-08-15 20:19:03 +08:00
minghong	80566f7fed	[stats](nereids)support partition stats (#22606 )	2023-08-15 17:52:25 +08:00
谢健	7de362f646	[fix](Nereids): expand other join which has or condition (#22809 )	2023-08-15 16:49:19 +08:00
Mryange	f1864d9fcf	[fix](function) fix str_to_date with specific format #22981	2023-08-15 15:30:48 +08:00
Jerry Hu	9b42093742	[feature](agg) Make 'map_agg' support array type as value (#22945 )	2023-08-15 14:44:50 +08:00
amory	707a527775	[FIX](map)insert into doris table with array/map type by local tvf (#22955 )	2023-08-15 13:11:23 +08:00
Siyang Tang	b49dc8042d	[feature](load) refactor CSV reading process during scanning, and support enclose and escape for stream load (#22539 ) ## Proposed changes Refactor thoughts: close #22383 Descriptions about `enclose` and `escape`: #22385 ## Further comments 2023-08-09: It's a pity that experiment shows that the original way for parsing plain CSV is faster. Therefor, the refactor is only applied on enclose related code. The plain CSV parser use the original logic. Fallback of performance is unavoidable anyway. From the `CSV reader`'s perspective, the real weak point may be the write column behavior, proved by the flame graph. Trimming escape will be enable after fix: #22411 is merged Cases should be discussed: 1. When an incomplete enclose appears in the beginning of a large scale data, the line delimiter will be unreachable till the EOF, will the buffer become extremely large? 2. What if an infinite line occurs in the case? Essentially, `1.` is equivalent to this. Only support stream load as trial in this PR, avoid too many unrelated changes. Docs will be added when `enclose` and `escape` is available for all kinds of load.	2023-08-15 09:23:53 +08:00
zhangdong	fa6110accd	[fix](catalog)paimon support more data type (#22899 )	2023-08-14 13:48:33 +08:00
bobhan1	bddab94121	[Enhancement](partial update) Support including delete sign column in partial update stream load (#22874 )	2023-08-13 10:32:21 +08:00
zhangguoqiang	41ff48f838	[regresstion][external]fix case test_show_where and es_query 0811 (#22898 )	2023-08-12 19:41:55 +08:00
amory	5e2748d2b4	[Improve](complex-type)update orc reader for complex type and add regress tests (#22856 )	2023-08-12 07:06:12 +08:00
zy-kkk	44475b64ef	[fix](pg test) fix postgresql jdbc catalog test case (#22875 )	2023-08-11 20:50:47 +08:00
daidai	28561f77e9	[fix](regression)fix test_hdfs_tvf regression_test out file : decimalv3 -> decimal (#22852 )	2023-08-11 20:44:18 +08:00
mch_ucchi	045843991a	[Fix](Nereids) fix insert into table of random distribution for nereids (#22831 ) currently insert into a table of random distribution info is not supported, we fix it by set physical properties to Any.	2023-08-11 19:26:39 +08:00
Mryange	72e264dd59	[fix](executor)fix error when FixedContainer with null (#22850 )	2023-08-11 17:20:50 +08:00

1 2 3 4 5 ...

1578 Commits