doris

Author	SHA1	Message	Date
shee	2c2e06a5fe	[Fix bug] fix non-equal out join is not supported (#8857 )	2022-04-21 12:44:20 +08:00
Zhengguo Yang	7af684ad0f	[fix] Fix a compatibility problem caused by using a non-existent database when connecting via mysql client (#9127 )	2022-04-21 12:43:39 +08:00
Mingyu Chen	7b3865b524	[fix](ut)(vectorized) fix a potential stack overflow bug and some unit test (#9140 )	2022-04-21 12:17:03 +08:00
caiconghui	fa5b5fc6d1	[fix](dynamic_partition) fix dynamic partition scheduler not work for olap table with random hash info (#9108 )	2022-04-21 12:16:27 +08:00
zhangy5	498f50a837	[regression-test] update test case dir which divided by basic functions (#9084 ) 1. Add test case dir. 2. Add some test suites.	2022-04-21 11:55:41 +08:00
Pxl	dda7604e16	[Bug][Storage-vectorized] fix code dump on outer join with not nullable column (#9112 )	2022-04-21 11:02:04 +08:00
caiconghui	40362dfaca	[fix](partition) Fix wrong partition distribution key info for random hash olap table (#9104 )	2022-04-20 17:08:42 +08:00
jiafeng.zhang	f253e260c8	[fix] Modify fe jetty configuration parameters (#9075 )	2022-04-20 14:51:25 +08:00
spaces-x	39c0fec680	[fix] fix bug when partition_id exceeds integer range in spark load (#9073 )	2022-04-20 14:50:55 +08:00
camby	a2edc6fd8b	[feature-wip](array-type) replicate impl for ColumnArray to support join with array column (#9070 ) SQL with JOIN and columns ARRAY, will call function ColumnArray::replicate. At this pr, we implement replicate for ARRAY type, to support SQL like this: `SELECT count(lo_array),count(d_array),SUM(lo_extendedprice*lo_discount) AS REVENUE FROM lineorder, date WHERE lo_orderdate = d_datekey AND d_year = 1993 AND lo_discount BETWEEN 1 AND 3 AND lo_quantity < 25;`	2022-04-20 14:50:34 +08:00
caiconghui	df3a8545dc	[fix](routine_load) Add retry mechanism for routine load task which encounter Broker transport failure (#9067 )	2022-04-20 14:49:58 +08:00
jakevin	3cd432c83a	[community](*) polish config about project (#8987 ) - Add `.editorconfig` - Polish `.gitignore`	2022-04-20 14:48:32 +08:00
Adonis Ling	bd126f0679	[improvement] Refactor type info for further optimizations. (#8786 ) ## Design: For now, there are two categories of types in Doris, one is for scalar types (such as int, char and etc.) and the other is for composite types (array and etc.). For the sake of performance, we can cache type info of scalar types globally (unique objects) due to the limited number of scalar types. When we consider the composite types, normally, the type info is generated in runtime (we can also use some cache strategy to speed up). The memory thereby should be reclaimed when we create type info for composite types. There are a lots of interfaces to get the type info of a specific type. I reorganized those as the following describes. 1. `const TypeInfo* get_scalar_type_info(FieldType field_type)` The function is used to get the type info of scalar types. Due to the cache, the caller uses the result WITHOUT considering the problems about memory reclaim. 2. `const TypeInfo* get_collection_type_info(FieldType sub_type)` The function is used to get the type info of array types with just ONE depth. Due to the cache, the caller uses the result WITHOUT considering the problems about memory reclaim. 3. `TypeInfoPtr get_type_info(segment_v2::ColumnMetaPB* column_meta_pb)` 4. `TypeInfoPtr get_type_info(const TabletColumn* col)` These functions are used to get the type info of BOTH scalar types and composite types. The caller should be responsible to manage the resources returned. #### About the new type `TypeInfoPtr` `TypeInfoPtr` is an alias type to `unique_ptr` with a custom deleter. 1. For scalar types, the deleter does nothing. 2. For composite types, the deleter reclaim the memory. By analyzing the callers of `get_type_info`, these classes should hold TypeInfoPtr: 1. `Field` 2. `ColumnReader` 3. `DefaultValueColumnIterator` Other classes are either constructed by the foregoing classes or hold those, so they can just use the raw pointer of `TypeInfo` directly for the sake of performance. 1. `ScalarColumnWriter` - holds `Field` 1. `ZoneMapIndexWriter` - created by `ScalarColumnWriter`, use `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `ZoneMapIndexWriter`, only uses scalar types. 2. `BitmapIndexWriter` - created by `ScalarColumnWriter`, uses `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `BitmapIndexWriter`, uses `type_info` in `BitmapIndexWriter` and `BitmapIndexWriter` doesn't support `ArrayType`. 3. `BloomFilterIndexWriter` - created by `ScalarColumnWriter`, uses `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `BloomFilterIndexWriter`, only uses scalar types. 2. `IndexedColumnReader` initializes `type_info` by the field type in meta (only scalar types). 3. `ColumnVectorBatch` 1. `ZoneMapIndexReader` creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `IndexedColumnReader` 2. `BitmapIndexReader` supports scalar types only and it creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `BitmapIndexReader` 3. `BloomFilterIndexWriter` supports scalar types only and it creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `BloomFilterIndexWriter`	2022-04-20 14:47:29 +08:00
zhannngchen	1b4cd76847	[feature](vectorized)(function) Support min_by/max_by function. (#8623 ) Support min_by/max_by on vectorized engine.	2022-04-20 14:46:19 +08:00
jiafeng.zhang	d58e8d76b5	[doc]Release manager docment (#9081 ) Add Release manager docment	2022-04-20 14:16:12 +08:00
FreeOnePlus	304bd9ab62	Change Get-Starting (#9102 ) Modify the download address	2022-04-20 14:15:18 +08:00
liuzhuang2017	8d0f06e49a	[docs][typo] Fixed Chinese and English "advance-usage.md" files. (#9099 ) Fixed Chinese and English "advance-usage.md" files	2022-04-20 14:14:39 +08:00
liuzhuang2017	3a9008f06a	[dosc][typo] Fix "basic-usage.md" files. (#9097 ) Fix "basic-usage.md"	2022-04-20 14:14:12 +08:00
smallhibiscus	1d0629925f	Modify the compilation docs whether it supports avx2. (#9095 ) Modify the compilation docs whether it supports avx2	2022-04-20 14:13:39 +08:00
liuzhuang2017	48f805fbab	Fix routine-load-manual.md (#9090 ) Fix routine-load-manual	2022-04-20 14:13:19 +08:00
LightGHLi	37bd89d24b	[typo](docs) fix some typos in docs (#9085 ) fix some typos in docs	2022-04-20 14:12:54 +08:00
Mingyu Chen	869fdff2f0	[refactor] add reference path for source file from impala (#9115 ) According to the requirements of the APLv2, the referenced code needs to be marked with the path of the source code.	2022-04-20 12:29:57 +08:00
Mingyu Chen	2cecb5dc82	[fix](regression-test) disable test for hdfs and fix double type with null (#9080 ) 1. add a new config in regression-conf.groovy enableHdfs, default is false, to skip tests with hdfs 2. fix a bug that when double type column result is null, exception will be thrown	2022-04-18 19:37:37 +08:00
HappenLee	0f86fed547	[improvement](insert) Support verbose keyword in insert query stmt (#9047 )	2022-04-18 19:36:40 +08:00
HappenLee	51db4e54c0	[fix](table-function) Fix bug of table function with outer join cause nullptr of tuple (#9041 )	2022-04-18 19:35:26 +08:00
shee	f3dce9a6c1	[fix](planner) fix is-null predicate in where statement cannot be pushed down to the storage layer (#9035 )	2022-04-18 19:35:02 +08:00
Pxl	681f960257	[fix](storage)(vectorized) query get wrong result when read datetime type column (#8872 )	2022-04-18 19:34:06 +08:00
Mingyu Chen	a71e0554be	[github] enable clang format github action (#9082 )	2022-04-18 17:48:35 +08:00
chenlinzhong	afce993ca7	[feature](load)(csv) CSV import and export support header (#8765 ) - Add two new types to stream load boker load: csv_with_names and csv_with_name_sand_types - Add two new types to export: csv_with_names and csv_with_names_and_types	2022-04-18 15:29:18 +08:00
smallhibiscus	dffd8513c6	Modify some bad link in docs. (#9078 ) Modify some bad link in docs.	2022-04-18 13:29:22 +08:00
Mingyu Chen	9051ed7c7d	Revert "[Refactor] remove some useless code (#8976 )" (#9074 ) This reverts commit de7dce4df84fcbfbbaf715cbac151e802321f80f. Reverts apache/incubator-doris#8976 This cause BE ut failed: sh run-be-ut.sh --run --filter OlapTableSinkTest.* ``` ==62008==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x7ffff36867c0 in thread T0 ```	2022-04-18 12:01:14 +08:00
Mingyu Chen	38b9f02c5f	[release] Add download url for 1.0.0 (#9071 )	2022-04-18 11:24:21 +08:00
Pxl	44d37acbff	Change date/datetime result type to bigint (#8975 )	2022-04-18 09:56:28 +08:00
dataroaring	de7dce4df8	[Refactor] remove some useless code (#8976 )	2022-04-18 09:55:54 +08:00
EmmyMiao87	04287cabb2	[Forbidden](Vec) Switch to non-vec engine when outer join + not null column (#8979 ) * [Forbidden](Vec) Switch to non-vec engine when outer join + not null column Vectorized code will occur `core` in the case of ```outer join + not null column```, such as issue #7901 So we need to fall back from vectorized mode to non-vectorized mode when we encounter this situation. If the nullside column of the outer join is a column that must return non-null like count() then there is no way to force the column to be nullable. At this time, vectorization cannot support this situation, so it is necessary to fall back to non-vectorization for processing. For example: Query: set enable_vectorized_engine=true Query: select from t1 left join (select k1, count(k2) as count_k2 from t2 group by k1) tmp on t1.k1=tmp.k1 Result: Query goes non-vectorized engine	2022-04-18 09:55:33 +08:00
hongbin	be0ba76dff	[Refactor] Use '#pragma once' to replace '#define' and '#endif' (#9062 )	2022-04-18 09:54:59 +08:00
hongbin	c71ffc01de	[Refactor] Cleanup some unused include (#9063 )	2022-04-18 09:52:31 +08:00
caoliang-web	b260bcba22	Modify some documents in the English version of doris (#9064 ) Modify some documents in the English version of doris	2022-04-18 08:25:21 +08:00
smallhibiscus	352d93b566	[Refactor][doc] Fixed some issues in en and zh-CN docs (#9068 ) * Modify some error in en and zh-CN docs	2022-04-18 08:24:57 +08:00
ruyliu	1a2620b724	[fix][doc]fix max_send_batch_parallelism_per_job default value (#9038 ) * fix max_send_batch_parallelism_per_job default value	2022-04-17 15:22:07 +08:00
caiconghui	0f8a7ff985	[Refactor](ReportHandler) Remove some unused schema_hash code in fe (#9005 )	2022-04-17 10:01:34 +08:00
smallhibiscus	a749f98e44	Fix get-starting en and zh-CN docs. (#9059 ) Co-authored-by: smallhibiscus <844981280>	2022-04-16 20:02:14 +08:00
wudi	7278ad460c	fix refactor doc bug (#9058 ) fix refactor doc bug	2022-04-16 17:24:27 +08:00
Xujian Duan	c7a098c1b0	[fix](sql_block_rule) optimization of alter sql_block_rule stmt (#8971 ) Optimization of alter sql_block_rule stmt.	2022-04-16 11:05:31 +08:00
jiafeng.zhang	b92dd11a1d	[fix][doc]Data import document modification (#9057 ) * [fix][doc]Data import document modification	2022-04-16 10:19:07 +08:00
smallhibiscus	c431da3bf8	[Refactor][doc] Fix bad link in documentation (#9053 ) Fix bad link in documentation	2022-04-15 19:50:35 +08:00
wudi	2c7327fb7c	add doc tpch and ssb (#9052 ) add doc tpch and ssb	2022-04-15 19:22:38 +08:00
jiafeng.zhang	34457cd768	add show load warning (#9051 ) add show load warning	2022-04-15 18:53:20 +08:00
smallhibiscus	6215e5b09f	Add best practice docs in advanced module (#9049 ) Add best practice docs in advanced module	2022-04-15 17:22:16 +08:00
FreeOnePlus	556602a5f1	[Refactor][Doc]Add part of the document content for Get-Starting (#8867 ) * Add part of the document content for Get-Starting	2022-04-15 16:38:42 +08:00

1 2 3 4 5 ...

4406 Commits