doris

Author	SHA1	Message	Date
Mingyu Chen	7cfebd05fd	[fix](hierarchical-storage) Fix bug that storage medium property change back to SSD (#9158 ) 1. fix bug described in #9159 2. fix a `fill_tuple` bug introduced from #9173	2022-04-26 10:15:19 +08:00
spaces-x	62b38d7a75	[fix](spark load) fix `getHashValue` of string type is always zero in spark load. (#9136 ) Buffer flip is used incorrectly. When the hash key is string type, the hash value is always zero. The reason is that the buffer of string type is obtained by wrap, which is not needed to flip. If we do so, the buffer limit for read will be zero.	2022-04-26 10:14:21 +08:00
camby	88115ffcb3	[feature-wip](array-type) ArrayFileColumnIterator bug fix (#9114 )	2022-04-26 09:35:46 +08:00
zhangstar333	cdd1b6d6dd	[fix](function) fix lag/lead function return invalid data (#9076 )	2022-04-26 09:34:46 +08:00
dataroaring	9e13be4cb6	[github] enable requested status check before merging pull requests (#9222 )	2022-04-26 08:58:36 +08:00
Henry2SS	bdf915abd4	[Enhancement] (image) check image validity as soon as generated (#9011 ) * load newly generated image file as soon as generated to check if it is valid. * delete the latest invalid image file * fix * fix * get filePath from saveImage() to ensure deleting the correct file while exception happens * fix Co-authored-by: wuhangze <wuhangze@jd.com>	2022-04-25 19:35:41 +08:00
dataroaring	687421b43f	keep at least one validated image file (#9192 ) * rename ImageSeq to LatestImageSeq in Storage * keep at least one validated image file	2022-04-25 19:32:43 +08:00
yiguolei	3bdfcde8e8	[Improvement] not print logs to fe.out when fe is running under daemon mode (#9195 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-04-25 18:29:29 +08:00
Stalary	7226089116	FIX: getChannel -> getChannel() (#9217 ) Co-authored-by: Rongqian Li <rongqian_li@idgcapital.com>	2022-04-25 17:46:00 +08:00
dataroaring	5b9a1a2a5d	avoiding a corrupt image file when there is image.ckpt with non-zero … (#9180 ) * avoiding a corrupt image file when there is image.ckpt with non-zero size For now, saveImage writes data to image.ckpt via an append FileOutputStream, when there is a non-zero size file named image.ckpt, a disaster would happen due to a corrupt image file. Even worse, fe only keeps the lastest image file and removes others. BTW, image file should be synced to disk. It is dangerous to only keep the latest image file, because an image file is validated when generating the next image file. Then we keep an non validated image file but remove validated ones. So I will issue a pr which keeps at least 2 image file. * append other data after MetaHeader * use channel.force instead of sync	2022-04-25 17:01:01 +08:00
Gabriel	b81f49b0d3	[BUG] fix compiling bug for java udf (#9161 )	2022-04-25 10:02:01 +08:00
SleepyBear	c3d0fee01b	[fix](broker load) sync the workflow of BrokerScanner to other Scanner to avoid oom (#9173 )	2022-04-25 10:01:42 +08:00
Stalary	af2295f971	MOD: remove <scope>provided</scope> (#9177 )	2022-04-25 10:00:57 +08:00
dataroaring	a608c3d5dc	[Fixbug]assure transaction num in image file is right (#9181 ) For now, dbTransactionManager::getTransactionNum is only used by checkpoint to get transaction num to put into a image file. However, transactions written into a image file do not come from the same data structure as the num comes. Thus, we should pay much attention to assure two data structue is consistent on size. Actually, it is very difficult to do so. This patch just let getTransactionNum get number from the same data structure as write method. The change was introduced by b93e841688.	2022-04-25 09:59:18 +08:00
Pxl	2d83167e50	[Feature] [Lateral-View] support outer combinator of table function (#9147 )	2022-04-24 12:09:40 +08:00
Stalary	4e1b75f5e7	[doc] add docker for Mac note (#9178 )	2022-04-23 22:08:53 +08:00
caoliang-web	48ac0d9591	[Refactor][doc]Modify the flink doris connector compilation documentation (#9169 )	2022-04-23 22:08:09 +08:00
wudi	bfa9814350	[doc] add scala2.11 compile doc (#9166 )	2022-04-23 22:07:45 +08:00
jiafeng.zhang	f2d741fa95	[doc] Modify the release version to prepare the key generation problem solution (#9165 )	2022-04-23 22:06:48 +08:00
liuzhuang2017	4911d6898a	[docs][typo] Fix some typos in "alter-table" content. (#9131 )	2022-04-23 22:05:13 +08:00
jakevin	6756db6587	[enhancment](): polish ignore with build_ (#9128 )	2022-04-23 22:04:46 +08:00
liuzhuang2017	4445d3188d	[docs][typo] Fix some typos in "getting-started" content. (#9124 )	2022-04-23 22:03:59 +08:00
ZenoYang	ae25633d50	[fix](cache) Generate md5 value using utf8 encoding for sqlkey string (#9121 )	2022-04-23 21:37:34 +08:00
caiconghui	89d37d920e	[fix](transaction) Fix running transaction num always be zero when execute show proc '/transactions' stmt (#9106 )	2022-04-23 21:37:18 +08:00
Henry2SS	4a10b37ca2	[feature](image tool) support image load tool (#8982 )	2022-04-23 21:36:58 +08:00
pengxiangyu	e157c2c254	[feature-wip](remote-storage) step3: Support remote storage, only for be, add migration_task_v2 (#8806 ) 1. Add TStorageMigrationReqV2 and EngineStorageMigrationTask to support migration action 2. Change TabletManager::create_tablet() for remote storage 3. Change TabletManager::try_delete_unused_tablet_path() for remote storage	2022-04-22 22:38:10 +08:00
Elvin wei	e880dde7a5	[feature-wip](statistics) step1: create the statistics job (#8858 ) This is the first PR for statistics collection includes some implementations of the statistics(#6370), it will not affect any existing code and users will not be able to create statistics job. It mainly implements the semantic checking module for statistical information collection jobs, and the job creation module. The syntax is: ANALYZE [[ db_name.tb_name ] [( column_name [, ...] )], ...] [ PROPERTIES(...) ] e.g. ANALYZE; ANALYZE tbl1; ANALYZE tbl1(col1, col2) PROPERTIES("cbo_ statistics_ task_ timeout" = "10"); Two configurations have been added: Timeout time of a single task max_cbo_statistics_task_timeout_sec The maximum number of running jobs the system can receive cbo_max_statistics_job_num Co-authored-by: weizhengte <1141550741@qq.com> Co-authored-by: weizhengte <weizhengte@foxmail.com> Co-authored-by: EmmyMiao87 <522274284@qq.com> Co-authored-by: frankywei <frankywei@tencent.com>	2022-04-22 18:24:54 +08:00
Mingyu Chen	81ff49f8e3	[revert] "[Fix bug] fix non-equal out join is not supported (#8857 )" (#9150 ) This PR cause FE ut failed: InferFiltersRuleTest testOn3Tables1stInner2ndRightJoinEqLiteralAt2nd testOn3Tables1stInner2ndRightJoinEqLiteralAt3rd	2022-04-21 18:20:19 +08:00
Zhengguo Yang	ae680b4248	[UDF] support RPC udaf part 1: support create RPC udaf in fe (#8510 )	2022-04-21 17:38:58 +08:00
shee	2c2e06a5fe	[Fix bug] fix non-equal out join is not supported (#8857 )	2022-04-21 12:44:20 +08:00
Zhengguo Yang	7af684ad0f	[fix] Fix a compatibility problem caused by using a non-existent database when connecting via mysql client (#9127 )	2022-04-21 12:43:39 +08:00
Mingyu Chen	7b3865b524	[fix](ut)(vectorized) fix a potential stack overflow bug and some unit test (#9140 )	2022-04-21 12:17:03 +08:00
caiconghui	fa5b5fc6d1	[fix](dynamic_partition) fix dynamic partition scheduler not work for olap table with random hash info (#9108 )	2022-04-21 12:16:27 +08:00
zhangy5	498f50a837	[regression-test] update test case dir which divided by basic functions (#9084 ) 1. Add test case dir. 2. Add some test suites.	2022-04-21 11:55:41 +08:00
Pxl	dda7604e16	[Bug][Storage-vectorized] fix code dump on outer join with not nullable column (#9112 )	2022-04-21 11:02:04 +08:00
caiconghui	40362dfaca	[fix](partition) Fix wrong partition distribution key info for random hash olap table (#9104 )	2022-04-20 17:08:42 +08:00
jiafeng.zhang	f253e260c8	[fix] Modify fe jetty configuration parameters (#9075 )	2022-04-20 14:51:25 +08:00
spaces-x	39c0fec680	[fix] fix bug when partition_id exceeds integer range in spark load (#9073 )	2022-04-20 14:50:55 +08:00
camby	a2edc6fd8b	[feature-wip](array-type) replicate impl for ColumnArray to support join with array column (#9070 ) SQL with JOIN and columns ARRAY, will call function ColumnArray::replicate. At this pr, we implement replicate for ARRAY type, to support SQL like this: `SELECT count(lo_array),count(d_array),SUM(lo_extendedprice*lo_discount) AS REVENUE FROM lineorder, date WHERE lo_orderdate = d_datekey AND d_year = 1993 AND lo_discount BETWEEN 1 AND 3 AND lo_quantity < 25;`	2022-04-20 14:50:34 +08:00
caiconghui	df3a8545dc	[fix](routine_load) Add retry mechanism for routine load task which encounter Broker transport failure (#9067 )	2022-04-20 14:49:58 +08:00
jakevin	3cd432c83a	[community](*) polish config about project (#8987 ) - Add `.editorconfig` - Polish `.gitignore`	2022-04-20 14:48:32 +08:00
Adonis Ling	bd126f0679	[improvement] Refactor type info for further optimizations. (#8786 ) ## Design: For now, there are two categories of types in Doris, one is for scalar types (such as int, char and etc.) and the other is for composite types (array and etc.). For the sake of performance, we can cache type info of scalar types globally (unique objects) due to the limited number of scalar types. When we consider the composite types, normally, the type info is generated in runtime (we can also use some cache strategy to speed up). The memory thereby should be reclaimed when we create type info for composite types. There are a lots of interfaces to get the type info of a specific type. I reorganized those as the following describes. 1. `const TypeInfo* get_scalar_type_info(FieldType field_type)` The function is used to get the type info of scalar types. Due to the cache, the caller uses the result WITHOUT considering the problems about memory reclaim. 2. `const TypeInfo* get_collection_type_info(FieldType sub_type)` The function is used to get the type info of array types with just ONE depth. Due to the cache, the caller uses the result WITHOUT considering the problems about memory reclaim. 3. `TypeInfoPtr get_type_info(segment_v2::ColumnMetaPB* column_meta_pb)` 4. `TypeInfoPtr get_type_info(const TabletColumn* col)` These functions are used to get the type info of BOTH scalar types and composite types. The caller should be responsible to manage the resources returned. #### About the new type `TypeInfoPtr` `TypeInfoPtr` is an alias type to `unique_ptr` with a custom deleter. 1. For scalar types, the deleter does nothing. 2. For composite types, the deleter reclaim the memory. By analyzing the callers of `get_type_info`, these classes should hold TypeInfoPtr: 1. `Field` 2. `ColumnReader` 3. `DefaultValueColumnIterator` Other classes are either constructed by the foregoing classes or hold those, so they can just use the raw pointer of `TypeInfo` directly for the sake of performance. 1. `ScalarColumnWriter` - holds `Field` 1. `ZoneMapIndexWriter` - created by `ScalarColumnWriter`, use `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `ZoneMapIndexWriter`, only uses scalar types. 2. `BitmapIndexWriter` - created by `ScalarColumnWriter`, uses `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `BitmapIndexWriter`, uses `type_info` in `BitmapIndexWriter` and `BitmapIndexWriter` doesn't support `ArrayType`. 3. `BloomFilterIndexWriter` - created by `ScalarColumnWriter`, uses `type_info` from the field in `ScalarColumnWriter` 1. `IndexedColumnWriter` - created by `BloomFilterIndexWriter`, only uses scalar types. 2. `IndexedColumnReader` initializes `type_info` by the field type in meta (only scalar types). 3. `ColumnVectorBatch` 1. `ZoneMapIndexReader` creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `IndexedColumnReader` 2. `BitmapIndexReader` supports scalar types only and it creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `BitmapIndexReader` 3. `BloomFilterIndexWriter` supports scalar types only and it creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `BloomFilterIndexWriter`	2022-04-20 14:47:29 +08:00
zhannngchen	1b4cd76847	[feature](vectorized)(function) Support min_by/max_by function. (#8623 ) Support min_by/max_by on vectorized engine.	2022-04-20 14:46:19 +08:00
jiafeng.zhang	d58e8d76b5	[doc]Release manager docment (#9081 ) Add Release manager docment	2022-04-20 14:16:12 +08:00
FreeOnePlus	304bd9ab62	Change Get-Starting (#9102 ) Modify the download address	2022-04-20 14:15:18 +08:00
liuzhuang2017	8d0f06e49a	[docs][typo] Fixed Chinese and English "advance-usage.md" files. (#9099 ) Fixed Chinese and English "advance-usage.md" files	2022-04-20 14:14:39 +08:00
liuzhuang2017	3a9008f06a	[dosc][typo] Fix "basic-usage.md" files. (#9097 ) Fix "basic-usage.md"	2022-04-20 14:14:12 +08:00
smallhibiscus	1d0629925f	Modify the compilation docs whether it supports avx2. (#9095 ) Modify the compilation docs whether it supports avx2	2022-04-20 14:13:39 +08:00
liuzhuang2017	48f805fbab	Fix routine-load-manual.md (#9090 ) Fix routine-load-manual	2022-04-20 14:13:19 +08:00
LightGHLi	37bd89d24b	[typo](docs) fix some typos in docs (#9085 ) fix some typos in docs	2022-04-20 14:12:54 +08:00

1 2 3 4 5 ...

4435 Commits