doris

Author	SHA1	Message	Date
AKIRA	cf5bc9594b	[fix](planner) conjuncts of the outer query block didn't work when it's on the results expr of inline view (#17036 ) Here is a cases: select id, name from (select '123' as id, '1234' as name, age from test_insert ) a where name != '1234';	2023-02-24 15:27:34 +08:00
AlexYue	c39914c0a0	[feature](partition)add default list partition (#15509 ) This pr implements the list default partition referred in related #15507. It's similar as GreenPlum's default's partition which would store all data not satisfying prior partition key's constraints and optimizer wouldn't filter default partition which means default partition would be scanned each time you try to select data from one table with default partition. User could either create one table with default partition or alter add one default partition. ```sql PARTITION LIST(key) { PARTITION p1 values in (xx,xx), PARTITION DEFAULT } ALTER TABLE XXX ADD PARTITION DEFAULT ``` We don't support automatically migrate data inside default partition which meets newly added partition key's constraint to newly add partition when alter add new partition. User should select default partition using new constraints as predicate and insert them to new partition. ```sql insert into tbl select * from tbl partition default where partition_key=xx; ```	2023-02-24 15:24:59 +08:00
starocean999	479d57df88	[fix](planner) the project expr should be calculated in join node in some case (#17035 ) Consider the sql bellow: select sum(cc.qlnm) as qlnm FROM outerjoin_A left join (SELECT outerjoin_B.b, coalesce(outerjoin_C.c, 0) AS qlnm FROM outerjoin_B inner JOIN outerjoin_C ON outerjoin_B.b = outerjoin_C.c ) cc on outerjoin_A.a = cc.b group by outerjoin_A.a; The coalesce(outerjoin_C.c, 0) was calculated in the agg node, which is wrong. This pr correct this, and the expr is calculated in the inner join node now.	2023-02-24 15:20:05 +08:00
yongjinhou	7470198df6	[Docs](docs) Organize http documents (#16618 ) 1. Organize http documents 2. Add http interface authentication for FE 3. Support https interface for FE 4. Provide authentication interface 5. Add http interface authentication for BE 6. Support https interface for BE	2023-02-24 15:17:01 +08:00
Zhengguo Yang	d562428b1d	[enhancement](memory) reduce memory usage for failed broker loads (#16974 ) Reduce more memory usage for failed broker load msg in fe after pr #15895	2023-02-24 12:07:02 +08:00
yiguolei	03a4fe6f39	[enhancement](streamload) make stream load context as shared ptr and save it in global load mgr (#16996 )	2023-02-24 11:15:29 +08:00
Tiewei Fang	be047f11aa	[BugFix](csv_reader) csv_reader support datev2/datetimev2 (#17031 )	2023-02-24 11:13:48 +08:00
yongjinhou	c3538ca804	[Enhancement](HttpServer) Add http interface authentication (#16571 ) 1. Organize http documents 2. Add http interface authentication for FE 3. Support https interface for FE 4. Provide authentication interface 5. Add http interface authentication for BE 6. Support https interface for BE	2023-02-24 10:59:33 +08:00
YueW	a12b3c3f0c	[fix](alter inverted index) fix incorrect CreateTime of 'show alter' query result after fe restart (#17043 ) For add or drop inverted index, when replay the logModifyTableAddOrDropInvertedIndices will new a schema change job, that has a new CreateTime, here should new a schema change job when not replay log.	2023-02-24 10:25:48 +08:00
airborne12	325757b411	[Fix](inverted index) fix memory leak when DorisCompoundReader strdup file name (#17062 ) 1. strdup(const char*) will copy a temporary char array. 2. std::string's copy construct will copy the temporary char array to std::string buffer. 3. finally the temporary char array will leak.	2023-02-24 10:15:04 +08:00
Pxl	03f4c7a94d	[Doc](Materialized-View) update documentation about materialized view enhancement (#17025 ) update documentation about materialized view enhancement	2023-02-24 10:06:35 +08:00
Pxl	c4edea5936	[Enchancement](function) refact and optimize some function register (#16955 ) refact and optimize some function register	2023-02-24 10:05:11 +08:00
yagagagaga	37b9b038c4	[typo](docs) fix Fix incorrect url address in export-manual.md. (#17072 )	2023-02-24 09:42:28 +08:00
DuRipeng	1cce5782a0	[typo](docs) collect doc md language annotation (#17090 )	2023-02-24 09:41:54 +08:00
TengJianPing	883f575cfe	[fix](string function) fix wrong usage of iconv_open (#17048 ) * [fix](string function) fix wrong usage of iconv_open Also add test case for function convert * fix test case	2023-02-24 09:13:10 +08:00
catpineapple	f49b0c6f39	add_conf_file_jdbc_drivers_dir (#17083 )	2023-02-24 08:36:34 +08:00
zhannngchen	ad81424603	Revert "[fix](merge-on-write) add check for segment num (#14032 )" (#17058 ) This reverts commit b7d2bec8ea3baaf8e4d52da5264f148ae827467b.	2023-02-23 21:18:31 +08:00
amory	7229751bd9	[Improve](map-type) Add contains_null for map (#16948 ) Add contains_null for map type.	2023-02-23 20:47:26 +08:00
lsy3993	c416bfbaef	[typo](docs)fix disk format (#17050 ) * change docker compose to 'docker-compose' * modify sql of mysql * fix docker start and stop cmd * new commit * markdown format adjust	2023-02-23 20:32:05 +08:00
qiye	92ecd16573	(feature)[DOE]Support array for Doris on ES (#16941 ) * (feature)[DOE]Support array for Doris on ES	2023-02-23 19:31:18 +08:00
谢健	48fd528a2b	[feature](Nereids) Add hint NTH_OPTIMIZED_PLAN to let the optimzier select n-th optimized plan (#16992 ) Add hint NTH_OPTIMIZED_PLAN to let the optimzier can select n-th optimized plan. For example, you could use, select /+SET_VAR("nth_optimized_plan"=2) / * from table; to select the second-best plan in the optimizer.	2023-02-23 18:56:51 +08:00
zhannngchen	e5f884a6fc	[enhancement](cache) make segment cache prune more effectively (#17011 ) BloomFilter in MoW table may consume lots of memory, and it's life cycle is same as segment. This patch try to improve the efficiency of recycling segment cache, to release the memory in time.	2023-02-23 18:24:18 +08:00
lihangyu	526a66e9fb	[Function](array-type) support array_apply (#17020 ) Filter array to match specific binary condition ``` mysql> select array_apply([1000000, 1000001, 1000002], '=', 1000002); +-------------------------------------------------------------+ \| array_apply(ARRAY(1000000, 1000001, 1000002), '=', 1000002) \| +-------------------------------------------------------------+ \| [1000002] \| +-------------------------------------------------------------+ ```	2023-02-23 17:38:16 +08:00
zhannngchen	edead494cb	[Enhancement](storage) add a new hidden column __DORIS_VERSION_COL__ for unique key table (#16509 )	2023-02-23 15:47:17 +08:00
Yusheng Xu	ad5022dd38	[regression-test](profile) add regression tests for query profile and insert profile (#16998 )	2023-02-23 15:35:20 +08:00
camby	52a731a2df	fix compile error while use gcc12 (#17016 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2023-02-23 15:07:31 +08:00
xy720	91fc9fae8e	[Bug](complex-type) Fix is null predicate in delete stmt for array/struct/map type (#17018 )	2023-02-23 15:06:49 +08:00
Ashin Gau	3ea6478ba8	[feature](multi-catalog) parquet reader support nested array column (#16961 ) Support to decode nested array column in parquet reader: 1. FE should generate the right nested column type. FE doesn't check the nesting depth and legality, like map\<array\<int\>, int\>. 2. `ParquetColumnReader` has removed the filtering of page index to support nested array type. It's too difficult to skip values in nested complex types. Maybe we should support the filtering of page index and lazy read in later PR. 3. `ExternalFileScanNode` has a bug in creating default value expression. 4. Maybe it's slow to read repetition levels in a while loop. I'll optimize this in next PR. 5. Array column has temporary `SchemaElement` in its thrift definition, we have removed them and keep its parent in former implementation. The remaining parent should inherit the repetition and definition level of its child.	2023-02-23 14:54:58 +08:00
Qi Chen	61826e3a77	[Improvement](parquet-reader) Improve performance of parquet reader filter calculation. (#16934 ) Improve performance of parquet reader filter calculation. - Use `filter_data` instead of `(*filter_ptr)` to merge filter to improve performance. - Use mutable column filter func instead of original new column filter func which introduced by #16850. - Avoid column ref-count increasing which caused unnecessary copying by passing column pointer ref.	2023-02-23 14:41:30 +08:00
ZhangYu0123	eb116cd25e	[chore](ui) execute selected code in sql editor. (#16906 ) * ui playground support selection sql to execute * ui playground support selection sql to execute	2023-02-23 14:25:06 +08:00
Tiewei Fang	c2cc75d741	[BugFix](Jdbc Catalog) Fix null pointer exception in JdbcExecutor (#16958 ) This pr do two things: 1. fix: It use `column[0]` to judge class type in JdbcExecutor, but column[0] may be null ! 2. Enhencement In the original logic, all fields in jdbc catalog table will be set Nullable. However, it is inefficient for nullable fields. Actually, we can know if the fields in data source table is nullable through jdbc. So we can set the corresponding fields in Doris jdbc catalog to nullable or not.	2023-02-23 14:04:54 +08:00
slothever	51bbae27b8	[feature-wip](iceberg) add dlf and glue catalog impl for iceberg catalog (#16602 ) iceberg catalog supports DLF on Alibaba Cloud and AWS Glue Catalog	2023-02-23 14:02:41 +08:00
Jibing-Li	bc619ce5be	[Fix](load)Pass hidden column to load columns (#17004 ) The LoadScanProvider doesn't get Hidden Columns from stream load parameter. This may cause stream load delete operation fail. This pr is to pass the hidden columns to LoadScanProvider.	2023-02-23 13:54:36 +08:00
morrySnow	37960e83d3	[test](Nereids) add ssb sf0.1 p1 regression case (#17046 )	2023-02-23 12:25:10 +08:00
minghong	4a56140c3a	[fix](planner) sub_bitmap should be always nullable (#17010 ) sub_bitmap return type should be ALWAYS_NULLABLE, not depend on children. For example sub_bitmap(bitmap_empty(), 1, 2) return NULL, but all children are not null.	2023-02-23 12:18:28 +08:00
yongkang.zhong	2e1ed384fd	[typo](docs) add split_by_string function 1.2.2 label (#17057 )	2023-02-23 11:17:25 +08:00
Lijia Liu	8eeb435963	[improvement](meta) Enhance Doris's fault tolerance to disk error (#16472 ) Sense io error. Retry query when io error. Greylist: When finds one disk is completely broken, or the diff of tablet number in BE and FE meta is too large,reduce the query priority of the BE.	2023-02-23 08:40:45 +08:00
Xinyi Zou	a1c0054b4c	[fix](memory) fix memory GC details and join probe catch bad_alloc (#16989 ) Fix Redhat 4.x OS /proc/meminfo has no MemAvailable, disable MemAvailable to control memory. vm_rss_str and mem_available_str recorded when gc is triggered, to avoid memory changes during gc and cause inaccurate logs. join probe catch bad_alloc, this may alloc 64G memory at a time, avoid OOM. Modify document doris_be_all_segments_num and doris_be_all_rowsets_num names.	2023-02-23 08:33:30 +08:00
yongkang.zhong	d7d82f26af	[typo](docs) add date_trunc function 1.2 label (#17037 )	2023-02-22 22:42:18 +08:00
minghong	a9fb47a80a	[fix](planner) create view init bug (#16890 ) the body of create view stmt is parsed twice. in the second parse, we get sql string from CreateViewStmt.viewDefStmt.toSql() function, which missed selectlist.	2023-02-22 20:40:08 +08:00
mch_ucchi	df2f248712	[feature](planner) add dayofweek for FEFunctions to support fold constant (#16993 ) add dayofweek for FEFunctions to support fold constant. use Zellar algorithm	2023-02-22 20:27:49 +08:00
starocean999	7aa063c1f3	[fix](planner) bucket shuffle join is not recognized if the first table is a subquery (#16985 ) consider sql select * from (select * from test_1) a inner join (select * from test_2) b on a.id = b.id inner join (select * from test_3) c on a.id = c.id Because a.id is from a subquery, to find its source table, need use function getSrcSlotRef().	2023-02-22 20:23:00 +08:00
YueW	7b0fc17c04	[enhancement](inverted index) Support fulltext index evaluate equal query and list query (#16994 ) Fulltext index is the inverted index of the specified tokenizer, before this pr, fulltext index only can evaluate match predicate, this pr to support evaluate equal predicate and list predicate.	2023-02-22 20:18:10 +08:00
catpineapple	4c92730c3a	[fix](planner)fix multi partition support datetime column #16759	2023-02-22 19:38:42 +08:00
zhangstar333	dc3dab5a23	[vectorized](jdbc) fix jdbc connect sql server error (#16929 )	2023-02-22 19:36:27 +08:00
Mingyu Chen	12b6786522	[fix](hive) fix unable to specify user to access hdfs (#16999 ) In version 1.2.1, user can set `"hadoop.username" = "xxx"` to specify a remote user to access hdfs when creating hive catalog. But in version 1.2.2, we upgrade the hadoop version from 2.8 to 3.3, some behavior changed and the user specified remote user is useless. This PR try to fix this by using `UserGroupInformation` to delegate.	2023-02-22 19:35:40 +08:00
ZhangYu0123	56ebbf8bc9	[chore](tools) fix load-clickbench-data script cannot be interrupted #17000	2023-02-22 19:34:40 +08:00
yongkang.zhong	8dd1a12ea6	[typo](docs)Add upgrade precautions #17027	2023-02-22 19:27:20 +08:00
cjq9458	e48d9c9d62	[doc](typo)update datax.md #17009	2023-02-22 19:27:03 +08:00
Xinyi Zou	b194a7cf83	[improvement](memory) Support GC segment cache, when memory insufficient (#16987 ) fix segment cache memory tracker statistics support GC	2023-02-22 18:31:20 +08:00

1 2 3 4 5 ...

8898 Commits