doris

Author	SHA1	Message	Date
Ashin Gau	9458a24cd7	[fix](multi-catalog) values in sqlserver should be enclosed by single quotes (#19971 ) Fix errors when inserting string/date/datetime values into SQLServer: ERROR 1105 (HY000): errCode = 2, detailMessage = (172.21.0.101)[INTERNAL_ERROR]UdfRuntimeException: JDBC executor sql has error: CAUSED BY: SQLServerException: Invalid column name '2021-10-30'. When using double quotes enclose string values, it will be parsed as column name, so we should enclose string values with single quotes.	2023-05-26 20:04:45 +08:00
lihangyu	317338913c	[Bug](topn) Fix topn fetch set real default value (#20074 ) 1. Before this PR if rowset does not contain column which should be read for related SlotDescriptor will call `insert_default` to column, but it's not this real defautl value.Real default value relevant information should be provided by the frontend side. 2. Support fetch when light schema change is not enabled, but disable for AGG or UNIQUE MOR model	2023-05-26 16:06:55 +08:00
zhangstar333	53ae24912f	[vectorized](feature) support partition sort node (#19708 )	2023-05-25 11:22:02 +08:00
Qi Chen	53ba46e404	[Fix][Refactor] Fix 'not member call on null pointer of type 'doris::TextConverter' error in ubsan env and refactor text converter. (#19849 ) Fix 'not member call on null pointer of type doris::TextConverter' error in ubsan env and refactor text converter.	2023-05-22 21:00:19 +08:00
WenYao	481e9aebdb	[Refactor](spark load) remove parquet scanner (#19251 )	2023-05-18 19:19:13 +08:00
HappenLee	ef0657c072	[Bug](pipeline) RegressionTest failed release resouce cause DCHECK failed (#19783 ) RegressionTest failed release resouce cause DCHECK failed	2023-05-18 18:57:25 +08:00
lihangyu	fd4fa5c64e	[Optimize](row store) optimize serialization and deserialization (#19691 ) 1. Get DataTypeSerde in advance to avoid get temporary DataTypeSerde iterate each column 2. Iterate the original row once is enoungh for deserializing by introducing a map for record the index of each column's unique id	2023-05-18 16:22:38 +08:00
HappenLee	fe42e52851	[pipeline](CTE) Support multi stream data sink in pipeline (#19519 )	2023-05-18 10:34:37 +08:00
mch_ucchi	1d05feea1b	[Feature](Nereids) add executable function to support fold constant for functions (#18209 ) 1. Add date-time functions for fold constant for Nereids. This is the list of executable date-time function nereids supports up to now: - now() - now(int) - current_timestamp() - current_timestamp(int) - localtime() - localtimestamp() - curdate() - current_date() - curtime() - current_time() - date_{add/sub}(),{years/months/days/hours/minutes/seconds}_{add/sub}() - datediff() - {date/datev2}() - {year/quarter/month/day/hour/minute/second}() - dayof{year/month/week}() - date_format() - date_trunc() - from_days() - last_day() - to_monday() - from_unixtime() - unix_timestamp() - utc_timestamp() - to_date() - to_days() - str_to_date() - makedate() 2. solved problem: - enable datev2/datetimev2 default. - refactor Nereids foldConstantOnFE and support fold nested expression. - separate the executable into multi-files for easily-reading and adding new functions	2023-05-17 21:26:31 +08:00
Ashin Gau	30c4f25cb3	[fix](multi-catalog) verify the precision of datetime types for each data source (#19544 ) Fix threes bugs of timestampv2 precision: 1. Hive catalog doesn't set the precision of timestampv2, and can't get the precision from hive metastore, so set the largest precision for timestampv2; 2. Jdbc catalog use datetimev1 to parse timestamp, and convert to timestampv2, so the precision is lost. 3. TVF doesn't use the precision from meta data of file format.	2023-05-17 20:50:15 +08:00
luozenglin	272a7565b8	[improvement](tracing) Remove useless span levels from be side tracing (#19665 ) 1. Remove an exec node method corresponding to a span and replace it with an exec node corresponding to a span; 2. Fix some problems with tracing in pipeline.	2023-05-17 19:04:52 +08:00
lihangyu	1462e44162	[Bug](topn) fix rowid fetcher merge with empty block (#19712 )	2023-05-17 10:56:32 +08:00
TengJianPing	2bdfaac609	[fix](ubsan) fix ubsan errors (#19658 ) ixu ubsan errors: doris/be/src/util/string_parser.hpp:275:58: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' doris/be/src/vec/functions/functions_comparison.h:214:51: runtime error: addition of unsigned offset to 0x7fea6c6b7010 overflowed to 0x7fea6c6b700c doris/be/src/vec/functions/multiply.cpp:67:50: runtime error: signed integer overflow: 1295699415680000000 * 0x0000000000015401d0a4cd4890a77700 cannot be represented in type '__int128 doris/be/src/vec/aggregate_functions/aggregate_function_percentile_approx.h:445:73: runtime error: addition of unsigned offset to 0x7feca3343d10 overflowed to 0x7feca3343d08 doris/be/src/exec/schema_scanner/schema_tables_scanner.cpp:330:24: run	2023-05-17 09:32:03 +08:00
lihangyu	e22f5891d2	[WIP](row store) two phase opt read row store (#18654 )	2023-05-16 13:21:58 +08:00
zclllyybb	92bf485abd	[Bug] Fix doris pipeline shared scan and top n opt (#19599 )	2023-05-15 10:00:44 +08:00
yiguolei	8ef9212ddc	[enhancement](exceptionsafe) force check exec node method's return value (#19538 )	2023-05-12 10:21:00 +08:00
yiguolei	1d421a26d9	[bugfix](memory) merge block may allocate failed (#19507 )	2023-05-11 10:42:47 +08:00
Qi Chen	4418eb36a3	[Fix](multi-catalog) Fix some hive partition issues. (#19513 ) Fix some hive partition issues. 1. Fix be will crash when using hive partitions field of `date`, `timestamp`, `decimal` type. 2. Fix hdfs uri decode error when using `timestamp` partition filed which will cause some url-encoding for special chars, such as `%3A` will encode `:`.	2023-05-11 07:49:46 +08:00
DeadlineFen	e08de52ee7	[chore](compile) using PCH for compilation acceleration under clang (#19303 )	2023-05-08 19:51:06 +08:00
yiguolei	4e4fb33995	[refactor](conjuncts) simplify conjuncts in exec node (#19254 ) Co-authored-by: yiguolei <yiguolei@gmail.com> Currently, exec node save exprcontext*, but the object is in object pool, the code is very unclear. we could just use exprcontext.	2023-05-04 18:04:32 +08:00
yixiutt	aef9355cd3	[feature-wip](partial update) PART1: support basic partial write (#17542 )	2023-04-28 17:17:57 +08:00
WenYao	2c836251b2	[Fix](schema scanner) Fixed the problem of overflow when multiplying two INT	2023-04-25 23:58:47 +08:00
WenYao	339d804ec4	[Refactor](exceptionsafe) add factory creator to some class (#19000 )	2023-04-25 14:33:47 +08:00
yiguolei	8d7a9fd21b	[refactor](exceptionsafe) add factory creator to some class (#18978 ) make vexprecontext,vexpr,function,query context,runtimestate thread safe. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-24 10:32:11 +08:00
Adonis Ling	e412dd12e8	[chore](build) Use include-what-you-use to optimize includes (PART II) (#18761 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-19 23:11:48 +08:00
Xinyi Zou	79c446c89f	[enhancement](exception) Column filter/replicate supports exception safety (#18503 )	2023-04-18 19:23:09 +08:00
HappenLee	b68857902e	[Compile](BE) Fix compile failed with tcmalloc (#18748 )	2023-04-18 09:26:45 +08:00
yongjinhou	b59c4b4702	[fix](build) Fix missing header files (#18740 )	2023-04-17 21:22:15 +08:00
Adonis Ling	9e960f4c4f	[chore](build) Use include-what-you-use to optimize includes (#18681 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-17 11:44:58 +08:00
Xinyi Zou	c704351273	[enhancement](memory) Refactor memory limit exceeded behavior (#18590 ) No check mem tracker limit and no cancel task in mem hook, only in Allocator. This helps in clearer analysis of memory issues and reduces performance loss. PODArray/hash table/arena memory allocation will use Allocator. Optimize mem limit exceeded log printing Optimize compilation time	2023-04-14 10:42:35 +08:00
YueW	0b8bc51b72	[fix](inverted index) Fix key column match query failed (#18436 ) * [fix](inverted index) Fix key column match query failed * [chore](regression case) add regression case * [fix] fix regression case no order by	2023-04-08 15:45:08 +08:00
Tiewei Fang	759f1da32e	[Enhencement](Backends) add `HostName` filed in backends table and delete backends table in information_schema (#18156 ) 1. Add `HostName` field for `show backends` statement and `backends()` tvf. 2. delete the `backends` table in `information_schema` database	2023-04-07 08:30:42 +08:00
zhangstar333	54dbb4af67	[vectorzied](jdbc) refactor jdbc table read array type (#18187 ) jdbc read array type get result from Doris is string, PG is java.sql.array, CK is java.lang.object it's difficult to maintain and read the code, so change all database's array result to string, then add a cast function from string to doris array type	2023-04-04 11:57:04 +08:00
Pxl	e77833bfa1	[Bug](materialized-view) fix where clause persistence replay incorrect (#18228 ) fix where clause persistence replay incorrect	2023-04-03 12:49:01 +08:00
Xinyi Zou	d9fe5f7b67	[enhancement](memory) Remove MemPool and replace it with Arena (#17820 ) Arena can replace MemPool in most scenarios. Except for memory reuse, MemPool supports reuse of previous memory chunks after clear, but Arena does not. Some comparisons between MemPool and Arena: 1. Expansion Arena is less than 128M index 2 alloc chunk; more than 128M memory, allocate 128M * n > `size`, n is equal to the minimum value that satisfies the expression; MemPool less than 512K index 2 alloc chunk, greater than 512K memory, separately apply for a `size` length chunk After Arena applied for a chunk larger than 128M last time, the minimum chunk applied for after that is 128M. Does this seem to be a waste of memory? MemPool is also similar. After the chunk of 512K was applied for last time, the minimum chunk of subsequent applications is 512K. 2. Alignment MemPool defaults to 16 alignment, because memtable and other places that use int128 require 16 alignment; Arena has no default alignment; 3. Memory reuse Arena only supports `rollback`, which reuses the memory of the current chunk, usually the memory requested last time. MemPool supports clear(), all chunks can be reused; or call ReturnPartialAllocation() to roll back the last requested memory; if the last chunk has no memory, search for the most free chunk for allocation 4. Realloc Arena supports realloc contiguous memory; it also supports realloc contiguous memory from any position at the time of the last allocation. The difference between `alloc_continue` and `realloc` is: 1. Alloc_continue does not need to specify the old size, but the default old size = head->pos - range_start 2. alloc_continue supports expansion from range_start when additional_bytes is between head and pos, which is equivalent to reusing a part of memory, while realloc completely allocates a new memory MemPool does not support realloc, but supports transferring or absorbing chunks between two MemPools 5. check mem limit MemPool checks the mem limit, and Arena checks at the Allocator layer. 6. Support for ASAN Arena does something extra 7. Error handling MemPool supports returning the error message of application failure directly through `Status`, and Arena throws Exception. Tests that Arena can consider 1. After the last applied chunk is larger than 128M, the minimum applied chunk is 128M, which seems to waste memory; 2. Support clear, memory multiplexing; 3. Increase the large list, alloc the memory larger than 128M, and the size is equal to `size`, so as to avoid the current chunk not being fully used, which is wasteful. 4. In some cases, it may be possible to allocate backwards to find chunks t	2023-03-29 20:56:49 +08:00
yiguolei	ac5b47e515	[bugfix](addlog) expr context is not closed and will core during deconstructor (#18134 )	2023-03-27 21:59:46 +08:00
yongkang.zhong	b0948ea4cd	[Fix](SAP Hana External Table) fix that SAP Hana external table can not insert batch values (#17957 ) In the batch insertion scenario, sap hana database does not support syntax insert into tables values (...),(...); what it supports is: ```sql INSERT INTO table(col1,col2) SELECT c1v1, c2v1 FROM dummy UNION ALL SELECT c1v2, c2v2 FROM dummy; ```	2023-03-23 18:49:50 +08:00
Pxl	40ca250678	[Feature](materialized-view) support where clause on create materialized view (#17534 ) support where clause on create materialized view	2023-03-22 11:25:13 +08:00
Mingyu Chen	cb79e42e5c	[refactor](file-system)(step-1) refactor file sysmte on BE and remove storage_backend (#17586 ) See #17764 for details I have tested: - Unit test for local/s3/hdfs/broker file system: be/test/io/fs/file_system_test.cpp - Outfile to local/s3/hdfs/broker. - Load from local/s3/hdfs/broker. - Query file on local/s3/hdfs/broker file system, with table value function and catalog. - Backup/Restore with local/s3/hdfs/broker file system Not test: - cold & host data separation case.	2023-03-21 21:08:38 +08:00
Gabriel	bd8e3e6405	[refactor](date) unify DateTimeValue and VecDateTimeValue (#17670 )	2023-03-20 16:27:08 +08:00
yiguolei	dd53bc1c8d	[unify type system](remove unused type desc) remove some code (#17921 ) There are many type definitions in BE. Should unify the type system and simplify the development. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-19 14:05:02 +08:00
yongkang.zhong	921e8192b2	[fix](multi-catalog) fix hana jdbc catalog insert error (#17838 )	2023-03-16 07:25:19 +08:00
Pxl	16fc3a0e22	[Chore](compile) remove some unused static on inline function to reduce compile time (#17603 ) remove some unused static on inline function to reduce compile time	2023-03-13 11:11:59 +08:00
huangzhaowei	4ba93efc98	[Enhance](DOE)Support parse default es iso datetime string (#17412 ) * support parse default es iso datetime string	2023-03-10 09:59:20 +08:00
WenYao	a745ab1703	[fix](schema scanner) fix query some schema table report invalid parameter (#17626 ) Example: SELECT ROUTINE_SCHEMA AS PROCEDURE_CAT, NULL AS PROCEDURE_SCHEM,ROUTINE_NAME AS PROCEDURE_NAME,NULL AS NUM_INPUT_PARAMS,NULL AS NUM_OUTPUT_PARAMS,NULL AS NUM_RESULT_SETS,ROUTINE_COMMENT AS REMARKS,IF(ROUTINE_TYPE = 'FUNCTION', 2,IF(ROUTINE_TYPE= 'PROCEDURE', 1, 0)) AS PROCEDURE_TYPE FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_SCHEMA = DATABASE(); ERROR 1105 (HY000): errCode = 2, detailMessage = invalid parameter This wrong and some BI tools could not work correctly.	2023-03-10 08:52:09 +08:00
Jerry Hu	08f0170895	[fix](olap) The 'scan key' generated by the 'is null' expression causes incorrect query results (#17569 )	2023-03-10 08:51:06 +08:00
qiye	a767472c56	[fix](DOE)Fix es p0 case error (#17502 ) Fix es array parse error, introduced by #16806	2023-03-08 08:06:30 +08:00
AlexYue	ee1be6edd7	[chore](fe) enhance_mysql_data_type (#17429 )	2023-03-06 10:42:01 +08:00
WenYao	a8f20eb4ac	[Enhencement](schema_scanner) Optimize the performance of reading information schema tables (#17371 ) batch fill block batch call rpc from FE to get table desc For 34w colunms SELECT COUNT( * ) FROM information_schema.columns; time: 10.3s --> 0.4s	2023-03-06 09:53:01 +08:00
Tiewei Fang	f1db0d9501	[Enhencement](File Reader) delete old file_reader (#17261 ) * delete old file_reader * fix 1	2023-03-01 20:24:03 +08:00

1 2 3 4 5 ...

867 Commits