doris

Author	SHA1	Message	Date
airborne12	8378ab5e41	[Fix](inverted index) fix memeory leak when inverted index writer do not finish correctly (#20028 ) * [Fix](inverted index) fix memeory leak when inverted index writer do not finish correctly * [Update](inverted index) use smart pointer to avoid memeory leak * [Chore](format) code format --------- Co-authored-by: airborne12 <airborne12@gmail.com>	2023-05-29 12:18:14 +08:00
Mryange	a86134cb39	[fix](executor) Fixed an error with cast as time. #20144 before mysql [(none)]>select cast("10:10:10" as time); +-------------------------------+ \| CAST('10:10:10' AS TIMEV2(0)) \| +-------------------------------+ \| 00:00:00 \| +-------------------------------+ after mysql [(none)]>select cast("10:10:10" as time); +-------------------------------+ \| CAST('10:10:10' AS TIMEV2(0)) \| +-------------------------------+ \| 10:10:10 \| +-------------------------------+ In the past, we supported this syntax. mysql [(none)]>select cast("2023:05:01 13:14:15" as time); +------------------------------------------+ \| CAST('2023:05:01 13:14:15' AS TIMEV2(0)) \| +------------------------------------------+ \| 13:14:15 \| +------------------------------------------+ However, "10:10:10" is also a valid datetime. mysql [(none)]>select cast("10:10:10" as datetime); +-----------------------------------+ \| CAST('10:10:10' AS DATETIMEV2(0)) \| +-----------------------------------+ \| 2010-10-10 00:00:00 \| +-----------------------------------+ So here, the order of parsing has been adjusted.	2023-05-29 12:17:21 +08:00
Jerry Hu	9f8de89659	[refactor](exec) replace the single pointer with an array of 'conjuncts' in ExecNode (#19758 ) Refactoring the filtering conditions in the current ExecNode from an expression tree to an array can simplify the process of adding runtime filters. It eliminates the need for complex merge operations and removes the requirement for the frontend to combine expressions into a single entity. By representing the filtering conditions as an array, each condition can be treated individually, making it easier to add runtime filters without the need for complex merging logic. The array can store the individual conditions, and the runtime filter logic can iterate through the array to apply the filters as needed. This refactoring simplifies the codebase, improves readability, and reduces the complexity associated with handling filtering conditions and adding runtime filters. It separates the conditions into discrete entities, enabling more straightforward manipulation and management within the execution node.	2023-05-29 11:47:31 +08:00
Kang	859b03dfdf	[Improvement](topn) prevent memory usage of key topn increasing unlimited (#19978 )	2023-05-29 10:16:15 +08:00
Yongqiang YANG	e0d9f7f955	[enhancement](load) add some profile items for load (#20141 )	2023-05-29 09:54:03 +08:00
yujun	42239d635a	[fix](tablet_manager_lock) fix create tablet timeout #20067 (#20069 )	2023-05-28 23:05:13 +08:00
AlexYue	4573ee9a49	[enhance](PrefetchReader) abort load task when data size returned by S3 is smaller than requested (#19947 ) We encountered one confusing situation where buffered reader were trapped in one endless loop when calling readat. Then we found out that it was all due to the return data size is less than requested. As the following picture shows, the actual data size is about 2M, and when we called readat it only retrieved about 1MB.	2023-05-28 21:48:17 +08:00
amory	9d44918036	[Improve](data-type) Clean datatype uselesscode (#20145 ) * fix struct_export out data * delete useless code with data type	2023-05-28 20:48:29 +08:00
bobhan1	c45da40ed7	[refactor-WIP](TaskWorkerPool) add specific classes for ALTER_TABLE, CLONE, STORAGE_MEDIUM_MIGRATE task (#20140 )	2023-05-28 19:27:08 +08:00
YueW	ae352997b4	[Enhancement](alter inverted index) Improve alter inverted index performance with light weight add or drop inverted index (#19063 )	2023-05-28 11:23:07 +08:00
AlexYue	da17c45c0b	[enhance](FileWriter)enhance s3 file writer bvar to avoid adding abort bytes (#20138 ) * don't add each time upload or it would add aborted bytes * alloca memory	2023-05-28 10:52:37 +08:00
bobhan1	0434c6a738	[refactor-WIP](TaskWorkerPool) add specific classes for PUSH, PUBLIC_VERION, CLEAR_TRANSACTION tasks (#19822 )	2023-05-27 22:47:45 +08:00
zhangstar333	509689491f	[improvement](exec) Refactor the partition sort node to send data in pipeline mode (#20128 ) before: the node will wait to retrieve all data from child, then send data to parent. now: for data from child that does not require sorting, it can be sent to parent immediately.	2023-05-27 22:42:10 +08:00
airborne12	ac8599fedb	[Fix](single replica load) fix indices_size key not found core (#20047 )	2023-05-27 13:28:07 +08:00
lihangyu	f3d8af330a	[Bug](point query) check point query before check two phase read (#20055 ) * [Bug](point query) checkAndSetPointQuery before checkEnableTwoPhaseRead 1. checkEnableTwoPhaseRead rely on thr short circuit flag 2. add more metric to display lookup profile * fix rebase	2023-05-27 12:38:58 +08:00
Adonis Ling	16c46974c5	[chore](build) Fix compilation errors reported by GCC-13 (#20114 )	2023-05-27 08:25:52 +08:00
Jack Drogon	93933308e6	[Feature-WIP](CCR): Add ccr doris interface (WIP) (#17881 )	2023-05-26 23:40:49 +08:00
HappenLee	e5b0d7a5cd	[CTE](eof) Support cte reuse reduce counter by eof status and pipeline task mem can release (#20056 )	2023-05-26 22:03:29 +08:00
plat1ko	3c6227a900	[fix](filesystem) Fix core caused by using moved variable in batch_delete_impl #20033	2023-05-26 21:39:27 +08:00
Gabriel	23ad72e734	[Bug](runtime filter) Fix min/max filter for decimalv3 (#20005 )	2023-05-26 21:35:21 +08:00
Qi Chen	cb4a57f44f	[Opt](orc-reader) Support merge small IO facility in orc reader. (#20092 ) #18976 introduced merge small IO facility to optimize performance, and used by parquet reader. This PR support this facility in orc reader. Current ORC reader implementation need to reposition parent present stream when reading lazy columns in lazy materialization facility. So let it works by removing `DCHECK_GE(offset, cached_data.end_offset)`.	2023-05-26 21:06:12 +08:00
zclllyybb	346c51faa2	[fix](expr) Make VExprContext exit gracefully (#19984 )	2023-05-26 20:21:53 +08:00
Ashin Gau	9458a24cd7	[fix](multi-catalog) values in sqlserver should be enclosed by single quotes (#19971 ) Fix errors when inserting string/date/datetime values into SQLServer: ERROR 1105 (HY000): errCode = 2, detailMessage = (172.21.0.101)[INTERNAL_ERROR]UdfRuntimeException: JDBC executor sql has error: CAUSED BY: SQLServerException: Invalid column name '2021-10-30'. When using double quotes enclose string values, it will be parsed as column name, so we should enclose string values with single quotes.	2023-05-26 20:04:45 +08:00
Xinyi Zou	a928b21434	[improvement](exception-safe) sort node is completely exception safe #20041	2023-05-26 18:29:02 +08:00
qiye	9e70a9ef84	[opt](compaction) add pick rowset to compact interval config (#19868 )	2023-05-26 17:39:02 +08:00
lihangyu	317338913c	[Bug](topn) Fix topn fetch set real default value (#20074 ) 1. Before this PR if rowset does not contain column which should be read for related SlotDescriptor will call `insert_default` to column, but it's not this real defautl value.Real default value relevant information should be provided by the frontend side. 2. Support fetch when light schema change is not enabled, but disable for AGG or UNIQUE MOR model	2023-05-26 16:06:55 +08:00
TengJianPing	488c9ba7c2	[improvement](exchange) test: data stream sender stop sending data to receiver if it returns eos early (#20081 )	2023-05-26 16:05:38 +08:00
zhangdong	b481045372	[improvement](k8s)when the IP of be is different from the IP specified by the master, obey it (#19951 ) When FE is deployed on a virtual machine and CN is deployed on k8s, FE needs to use a proxy IP to communicate with CN nodes, and BE cannot resolve the proxy IP from the local network card. We change the previous verification rules and obey the IP assigned by the master	2023-05-26 15:40:29 +08:00
yiguolei	0ed817ed1a	[improvement](status) should send query timeout status to be, instead of internal error (#20016 ) If a query is cancelled, the reason is very unclear and we do not know the call stack. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-05-26 15:11:17 +08:00
luozenglin	b81e9e2521	[fix](resource-group) Fix resource group memory isolation may release too much memory (#20066 ) Suppose three queries are executed in a resource group with a memory_limit of 8G, and they consume memory of query_a = 3G, query_b = 3G, and query_c = 3G. The total memory used is counted as 9G when the resource group GC is executed, which exceeds the resource group limit and cancels query_a. When the resource group is next GC, the memory of query_a may not be freed yet, and it will be counted again in the total memory consumed by that resource group, which again exceeds the resource group limit and cancels query_b. From the user's perspective, it is fine to execute query_a and query_b at the same time, but executing query_ a, query_b and query_c will be cancelled for two queries, which is not as expected. This pr skips the queries that are cancelled when counting the memory used by the resource group. If this causes the process memory to grow, the process gc will handle it.	2023-05-26 14:50:12 +08:00
Pxl	43aa062fb1	[Chore](hash-join) remove useless conditions and add some case (#20050 )	2023-05-26 14:45:24 +08:00
amory	ee34b6de2d	[Refact] (serde) refact mysql serde with data type (#19543 ) refact mysql output (de)serialize with data type serde , avoid accoriding switch case Primitive type writed in mysqlWriter	2023-05-26 14:11:17 +08:00
Pxl	15a7420661	[Chore](ub) fix some undefined behaviors (#19986 ) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_reader.cpp:895:21: runtime error: load of value 423208544, which is not a valid value for type 'doris::ReaderType' /home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_decimal.cpp:260:33: runtime error: load of misaligned address 0x7fa3348b301c for type 'int64_t' (aka 'long'), which requires 8 byte alignment /home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:82:24: runtime error: variable length array bound evaluates to non-positive value 0 /home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_string.h:225:26: runtime error: null pointer passed as argument 2, which is declared to never be null	2023-05-26 14:08:40 +08:00
Mryange	92a6122f74	[feature](profile)Add the filtering information of the Bloom filter in profile. (#19789 )	2023-05-26 10:56:58 +08:00
Xinyi Zou	56360ba04a	[fix](memory) Load flush memtable no check memory exceed #20036	2023-05-26 09:57:00 +08:00
AlexYue	bc4e0e97f2	[enhance](S3FileWriter) abort when s3 file writer abnormally quite and optimize s3 buffer pool (#19944 ) 1. reduce s3 buffer pool's ctor cost 2. before this pr, if one s3 file writer return err when calling append or close function, the caller will not call abort function which result in one confusing DCHECK failed like the following picture	2023-05-26 09:14:38 +08:00
Yongqiang YANG	686711adda	[fix](publish) dot use wait_for for publish synchorization (#20029 ) It leads to use after free problem.	2023-05-25 20:01:06 +08:00
TengJianPing	3598518e59	[fix](revert) data stream sender stop sending data to receiver if it returns eos early (#19847 )" (#20040 ) * Revert "[fix](sink) fix END_OF_FILE error for pipeline caused by VDataStreamSender eof (#20007)" This reverts commit 2ec1d282c5e27b25d37baf91cacde082cca4ec31. * [fix](revert) data stream sender stop sending data to receiver if it returns eos early (#19847)" This reverts commit c73003359567067ea7d44e4a06c1670c9ec37902.	2023-05-25 16:50:17 +08:00
zhangstar333	002c76e06f	[vectorized](udaf) support udaf function work with window function (#19962 )	2023-05-25 14:38:47 +08:00
zhangstar333	53ae24912f	[vectorized](feature) support partition sort node (#19708 )	2023-05-25 11:22:02 +08:00
TengJianPing	2ec1d282c5	[fix](sink) fix END_OF_FILE error for pipeline caused by VDataStreamSender eof (#20007 ) * [fix](sink) fix END_OF_FILE error for pipeline caused by VDataStreamSender eof	2023-05-25 10:29:35 +08:00
HappenLee	2d668e8d0b	[DEBUG](Log) Add debug string for pipeline task cacnel (#20026 )	2023-05-25 09:58:31 +08:00
bobhan1	bf4072e5b0	[fix](StorageEngine) release DataDir after the thread pool has been shutdown (#20014 )	2023-05-24 23:51:06 +08:00
zhannngchen	ff54b45775	[fix](partial-update) should hold tablet meta lock before calling lookup_row_key() (#19964 )	2023-05-24 16:37:27 +08:00
Pxl	3ba7c2336b	[Chore](build) change CMAKE_CXX_STANDARD from 17 to 20 #19987	2023-05-24 16:16:42 +08:00
Jerry Hu	e5eed53b89	[improvement](bitmap) Use shared_ptr in BitmapValue to avoid deep copying (#19101 ) Currently bitmapvalue type is copied between columns, it cost a lot of memory. Use a shared ptr in bitmap value to avoid copy data.	2023-05-24 16:13:01 +08:00
TengJianPing	c730033595	[improvement](exchange) data stream sender stop sending data to receiver if it returns eos early (#19847 ) For broadcast join, only one build fragment instance will build hash table, other fragment instances just receive and throw away build side data, this is waste of memory and cpu. This PR improve this condition, data stream receiver tells sender that it does not need data from sender, and sender stops sending anydata to it.	2023-05-24 15:11:32 +08:00
Xinyi Zou	14b4c7abf9	[fix](hashtable) Check query cancel status during build hash table #19970 should cancel query during hash table build stage if the query is cancelled.	2023-05-24 14:24:03 +08:00
Xinyi Zou	cf7a74f6ec	[fix](memory) query check cancel while waiting for memory in Allocator, and optimize log (#19967 ) After the query check process memory exceed limit in Allocator, it will wait up to 5s. Before, Allocator will not check whether the query is canceled while waiting for memory, this causes the query to not end quickly.	2023-05-24 11:08:48 +08:00
YueW	08ec5e2eb5	[fix](function) fix result column is nullable type when fast execute (#19889 )	2023-05-24 10:27:50 +08:00

1 2 3 4 5 ...

4588 Commits