doris

Author	SHA1	Message	Date
feiniaofeiafei	4c75fecea9	[fix](compile) be compile failed in mac due to std::max (#37238 ) (#38860 ) cherry-pick #37238 to branch-2.1	2024-08-05 16:31:39 +08:00
Gabriel	bb962a8291	[minor](fix) Fix incorrect fmt arguments (#38840 ) (#38861 ) pick #38840	2024-08-05 16:06:32 +08:00
Uniqueyou	65154f8abe	[branch-2.1] (doris-future) Support auto partition name function (#38853 ) cherry-pick https://github.com/apache/doris/pull/34258 to branch-2.1	2024-08-05 16:04:24 +08:00
Pxl	86ef0069ea	[Feature](function) support group concat with distinct and order by (#38851 ) pick from #38744 and #38776	2024-08-05 15:44:51 +08:00
daidai	607c0b82a9	[opt](serde)Optimize the filling of fixed values into block columns without repeated deserialization. (#37377 ) (#38245 ) (#38810 ) ## Proposed changes pick pr: #38575 and fix this pr bug : #38245	2024-08-05 09:13:08 +08:00
amory	2653087843	[pick](array-funcs)fix array with empty arg in be behavior (#38708 ) ## Proposed changes backport: https://github.com/apache/doris/pull/36845 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-05 09:08:28 +08:00
zhangstar333	1b3d4b4d31	[cherry-pick](branch-21)fix operator do_projections should use local_state intermediate_projections (#38612 ) (#38765 ) ## Proposed changes cherry-pick from master https://github.com/apache/doris/pull/38612 <!--Describe your changes.-->	2024-08-05 09:07:16 +08:00
daidai	5d02c48715	[feature](hive)Support reading renamed Parquet Hive and Orc Hive tables. (#38432 ) (#38809 ) bp #38432 ## Proposed changes Add `hive_parquet_use_column_names` and `hive_orc_use_column_names` session variables to read the table after rename column in `Hive`. These two session variables are referenced from `parquet_use_column_names` and `orc_use_column_names` of `Trino` hive connector. By default, these two session variables are true. When they are set to false, reading orc/parquet will access the columns according to the ordinal position in the Hive table definition. For example: ```mysql in Hive : hive> create table tmp (a int , b string) stored as parquet; hive> insert into table tmp values(1,"2"); hive> alter table tmp change column a new_a int; hive> insert into table tmp values(2,"4"); in Doris : mysql> set hive_parquet_use_column_names=true; Query OK, 0 rows affected (0.00 sec) mysql> select * from tmp; +-------+------+ \| new_a \| b \| +-------+------+ \| NULL \| 2 \| \| 2 \| 4 \| +-------+------+ 2 rows in set (0.02 sec) mysql> set hive_parquet_use_column_names=false; Query OK, 0 rows affected (0.00 sec) mysql> select * from tmp; +-------+------+ \| new_a \| b \| +-------+------+ \| 1 \| 2 \| \| 2 \| 4 \| +-------+------+ 2 rows in set (0.02 sec) ``` You can use `set parquet.column.index.access/orc.force.positional.evolution = true/false` in hive 3 to control the results of reading the table like these two session variables. However, for the rename struct inside column parquet table, the effects of hive and doris are different.	2024-08-05 09:06:49 +08:00
Jerry Hu	53773ae6b7	[opt](join) check datatype of intermediate slots in hash join (#38556 ) (#38792 ) ## Proposed changes pick #38556	2024-08-05 09:03:21 +08:00
zclllhhjj	8fa0710cb3	[branch-2.1](load) fix miss writer in concurrency incremental open (#38605 ) (#38793 ) pick https://github.com/apache/doris/pull/38605	2024-08-05 08:56:23 +08:00
hui lai	6035edad0b	[fix](multi table) fix single stream multi table memory leak (#38255 ) (#38824 ) pick (#38255) We meet OOM when using single stream multi table ![image](https://github.com/user-attachments/assets/748e9914-d591-4f41-8b28-412d3cecc841) It exist memory leak, and heap profile like: ![image](https://github.com/user-attachments/assets/af30c593-88ea-44f6-bba1-82436b13f99f) The stream load context will not release in some exception conditions as plan failed for high concurrency causing timeout when obtaining read lock. It is introduced by https://github.com/apache/doris/pull/35458 The solution effect is shown in the following figure, which can run stably with a small amount of memory ![image](https://github.com/user-attachments/assets/4483e0a5-6c0c-4cdc-b8ed-3408da6a86b2)	2024-08-04 22:12:44 +08:00
Luwei	0603ec1d9d	[enhancement](compaction) optimizing memory usage for compaction (#37099 ) (#37486 )	2024-08-04 10:49:18 +08:00
HappenLee	7bdc508ac7	[Bug](fix) fix coredump case in (not null, null) execpt (not null, not null) case (#38756 ) ## Proposed changes Issue Number: close #38612 <!--Describe your changes.-->	2024-08-04 10:44:10 +08:00
bobhan1	64b69ed1ba	[branch-2.1] Picks "[opt](merge-on-write) Skip the alignment process of some rowsets in partial update #38487 " (#38682 ) ## Proposed changes picks https://github.com/apache/doris/pull/38487	2024-08-02 20:05:31 +08:00
amory	556f0fc784	[pick](json-keys) support json_keys function (#38631 ) ## Proposed changes backport: https://github.com/apache/doris/pull/36411 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-02 19:10:00 +08:00
amory	9b07cd2069	[pick](json-serde)pick jsonb string deserialize with spec char (#38711 ) ## Proposed changes backport: https://github.com/apache/doris/pull/37176 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-02 13:37:41 +08:00
qiye	b3f335ba5f	[enhancement](index compaction) Enable index compaction by default (#36812 ) (#38676 ) ## Proposed changes bp #36812	2024-08-02 12:03:57 +08:00
amory	1d982ada45	[pick](array-funcs)pick array func array_enumerate_uniq bugfix (#38721 ) ## Proposed changes backport: https://github.com/apache/doris/pull/38384 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-02 11:25:17 +08:00
amory	f5bc65989c	[pick](array-range)improve array_range func for large param (#38707 ) ## Proposed changes backport: https://github.com/apache/doris/pull/38284 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-02 11:22:46 +08:00
amory	b7e1588be9	[pick](upgrade)fix log message (#38710 ) ## Proposed changes backport: https://github.com/apache/doris/pull/38254 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-02 11:20:20 +08:00
yujun	327069fdbc	[branch-2.1](log) add tablet clear cache log (#38713 )	2024-08-02 08:40:02 +08:00
zzzxl	0da388ade5	[fix](inverted index) fix match_phrase_ edge query result error #38327 (#38740 )	2024-08-01 23:17:53 +08:00
qiye	4d980b8235	[feature](http action)Add http action to show nested inverted index file (#38272 ) (#38672 ) backport #38272	2024-08-01 19:30:59 +08:00
Gabriel	3e5255a862	[pipeline](fix) Fix blocking task which is not triggered by 2nd RPC (… (#38694 ) …#38568) Once a query is cancelled due to any reason, BE may not receive 2nd RPC from FE. If so, we must ensure the execution dependency is ready so tasks will not be blocked.	2024-08-01 18:23:41 +08:00
Gabriel	82c681595e	[fix](local exchange) Fix local exchange blocked by a huge data block… (#38693 ) … (#38657) If a huge block is push into local exchanger, it will be blocked due to concurrent problems. This PR use a unique lock to resolve it .	2024-08-01 18:04:19 +08:00
meiyi	e8690b62ee	[fix](group commit) Pick add debug log show why group commit not work; delete wal when replay success (#38611 ) (#38659 ) Pick https://github.com/apache/doris/pull/38611	2024-08-01 16:59:54 +08:00
Gabriel	9d23ccf1f2	[Improvement](schema scan) Use async scanner for schema scanners (#38… (#38666 ) …403)	2024-08-01 16:05:24 +08:00
Qi Chen	4042cdf553	[Fix](memory) Fix allocator.h compiling failed on mac. (#38646 ) Backport #38562. Fix allocator.h compiling failed on mac which introduced by #37257.	2024-08-01 13:56:53 +08:00
Xin Liao	63a3ff570b	[Opt](load) print tablet id when memtable flush coredump #38618 (#38656 ) cherry pick from #38618	2024-08-01 13:52:50 +08:00
HappenLee	28998300d4	[Bug](fix) fix ubsan use int32_t pointer access bool value (#38621 ) ## Proposed changes Issue Number: close #38617 <!--Describe your changes.-->	2024-08-01 13:52:12 +08:00
amory	338fa32303	[pick](simdjson) fix simdjson with object array when jsonroot is not empty (#38633 ) ## Proposed changes backport: https://github.com/apache/doris/pull/38490 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-01 11:04:54 +08:00
wuwenchi	41fa7bc9fd	[bugfix](paimon)Fixed the reading of timestamp with time zone type data for 2.1 (#37716 ) (#38592 ) bp: #37716	2024-08-01 10:23:06 +08:00
amory	184b8cbbe4	[pick](json)fix jsonb deseriaze (#38630 ) ## Proposed changes backport: https://github.com/apache/doris/pull/37251 Issue Number: close #xxx <!--Describe your changes.-->	2024-08-01 10:18:27 +08:00
airborne12	66ebf709ba	[Fix](inverted index) fix fast execute for not_in expr #37745 (#38594 ) cherry pick from #37745	2024-07-31 19:58:12 +08:00
airborne12	7730aa2170	[Fix](inverted index) fix wrong no need read data when same column in inverted index and like function #36687 (#38581 ) cherry pick from #36687	2024-07-31 19:41:39 +08:00
airborne12	a75511ae08	[Feature](inverted index) add no need read data optimize config (#38584 ) pick from #36686	2024-07-31 19:39:17 +08:00
airborne12	232ee74566	[Fix](inverted index) fix memory leak for index compaction (#38586 ) Pick from (#36209)	2024-07-31 19:19:38 +08:00
airborne12	aed0cc8ba0	[Fix](inverted index) remove duplicate stats of inverted_index_query_cache_miss #36707 (#38580 ) cherry pick from #36707	2024-07-31 19:18:58 +08:00
airborne12	7357d7bd3b	[Update](inverted index) Add column name to debug point for "no need to read data" optimization #37649 (#38579 ) cherry pick from #37649	2024-07-31 19:17:46 +08:00
HappenLee	3b234cfab6	[performance](exec) Performance problem create too many scanner task (#38460 ) ## Proposed changes cherry pick the pr: #38430 <!--Describe your changes.-->	2024-07-31 14:34:01 +08:00
lihangyu	aa9bdd76d0	[Pick](Variant) pick some fix #38413 #38364 (#38512 )	2024-07-31 11:03:31 +08:00
walter	182bf4d323	[chore](fe) Returns dropped tables in GetMeta request (#38541 ) Cherry-pick #38019	2024-07-31 10:57:00 +08:00
Mryange	017dad8c54	[fix](type)support runtime predicate for time type (#38258 ) (#38465 ) ## Proposed changes https://github.com/apache/doris/pull/38258 Issue Number: close #xxx <!--Describe your changes.-->	2024-07-31 10:27:36 +08:00
camby	715bcd13f1	[opt](mow) opt mow lookup with sequence column (#38287 ) (#38406 )	2024-07-30 09:46:09 +08:00
airborne12	cefee4dbc0	[Pick 2.1](clucene) update clucene version (#38496 ) ## Proposed changes backport #38482	2024-07-30 09:40:04 +08:00
hui lai	17d351af80	[fix](csv reader) fix csv parser incorrect if enclosing line_delimiter (#38347 ) (#38445 ) Csv reader parse data incorrect when data enclosing line_delimiter, for example, line_delimiter is \n and enclose is ', data as follows: ``` 'aaaaaaaaaaaa bbbb' ``` it will be parsed as two columns: `'aaaaaaaaaaaa` and `bbbb',` rather than one column ``` 'aaaaaaaaaaaa bbbb' ``` The reason why this happened is csv reader will not reset result when not match enclose in this `output_buf_read`, causing incorrect truncation was made. Co-authored-by: Xin Liao <liaoxinbit@126.com>	2024-07-29 14:55:45 +08:00
Jerry Hu	87cf2d1fb4	[fix](spill) Duplicate calls to Dependency::set_ready() in hash join(#37461 ) (#38399 ) ## Proposed changes pick #37461 Duplicate calling the function `Dependency::set_ready()` will cause pipeline tasks to be scheduled incorrectly.	2024-07-29 09:44:48 +08:00
Xin Liao	e9f12fac47	[fix](load) fix no error url for stream load #38325 (#38417 ) cherry pick from #38325	2024-07-28 19:06:57 +08:00
Xin Liao	d8744cd3d0	[Opt](load) don't print stack when some errors occur for stream load #38332 (#38418 ) cherry pick from #38332	2024-07-28 19:04:24 +08:00
Gabriel	c93f3bd24e	[Improvement](bloom filter) Forbid small bloom filter (#38349 ) (#38392 ) Bloom filter has a expected filter ratio when data is enough. This PR forbid too small bloom filter which has a big bias for filter ratio. pick #38349	2024-07-26 10:11:31 +08:00

1 2 3 4 5 ...

8079 Commits