doris

Author	SHA1	Message	Date
HappenLee	90d6985f91	[Fix](bug) Is null predicate get error query result (#41704 ) cherry-pick #41668	2024-10-12 13:18:14 +08:00
airborne12	34429bfa0e	[Chore](inverted index) remove useless code of compound filters for inverted index #40258 (#41448 ) cherry pick from #40258	2024-09-29 17:27:29 +08:00
Socrates	0b4552f74b	[cherry-pick](branch-2.1) pick hive text write from master (#40537 ) ## Proposed changes pick prs: https://github.com/apache/doris/pull/38549 https://github.com/apache/doris/pull/40183 https://github.com/apache/doris/pull/40315 --------- Co-authored-by: Calvin Kirs <kirs@apache.org>	2024-09-27 20:57:07 +08:00
bobhan1	eb13cd4154	[branch-2.1] Picks "[Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update #40272 " (#40964 ) picks https://github.com/apache/doris/pull/40272	2024-09-26 22:54:27 +08:00
lihangyu	c6a6adb3a4	[Fix](topn) avoid missmatched row count when upgrading (#40999 ) #41000	2024-09-21 08:46:57 +08:00
Mingyu Chen	8e860a26a7	[fix](systable) fix unstable case for partitions table (#40553 ) (#41043 ) bp #40553	2024-09-20 17:13:30 +08:00
Socrates	e0fac66223	[branch-2.1](fix) fix snappy decompressor bug (#40862 ) ## Proposed changes Hadoop snappycodec source : https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/codec/SnappyCodec.cc Example: OriginData(The original data will be divided into several large data block.) : large data block1 \| large data block2 \| large data block3 \| .... The large data block will be divided into several small data block. Suppose a large data block is divided into three small blocks: large data block1: \| small block1 \| small block2 \| small block3 \| CompressData: <A [B1 compress(small block1) ] [B2 compress(small block1) ] [B3 compress(small block1)]> A : original length of the current block of large data block. sizeof(A) = 4 bytes. A = length(small block1) + length(small block2) + length(small block3) Bx : length of small data block bx. sizeof(Bx) = 4 bytes. Bx = length(compress(small blockx))	2024-09-20 11:57:14 +08:00
Jerry Hu	b8bc9b699c	[fix](scan) Incorrect scan keys lead to wrong query results. (#40814 ) (#40971 ) ## Proposed changes pick #40814 ``` mysql [doris_14555]>select * from table_9436528_3; +------+------+------+------+------------------------+--------------------+------+ \| col1 \| col2 \| col3 \| col5 \| col4 \| col6 \| col7 \| +------+------+------+------+------------------------+--------------------+------+ \| -100 \| 1 \| -82 \| 1 \| 2024-02-16 04:37:37.00 \| -1299962421.904282 \| NULL \| \| -100 \| 1 \| 92 \| 1 \| 2024-02-16 04:37:37.00 \| 23423423.0324234 \| NULL \| \| -100 \| 0 \| -82 \| 0 \| 2023-11-11 10:49:43.00 \| 840968969.872149 \| NULL \| ``` wrong result: ``` mysql [doris_14555]>select * from table_9436528_3 where col1 <= -100 and col2 in (true, false) and col3 = -82; +------+------+------+------+------------------------+--------------------+------+ \| col1 \| col2 \| col3 \| col5 \| col4 \| col6 \| col7 \| +------+------+------+------+------------------------+--------------------+------+ \| -100 \| 1 \| -82 \| 1 \| 2024-02-16 04:37:37.00 \| -1299962421.904282 \| NULL \| \| -100 \| 1 \| 92 \| 1 \| 2024-02-16 04:37:37.00 \| 23423423.0324234 \| NULL \| +------+------+------+------+------------------------+--------------------+------+ ``` ## Proposed changes Issue Number: close #xxx <!--Describe your changes.-->	2024-09-19 22:01:02 +08:00
Xinyi Zou	b52b572ade	[branch-2.1](memory) When Load ends, check memory tracker value returns is equal to 0 (#40850 ) pick #38960 #39908 #40043 #40092 #40016 #40439 --------- Co-authored-by: hui lai <1353307710@qq.com> Co-authored-by: yiguolei <676222867@qq.com>	2024-09-15 23:47:53 +08:00
Kaijie Chen	7851563829	[fix](brpc_client_cache) resolve hostname in DNS cache before passing to brpc (#40074 ) (#40786 ) backport #40074	2024-09-13 14:28:01 +08:00
Vallish Pai	3604d63184	[Branch 2.1] backport systable PR (#34384,#40153,#40456,#40455,#40568) (#40687 ) backport https://github.com/apache/doris/pull/40568 https://github.com/apache/doris/pull/40455 https://github.com/apache/doris/pull/40456 https://github.com/apache/doris/pull/40153 https://github.com/apache/doris/pull/34384 Test result: 2024-09-11 11:00:45.618 INFO [suite-thread-1] (SuiteContext.groovy:309) - Recover original connection 2024-09-11 11:00:45.619 INFO [suite-thread-1] (Suite.groovy:359) - Execute sql: REVOKE SELECT_PRIV ON test_partitions_schema_db.duplicate_table FROM partitions_user 2024-09-11 11:00:45.625 INFO [suite-thread-1] (SuiteContext.groovy:299) - Create new connection for user 'partitions_user' 2024-09-11 11:00:45.632 INFO [suite-thread-1] (Suite.groovy:1162) - Execute tag: select_check_5, sql: select TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME from information_schema.partitions where table_schema="test_partitions_schema_db" order by TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME 2024-09-11 11:00:45.644 INFO [suite-thread-1] (SuiteContext.groovy:309) - Recover original connection 2024-09-11 11:00:45.645 INFO [suite-thread-1] (ScriptContext.groovy:120) - Run test_partitions_schema in /root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy succeed 2024-09-11 11:00:45.652 INFO [main] (RegressionTest.groovy:259) - Start to run single scripts 2024-09-11 11:01:10.321 INFO [main] (RegressionTest.groovy:380) - Success suites: /root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy: group=default,p0, name=test_partitions_schema 2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:459) - All suites success. ____ _ ____ ____ _____ ____ \| _ \ / \ / ___/ ___\|\| ____\| _ \ \| \|_) / _ \ \___ \___ \\| _\| \| \| \| \| \| __/ ___ \ ___) \|__) \| \|___\| \|_\| \| \|_\| /_/ \_\____/____/\|_____\|____/ 2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:410) - Test 1 suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts 2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:119) - Test finished 2024-09-11 11:03:00.712 INFO [suite-thread-1] (Suite.groovy:1162) - Execute tag: select_check_5, sql: select * from information_schema.table_options ORDER BY TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_MODEL,TABLE_MODEL_KEY,DISTRIBUTE_KEY,DISTRIBUTE_TYPE,BUCKETS_NUM,PARTITION_NUM; 2024-09-11 11:03:00.729 INFO [suite-thread-1] (SuiteContext.groovy:309) - Recover original connection 2024-09-11 11:03:00.731 INFO [suite-thread-1] (ScriptContext.groovy:120) - Run test_table_options in /root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy succeed 2024-09-11 11:03:04.817 INFO [main] (RegressionTest.groovy:259) - Start to run single scripts 2024-09-11 11:03:28.741 INFO [main] (RegressionTest.groovy:380) - Success suites: /root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy: group=default,p0, name=test_table_options 2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:459) - All suites success. ____ _ ____ ____ _____ ____ \| _ \ / \ / ___/ ___\|\| ____\| _ \ \| \|_) / _ \ \___ \___ \\| _\| \| \| \| \| \| __/ ___ \ ___) \|__) \| \|___\| \|_\| \| \|_\| /_/ \_\____/____/\|_____\|____/ 2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:410) - Test 1 suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts 2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:119) - Test finished ************************* 7. row ************************* PartitionId: 18035 PartitionName: p100 VisibleVersion: 2 VisibleVersionTime: 2024-09-11 10:59:28 State: NORMAL PartitionKey: col_1 Range: [types: [INT]; keys: [83647]; ..types: [INT]; keys: [2147483647]; ) DistributionKey: pk Buckets: 10 ReplicationNum: 1 StorageMedium: HDD CooldownTime: 9999-12-31 15:59:59 RemoteStoragePolicy: LastConsistencyCheckTime: NULL DataSize: 2.872 KB IsInMemory: false ReplicaAllocation: tag.location.default: 1 IsMutable: true SyncWithBaseTables: true UnsyncTables: NULL CommittedVersion: 2 RowCount: 4 7 rows in set (0.01 sec) --------- Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>	2024-09-12 11:50:09 +08:00
qiye	8708fae420	[fix](ES Catalog)Support parse single value for array column (#40614 ) (#40660 ) bp #40614	2024-09-11 17:26:48 +08:00
qiye	314f6ae823	[fix](ES Catalog)Fix int parse error when querying by doc_values (#40385 ) (#40521 ) bp #40385	2024-09-09 14:29:21 +08:00
Mingyu Chen	92752b90e7	[feature](metacache) add system table catalog_meta_cache_statistics #40155 (#40210 ) bp #40155	2024-09-02 23:23:35 +08:00
yiguolei	ca07a00c93	Revert "[branch-2.1](hive) support hive write text table (#38549 ) (#4… (#40157 ) …0063)" This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68. ## Proposed changes Issue Number: close #xxx <!--Describe your changes.--> Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-08-30 10:25:38 +08:00
Socrates	c6df7c21a3	[branch-2.1](hive) support hive write text table (#38549 ) (#40063 ) 1. Support write hive text table 2. Add SessionVariable `hive_text_compression` to write compressed hive text table 3. Supported compression type: gzip, bzip2, snappy, lz4, zstd pick from https://github.com/apache/doris/pull/38549	2024-08-29 16:50:40 +08:00
Mingyu Chen	131238ff71	[fix](file-cache) change metric_value column in file_cache_statistics table to string (#40083 ) Make it more flexible followup #39552	2024-08-29 16:39:22 +08:00
Mingyu Chen	173aafc86f	[Enhancement] add information_schema.table_properties #38745 (#38746 ) (#39886 ) bp #38746 --------- Co-authored-by: Vallish Pai <vallishpai@gmail.com>	2024-08-27 17:22:19 +08:00
wangbo	6ceb574aa0	[branch-2.1]Pick IO limit/workload group usage table (#39839 )	2024-08-23 18:51:47 +08:00
wangbo	a55e109e97	[pick][Improment]Add schema table workload_group_privileges (#38436 ) (#39708 ) pick #38436	2024-08-22 00:44:43 +08:00
Mingyu Chen	0bfcee1251	[opt](file-cache) support system table file_cache_statistics (#39552 ) 1. Add new system table: `file_cache_statistics` This table is used for viewing metrics related to file cache on BE side ``` mysql> select * from information_schema.file_cache_statistics limit 10; +-------+---------------+----------------------------+--------------------------------+--------------------+ \| BE_ID \| BE_IP \| CACHE_PATH \| METRIC_NAME \| METRIC_VALUE \| +-------+---------------+----------------------------+--------------------------------+--------------------+ \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| disposable_queue_curr_elements \| 0 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| disposable_queue_curr_size \| 0 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| disposable_queue_max_elements \| 102400 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| disposable_queue_max_size \| 21474836480 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| hits_ratio \| 0.8539634687001242 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| hits_ratio_1h \| 0 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| hits_ratio_5m \| 0 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| index_queue_curr_elements \| 0 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| index_queue_curr_size \| 0 \| \| 10003 \| 172.20.32.136 \| /mnt/output/be/file_cache/ \| index_queue_max_elements \| 102400 \| +-------+---------------+----------------------------+--------------------------------+--------------------+ ``` It will show metrics of file caches on each BE. 2. Add new metrics `hits_ratio_1h` and `hits_ratio_5m` for file cache This 2 metrics will show the hit ratio of file cache in recent 1 hour or 5 minutes. So that we can know recent hit ratio instead of global historical hit ratio.	2024-08-21 10:03:39 +08:00
qiye	43cc8d648d	[fix](ES Catalog)Check isArray before parse json to array (#39104 ) (#39273 ) ## Proposed changes bp #39104	2024-08-13 15:13:40 +08:00
wangbo	fc0222a64c	[opt](info) processlist schema table support show all fe (#38701 ) (#38953 ) pick #38701	2024-08-07 11:01:46 +08:00
Gabriel	9d23ccf1f2	[Improvement](schema scan) Use async scanner for schema scanners (#38… (#38666 ) …403)	2024-08-01 16:05:24 +08:00
Mryange	017dad8c54	[fix](type)support runtime predicate for time type (#38258 ) (#38465 ) ## Proposed changes https://github.com/apache/doris/pull/38258 Issue Number: close #xxx <!--Describe your changes.-->	2024-07-31 10:27:36 +08:00
zzzxl	e2bb86e7f8	[fix](inverted index) fixed in_list condition not indexed on pipelinex (#38178 ) ## Proposed changes https://github.com/apache/doris/pull/36565 https://github.com/apache/doris/pull/37842 https://github.com/apache/doris/pull/37921 https://github.com/apache/doris/pull/37386 <!--Describe your changes.-->	2024-07-25 14:42:34 +08:00
Xinyi Zou	10c5c336d8	[branch-2.1](arrow-flight-sql) Add config arrow_flight_result_sink_buffer_size_rows (#38223 ) pick #38221	2024-07-24 15:15:39 +08:00
qiye	e5339a4014	[feature](ES Catalog)Support control scroll level by config #37180 (#37290 ) ## Proposed changes backport #37180	2024-07-15 16:41:38 +08:00
qiye	f8cee439b6	[feature](ES Catalog) map nested/object type in ES to JSON type in Doris (#37101 ) (#37182 ) backport #37101	2024-07-05 10:48:32 +08:00
abmdocrt	02fad48870	[Fix](upgrade) Fix fields not handled correctly during upgrade and downgrade (#36691 ) master version is #36690	2024-06-22 14:23:04 +08:00
lihangyu	445d42a57d	[fix](topn-opt) remove redundant check for fetch phase (#36676 ) #36629 Issue Number: close #xxx <!--Describe your changes.-->	2024-06-21 22:28:38 +08:00
zclllyybb	bd47d5a681	[branch-2.1](auto-partition) Fix auto partition load failure in multi replica (#36586 ) this pr 1. picked #35630, which was reverted #36098 before. 2. picked #36344 from master these two pr fixed existing bug about auto partition load. --------- Co-authored-by: Kaijie Chen <ckj@apache.org>	2024-06-20 17:51:18 +08:00
Pxl	dda25cceb6	[Bug](information-schema) fix some bug of information_schema.PROCESSLIST (#36447 ) ## Proposed changes pick from #36409	2024-06-18 16:45:48 +08:00
zclllyybb	3b23eee37c	Revert "[fix](auto-partition) fix auto partition load lost data in multi sender (#35287 )" (#36098 ) Reverts apache/doris#35630 because it brought some more damaging bugs. we will fix it and merge in next version	2024-06-11 17:11:42 +08:00
wangbo	75a6f28f2e	[cherry-pick]Add query type when report (#35918 ) pick #34978	2024-06-11 10:51:59 +08:00
amory	b5a35b9cef	[FIX] Pick array inverted index bugfix (#35837 ) here with some array with inverted index bugfix: see also: https://github.com/apache/doris/pull/34766 https://github.com/apache/doris/pull/35086 https://github.com/apache/doris/pull/34683 https://github.com/apache/doris/pull/34076	2024-06-06 09:54:14 +08:00
amory	fe1a4c4136	[Feature](IP) support ipv4/ipv6 with inverted index and conjuncts for query (#35734 ) support data type ipv4/ipv6 with inverted index and then we can query like "> or < or >= or <= or in/not in " this conjuncts expr for ip with inverted index speeding up	2024-06-03 23:24:03 +08:00
Kaijie Chen	c2fc485327	[fix](auto-partition) fix auto partition load lost data in multi sender (#35287 ) (#35630 ) ## Proposed changes Change `use_cnt` mechanism for incremental (auto partition) channels and streams, it's now dynamically counted. Use `close_wait()` of regular partitions as a synchronize point to make sure all sinks are in close phase before closing any incremental (auto partition) channels and streams. Add dummy (fake) partition and tablet if there is no regular partition in the auto partition table. Backport #35287 Co-authored-by: zhaochangle <zhaochangle@selectdb.com>	2024-05-31 10:27:03 +08:00
Qi Chen	b91d2caab8	[Feature](iceberg-writer) Implements iceberg sink basic functionality for inserting into table. (#35587 ) backport #34929	2024-05-29 16:40:54 +08:00
yiguolei	f38ecd349c	[enhancement](memory) return error if allocate memory failed during add rows method (#35085 ) * return error when add rows failed * f --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-05-22 00:53:34 +08:00
shee	fb28d0b185	[BUG] fix scan range boundary handling is incorrect (#34832 ) fix scan range boundary handling is incorrect Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>	2024-05-21 13:00:50 +08:00
Tiewei Fang	c0fd98abe5	[Fix](tvf) Fix that tvf reading empty files in compressed formats. (#34926 ) 1. Fix the issue with tvf reading empty compressed files. 2. move two test cases (`test_local_tvf_compression` and `test_s3_tvf_compression`) from p2 to p0	2024-05-21 12:59:31 +08:00
abmdocrt	42425808a1	[Cherry-Pick](branch-2.1) Pick "Fix multiple replica partial update auto inc data inconsistency problem #34788 " (#35056 ) * [Fix](auto inc) Fix multiple replica partial update auto inc data inconsistency problem (#34788) * Problem: For tables with auto-increment columns, updating partial columns can cause data inconsistency among replicas. Cause: Previously, the implementation for updating partial columns in tables with auto-increment columns was done independently on each BE (Backend), leading to potential inconsistencies in the auto-increment column values generated by each BE. Solution: Before distributing blocks, determine if the update involves partial columns of a table with an auto-increment column. If so, add the auto-increment column to the last column of the block. After distributing to each BE, each BE will check if the data key for the partial column update exists. If it exists, the previous auto-increment column value is used; if not, the auto-increment column value from the last column of the block is used. This ensures that the auto-increment column values are consistent across different BEs. * 2 * [Fix](regression-test) Fix auto inc partial update unstable regression test (#34940)	2024-05-20 15:43:46 +08:00
xueweizhang	9b5028785d	[fix](prepare) fix datetimev2 return err when binary_row_format (#34662 ) fix datetimev2 return err when binary_row_format. before pr, Backend return datetimev2 alwary by to_string. fix datatimev2 return metadata loss scale.	2024-05-18 18:37:41 +08:00
zzzxl	cc00666be6	[opt](inverted index) add inlist condition handling to compound (#34134 ) 1. Previously, the compound did not support the inlist condition, which could impact performance if an inverted index was created.	2024-05-10 14:35:47 +08:00
wangbo	39fdc9ba0c	[refactor](executor)Rename workload schedule policy #34497	2024-05-08 08:35:20 +08:00
wangbo	299d069da9	Fix alter policy failed (#33910 )	2024-04-22 22:33:24 +08:00
wangbo	03c3419265	[Refactor](executor)Add workload schedule policy table (#33729 )	2024-04-20 20:06:34 +08:00
zclllyybb	25358564ca	[Fix](compile) Fix gcc compile on master (#33864 ) This is imported by #33511. wrongly used ColumnStr<T> (); which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)	2024-04-19 23:41:37 +08:00
Mryange	74590e4836	[refine](node) Remove the cse DCHECK from the constructor (#33856 ) It's possible that a failure in the fe caused the check to fail, and at that moment, it may not be possible to retrieve the corresponding query ID from be.out.	2024-04-19 23:41:37 +08:00

1 2 3 4 5 ...

1068 Commits