Commit Graph

18404 Commits

Author SHA1 Message Date
7cb00a8e54 [Feature](hive-writer) Implements s3 file committer. (#34307)
Backport #33937.
2024-04-29 19:56:49 +08:00
1bfe0f0393 [feature](iceberg)support read iceberg complex type,iceberg.orc format and position delete. (#33935) (#34256)
master #33935
2024-04-29 14:40:12 +08:00
9b7e007ef6 [Bug](union) fix union operator set eos is not incorrect (#34250)
* [test](case) fix unstable case without order by distinct row

* [Bug](union) fix union operator set eos is not incorrect
2024-04-29 13:38:03 +08:00
20bd0c2987 [FIX](cases )fix ipv6 value for regress case 2024-04-29 13:37:29 +08:00
222289697d [improve](regression) Support qt_target_sql (#34236) (#34270) 2024-04-29 11:50:35 +08:00
5277a55791 (pick 34003) release fd for shutdown tablets (#34224) 2024-04-29 10:51:19 +08:00
946d28646a [fix](outfile)Fixed orcOutputStream.close() throwing an exception during destruction causing the program to hang. (#34254)
bp #34243
2024-04-28 19:54:34 +08:00
417431fd83 [Enhancement](hdfs-file-system) Change fs_handler ptr to shared_ptr and remove ref count operations. (#34049)
Backport #33959.
2024-04-28 19:45:30 +08:00
99af54f779 [Fix](orc-reader) Fix the issue when string col has mixed plain and dict encoding in different stripes. (#34146) (#34248)
backport #34146
2024-04-28 19:43:57 +08:00
11039ade7b [opt](paimon) support mapping Paimon column type "Row" to Doris type "Struct" (#34239)
backport: #33786
2024-04-28 19:38:50 +08:00
1fda68f738 [feature](planner) Support select constant from dual syntax sugar (#34200) (#34232)
In MySQL, it's common to use a simplified syntax like `SELECT constant FROM dual`
which is equivalent to just `SELECT constant`.
This syntax is often used by BI tools when utilizing MySQL connectors to verify connection validity.
To enhance compatibility and ensure seamless integration with such tools,
we have now implemented this feature in Doris.

### Key Changes:
- Doris now interprets `SELECT constant FROM dual` as `SELECT constant`, aligning with MySQL's behavior.
- This update ensures that BI tools can use standard MySQL connectors without modifications or errors when connecting to Doris.
2024-04-28 15:56:16 +08:00
341f5cd7a3 [fix](branch-2.1) Fix streamload profile not set (#34221) 2024-04-28 14:36:58 +08:00
45556686ea [fix](test) fix some external test cases (#34209)
Fix some test cases and enable `test_information_schema_external` suite
2024-04-27 23:25:33 +08:00
a6bf35efdf 2.1.3-rc03 2024-04-27 20:54:06 +08:00
7ab425ee4b [improve](move-memtable) reduce default load stream per node to 2 for stream load (#34065) (#34205)
Co-authored-by: Kaijie Chen <ckj@apache.org>
2024-04-27 18:20:57 +08:00
cd1c9edd71 [fix](pipeline-load) fix no error url when data quality error and total rows is negative (#34072) (#34204)
Co-authored-by: HHoflittlefish777 <77738092+HHoflittlefish777@users.noreply.github.com>
2024-04-27 18:19:08 +08:00
36e80af327 [fix](schema change) fix the defineName field is not the same when copying column (#34201)
* [fix](schema change) fix the defineName field is not the same when copying column

* fix
2024-04-27 11:59:07 +08:00
cf700a62b6 [test](case) fix unstable case without order by distinct row (#34167) 2024-04-27 11:20:36 +08:00
30a68c1240 [fix](spill) use different algorithm to avoid partition data skew (#34162) 2024-04-27 11:20:36 +08:00
4b9772062b [refactor](statistic) fetch statistic data with catalog and database id (#33928) (#34202)
bp #33928
2024-04-27 09:38:41 +08:00
c998e2f714 [Enhancement](planner) Support string input for sql_select_limit (#34177) 2024-04-27 02:29:47 +08:00
414fbd353e [fix](ES catalog)Make col != '' behavior consistent with SQL (#34151)
In SQL syntax, `col != ''` equals `col.length() > 0`.
It means that this column must exist in ES doc fields and its content is not empty.
In this PR, we make a special translation for this binary predicate to keep the behavior of both consistent.

---------

Co-authored-by: Luennng <luennng@gmail.com>
2024-04-27 02:29:33 +08:00
3ba42a7823 [improvement](mtmv) Optimize the nested materialized view performance (#34163)
Record increase refersh version more accurately.
The refreshVersion in the memo will increase when mv rewrite successfully.
In query rewrite, if refresh version is different from the current struct info map in group,
will refresh the group struct info or not.
2024-04-27 02:29:33 +08:00
970d0c80df [Improvement](agg) Improve count distinct distribute keys (#33167) 2024-04-27 02:29:33 +08:00
c125148deb [opt](Nereids) bucket shuffle downgrade expansion (#34088)
Expand bucket shuffle downgrade condition, which originally requiring a single partition after pruning, basic table and bucket number < para number. Currently, we expect this option can be used for disabling bucket shuffle more efficiently, without above restrictions.

Co-authored-by: zhongjian.xzj <zhongjian.xzj@zhongjianxzjdeMacBook-Pro.local>
2024-04-27 02:29:33 +08:00
5e9eb417ad [bugfix](insert) fix cherry pick : redundant branch judgment #34160 2024-04-27 02:19:39 +08:00
90040e7f92 [Fix] fix compile problem (#34184)
fix compile problem in branch-2.1
2024-04-26 17:57:16 +08:00
10e098845d [fix](compile) fix two compile errors on MacOS (#33834) (#34149) 2024-04-26 17:02:44 +08:00
627245f93a [fix](Nereids) support not in predicate for delete command (#34153) 2024-04-26 15:06:28 +08:00
0f0c0a266b [opt](parquet)Skip page with offset index (#33082)
Make skip_page() in ColumnChunkReader more efficient. No more reading page headers if there are pagelocations in chunk.
2024-04-26 15:06:16 +08:00
acc2b532e7 [Test](hive-writer) Adjust test_hive_write_partitions regression test to resolve special characters issue with git on windows. (#34026) 2024-04-26 15:05:47 +08:00
91887a285e Implement HLL with 128 buckets to support statistics cache. (#34124) 2024-04-26 15:05:36 +08:00
b24ff9953d [fix](Nereids) column pruning should prune map in cte consumer (#34079)
we save bi-map in cte consumer to get the maping between producer and consumer.
the consumer's output is decided by the map in it.
so, cte consumer should be output prunable, and should remove useless entry from map when do column pruning
2024-04-26 15:05:19 +08:00
b41a5339d3 [Fix](nereids) fix rule merge_aggregate when has project (#33892) 2024-04-26 15:05:09 +08:00
a34ed4643a [fix](planner)date_add function should accept date type as its param (#34035) 2024-04-26 15:04:45 +08:00
5adc823b14 [fix](nereids)move ReplaceVariableByLiteral rule to analyze phase (#33997) 2024-04-26 15:04:45 +08:00
b7b87fbb95 [fix](planner)cast expr should do nothing in compactForLiteral method (#34047) 2024-04-26 15:04:45 +08:00
fdf91759b6 [fix](nereids)prevent null pointer access if translate expression fails (#33990) 2024-04-26 15:04:33 +08:00
60e20a3afe [fix](pipeline_x) Crc32HashPartitioner should use ShuffleChannelIds (#34147) 2024-04-26 15:03:11 +08:00
7f4b7b04ad [test](hive)add subnet for hive docker compose (#34000) (#34157)
bp #34000
Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>
2024-04-26 13:49:33 +08:00
50f9d47e96 [test](hive) run suite cases both in hive2 and hive3 (#33874) (#34156)
bp #33874

Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>
2024-04-26 13:48:09 +08:00
9aa08d8deb [improve](disk) Not add disk path to broken list if check status is not IO_ERROR (#34111) 2024-04-26 07:44:12 +08:00
7af5fc7321 [fix](Nereids) check after rewrite cannot handle agg in other opeator (#34114)
this is a stupid mistake. we import a same name class from another package
2024-04-26 07:43:40 +08:00
4f6b9db7a7 Update doris_main.cpp (#34128)
* Update doris_main.cpp

Log(FATAL) introduces a core dump, which is confusing for users. We should print error msg and exit without a core dump.

* Update doris_main.cpp
2024-04-26 07:43:40 +08:00
9f0a5690a6 [profile](scan) add projection time in scaner #34120 2024-04-26 07:43:40 +08:00
52031c86b7 [improvement](mtmv) Optimize the performance of nested materialized view rewriting (#34127)
Optimize the performance of nested materialized view rewriting gracefully, future performance optimzie base on this.
2024-04-26 07:43:25 +08:00
55d5ed9ab6 [test](streamload) add load empty file regression test (#34110) 2024-04-26 07:42:09 +08:00
Pxl
7fbca522b7 [Bug](runtime-filter) fix bloom filter size error on rf merge (#34082)
fix bloom filter size error on rf merge

W20240424 11:28:56.826277 3494287 ref_count_closure.h:80] RPC meet error status: [INVALID_ARGUMENT]PStatus: (172.21.0.15)[INVALID_ARGUMENT]bloom filter size not the same: already allocated bytes 65536, expected allocated bytes 32768
2024-04-26 07:41:56 +08:00
75644392f4 [fix](Nereids) support aggregate function only in having statement (#34086)
SQL like

> SELECT 1 AS c1 FROM t HAVING count(1) > 0 OR c1 IS NOT NULL
2024-04-26 07:41:45 +08:00
a237f7ec6e [feature](Nereids): add equal set in functional dependencies (#33642) 2024-04-26 07:41:45 +08:00