Commit Graph

265 Commits

Author SHA1 Message Date
09f2516c7c [fix](test) fix some test cases #43217 (#43216)
bp #43217
2024-11-04 22:15:30 +08:00
98d3db03b1 [fix](regression-test) fix test_hive_serde_prop #42886 (#43098)
cherry pick from #42886

Co-authored-by: Socrates <suxiaogang223@icloud.com>
2024-11-01 23:11:45 +08:00
c573351e4e [fix](tvf) fix FE cannot start when replay alter view from tvf (#42866)
bp: #40872
2024-10-31 14:15:48 +08:00
cc30a7e78e [fix](test) fix some unstable external p0 test cases (#42685) (#42943)
cherry-pick #42685

Co-authored-by: daidai <2017501503@qq.com>
2024-10-31 12:36:03 +08:00
cdd32d9582 [enhance](hive) support reading hive table with OpenCSVSerde #42257 (#42940)
cherry pick from #42257

Co-authored-by: Socrates <suxiaogang223@icloud.com>
2024-10-31 11:12:07 +08:00
4a62d9e44b Revert "[2.1][improvement](jdbc catalog) Add catalog property to enable jdbc connection pool" (#42481)
Reverts apache/doris#42255

We have found that after closing the connection pool, there will be
class loading problems and connection release problems for some data
sources. We will remove this function first and re-add it after solving
and testing it completely.
2024-10-25 19:37:36 +08:00
2defa90be7 [test](ES Catalog)Add mapping _routing test case (#42074) (#42282)
## Proposed changes

bp #42074
2024-10-23 10:14:12 +08:00
157d67e7ca [enhance](hive) Add regression-test cases for hive text ddl and hive text insert and fix reading null string bug #42200 (#42273)
cherry pick from #42200

Co-authored-by: Socrates <suxiaogang223@icloud.com>
2024-10-22 23:56:57 +08:00
bde8e2d474 [2.1][improvement](jdbc catalog) Add catalog property to enable jdbc connection pool (#42255)
pick (#41992)

We initially introduced jdbc connection pool to improve the connection
performance of jdbc catalog, but we always found that connection pool
would bring some unexpected errors, so we chose to add a catalog
property: `enable_connection_pool` to choose whether to enable the jdbc
connection pool of jdbc catalog, and the default false.However, the
created catalog will still open the connection pool when it is upgraded,
and only the newly created catalog will be false

And we conducted performance tests on this, the performance loss is
within the expected range.

- Enable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM mysql.test.test
limit 1;' --create-schema=mysql --delimiter=";" --verbose
Benchmark
        Average number of seconds to run all queries: 0.008 seconds
        Minimum number of seconds to run all queries: 0.004 seconds
        Maximum number of seconds to run all queries: 0.133 seconds
        Number of clients running queries: 1
        Average number of queries per client: 1

- Disable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM
mysql_no_pool.test.test limit 1;' --create-schema=mysql --delimiter=";"
--verbose
Benchmark
        Average number of seconds to run all queries: 0.054 seconds
        Minimum number of seconds to run all queries: 0.047 seconds
        Maximum number of seconds to run all queries: 0.184 seconds
        Number of clients running queries: 1
        Average number of queries per client: 1
2024-10-22 23:28:28 +08:00
c1d2b8d548 [2.1][improvement](jdbc catalog) Disallow non-constant type conversion pushdown and implicit conversion pushdown (#42242)
pick (#42102)

Add a variable `enable_jdbc_cast_predicate_push_down`, the default value
is false, which prohibits the pushdown of non-constant predicates with
type conversion and all predicates with implicit conversion. This change
can prevent the wrong predicates from being pushed down to the Jdbc data
source, resulting in query data errors, because the predicates with cast
were not correctly pushed down to the data source before. If you find
that the data is read correctly and the performance is better before
this change, you can manually set this variable to true

```
| Expression                                          | Can Push Down |
|-----------------------------------------------------|---------------|
| column type equals const type                       | Yes           |
| column type equals cast const type                  | Yes           |
| cast column type equals const type                  | No            |
| cast column type equals cast const type             | No            |
| column type not equals column type                  | No            |
| column type not equals cast const type              | No            |
| cast column type not equals const type              | No            |
| cast column type not equals cast const type         | No            |

```
2024-10-22 17:27:29 +08:00
47ff6f1300 [fix](OrcReader) fix the issue that orc_reader can not read DECIMAL(0,0) type of orc file #41795 (#42220)
cherry pick from #41795

Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
2024-10-22 10:10:25 +08:00
e713b92321 [fix](multi-catalog) Disable string dictionary filtering when predicate express is not slot #42113 (#42222)
cherry pick from #42113

Co-authored-by: Socrates <suxiaogang223@icloud.com>
2024-10-22 09:43:29 +08:00
084434e25c [Test](tvf) add regression tests for testing orc reader #41606 #42188 (#42120)
cherry pick from #42031 #42188

---------

Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: TieweiFang <ftw2139@163.com>
2024-10-21 21:31:18 +08:00
bbd4970ed8 [feature](jdbc catalog) support gbase jdbc catalog #41027 #41587 (#42123)
cherry pick from #41027 #41587

---------

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2024-10-21 16:52:23 +08:00
a32ad0b1f7 [cherry-pick](branch-2.1) support reading brotli compressed parquet file (#42162)
pick pr: https://github.com/apache/doris/pull/41875
2024-10-21 16:48:09 +08:00
a150d160ea [fix](jdbc catalog) fix and add mysql and doris extremum test #41679 (#42122)
cherry pick from #41679

---------

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2024-10-21 16:39:40 +08:00
5ba0da4a84 [fix](test) fix unstable external p0 cases #42069 (#42153)
cherry pick from #42069
2024-10-21 15:04:40 +08:00
968e33f07e [cherry-pick](branch-21) pick (#39057) (#41352) (#41958)
## Proposed changes

pick from master (#39057) (#41352)

<!--Describe your changes.-->

---------

Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
2024-10-17 14:30:40 +08:00
1b901f6fcc [cherry-pick](branch-2.1) add parquet tvf cases and fix some parquet bug (#41931)
## Proposed changes
pick pr:
  https://github.com/apache/doris/pull/41683
  https://github.com/apache/doris/pull/41506
  https://github.com/apache/doris/pull/41338
  https://github.com/apache/doris/pull/39326

---------

Co-authored-by: morningman <morningman@163.com>
2024-10-17 14:20:58 +08:00
cf2ec26bc2 [fix](catalog) should return error if try using a unknown database (#40479) (#41971)
bp #40479
2024-10-17 11:13:56 +08:00
4888c632f4 [cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text (#41684)
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/40291
2024-10-15 00:08:23 +08:00
b2bac26c17 [fix](jdbc catalog) Disable oracle scan null operator pushdown (#41563) (#41712)
Because Oracle versions below Oracle21 do not support null as an
operator, and considering that most users' Oracle versions are below
Oracle21, we disable Oracle's null operator pushdown by default.
pick (#41563)
2024-10-11 21:01:05 +08:00
8c0f73cb90 [Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.(#40225 , #40888 ,#41386 ) (#41610)
bp #40225 , #40888 ,#41386

## Proposed changes
Among them, #40225 is the new api of mc,
#40888 is used to fix the bug when reading null between the new and old
apis,
#41386 is used for compatibility between the new and old versions
2024-10-11 11:55:41 +08:00
0fb42d3a48 [Enhancement](tvf)catalog tvf implements user permission checks and hides sensitive information (#41497) (#41604)
bp #41497 

before #21790
## Proposed changes
This PR unifies the duplicate parts of `catalog tvf` and `show
catalogs`, adds permission check when querying `catalog tvf`, and hides
sensitive information.
2024-10-10 17:55:40 +08:00
308700f0ca [fix](test) fix unstable test_export_external_table cases (#41523) (#41570)
bp #41523
2024-10-09 11:53:22 +08:00
4f81fc474c [bugfix](paimon)Get the file format by file name (#41020) (#41487)
bp #41020
2024-09-30 15:46:13 +08:00
0b4552f74b [cherry-pick](branch-2.1) pick hive text write from master (#40537)
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315

---------

Co-authored-by: Calvin Kirs <kirs@apache.org>
2024-09-27 20:57:07 +08:00
4deda2fce7 [improvement](nereids) Simplify ScanNode projection handling by removing redundant conditions (#40801) (#41315)
pick from master #40801

This PR simplifies the handling of `ScanNode` projection logic.
Previously, the code included multiple conditional checks to determine
whether a `projectionTuple` should be generated. These conditions have
been removed, and now `projectionTuple `is always generated for
`ScanNode`, ensuring a consistent projection setup. Additionally,
redundant handling of `SlotId` and `SlotRef` has been eliminated, making
the code cleaner and easier to maintain. The behavior for `OlapScanNode`
remains unchanged.
2024-09-26 10:35:01 +08:00
5b3b2cec80 [feat](metatable) support table$partitions for hive table (#40774) (#41230)
bp #40774
and pick part of #34552, add `isPartitionedTable()` interface in `TableIf`
2024-09-25 09:52:07 +08:00
e0fac66223 [branch-2.1](fix) fix snappy decompressor bug (#40862)
## Proposed changes
Hadoop snappycodec source :

https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/codec/SnappyCodec.cc
Example:
OriginData(The original data will be divided into several large data
block.) :
     large data block1 | large data block2 | large data block3 | ....
The large data block will be divided into several small data block.
Suppose a large data block is divided into three small blocks:
large data block1: | small block1 | small block2 | small block3 |
CompressData: <A [B1 compress(small block1) ] [B2 compress(small block1)
] [B3 compress(small block1)]>

A : original length of the current block of large data block.
sizeof(A) = 4 bytes.
A = length(small block1) + length(small block2) + length(small block3)
Bx : length of  small data block bx.
sizeof(Bx) = 4 bytes.
Bx = length(compress(small blockx))
2024-09-20 11:57:14 +08:00
5f583fa329 [branch-2.1][test](jdbc catalog) add oceanbase ce jdbc catalog test (#40978)
pick #34972)
2024-09-19 22:11:24 +08:00
774efe78e6 [fix](regression)fix maxcompute p0 case (#40933)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-09-19 01:09:53 +08:00
4b7b43b5ca [bugfix](hive/iceberg)align with Hive insert overwrite table functionality (#39840) (#40724)
bp #39840
2024-09-12 19:20:15 +08:00
8708fae420 [fix](ES Catalog)Support parse single value for array column (#40614) (#40660)
bp #40614
2024-09-11 17:26:48 +08:00
314f6ae823 [fix](ES Catalog)Fix int parse error when querying by doc_values (#40385) (#40521)
bp #40385
2024-09-09 14:29:21 +08:00
962c382077 [fix](jdbc catalog) Fix type recognition error when using query tvf to query doris (#40481)
pick  (#40122)

Using string to match Doris type will not work with query tvf, so use
field matching instead
2024-09-06 19:30:32 +08:00
8104b992d1 [fix](ES Catalog)Do not extract doc_values of field with ignore_above setting (#40314) (#40464)
bp #40314
2024-09-06 16:25:30 +08:00
41271ecba0 [fix](ES Catalog)Do not push down limit to ES when predicates can not be processed by ES. (#40111) (#40265)
bp #40111
2024-09-03 11:17:24 +08:00
ca07a00c93 Revert "[branch-2.1](hive) support hive write text table (#38549) (#4… (#40157)
…0063)"

This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-08-30 10:25:38 +08:00
a7156ee775 [fix](parquet)Fix the be core issue when reading parquet unsigned types. (#39926) (#40123)
bp #39926
2024-08-29 21:52:52 +08:00
c6df7c21a3 [branch-2.1](hive) support hive write text table (#38549) (#40063)
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd

pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
263746b04b [fix](paimon) fix crash when enable cache with paimon deletion vector(#39877) (#39875)
bp #39877
2024-08-24 17:58:20 +08:00
de2e8f0ae6 [fix](ctas) fix NPE when ctas with old planner and varchar issue (#39744) (#39871)
bp #39744
2024-08-24 09:24:47 +08:00
c40246efa9 [bugfix](iceberg)Fixed random core with writing iceberg partitioned table for 2.1 (#39808)(#39569) (#39832)
## Proposed changes

bp: #39808 #39569
2024-08-23 17:19:48 +08:00
67a8099991 [fix](multi-catalog)fix max compute array and map type read offset (#39822)
bp #39680
2024-08-23 16:53:52 +08:00
1f16daa5f6 Revert "[bugfix](iceberg)clear block for partition values for 2.1 (#39569)" (#39815)
Reverts apache/doris#39729
2024-08-23 11:58:42 +08:00
40a58b9e42 [branch-2.1][regression test](jdbc catalog) Enable CLICKHOUSE_ALWAYS_RUN_INITDB_SCRIPTS for clickhouse docker (#39667)
pick (#39425) #39693
2024-08-23 09:59:03 +08:00
dc732fe33f [bugfix](iceberg)clear block for partition values for 2.1 (#39569) (#39729)
## Proposed changes

bp: #39569

clear block, or we will get wrong partition values.
2024-08-22 22:43:02 +08:00
f553645a71 [fix](mtmv) transfer col in mysql varchar to text when create MTMV (#37668) (#39727)
pick from master #37668
2024-08-22 15:20:59 +08:00
ebbebdf590 [regression](kerberos)add hive with kerberos write back case (#39682)
bp #38647
2024-08-21 18:29:42 +08:00