702abbff0f
[Opt](orc)Optimize the merge io when orc reader read multiple tiny stripes. ( #42004 ) ( #44239 )
...
bp #42004
Co-authored-by: kaka11chen <kaka11.chen@gmail.com >
2024-11-22 11:01:41 +08:00
3136fa48a6
branch-2.1: [chore](ci) adjust some invalid url #44261 ( #44270 )
...
Cherry-picked from #44261
Co-authored-by: Dongyang Li <lidongyang@selectdb.com >
2024-11-19 19:28:04 +08:00
83b74827aa
branch-2.1: [fix](iceberg)Fix count(*) error with dangling delete problem #44039 ( #44101 )
...
Cherry-picked from #44039
Co-authored-by: wuwenchi <wuwenchi@selectdb.com >
2024-11-19 17:19:25 +08:00
efb3bdd96e
[fix](test) fix clickhouse jdbc catalog func push down case #43196 ( #44151 )
...
cherry pick from #43196
Co-authored-by: zy-kkk <zhongyk10@gmail.com >
2024-11-18 18:03:10 +08:00
48e33bfb2a
branch-2.1: [fix](hive)Fixed the issue of reading hive table with empty lzo files #43979 ( #44063 )
...
Cherry-picked from #43979
Co-authored-by: wuwenchi <wuwenchi@selectdb.com >
2024-11-16 16:14:50 +08:00
4531cd86e3
branch-2.1: [fix](regression-test) add checks for existence and successful upload of data files in hive-metastore.sh #43853 ( #43888 )
...
Cherry-picked from #43853
Co-authored-by: Socrates <suyiteng@selectdb.com >
2024-11-14 11:23:23 +08:00
a1ff02288f
branch-2.1: [fix](hive) support query hive view created by spark ( #43553 )
...
Cherry-picked from #43530
Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com >
Co-authored-by: morningman <yunyou@selectdb.com >
2024-11-11 23:28:53 +08:00
1ac00ea983
branch-2.1: [feat](doris compose) Copy lastest compose code from master branch ( #43464 )
...
Copy lastest code from master branch to support run docker suites
without external doris cluster, enable jvm debug port, ..., etc.
2024-11-08 09:47:19 +08:00
cdd32d9582
[enhance](hive) support reading hive table with OpenCSVSerde #42257 ( #42940 )
...
cherry pick from #42257
Co-authored-by: Socrates <suxiaogang223@icloud.com >
2024-10-31 11:12:07 +08:00
fce4695f37
[Configuration](transactional-hive) Add skip_checking_acid_version_file session var to skip checking acid version file in some hive envs. ( #42111 )( #42225 ) ( #42939 )
...
cherry-pick (#42111 )(#42225 )
---------
Co-authored-by: Qi Chen <kaka11.chen@gmail.com >
2024-10-31 09:52:20 +08:00
2defa90be7
[test](ES Catalog)Add mapping _routing test case ( #42074 ) ( #42282 )
...
## Proposed changes
bp #42074
2024-10-23 10:14:12 +08:00
157d67e7ca
[enhance](hive) Add regression-test cases for hive text ddl and hive text insert and fix reading null string bug #42200 ( #42273 )
...
cherry pick from #42200
Co-authored-by: Socrates <suxiaogang223@icloud.com >
2024-10-22 23:56:57 +08:00
38e529cd29
[cherry-pick](branch-2.1) support decimal256 for parquet reader ( #42241 )
...
## Proposed changes
pick pr: https://github.com/apache/doris/pull/41526
2024-10-22 19:42:09 +08:00
c1d2b8d548
[2.1][improvement](jdbc catalog) Disallow non-constant type conversion pushdown and implicit conversion pushdown ( #42242 )
...
pick (#42102 )
Add a variable `enable_jdbc_cast_predicate_push_down`, the default value
is false, which prohibits the pushdown of non-constant predicates with
type conversion and all predicates with implicit conversion. This change
can prevent the wrong predicates from being pushed down to the Jdbc data
source, resulting in query data errors, because the predicates with cast
were not correctly pushed down to the data source before. If you find
that the data is read correctly and the performance is better before
this change, you can manually set this variable to true
```
| Expression | Can Push Down |
|-----------------------------------------------------|---------------|
| column type equals const type | Yes |
| column type equals cast const type | Yes |
| cast column type equals const type | No |
| cast column type equals cast const type | No |
| column type not equals column type | No |
| column type not equals cast const type | No |
| cast column type not equals const type | No |
| cast column type not equals cast const type | No |
```
2024-10-22 17:27:29 +08:00
a32ad0b1f7
[cherry-pick](branch-2.1) support reading brotli compressed parquet file ( #42162 )
...
pick pr: https://github.com/apache/doris/pull/41875
2024-10-21 16:48:09 +08:00
a150d160ea
[fix](jdbc catalog) fix and add mysql and doris extremum test #41679 ( #42122 )
...
cherry pick from #41679
---------
Co-authored-by: zy-kkk <zhongyk10@gmail.com >
2024-10-21 16:39:40 +08:00
1b901f6fcc
[cherry-pick](branch-2.1) add parquet tvf cases and fix some parquet bug ( #41931 )
...
## Proposed changes
pick pr:
https://github.com/apache/doris/pull/41683
https://github.com/apache/doris/pull/41506
https://github.com/apache/doris/pull/41338
https://github.com/apache/doris/pull/39326
---------
Co-authored-by: morningman <morningman@163.com >
2024-10-17 14:20:58 +08:00
4888c632f4
[cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text ( #41684 )
...
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/40291
2024-10-15 00:08:23 +08:00
4f81fc474c
[bugfix](paimon)Get the file format by file name ( #41020 ) ( #41487 )
...
bp #41020
2024-09-30 15:46:13 +08:00
0b4552f74b
[cherry-pick](branch-2.1) pick hive text write from master ( #40537 )
...
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315
---------
Co-authored-by: Calvin Kirs <kirs@apache.org >
2024-09-27 20:57:07 +08:00
c744eb87c5
[fix](regression)fix some regression test ( #40928 ) ( #41046 )
...
bp #40928
2024-09-20 18:17:44 +08:00
5f583fa329
[branch-2.1][test](jdbc catalog) add oceanbase ce jdbc catalog test ( #40978 )
...
pick #34972 )
2024-09-19 22:11:24 +08:00
7bb9ca91c8
[branch-2.1](fix) adjust data download url about hive docker ( #40846 )
...
## Proposed changes
fix paimon regression test
Co-authored-by: Dongyang Li <hello_stephen@qq.com >
Co-authored-by: stephen <hello-stephen@qq.com >
2024-09-14 23:19:54 +08:00
8708fae420
[fix](ES Catalog)Support parse single value for array column ( #40614 ) ( #40660 )
...
bp #40614
2024-09-11 17:26:48 +08:00
8104b992d1
[fix](ES Catalog)Do not extract doc_values of field with ignore_above setting ( #40314 ) ( #40464 )
...
bp #40314
2024-09-06 16:25:30 +08:00
ca07a00c93
Revert "[branch-2.1](hive) support hive write text table ( #38549 ) (#4… ( #40157 )
...
…0063)"
This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-08-30 10:25:38 +08:00
c6df7c21a3
[branch-2.1](hive) support hive write text table ( #38549 ) ( #40063 )
...
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd
pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
b9da934b16
[fix](hive) report error with escape char and null format ( #39700 ) ( #39869 )
...
bp #39700
Co-authored-by: Socrates <suxiaogang223@icloud.com >
2024-08-24 09:23:03 +08:00
cf698fb615
[fix](regression) fix some jdbc datasource docker health check ( #39141 ) ( #39872 )
2024-08-24 03:29:18 +08:00
508c7a7040
[fix](hive)Modify the Hive notification event processing method when using meta cache and add parameters to the Hive catalog. ( #39239 ) ( #39865 )
...
bp #39239
Co-authored-by: daidai <2017501503@qq.com >
2024-08-23 23:21:02 +08:00
40a58b9e42
[branch-2.1][regression test](jdbc catalog) Enable CLICKHOUSE_ALWAYS_RUN_INITDB_SCRIPTS for clickhouse docker ( #39667 )
...
pick (#39425 ) #39693
2024-08-23 09:59:03 +08:00
27ba2542e2
[case](iceberg)append iceberg schema change case. ( #38766 ) ( #39630 )
...
bp #38766
2024-08-21 09:17:12 +08:00
2948b5ea2b
[branch-2.1][fix](jdbc scan) Remove the conjuncts.remove call in JdbcScan ( #39407 )
...
pick (#39180 )
In #37565 , due to the change in the calling order of finalize, the final
generated Plan will be missing the PREDICATES that have been pushed down
in Jdbc. Although this behavior is correct, before perfectly handling
the push down of various PREDICATES, we need to keep all conjuncts to
ensure that we can still filter data normally when the data returned by
Jdbc is a superset.
2024-08-16 19:01:40 +08:00
43cc8d648d
[fix](ES Catalog)Check isArray before parse json to array ( #39104 ) ( #39273 )
...
## Proposed changes
bp #39104
2024-08-13 15:13:40 +08:00
3da2d1c9d6
[bug](parquet)Fix the problem that the parquet reader reads the missing sub-columns of the struct and fails. ( #38718 ) ( #39192 )
...
bp #38718
2024-08-11 20:37:40 +08:00
607c0b82a9
[opt](serde)Optimize the filling of fixed values into block columns without repeated deserialization. ( #37377 ) ( #38245 ) ( #38810 )
...
## Proposed changes
pick pr: #38575 and fix this pr bug : #38245
2024-08-05 09:13:08 +08:00
5d02c48715
[feature](hive)Support reading renamed Parquet Hive and Orc Hive tables. ( #38432 ) ( #38809 )
...
bp #38432
## Proposed changes
Add `hive_parquet_use_column_names` and `hive_orc_use_column_names`
session variables to read the table after rename column in `Hive`.
These two session variables are referenced from
`parquet_use_column_names` and `orc_use_column_names` of `Trino` hive
connector.
By default, these two session variables are true. When they are set to
false, reading orc/parquet will access the columns according to the
ordinal position in the Hive table definition.
For example:
```mysql
in Hive :
hive> create table tmp (a int , b string) stored as parquet;
hive> insert into table tmp values(1,"2");
hive> alter table tmp change column a new_a int;
hive> insert into table tmp values(2,"4");
in Doris :
mysql> set hive_parquet_use_column_names=true;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from tmp;
+-------+------+
| new_a | b |
+-------+------+
| NULL | 2 |
| 2 | 4 |
+-------+------+
2 rows in set (0.02 sec)
mysql> set hive_parquet_use_column_names=false;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from tmp;
+-------+------+
| new_a | b |
+-------+------+
| 1 | 2 |
| 2 | 4 |
+-------+------+
2 rows in set (0.02 sec)
```
You can use `set
parquet.column.index.access/orc.force.positional.evolution = true/false`
in hive 3 to control the results of reading the table like these two
session variables. However, for the rename struct inside column parquet
table, the effects of hive and doris are different.
2024-08-05 09:06:49 +08:00
c0caca7c55
[fix](ES Catalog)Fix unstable test test_es_query ( #38801 ) ( #38802 )
...
## Proposed changes
bp #38801
2024-08-03 23:49:00 +08:00
b0943064e0
[fix](kerberos)fix and refactor ugi login for kerberos and simple authentication ( #38607 )
...
pick from (#37301 )
2024-08-01 14:01:32 +08:00
41fa7bc9fd
[bugfix](paimon)Fixed the reading of timestamp with time zone type data for 2.1 ( #37716 ) ( #38592 )
...
bp: #37716
2024-08-01 10:23:06 +08:00
ef8a1918c3
[case][fix](iceberg)move rest cases from p2 to p0 and fix iceberg version issue for 2.1 ( #37898 ) ( #38589 )
...
bp: #37898
2024-07-31 22:41:56 +08:00
86dd2d24ce
[fix](test) Modify SQLServer image to custom hub ( #38515 ) ( #38613 )
...
pick from master #38515
Co-authored-by: zy-kkk <zhongyk10@gmail.com >
2024-07-31 19:21:28 +08:00
c011060e4f
[chore](ci) adjust thirdparty docker image source for easy management… ( #38558 )
...
… (#37307 )
pick from master #37307
Co-authored-by: stephen <hello-stephen@qq.com >
2024-07-31 14:47:16 +08:00
f7068b5658
[cherry-pick](branch-2.1) Make doris read hive text table parameters and behavior consistent with hive ( #37840 )
...
## Proposed changes
pick from master https://github.com/apache/doris/pull/37638
<!--Describe your changes.-->
2024-07-16 22:24:50 +08:00
bdf3e3a17e
[test](docker) change the default region for docker compose ( #37768 ) ( #37813 )
...
bp #37768
2024-07-15 22:18:33 +08:00
e5339a4014
[feature](ES Catalog)Support control scroll level by config #37180 ( #37290 )
...
## Proposed changes
backport #37180
2024-07-15 16:41:38 +08:00
ea12114549
[fix](dockerfile) Switch repos to point to to vault.centos.org because CentOS 7 is EOL ( #37568 ) ( #37763 )
...
bp #37568
2024-07-15 15:57:56 +08:00
16de141743
[regression](kerberos)add hive kerberos docker regression env ( #37657 )
...
## Proposed changes
pick:
[regression](kerberos)fix regression pipeline env when write hosts
(#37057 )
[regression](kerberos)add hive kerberos docker regression env (#36430 )
2024-07-15 09:35:39 +08:00
56a207c3f0
[case](paimon/iceberg)move cases from p2 to p0 ( #37276 ) ( #37738 )
...
bp #37276
Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com >
2024-07-13 10:01:05 +08:00
81360cf897
[opt](test) shorten the external p0 running time ( #37320 ) ( #37473 )
...
bp #37320
2024-07-09 15:35:15 +08:00