Commit Graph

246 Commits

Author SHA1 Message Date
ed94a6d21a branch-2.1: [fix](docker)Add docker-ps 'sudo' permissions #52395 (#52457)
Cherry-picked from #52395

Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
Co-authored-by: wuwenchi.wwc <wuwenchi.wwc@oceanbase.com>
2025-06-28 20:06:51 +08:00
3a1e95c6c2 branch-2.1: [improvement](jdbc catalog) Optimize the acquisition of indentity type in SQLServer (#51659)
pick #51285
2025-06-16 16:50:37 +08:00
ac65fed0ed branch-2.1: [fix](jdbc test) Add more connections to mysql docker #50970 (#51210)
Cherry-picked from #50970

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2025-05-24 17:46:23 +08:00
5c344ea043 branch-2.1: [opt](docker) add a script flag to control load data or not #51065 (#51083)
Cherry-picked from #51065

Co-authored-by: zgxme <zhenggaoxiong@selectdb.com>
2025-05-21 12:09:07 +08:00
13fbc9efa6 branch-2.1: [fix](hive) fix write hive partition by Doris #50864 (#50921)
Cherry-picked from #50864

Co-authored-by: Socrates <suxiaogang223@icloud.com>
2025-05-17 16:14:23 +08:00
48778eab4d branch-2.1: [fix](iceberg)Fix the inconsistency between the data in pg and the data in MinIO. #50578 (#50641)
Cherry-picked from #50578

Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2025-05-07 23:15:02 +08:00
4aff17f355 branch-2.1: [fix](docker hive3) hive server oom and not auto-restart #50456 (#50507)
Cherry-picked from #50456

Co-authored-by: Thearas <gaozifeng@selectdb.com>
2025-05-03 22:44:29 +08:00
0710d9b2d6 branch-2.1: [fix](orc) Should not pass selection vector when decode child column of List or Map #50136 (#50316)
bp: #50136
2025-04-25 09:04:06 +08:00
94986fc574 branch-2.1: [fix](multi-catalog) Fix bug: "Can not create a Path from an empty string" (#49382) (#49641)
### What problem does this PR solve?
Problem Summary:
In HiveMetaStoreCache, the function FileInputFormat.setInputPaths is
used to set input paths. However, this function splits paths using
commas, which is not the expected behavior. As a result, when partition
values contain commas, it leads to incorrect path parsing and potential
errors.
```java
  public static void setInputPaths(JobConf conf, String org.apache.hadoop.shaded.com.aSeparatedPaths) {
    setInputPaths(conf, StringUtils.stringToPath(
                        getPathStrings(org.apache.hadoop.shaded.com.aSeparatedPaths)));
  }
```
To prevent FileInputFormat.setInputPaths from splitting paths by commas,
we use another overloaded version of the method. Instead of passing a
comma-separated string, we explicitly pass a Path object, ensuring that
partition values containing commas are handled correctly.
```java
  public static void setInputPaths(JobConf conf, Path... inputPaths) {
    Path path = new Path(conf.getWorkingDirectory(), inputPaths[0]);
    StringBuffer str = new StringBuffer(StringUtils.escapeString(path.toString()));
    for(int i = 1; i < inputPaths.length;i++) {
      str.append(StringUtils.COMMA_STR);
      path = new Path(conf.getWorkingDirectory(), inputPaths[i]);
      str.append(StringUtils.escapeString(path.toString()));
    }
    conf.set(org.apache.hadoop.shaded.org.apache.hadoop.mapreduce.lib.input.
      FileInputFormat.INPUT_DIR, str.toString());
  }
```

### Release note

None
2025-03-29 09:13:43 +08:00
676b868d99 branch-2.1:[opt](docker) Add ranger docker component (#47697) (#48359)
### What problem does this PR solve?
bp  https://github.com/apache/doris/pull/47697

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [x] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [x] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-02-27 09:47:25 +08:00
fb31586612 branch-2.1: [test](jdbc catalog) add more jdbc catalog extreme test (#47799)
cherry-pick (#47525)
2025-02-14 17:03:49 +08:00
45c9daa063 branch-2.1:[fix](docker) Starting thirdpaty script with only the res… (#47592) 2025-02-08 10:11:32 +08:00
226f848ad8 branch-2.1: [fix](hive docker)Table partition_location_1 miss data #47539 (#47559)
Cherry-picked from #47539

Co-authored-by: Thearas <gaozifeng@selectdb.com>
2025-02-07 11:21:47 +08:00
af55eba242 branch-2.1: [opt](hive docker)Exit on creating table failed #47390 (#47453) 2025-01-26 17:28:20 +08:00
7c9d64d79a [opt](iceberg docker)Add health check for iceberg rest container (#46767) (#47422) 2025-01-25 09:04:27 +08:00
5f2438aeab branch-2.1: [opt](docker)Add healthy check for ES and Kafka #47362 (#47414)
Cherry-picked from #47362

Co-authored-by: Thearas <gaozifeng@selectdb.com>
2025-01-25 09:00:50 +08:00
407d04fab5 branch-2.1: [opt](docker)Replace healthy container with --wait #47357 (#47421)
Cherry-picked from #47357

Co-authored-by: Thearas <gaozifeng@selectdb.com>
2025-01-25 08:31:15 +08:00
baaf026e82 [fix](hive docker)Reserve host port for hive2 namenode and datanode (#47262) (#47354)
Problem Summary:

The [External hive

CI](http://43.132.222.7:8111/buildConfiguration/Doris_External_Regression/612304?buildTab=log&linesState=3650&logView=flowAware)
failed because of `namenode` error( 50070 port already in used), docker
logs:
```txt
2025-01-21T04:22:37.955682469Z java.net.BindException: Port in use: 0.0.0.0:50070
2025-01-21T04:22:37.955686106Z 	at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:940)
2025-01-21T04:22:37.955689402Z 	at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:876)
2025-01-21T04:22:37.955692708Z 	at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:142)
2025-01-21T04:22:37.955697828Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:760)
2025-01-21T04:22:37.955701444Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:639)
2025-01-21T04:22:37.955704831Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:819)
2025-01-21T04:22:37.955708237Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:803)
2025-01-21T04:22:37.955711674Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1500)
2025-01-21T04:22:37.955715090Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1566)
2025-01-21T04:22:37.955718446Z Caused by: java.net.BindException: Address already in use
2025-01-21T04:22:37.955722013Z 	at sun.nio.ch.Net.bind0(Native Method)
2025-01-21T04:22:37.955725460Z 	at sun.nio.ch.Net.bind(Net.java:433)
2025-01-21T04:22:37.955729227Z 	at sun.nio.ch.Net.bind(Net.java:425)
2025-01-21T04:22:37.955733074Z 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
2025-01-21T04:22:37.955736600Z 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
2025-01-21T04:22:37.955740197Z 	at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
2025-01-21T04:22:37.955743884Z 	at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:934)
2025-01-21T04:22:37.955747391Z 	... 8 more
2025-01-21T04:22:37.961686454Z 25/01/21 04:22:37 INFO util.ExitUtil: Exiting with status 1
```

The best choice is avoid the services using server port at range
`/proc/sys/net/ipv4/ip_local_port_range` (32768-60999). But since the
namenode [hardcode exposing port `50070` in docker
image](https://hub.docker.com/layers/bde2020/hadoop-datanode/2.0.0-hadoop2.7.4-java8/images/sha256-5623fca5e36d890983cdc6cfd29744d1d65476528117975b3af6a80d99b3c62f),
so we add the port to `net.ipv4.ip_local_reserved_ports` and introduce a
new flags `--reserve-ports` to control it (default false, because not
everyone want to modify system reserved ports).

Change-Id: I03a81e9931cb555695199436b6f0517cccf83588
2025-01-24 16:12:03 +08:00
3aad9e5f67 [opt](oceanbase docker)Use LTS docker image and print unhealthy docker logs (#46647) (#47349)
### What problem does this PR solve?

Problem Summary:
Oceanbase container sometimes start failed.
<img width="653" alt="image"

src="https://github.com/user-attachments/assets/d95c66cf-7e04-4179-a565-9b9dd8b87128"
/>

We do two things:
1. Print last 100 lines docker logs of unhealthy container for debugging
2. Upgrade Oceanbase docker image to the newest `4.2.1-lts`, since it is
7 months newer than `4.2.1`, more stable
2025-01-24 11:22:02 +08:00
50b3303385 branch-2.1:[fix](docker) Start kerberos docker correctly (#47315)
### What problem does this PR solve?

Wrongly started kerberos twice that may cause container name conflict.

Related PR: https://github.com/apache/doris/pull/46858
2025-01-22 17:55:37 +08:00
7568b21273 branch-2.1: [fix](docker) solve kerberos docker conflict #47260 (#47273)
Cherry-picked from #47260

Co-authored-by: zgxme <zhenggaoxiong@selectdb.com>
2025-01-22 10:17:39 +08:00
4bd55b2f8b branch-2.1: [Opt](external-docker) Modify kerberos network mode to host #47043 (#47095)
Cherry-picked from #47043

Co-authored-by: zgxme <zhenggaoxiong@selectdb.com>
2025-01-16 23:12:05 +08:00
13fa4ea2ee branch-2.1 [Opt](docker) kerberos docker healthy check (#46662) (#46858)
#46662

Co-authored-by: zgxme <zhenggaoxiong@selectdb.com>
2025-01-13 15:38:17 +08:00
c016eb49c5 [enhance](mtmv)When obtaining the partition list fails, treat the pai… (#46708)
…mon table as an unpartitioned table  (#46641)

pick: https://github.com/apache/doris/pull/46641
2025-01-10 10:46:09 +08:00
72cdedc47f branch-2.1: [opt](iceberg docker)Use PostgreSQL as the backend for the Iceberg REST server. #46289 (#46576)
Cherry-picked from #46289

Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2025-01-09 22:30:03 +08:00
eddea8b309 [opt](hive docker)Parallel put hive data (#46571) (#46682)
Problem Summary:
Parallel put `tpch1.db`, `paimon1` and `tvf_data` hive data. Reduce the
time cost from 22m to 16m on 16C machine.

Change-Id: Ib75c57d397ce1f96d5108d4b570bcb215f31d421
2025-01-09 14:08:35 +08:00
3bc70876c4 branch-2.1: [fix](test) Optimize the health check after oceanbase docker starts #46434 (#46599)
Cherry-picked from #46434

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2025-01-08 20:29:40 +08:00
4d0037a928 branch-2.1: [fix](ES catalog)Fix query long value exception with doc_value #46554 (#46581)
Cherry-picked from #46554

Co-authored-by: qiye <luen@selectdb.com>
2025-01-08 15:26:58 +08:00
5d2930e783 [fix](shellcheck) fix hive-metastore and enable shellcheck in docker (#46496) (#46574)
cherry-pick (#46496)

Co-authored-by: Socrates <suyiteng@selectdb.com>
2025-01-08 11:10:34 +08:00
d8c94d6392 branch-2.1: [fix](regression)fix hive translation unstable case. #46385 (#46409)
Cherry-picked from #46385

Co-authored-by: daidai <changyuwei@selectdb.com>
2025-01-04 08:59:56 +08:00
02239e4fb2 branch-2.1: [chore](regression) do not hard code S3 bucket and endpoint of hive t… #46159 (#46169)
Cherry-picked from #46159

Co-authored-by: zgxme <zhenggaoxiong@selectdb.com>
2024-12-31 11:44:36 +08:00
6dd92be33d [feature](statistics)Support get row count for pg and sql server. (#42674) (#46131)
backport: https://github.com/apache/doris/pull/42674
2024-12-29 19:37:21 +08:00
a380f5d222 [enchement](utf8)import enable_text_validate_utf8 session var (#45537) (#46070)
bp #45537
2024-12-28 10:05:03 +08:00
303557ac70 [fix](hive)fix hive insert only translaction table. (#45753)
### What problem does this PR solve?
bp #44001 , but no hive4 acid table.

Problem Summary:
1. Fixed the issue that when reading insert translaction only tables,
there was no acid check, which caused multiple data reads (i.e., reading
data from the previous base_n).
2. Forbidden to create, insert data, and delete aicd tables.
2024-12-22 21:23:21 +08:00
19c0e89da7 [enchement](iceberg)support read iceberg partition evolution table. (#45367) (#45569)
cherry-pick #45367

Co-authored-by: daidai <changyuwei@selectdb.com>
2024-12-20 08:56:51 +08:00
7d32e4f71f branch-2.1: [Fix](ORC) Not push down fixed char type in orc reader #45484 (#45525)
cherry-pick #45484
2024-12-19 14:06:00 +08:00
ea24410faf [enhancement][docker] fix kafka docker issue (#45091) 2024-12-06 14:36:57 +08:00
11c517fe1e [enhancement][docker]update routine docker file (#45048) 2024-12-05 17:27:44 +08:00
702abbff0f [Opt](orc)Optimize the merge io when orc reader read multiple tiny stripes. (#42004) (#44239)
bp #42004

Co-authored-by: kaka11chen <kaka11.chen@gmail.com>
2024-11-22 11:01:41 +08:00
3136fa48a6 branch-2.1: [chore](ci) adjust some invalid url #44261 (#44270)
Cherry-picked from #44261

Co-authored-by: Dongyang Li <lidongyang@selectdb.com>
2024-11-19 19:28:04 +08:00
83b74827aa branch-2.1: [fix](iceberg)Fix count(*) error with dangling delete problem #44039 (#44101)
Cherry-picked from #44039

Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2024-11-19 17:19:25 +08:00
efb3bdd96e [fix](test) fix clickhouse jdbc catalog func push down case #43196 (#44151)
cherry pick from #43196

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2024-11-18 18:03:10 +08:00
48e33bfb2a branch-2.1: [fix](hive)Fixed the issue of reading hive table with empty lzo files #43979 (#44063)
Cherry-picked from #43979

Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2024-11-16 16:14:50 +08:00
4531cd86e3 branch-2.1: [fix](regression-test) add checks for existence and successful upload of data files in hive-metastore.sh #43853 (#43888)
Cherry-picked from #43853

Co-authored-by: Socrates <suyiteng@selectdb.com>
2024-11-14 11:23:23 +08:00
a1ff02288f branch-2.1: [fix](hive) support query hive view created by spark (#43553)
Cherry-picked from #43530

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
Co-authored-by: morningman <yunyou@selectdb.com>
2024-11-11 23:28:53 +08:00
cdd32d9582 [enhance](hive) support reading hive table with OpenCSVSerde #42257 (#42940)
cherry pick from #42257

Co-authored-by: Socrates <suxiaogang223@icloud.com>
2024-10-31 11:12:07 +08:00
fce4695f37 [Configuration](transactional-hive) Add skip_checking_acid_version_file session var to skip checking acid version file in some hive envs. (#42111)(#42225) (#42939)
cherry-pick (#42111)(#42225)

---------

Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
2024-10-31 09:52:20 +08:00
2defa90be7 [test](ES Catalog)Add mapping _routing test case (#42074) (#42282)
## Proposed changes

bp #42074
2024-10-23 10:14:12 +08:00
157d67e7ca [enhance](hive) Add regression-test cases for hive text ddl and hive text insert and fix reading null string bug #42200 (#42273)
cherry pick from #42200

Co-authored-by: Socrates <suxiaogang223@icloud.com>
2024-10-22 23:56:57 +08:00
38e529cd29 [cherry-pick](branch-2.1) support decimal256 for parquet reader (#42241)
## Proposed changes
pick pr: https://github.com/apache/doris/pull/41526
2024-10-22 19:42:09 +08:00