Files
doris/docker
daidai 5d02c48715 [feature](hive)Support reading renamed Parquet Hive and Orc Hive tables. (#38432) (#38809)
bp #38432 

## Proposed changes
Add `hive_parquet_use_column_names` and `hive_orc_use_column_names`
session variables to read the table after rename column in `Hive`.

These two session variables are referenced from
`parquet_use_column_names` and `orc_use_column_names` of `Trino` hive
connector.

By default, these two session variables are true. When they are set to
false, reading orc/parquet will access the columns according to the
ordinal position in the Hive table definition.

For example:
```mysql
in Hive :
hive> create table tmp (a int , b string) stored as parquet;
hive> insert into table tmp values(1,"2");
hive> alter table tmp  change column  a new_a int;
hive> insert into table tmp values(2,"4");

in Doris :
mysql> set hive_parquet_use_column_names=true;
Query OK, 0 rows affected (0.00 sec)

mysql> select  * from tmp;
+-------+------+
| new_a | b    |
+-------+------+
|  NULL | 2    |
|     2 | 4    |
+-------+------+
2 rows in set (0.02 sec)

mysql> set hive_parquet_use_column_names=false;
Query OK, 0 rows affected (0.00 sec)

mysql> select  * from tmp;
+-------+------+
| new_a | b    |
+-------+------+
|     1 | 2    |
|     2 | 4    |
+-------+------+
2 rows in set (0.02 sec)
```

You can use `set
parquet.column.index.access/orc.force.positional.evolution = true/false`
in hive 3 to control the results of reading the table like these two
session variables. However, for the rename struct inside column parquet
table, the effects of hive and doris are different.
2024-08-05 09:06:49 +08:00
..

Doris Develop Environment based on docker

Preparation

  1. Download the Doris code repo

    $ cd /to/your/workspace/
    $ git clone https://github.com/apache/doris.git
    $ cd doris
    $ git submodule update --init --recursive
    

    You can remove the .git dir in doris/ to make the dir size smaller. So that the following generated docker image can be smaller.

  2. Copy Dockerfile

    $ cd /to/your/workspace/
    $ cp doris/docker/Dockerfile ./
    

After preparation, your workspace should like this:

.
├── Dockerfile
├── doris
│   ├── be
│   ├── bin
│   ├── build.sh
│   ├── conf
│   ├── DISCLAIMER-WIP
│   ├── docker
│   ├── docs
│   ├── env.sh
│   ├── fe
│   ├── ...

Build docker image

$ cd /to/your/workspace/
$ docker build -t doris:v1.0  .

doris is docker image repository name and v1.0 is tag name, you can change them to whatever you like.

Use docker image

This docker image you just built does not contain Doris source code repo. You need to download it first and map it to the container. (You can just use the one you used to build this image before)

$ docker run -it -v /your/local/path/doris/:/root/doris/ doris:v1.0
$ docker run -it -v /your/local/.m2:/root/.m2 -v /your/local/doris-DORIS-x.x.x-release/:/root/doris-DORIS-x.x.x-release/ doris:v1.0

Then you can build source code inside the container.

$ cd /root/doris/
$ sh build.sh

NOTICE

The default JDK version is openjdk 11, if you want to use openjdk 8, you can run the command:

$ alternatives --set java java-1.8.0-openjdk.x86_64
$ alternatives --set javac java-1.8.0-openjdk.x86_64
$ export JAVA_HOME=/usr/lib/jvm/java-1.8.0

The version of jdk you used to run FE must be the same version you used to compile FE.

Latest update time

2022-1-23