Files

Socrates 94986fc574 branch-2.1: [fix](multi-catalog) Fix bug: "Can not create a Path from an empty string" (#49382 ) (#49641 )

### What problem does this PR solve?
Problem Summary:
In HiveMetaStoreCache, the function FileInputFormat.setInputPaths is
used to set input paths. However, this function splits paths using
commas, which is not the expected behavior. As a result, when partition
values contain commas, it leads to incorrect path parsing and potential
errors.
```java
  public static void setInputPaths(JobConf conf, String org.apache.hadoop.shaded.com.aSeparatedPaths) {
    setInputPaths(conf, StringUtils.stringToPath(
                        getPathStrings(org.apache.hadoop.shaded.com.aSeparatedPaths)));
  }
```
To prevent FileInputFormat.setInputPaths from splitting paths by commas,
we use another overloaded version of the method. Instead of passing a
comma-separated string, we explicitly pass a Path object, ensuring that
partition values containing commas are handled correctly.
```java
  public static void setInputPaths(JobConf conf, Path... inputPaths) {
    Path path = new Path(conf.getWorkingDirectory(), inputPaths[0]);
    StringBuffer str = new StringBuffer(StringUtils.escapeString(path.toString()));
    for(int i = 1; i < inputPaths.length;i++) {
      str.append(StringUtils.COMMA_STR);
      path = new Path(conf.getWorkingDirectory(), inputPaths[i]);
      str.append(StringUtils.escapeString(path.toString()));
    }
    conf.set(org.apache.hadoop.shaded.org.apache.hadoop.mapreduce.lib.input.
      FileInputFormat.INPUT_DIR, str.toString());
  }
```

### Release note

None

2025-03-29 09:13:43 +08:00

common

…

conf

[fix](ES Catalog)Make sure ES meta is synced before using (#46781 ) (#47711 )

2025-02-11 21:00:08 +08:00

ctas_p0

…

data

branch-2.1: [fix](multi-catalog) Fix bug: "Can not create a Path from an empty string" (#49382 ) (#49641 )

2025-03-29 09:13:43 +08:00

framework

[regression-test](cases) mv some cases nonConcurrent (#49460 )

2025-03-26 11:22:44 +08:00

java-udf-src

[chore](test)Exclude Hive-related packages from java-udf-src. (#40757 ) (#40785 )

2024-09-13 13:44:05 +08:00

pipeline

branch-2.1: [chore](ci) rm unused file #48326 (#49290 )

2025-03-20 16:48:17 +08:00

plugins

[fix](regression)Fix unstable compaction related cases (#46920 ) (#47003 )

2025-01-15 13:42:04 +08:00

regression-test/realData/insert_p0

[Feat](nereids) support pull up predicate from set operator (#39450 ) (#44056 )

2024-12-03 16:24:52 +08:00

script

…

ssl_default_certificate

…

suites

branch-2.1: [fix](multi-catalog) Fix bug: "Can not create a Path from an empty string" (#49382 ) (#49641 )

2025-03-29 09:13:43 +08:00

certificate.p12

…

README.md

branch-2.1: [fix](regression-test) fix injection would not be removed when exception #46357 (#46360 )

2025-01-03 22:02:50 +08:00

README.md

Guide for test cases

General Case

Write "def" before variable names; otherwise, they will be global variables and may be affected by other cases running in parallel.

Problematic code:
```
ret = ***
```
Correct code:
```
def ret = ***
```
Avoid setting global session variables or modifying cluster configurations in cases, as it may affect other cases.

Problematic code:
```
sql """set global enable_pipeline_x_engine=true;"""
```
Correct code:
```
sql """set enable_pipeline_x_engine=true;"""
```
If it is necessary to set global variables or modify cluster configurations, specify the case to run in a nonConcurrent manner.

Example
For cases involving time-related operations, it is best to use fixed time values instead of dynamic values like the now() function to prevent cases from failing after some time.

Problematic code:
```
sql """select count(*) from table where created < now();"""
```
Correct code:
```
sql """select count(*) from table where created < '2023-11-13';"""
```
After streamloading in a case, add a sync to ensure stability when executing in a multi-FE environment.

Problematic code:
```
streamLoad { ... }
sql """select count(*) from table """
```
Correct code:
```
streamLoad { ... }
sql """sync"""
sql """select count(*) from table """
```
For UDF cases, make sure to copy the corresponding JAR file to all BE machines.

Example
Do not create the same table in different cases under the same directory to avoid conflicts.
Cases injected should be marked as nonConcurrent and ensured injection to be removed after running the case.

Compatibility case

Refers to the resources or rules created on the initial cluster during FE testing or upgrade testing, which can still be used normally after the cluster restart or upgrade, such as permissions, UDF, etc.

These cases need to be split into two files, load.groovy and xxxx.groovy, placed in a folder, and tagged with the restart_fe group label, example.