1. Trim Schema Names: Adapted the system to remove trailing spaces from DB2 schema names, ensuring compatibility without affecting query operations.
2. XML Mapping: Implemented a feature to directly map XML types to String.
In the previous design, we were compatible with MySQL's auto-increment column and default value to bypass the null value check when writing back Jdbc External Table. However, because MySQL's default value is not completely unified with Doris, this resulted in The unsuitable default value is wrong. In response to this situation, I made the following optimizations
1. For JDBC External Table, we always allow certain columns to be missing during insertion. Even if these columns are not allowed to be empty at the source end, the error should be generated by the source end, not Doris herself.
2. When the target column is non-nullable and the insertion is done via `INSERT INTO tbl VALUES()` or `INSERT INTO tbl SELECT constants`, Doris should verify any inconsistency between them and throw an exception. This check is not applied for `INSERT INTO tbl SELECT ... FROM tbl` operations.
1. add volume for es logs
2. optimize health check, waiting for es status to be green
3. fix es6 valume path error
4. optimize disk watermark to avoid es disk watermark error
5. fix es6 create index error
6. add custom elasticsearch.yml for es6
7. add log4j2.properties for es6, es7, es8
This PR proposes mapping external catalog JSON types to String instead of JsonB in Apache Doris. This change is motivated by the realization that JDBC retrieves JSON data as a String JSON string, regardless of its storage format (Json(String) or Json(Binary)). Mapping to String streamlines data retrieval, simplifies write-backs, and ensures compatibility with all JSON(String) and JSON(Binary) functions, despite potentially misleading displays of JSON data as Strings in Doris. This approach avoids the performance overhead and complexity of converting each row of data from JsonB to String, making the process more efficient and elegant.
About Upgrade
To ensure query compatibility with existing Catalogs in the upgraded version,we currently still retain the capability to query external JSON types as JSONB. However, once you upgrade to the new version and either refresh the Catalog or create a new one, all external JSON types will be treated as Strings. To ensure consistent behavior,and possible future removal of support for JSON as JSONB query code, it is highly recommended that you manually refresh your Catalog as soon as possible after upgrading to the new version.
In the previous logic, when we restored the Column in the predicate pushdown based on the logical syntax tree for JdbcScanNode, in order to avoid query errors caused by keywords such as `key`, we added escape characters for it, but before we only Binary predicates are processed, which is imperfect. We should add escape characters to all columns that appear in the predicate to avoid errors with keywords or illegal characters.
Add `information_schema` database for all catalog.
This is useful when using BI tools to connect to Doris,
the tools can get meta info from `information_schema`.
This PR mainly changes:
1. There will be a `information_schema` db in each catalog.
2. Each `information_schema` db only store the meta info of the catalog it belongs to.
3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name.
4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true,
The `TABLE_SCHEMA` column's value is the like `ctl.db`, because:
When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`.
And then some BI will try to query `information_schema` with sql like:
`select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"`
So it has to be format as `ctl.db`
eg, the `information_schema.columns` table in external catalog `doris` is like:
```
mysql> select * from information_schema.columns limit 1\G
*************************** 1. row ***************************
TABLE_CATALOG: doris
TABLE_SCHEMA: doris.__internal_schema
TABLE_NAME: column_statistics
COLUMN_NAME: id
ORDINAL_POSITION: 1
COLUMN_DEFAULT: NULL
IS_NULLABLE: NO
DATA_TYPE: varchar
CHARACTER_MAXIMUM_LENGTH: 4096
CHARACTER_OCTET_LENGTH: 16384
NUMERIC_PRECISION: NULL
NUMERIC_SCALE: NULL
DATETIME_PRECISION: NULL
CHARACTER_SET_NAME: NULL
COLLATION_NAME: NULL
COLUMN_TYPE: varchar(4096)
COLUMN_KEY:
EXTRA:
PRIVILEGES:
COLUMN_COMMENT:
COLUMN_SIZE: 4096
DECIMAL_DIGITS: NULL
GENERATION_EXPRESSION: NULL
SRS_ID: NULL
```
6. Modify the behavior of
- show tables
- shwo databases
- show columns
- show table status
The above statements may query the `information_schema` db if there is `where` predicate after them
This enhancement shall extend existing logic for SHOW PARTITIONS FROM to include: -
Limit/Offset
Where [partition name only] [equal operator and like operator]
Order by [partition name only]
Issue Number: close#27834
If specified, got a column of constant. otherwise an incremental series like it always be.
mysql> select * from numbers("number" = "5", "const_value" = "-123");
+--------+
| number |
+--------+
| -123 |
| -123 |
| -123 |
| -123 |
| -123 |
+--------+
5 rows in set (0.11 sec)
Opt zstd block decompression by `ZSTD_decompressDCtx()` to replace streaming decompression.
It will improve performance but consume more memory.
Test result:
- env: 1 node(16 cores, 64G).
- parquet column: 100 million rows of char(255) column.
- result: 5.2 -> 4.6.
Fix null map issue in parquet reader which cause result incorrect such as `min()`, `max()`.
In order to share null map between parquet converted src column and dst column to avoid copying. It is very tricky that will call mutable function `doris_nullable_column->get_null_map_column_ptr()` which will set `_need_update_has_null = true`. Because some operations such as agg will call `has_null()` to set `_need_update_has_null = false`.
This PR adds the processing of lowercase Column names in Oracle Jdbc Catalog. In the previous behavior, we changed all Oracle columns to uppercase queries by default, but could not handle the lowercase case. This PR can solve this situation and improve All Jdbc Catalog works