Problem:
When partially updating columns without specifying the auto-increment column, and the imported data contains new keys, an error stating the auto-increment column could not be found occurs.
Reason:
The logic for partial column updates does not account for new keys in auto-increment columns. Since auto-increment columns can be generated by the system, it's possible to omit this column data during import. However, partial column updates treat this as a regular column, expecting it to be nullable or have a default value for automatic filling, overlooking the fact that auto-increment columns can also be auto-filled. This oversight leads to the error.
Solution:
Incorporate a check for auto-increment columns into the partial column update logic, and include the logic for generating auto-increment column values in the process of completing partial updates.
Issue Number: close#29406
1. increase lzop version to 0x1040,
I set to 0x1040 only for decompressing lzo files compressed by higher version of lzop,
no change of decompressing logic,
actully, 0x1040 should have "F_H_FILTER" feature,
but it mainly for audio and image data, so we do not support it.
2. use orc::lzoDecompress() instead of lzo1x_decompress_safe() to decompress lzo data
3. use crc32c::Extend() instead of lzo_crc32()
4. use olap_adler32() instead of lzo_adler32()
5. thus, remove dependency of Markus F.X.J. Oberhumer's lzo library
6. remove DORIS_WITH_LZO, so lzo file are supported by stream and broker load by default
7. add some regression test
Add `information_schema` database for all catalog.
This is useful when using BI tools to connect to Doris,
the tools can get meta info from `information_schema`.
This PR mainly changes:
1. There will be a `information_schema` db in each catalog.
2. Each `information_schema` db only store the meta info of the catalog it belongs to.
3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name.
4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true,
The `TABLE_SCHEMA` column's value is the like `ctl.db`, because:
When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`.
And then some BI will try to query `information_schema` with sql like:
`select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"`
So it has to be format as `ctl.db`
eg, the `information_schema.columns` table in external catalog `doris` is like:
```
mysql> select * from information_schema.columns limit 1\G
*************************** 1. row ***************************
TABLE_CATALOG: doris
TABLE_SCHEMA: doris.__internal_schema
TABLE_NAME: column_statistics
COLUMN_NAME: id
ORDINAL_POSITION: 1
COLUMN_DEFAULT: NULL
IS_NULLABLE: NO
DATA_TYPE: varchar
CHARACTER_MAXIMUM_LENGTH: 4096
CHARACTER_OCTET_LENGTH: 16384
NUMERIC_PRECISION: NULL
NUMERIC_SCALE: NULL
DATETIME_PRECISION: NULL
CHARACTER_SET_NAME: NULL
COLLATION_NAME: NULL
COLUMN_TYPE: varchar(4096)
COLUMN_KEY:
EXTRA:
PRIVILEGES:
COLUMN_COMMENT:
COLUMN_SIZE: 4096
DECIMAL_DIGITS: NULL
GENERATION_EXPRESSION: NULL
SRS_ID: NULL
```
6. Modify the behavior of
- show tables
- shwo databases
- show columns
- show table status
The above statements may query the `information_schema` db if there is `where` predicate after them
using weak ptr as a lock between fragment execute thread and scanner thread, to solve the core problem in scanner's dctor to access scannode's profile.
Fixed the problem of not being able to read parquet lz4 compressed format. By default, it is decompressed according to the Hadoop lz4 format. If it fails, it will fall back to the standard lz4 compression format.
using blocksproduced and rowsproduced to unify the counter name in DataStreamSender and other exec node, or exchange operator and other operators.
blocks produced and rows produced are more easy to understand.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
1. fix race condition problem when get tablet load index
2. change tablet search algorithm from random to round-robin for random distribution table when load_to_single_tablet set to false
Effect: Client will see error message like below when BE meeting plan logical error.
RROR 1105 (HY000): errCode = 2, detailMessage = ([xxx]())[CANCELLED]Logical error during processing VNewOlapScanNode(dr_case_tag), output of projections 2 mismatches with exec node output 3
fix bug that #24059 .
Added some information_schema scanner tests.
files
schema_privileges
table_privileges
partitions
rowsets
statistics
table_constraints
Based on infodb_support_ext_catalog=false, it currently includes tests for all tables under the information_schema database.