Issue Number: close #xxx
This PR aims to enhance the compatibility of BI tools (such as Dbeaver, DataGrip) when using the mysql connector to connect to Doris, because some BI tools query some tables in the mysql database. In our tests, the user and procs_priv tables were mainly queried. This PR adds these two tables and adds actual data to the user table. However, please note that most of the fields in the user table are in Doris' own format rather than mysql format, so it can only ensure that the BI tool is querying No error is reported when accessing these tables, which does not guarantee that the data is completely displayed, and the tables under Doris's mysql database do not support data modification.
Thanks to @liujiwen-up for assisting in testing
Add `information_schema` database for all catalog.
This is useful when using BI tools to connect to Doris,
the tools can get meta info from `information_schema`.
This PR mainly changes:
1. There will be a `information_schema` db in each catalog.
2. Each `information_schema` db only store the meta info of the catalog it belongs to.
3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name.
4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true,
The `TABLE_SCHEMA` column's value is the like `ctl.db`, because:
When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`.
And then some BI will try to query `information_schema` with sql like:
`select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"`
So it has to be format as `ctl.db`
eg, the `information_schema.columns` table in external catalog `doris` is like:
```
mysql> select * from information_schema.columns limit 1\G
*************************** 1. row ***************************
TABLE_CATALOG: doris
TABLE_SCHEMA: doris.__internal_schema
TABLE_NAME: column_statistics
COLUMN_NAME: id
ORDINAL_POSITION: 1
COLUMN_DEFAULT: NULL
IS_NULLABLE: NO
DATA_TYPE: varchar
CHARACTER_MAXIMUM_LENGTH: 4096
CHARACTER_OCTET_LENGTH: 16384
NUMERIC_PRECISION: NULL
NUMERIC_SCALE: NULL
DATETIME_PRECISION: NULL
CHARACTER_SET_NAME: NULL
COLLATION_NAME: NULL
COLUMN_TYPE: varchar(4096)
COLUMN_KEY:
EXTRA:
PRIVILEGES:
COLUMN_COMMENT:
COLUMN_SIZE: 4096
DECIMAL_DIGITS: NULL
GENERATION_EXPRESSION: NULL
SRS_ID: NULL
```
6. Modify the behavior of
- show tables
- shwo databases
- show columns
- show table status
The above statements may query the `information_schema` db if there is `where` predicate after them
fix bug that #24059 .
Added some information_schema scanner tests.
files
schema_privileges
table_privileges
partitions
rowsets
statistics
table_constraints
Based on infodb_support_ext_catalog=false, it currently includes tests for all tables under the information_schema database.
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
There are many type definitions in BE. Should unify the type system and simplify the development.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
Example:
SELECT ROUTINE_SCHEMA AS PROCEDURE_CAT, NULL AS PROCEDURE_SCHEM,ROUTINE_NAME AS PROCEDURE_NAME,NULL AS NUM_INPUT_PARAMS,NULL AS NUM_OUTPUT_PARAMS,NULL AS NUM_RESULT_SETS,ROUTINE_COMMENT AS REMARKS,IF(ROUTINE_TYPE = 'FUNCTION', 2,IF(ROUTINE_TYPE= 'PROCEDURE', 1, 0)) AS PROCEDURE_TYPE FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_SCHEMA = DATABASE();
ERROR 1105 (HY000): errCode = 2, detailMessage = invalid parameter
This wrong and some BI tools could not work correctly.
1. Spark can set the timestamp precision by the following configuration:
spark.sql.parquet.outputTimestampType = INT96(NANOS), TIMESTAMP_MICROS, TIMESTAMP_MILLIS
DATETIME V1 only keeps the second precision, DATETIME V2 keeps the microsecond precision.
2. If using DECIMAL V2, the BE saves the value as decimal128, and keeps the precision of decimal as (precision=27, scale=9). DECIMAL V3 can maintain the right precision of decimal
Increase compatibility with mysql
1. Added two system tables files and partitions
2. Improved the return logic of mysql error code to make the error code more compatible with mysql
3. Added lock/unlock tables statement and show columns statement for compatibility with mysql dump
4. Compatible with mysqldump tool, now you can use mysql dump to dump data and table structure from doris
now use mysqldump may print error message like
```
$ mysqldump -h127.0.0.1 -P9130 -uroot test_query_qa > a
mysqldump: Error: 'errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): `EXTRA`' when trying to dump tablespaces
```
This error message not effect the export file, you can add `--no-tablespaces` to avoid this error
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
1. Add metadata table 'statistics' to store index information;
2. In the header information returned by mysql, the data type length is returned according to the actual type.