turn on all test case in scalar function W except width_bucket(fix be bug in next PR)
turn off all test case for group_concat(distinct order by)
fix return nullable in TimestampArithmetic
Clear transaction state log occupies too much time, so we change clear transaction log level from info to debug
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
In #16343, we split the timeout variable into two ones (one is for query and another is for insertion).
The function `ConnectProcessor::handleQuery` uses the corresponding session variable to change the timeout for the queries requested by MySQL client. However, the function `StmtExecutor::handleInsert` doesn't use the session variable to change the timeout, so we can't change the timeout for the CTAS and MTMV insertion job.
Reason: column_name[k5], decimal value is not valid for definition, value=123.123, precision=9, scale=3, min=-999999.999, max=-999999.999; . src line [];
#17273
build-tpcds-tools.sh
gen-tpcds-data.sh
gen-tpcds-queries.sh
create-tpcds-tables.sh
load-tpcds-data.sh
run-tpcds-queries.sh
generate data and queries support specify SCALE,
create table may need to be edited handly to specify BUCKETS or change int to bigint if SCALE is too big.
---------
Co-authored-by: stephen <hello_stephen@@qq.com>
1. The first property is `only_specified_database`:
In the past, `Jdbc Catalog` will synchronize all database from source database.
Now we add a parameter called `only_specified_database` to jdbc catalog to allow only the specified database to be synchronized, eg:
```sql
create resource if not exists ${resource_name} properties(
"type"="jdbc",
"user"="root",
"password"="123456",
"jdbc_url" = "jdbc:mysql://172.18.0.1:${mysql_port}/doris_test?useSSL=false",
"driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar",
"driver_class" = "com.mysql.cj.jdbc.Driver",
"only_specified_database" = "true"
);
```
if `only_specified_database` is `true`, jdbc catalog will only synchronize the database which is specified in `jdbc_url`.
2. The second property is `lower_case_table_names`:
This property will synchronize jdbc external data source table names in lower case.
```sql
create resource if not exists ${resource_name} properties(
"type"="jdbc",
"user"="doris_test",
"password"="123456",
"jdbc_url" = "jdbc:oracle:thin:@172.18.0.1:${oracle_port}:${SID}",
"driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/ojdbc8.jar",
"driver_class" = "oracle.jdbc.driver.OracleDriver",
"lower_case_table_names" = "true"
);
```
1. could not use static INSTANCE for FoldConstantOnBE rule, because it is stateful
2. if expression root is Alias, should use its child to do const collection
This CL mainly changes:
Support specifying csv schema manually in s3/hdfs table valued function
s3 (
'URI' = 'https://bucket1/inventory.dat',
'ACCESS_KEY'= 'ak',
'SECRET_KEY' = 'sk',
'FORMAT' = 'csv',
'column_separator' = '|',
'csv_schema' = 'k1:int;k2:int;k3:int;k4:decimal(38,10)',
'use_path_style'='true'
)
Add new session variable dry_run_query
If set to true, the real query result will not be returned, instead, it will only return the number of returned rows.
mysql> select * from bigtable;
+--------------+
| ReturnedRows |
+--------------+
| 10000000 |
+--------------+
This can avoid large result set transmission time and focus on real execution time of query engine.
For debug and analysis purpose.
background:
At the moment, match query must with inverted index,
problem description:
After drop inverted index which is the only index in table, there still can use match query for this index column.
fix it:
The index should be updated on BE regardless of whether the indexes_desc from FE is empty.
We set LIBHDFS3_CONF env in start_be.sh, so libhdfs3 will try to read this hdfs-site.xml,
if file does not exist, it will throw error. But Doris does not handle this error, cause BE crash.
This CL mainly changes:
Modify start_be.sh to only set LIBHDFS3_CONF if hdfs-site.xml exist.
Refactor the HDFSCommonBuilder so that it can return error correctly.
Add BE IP info in status, so that we can get ip from error msg like:
ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]failed to init reader for file 000.snappy.orc, err:
[INTERNAL_ERROR][172.21.0.101]failed to init HDFSCommonBuilder, please check check be/conf/hdfs-site.xml
The logic of prefer compute node is wrong, which causing the external table query can only assign up to 3 backends.
This CL refactor this logic and also change some FE config:
prefer_compute_node_for_external_table
If set to true, query on external table will prefer to assign to compute node.
And the max number of compute node is controlled by min_backend_num_for_external_table.
If set to false, query on external table will assign to any node.
min_backend_num_for_external_table
Only take effect when prefer_compute_node_for_external_table is true.
If the compute node number is less than this value, query on external table will try to get some mix node
to assign, to let the total number of node reach this value.
If the compute node number is larger than this value, query on external table will assign to compute node only.
Add a special counter for memtracker, faster, but relaxed ordering and not accurate in real time
Track thread create and destroy memory, which was previously removed due to performance loss and added back