Problem:
when we create table with datatype varchar(), we regard it to be max length by default. But when we desc, it does not show
real length but show varchar()
Reason:
when we upgrade version from 2.0.1 to 2.0.2, we support new feature of creating varchar(), and it shows the same way with
ddl schema. So user would confuse of the length of varchar
Solved:
change the showing of varchar() to varchar(65533), which in compatible with hive
Use field datatype such as decimal(10, 0) to create table like. Because the scale is 0, the precision and scale will lost when create table like done. this will fix the bug.
**Before fix, create table with following SQL**:
CREATE TABLE IF NOT EXISTS db_test.table_test
(
`name` varchar COMMENT "1m size",
`id` SMALLINT COMMENT "[-32768, 32767]",
`timestamp0` decimal null comment "c0",
`timestamp1` decimal(38, 0) null comment "c1"
)
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES ('replication_num' = '1');
**and Then run**
CREATE TABLE db_test.table_test_like LIKE db_test.table_test
SHOW CREATE TABLE db_test.table_test_like;
the field `timestamp1` will be decimal(9, 0), it's wrong. this will fix it.
## Proposed changes
Infer the column name when create view if the column is expression
## Further comments
expr column name infer strategy as following:
| expr | example | column name(before) | Inferred column name(if position is 2) |
| ------------- | --------------------------------------- | ------------------------------ | -------------------------------------- |
| function | dayofyear() | dayofyear() | __dayofyear_1 |
| cast | cast(1 as bigint) | CAST(1 AS BIGINT) | __cast_1 |
| anylyticExpr | min() | min() | __min_1 |
| predicate | 1 in (1,2,3,4) | 1 IN (1, 2, 3, 4) | __in_predicate_1 |
| literal | 1 or 'string_var_name' | 1 or 'string_var_name' | __literal_1 |
| arithmeticExpr | & | ... & ... | __arithmetic_expr_1 |
| identifier | a or b | a or b | a or b |
| case | CASE WHEN remark = 's' THEN 1 ELSE 2 END | CASE WHEN remark = 's' THEN 1 ELSE 2 END | __case_1 |
| window | min(timestamp) OVER (...) | min(timestamp) OVER(...) | __min_1 |
SQL for example:
```sql
CREATE VIEW v1 AS
SELECT
error_code,
1,
'string',
now(),
dayofyear(op_time),
cast (source AS BIGINT),
min(`timestamp`) OVER (
ORDER BY
op_time DESC ROWS BETWEEN UNBOUNDED PRECEDING
AND 1 FOLLOWING
),
1 > 2,
2 + 3,
1 IN (1, 2, 3, 4),
remark LIKE '%like',
CASE WHEN remark = 's' THEN 1 ELSE 2 END,
TRUE | FALSE
FROM
db_test.table_test1
```
the output column name is as following:
```
error_code
__literal_1
__literal_2
__now_3
__dayofyear_4
__cast_expr_5
__min_6
__binary_predicate_7
__arithmetic_expr_8
__in_predicate_9
__like_predicate_10
__case_expr_11
__arithmetic_expr_12
```
before the lambda function Expr not implement toSqlImpl() function.
so it's call parent function, which is not suit for lambda function.
and will be have error when create view.
Currently, compaction is executed separately for each backend, and the reconstruction of the index during compaction leads to high CPU usage. To address this, we are introducing single replica compaction, where a specific primary replica is selected to perform compaction, and the remaining replicas fetch the compaction results from the primary replica.
The Backend (BE) requests replica information for all peers corresponding to a tablet from the Frontend (FE). This information includes the host where the replica is located and the replica_id. By calculating hash(replica_id), the replica with the smallest hash value is responsible for executing compaction, while the remaining replicas are responsible for fetching the compaction results from this replica.
The compaction task producer thread, before submitting a compaction task, checks whether the local replica should fetch from its peer. If it should, the task is then submitted to the single replica compaction thread pool.
When performing single replica compaction, the process begins by requesting rowset versions from the target replica. These rowset_versions are then compared with the local rowset versions. The first version that can be fetched is selected.
Make database, table, column and other names support unicode by changing LABEL_REGEX COMMON_NAME_REGIEX COMMON_TABLE_NAME_REGEX COLUMN_NAME_REGEX regular expressions in class FeNameFormat.
P.S. @SharpRay has transfered PR #13467 to me, and I‘m responsible for the task now. There will be some modifications during the review period, so I create a new PR and the original #13467 could be closed. Thanks.
Fix bug that the ArithmeticExpr's write method is not implement, causing FE crash when creating function like:
CREATE ALIAS FUNCTION IF NOT EXISTS mesh_udf_test1(INT,INT) WITH PARAMETER(n,d) AS ROUND(1+floor(n/d));
Add if exists and if not exists for drop and create function
Fix a minor bug that if file does not exist, hdfs() table valued function will throw NPE
When executing create table as select stmt,
the varchar/char/string type of column in created table will be unified to string type.
Because when select from external table (mysql/pg, etc), the length of varchar in external database
is calculated by "char" length, not "byte" length.
So if there is a column with varchar(10) in external table, then there will be a same varchar(10)
in created table. But the byte length of data in external table may be larger than 10, causing failure of CTAS.
Change to string will not impact performance of the capacity of disk storage.
And notice that if a string type column is the first column, it will be changed to varchar(65535),
because we do not allow string type column as sort key column.