Design Documentation Linked to #25514
Regression test add a new group: arrow_flight_sql,
./run-regression-test.sh -g arrow_flight_sql to run regression-test, can use jdbc:arrow-flight-sql to run all Suites whose group contains arrow_flight_sql.
./run-regression-test.sh -g p0,arrow_flight_sql to run regression-test, can use jdbc:arrow-flight-sql to run all Suites whose group contains arrow_flight_sql, and use jdbc:mysql to run other Suites whose group contains p0 but does not contain arrow_flight_sql.
Requires attention, the formats of jdbc:arrow-flight-sql and jdbc:mysql and mysql client query results are different, for example:
Datatime field type: jdbc:mysql returns 2010-01-02T05:09:06, mysql client returns 2010-01-02 05:09:06, jdbc:arrow-flight-sql also returns 2010-01-02 05:09 :06.
Array and Map field types: jdbc:mysql returns ["ab", "efg", null], {"f1": 1, "f2": "a"}, jdbc:arrow-flight-sql returns ["ab ","efg",null], {"f1":1,"f2":"a"}, which is missing spaces.
Float field type: jdbc:mysql and mysql client returns 6.333, jdbc:arrow-flight-sql returns 6.333000183105469, in query_p0/subquery/test_subquery.groovy.
If the query result is empty, jdbc:arrow-flight-sql returns empty and jdbc:mysql returns \N.
use database; and query should be divided into two SQL executions as much as possible. otherwise the results may not be as expected. For example: USE information_schema; select cast ("0.0101031417" as datetime) The result is 2000-01-01 03:14:1 (constant fold), select cast ("0.0101031417" as datetime) The result is null (no constant fold),
In addition, doris jdbc:arrow-flight-sql still has unfinished parts, such as:
Unsupported data type: Decimal256. INVALID_ARGUMENT: [INTERNAL_ERROR]Fail to convert block data to arrow data, error: [E3] write_column_to_arrow with type Decimal256
Unsupported null value of map key. INVALID_ARGUMENT: [INTERNAL_ERROR]Fail to convert block data to arrow data, error: [E33] Can not write null value of map key to arrow.
Unsupported data type: ARRAY<MAP<TEXT,TEXT>>
jdbc:arrow-flight-sql not support connecting to specify DB name, such asjdbc:arrow-flight-sql://127.0.0.1:9090/{db_name}", In order to be compatible with regression-test, use db_nameis added before all SQLs whenjdbc:arrow-flight-sql` runs regression test.
select timediff("2010-01-01 01:00:00", "2010-01-02 01:00:00");, error java.lang.NumberFormatException: For input string: "-24:00:00"
Result is empty for query select * from person where address like '%\\\\%';, but MySQL can get a line of result.
CREATE TABLE `person` (
`id` int(11) NULL,
`name` text NULL,
`age` int(11) NULL,
`class` int(11) NULL,
`address` text NULL
) ENGINE=OLAP
UNIQUE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
);
insert into person values (10001,'test1',30,2,'test\\\\,xxx');
Adding logs:
select * from person where address like '%\\\\%';
I0323 10:26:15.907760 2387043 like.cpp:558] arg str: %\\%, size: 4, pattern LIKE_ENDS_WITH_RE: (?:%+)(((\\%)|(\\_)|([^%_]))+), size: 30
I0323 10:26:15.907789 2387043 like.cpp:562] match 0: \\%, size: 3
I0323 10:26:15.907801 2387043 like.cpp:562] match 1: \%, size: 2
I0323 10:26:15.907811 2387043 like.cpp:562] match 2: \%, size: 2
I0323 10:26:15.907821 2387043 like.cpp:562] match 3: , size: 0
I0323 10:26:15.907830 2387043 like.cpp:562] match 4: \, size: 1
I0323 10:26:15.907842 2387043 like.cpp:615] search_string : \\%
I0323 10:26:15.907855 2387043 like.cpp:619] search_string escape removed: \%
It matchs against the LIKE_ENDS_WITH_RE which is wrong, the meaning of the sql should be: match strings that have one backslash in any place.
bug: some chinese word not sort by pinyin in GBK coding
CREATE TABLE `test_convert` (
`a` varchar(100) NULL
) ENGINE=OLAP
DUPLICATE KEY(`a`)
DISTRIBUTED BY HASH(`a`) BUCKETS 3
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
insert into test_convert values("b"), ("a"), ("c"), ("睿"), ("多"), ("丝");
Query OK, 6 rows affected (0.03 sec)
{'label':'insert_ca73a6acc2194d5b_888218a3949355a6', 'status':'VISIBLE', 'txnId':'18068'}
mysql [test]>select * from test_convert;
+------+
| a |
+------+
| a |
| c |
| 丝 |
| b |
| 多 |
| 睿 |
+------+
6 rows in set (0.01 sec)
mysql [test]>select * from test_convert order by convert(a using gbk);
+------+
| a |
+------+
| a |
| b |
| c |
| 多 |
| 丝 |
| 睿 |
+------+
6 rows in set (0.01 sec)
Previous logic of reverse function might not be strong enough to handle illegal character. For example, one one byte size character would be mistaken as one utf-8 character which occupies more than one byte space. And unfortunately exceeding the buffer space during future process.