Commit Graph

11043 Commits

Author SHA1 Message Date
b65094c8df [Improvement](multi-catalog) paimon supports projection push down (#20522)
Co-authored-by: hugoluo <hugoluo@tencent.com>
2023-06-07 00:39:08 +08:00
880e2d8373 [typo](doc) update spark connnector version compatibility instructions (#20477) 2023-06-06 23:33:27 +08:00
43ae2c59c3 [typo](doc) Fixed some description of date_format function documentation (#20504) 2023-06-06 23:32:34 +08:00
5a749e6f4d [doc](catalog-hive) Add the property of hive catalog with kerberos. (#20502)
Co-authored-by: smallhibiscus <844981280>
2023-06-06 23:30:19 +08:00
c991249360 [enhancement](cooldown) use cooldown replica first when generating scan node (#20384) 2023-06-06 22:15:49 +08:00
a68afd0672 [fix](cooldown) fix bug due to tablets info changed (#20465) 2023-06-06 22:15:17 +08:00
b22e364cdb [fix](log) publish version log is printed too frequently (#20507) 2023-06-06 20:34:38 +08:00
82cf76f92b [fix](Nereids) join condition not extract as conjunctions (#20498) 2023-06-06 20:34:19 +08:00
05bdbce8fc [Feature](Nereids) support update unique table statement (#20313) 2023-06-06 20:32:43 +08:00
61d9bd2ba1 [fix](regression) fix export file test cases (#20463) 2023-06-06 20:07:31 +08:00
1f63c56e20 [sample](doris-soruce) add demo for reading data from doris be using thrift (#20192) 2023-06-06 19:57:34 +08:00
0c6292abaa [fix](stats) skip forbid_unknown_col_stats check for invisible column and internal db (#20362)
1. skip forbidUnknownColStats check for in-visible columns
2. use columsStatistics.isUnknown to tell if this stats is unknown
3. skip unknown stats check for internal schema
2023-06-06 19:07:33 +08:00
625a8bcb05 [fix](merge-on-write) fix that set_txn_related_delete_bitmap may coredump (#20300) 2023-06-06 17:49:01 +08:00
a569d371b3 [fix](Nereids) give clean error message when there are subquery in the on clause (#20211)
Add the rule for checking the join node in `analysis/CheckAnalysis.java` file. When we check the join node, we should check its' on clause. If there are some subquery expression, we should throw exception.

Before this PR
```
mysql> select a.k1 from baseall a join test b on b.k2 in (select 49);
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: nul
```

After this PR
```
mysql> select a.k1 from baseall a join test b on b.k2 in (select 49);
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: Not support OnClause contain Subquery, expr:k2 IN (INSUBQUERY) (LogicalOneRowRelation ( projects=[49 AS `49`#28], buildUnionNode=true ))
```
2023-06-06 16:50:20 +08:00
b1a8bb28f7 [Fix](WorkloadGroup)Fix query queue nereids bug #20484 2023-06-06 16:44:35 +08:00
4bc221aa25 [improvement](column reader) lazy load indices (#20456)
Currently when reading column data, all types of indice are read even if they are not actually used, this PR implements lazy load of indices.
2023-06-06 16:36:06 +08:00
17259672ff [typo](docs)modify http_port to webserver_port (#20447) 2023-06-06 16:08:45 +08:00
7df8459e21 [fix](regression-test) add retry time to avoid regression test failed (#20487)
Now after alter table ${tbl} set('dynamic_partition.end'='5'), we add dynamic partition async.
We need to wait dynamic scheduler.
2023-06-06 15:50:11 +08:00
48021366bf [fix](load) fix unified load redirect status delegate error (#20467) 2023-06-06 15:46:48 +08:00
13f1b90768 [Fix] (tablet) fix tablet queryable set (#20413) (#20414) 2023-06-06 15:38:01 +08:00
24f9610cbb [fix](docker)Add container graceful exit logic (#20474)
Add FE container and BE container to execute the logic of the Stop script when executing the exit command to ensure that the metadata is written successfully and minimize the restart exception caused by BEBJE.
2023-06-06 15:25:21 +08:00
1b02b28c40 [feature](docker)Docker example hive-broker-doris (#20473)
add new docker example: hdfs-broker-doris
2023-06-06 15:24:58 +08:00
a3bcdf7b44 [docs](docs)Fix docker example doc (#20472)
fix run-docker-cluster docs docker volumes:conf
2023-06-06 15:24:32 +08:00
0337dd573c [fix](docker)Fix docker example script (#20471)
remove docker example script volumes: conf
2023-06-06 15:24:04 +08:00
f1db1f3663 [fix](docker)Fix BE init script Bug (#20470)
change /bin/env bash -> /bin/bash
2023-06-06 15:23:35 +08:00
5184b31620 [feature](docker)Add new example MySQL-Flink-Doris Demo (#20469)
Add new example MySQL-Flink-Doris Demo
2023-06-06 15:23:11 +08:00
fe63a0a3bb [Feature](multi-catalog)support paimon catalog (#19681)
CREATE CATALOG paimon_n2 PROPERTIES (
"dfs.ha.namenodes.HDFS1006531" = "nn2,nn1",
"dfs.namenode.rpc-address.HDFS1006531.nn2" = "172.16.65.xx:4007",
"dfs.namenode.rpc-address.HDFS1006531.nn1" = "172.16.65.xx:4007",
"hive.metastore.uris" = "thrift://172.16.65.xx:7004",
"type" = "paimon",
"dfs.nameservices" = "HDFS1006531",
"hadoop.username" = "hadoop",
"paimon.catalog.type" = "hms",
"warehouse" = "hdfs://HDFS1006531/data/paimon1",
"dfs.client.failover.proxy.provider.HDFS1006531" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
);
2023-06-06 15:08:30 +08:00
ae428c29e2 [feature](planner)(nereids) support user defined variable (#20334)
Support user-defined variables.
After this PR, we can use `set @a = xx` to define a user variable and use it in the query like `select @a`.

the changes of this PR:
1. Support the grammar for `set user variable` in the parser.
2. Add the `userVars` in `VariableMgr` to store the user-defined variables.
3. For the `set @a = xx`, we will store the variable name and its value in the `userVars` in `VariableMgr`.
4. For the `select @a`, we will get the value for the variable name in `userVars`.
2023-06-06 14:35:16 +08:00
0fce7b9011 [fix](http) Let the sdk find the httpclient package determined (#20205) 2023-06-06 14:20:38 +08:00
1f032a551d [Improve](array-functions) support array first function (#20397)
add array_first(lambda, [1,2,3,null]) function for doris
2023-06-06 12:08:46 +08:00
1b94b6368f [fix](load) in strict mode, return error for insert if datatype convert fails (#20378)
* [fix](load) in strict mode, return error for load and insert if datatype convert fails

Revert "[fix](MySQL) the way Doris handles boolean type is consistent with MySQL (#19416)"

This reverts commit 68eb420cabe5b26b09d6d4a2724ae12699bdee87.

Since it changed other behaviours, e.g. in strict mode insert into t_int values ("a"),
it will result 0 is inserted into table, but it should return error instead.

* fix be ut

* fix regression tests
2023-06-06 12:04:03 +08:00
65100d8083 [improvement](profile)add max/min rpc time (#20339) 2023-06-06 12:03:01 +08:00
e553615a27 [opt](Nereids) perfer use datev2 / datetimev2 in date related functions (#20224)
1. update all date related functions' signatures order. 
1.1. if return value need to be compute with time info, args with datetimev2 at the top of the list, followed by datev2, datetime and date
1.2. if return value need to be compute with only date info, args with datev2 at the top of list, followed by datetimev2, date and datetime
2. Priority for use datev2, if we must cast date to datev2 or datetime/datetimev2
2023-06-06 11:42:29 +08:00
c56eddbfa9 [bug](jdbc) fix trino date/datetime filter (#20443)
When querying Trino's JDBC catalog, if our WHERE filter condition is k1 >= '2022-01-01', this format is incorrect. 
In Trino, the correct format should be k1 >= date '2022-01-01' or k1 >= timestamp '2022-01-01 00:00:00'. 
Therefore, the date string in the WHERE condition needs to be converted to the date or timestamp format supported by Trino.
2023-06-06 11:20:42 +08:00
2fc1141c5f [test](regression) update some case in p2 (#20436) 2023-06-06 11:05:56 +08:00
d02737a293 [feature](struct-type) support struct_element function (#19045)
This commit support a function allows return a field column in named struct column.
Since the function can return any type, this commit also supports ANY_STRUCT_TYPE
and ANY_ELEMENT_TYPE.
2023-06-06 10:44:08 +08:00
f839c90c27 [fix][refactor](backend-policy)(compute) refactor the hierarchy of external scan node and fix compute node bug #20402
There should be 2 kinds of ScanNode:

OlapScanNode
ExternalScanNode
The Backends used for ExternalScanNode should be controlled by FederationBackendPolicy.
But currently, only FileScanNode is controlled by FederationBackendPolicy, other scan node such as MysqlScanNode,
JdbcScanNode will use Mix Backend even if we enable and prefer to use Compute Backend.

In this PR, I modified the hierarchy of ExternalScanNode, the new hierarchy is:

ScanNode
    OlapScanNode
    SchemaScanNode
    ExternalScanNode
        MetadataScanNode
        DataGenScanNode
        EsScanNode
        OdbcScanNode
        MysqlScanNode
        JdbcScanNode
        FileScanNode
            FileLoadScanNode
            FileQueryScanNode
                MaxComputeScanNode
                IcebergScanNode
                TVFScanNode
                HiveScanNode
                    HudiScanNode
And previously, the BackendPolicy is the member of FileScanNode, now I moved it to the ExternalScanNode.
So that all subtype ExternalScanNode can use BackendPolicy to choose Compute Backend to execute the query.

All all ExternalScanNode should implement the abstract method createScanRangeLocations().

For scan node like jdbc scan node/mysql scan node, the scan range locations will be selected randomly from
compute node(if preferred).

And for compute node selection. If all scan nodes are external scan nodes, and prefer_compute_node_for_external_table
is set to true, the BE for this query will only select compute nodes.
2023-06-06 10:35:30 +08:00
c7888f4bfa [feature](profile)Add the filtering info of the in filter in profile #20321
image Currently, it is difficult to obtain the id of in filters,so, the some in filters's id is -1.
2023-06-06 10:24:59 +08:00
378ffa133e [fix](regression-test) Add lost ddl file for tpcds_sf1_p2 #20288 2023-06-06 09:57:38 +08:00
5f4ccb1f2e [fix](load) fix generate delete bitmap in memtable flush (#20446)
1. Generate delete bitmap for one segment at a time.
2. Generate delete bitmap before segment compaction.
Fix #20445
2023-06-06 09:48:30 +08:00
22eec4148b [fix](conf) fix fe host in doris-cluster.conf #20422 2023-06-06 09:15:36 +08:00
1fc48e83f2 [fix](executor)Fix duplicate timer and add open timer #20448
1 Currently, Node's total timer couter has timed twice(in Open and alloc_resource), this may cause timer in profile is not correct.
2 Add more timer to find more code which may cost much time.
2023-06-06 08:55:52 +08:00
4f77578d8a [enhancement](profile) add build get child next time (#20460)
Currently, build time not include child(1)->get next time, it is very confusing during shared hash table scenario. So that I add a profile.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-06-06 08:55:19 +08:00
6c96e1dc9f [fix](regression) add sync after streamload in test_stream_load (#20425)
Add sync after streamload in test_stream_load to fix following error:

Exception in load_p0/stream_load/test_stream_load.groovy(line 180):

                throw exception
            }
            log.info("Stream load result: ${result}".toString())
            def json = parseJson(result)
            assertEquals("success", json.Status.toLowerCase())
            assertEquals(1, json.NumberTotalRows)
            assertEquals(0, json.NumberFilteredRows)
        }
    }
    order_qt_sql1 " SELECT * FROM ${tableName2}"
^^^^^^^^^^^^^^^^^^^^^^^^^^ERROR LINE^^^^^^^^^^^^^^^^^^^^^^^^^^

    // test common case
    def tableName3 = "test_all"
    def tableName4 = "test_less_col"
    def tableName5 = "test_bitmap_and_hll"
    def tableName6 = "test_unique_key"
    def tableName7 = "test_unique_key_with_delete"
    def tableName8 = "test_array"
    def tableName10 = "test_struct"
    sql """ DROP TABLE IF EXISTS ${tableName3} """

Exception:
java.lang.IllegalStateException: Check tag 'sql1' failed:
Check tag 'sql1' failed, line 1 mismatch, real line is empty, but expect is 2019  9  9  9  7.700  a  2019-09-09  1970-01-01T08:33:39  k7  9.0  9.0
sql:
 SELECT * FROM load_nullable_to_not_nullable
2023-06-06 08:32:25 +08:00
b7fc17da68 [feature-wip](multi-catalog)(step2)support read max compute data by JNI (#19819)
Issue Number: #19679
2023-06-05 22:10:08 +08:00
e576d533b2 [typo](doc)Remove useless hints (#20457)
Co-authored-by: hechao <hechao@selectdb.com>
2023-06-05 21:13:52 +08:00
0a90a9d507 [feature-wip](duplicate_no_keys) Add some test cases of all the duplicate tables in test case tpcds_sf100_dup_without_key_p2 and make them duplicate tables without keys (#20431) 2023-06-05 21:04:41 +08:00
25aa86087c [fix](audit) Fix the error of peakMemoryBytes in the audit log (#20449) 2023-06-05 21:02:18 +08:00
05d497d21e [fix](sequence) value predicates shouldn't be push down when has sequence column (#20408)
* (fix)[sequence] value predicates shouldn't be push down when has sequence column

* add case
2023-06-05 19:18:34 +08:00
fac0b50f56 [Fix](Planner)fix cast date/datev2/datetime to float/double return null. (#20008) 2023-06-05 19:06:50 +08:00