Commit Graph

1489 Commits

Author SHA1 Message Date
daf8ce29ca [Bug] Fix bucket shuffle bug when left table is without any data (#5965) 2021-06-16 09:39:31 +08:00
d0b60541af [Bug] fix use uncorrect table name in expand star (#6003)
SelectStmt use new TableName(null, tableRef.getAlias()) to expand star expression. tableRef.getAlias() is full name include database name and table name. 
Using it as table name will generate wrong sql in CreateViewStmt. 
This patch fix this problem and use correct database name and table name in expand star method.
2021-06-15 14:18:00 +08:00
54c7d177f8 [Log] Fix a log issue in BDBJournalCursor (#6006) 2021-06-10 17:39:25 +08:00
d33a6d1b98 [Function] Support date function: yearweek(), week(), makedate(). (#6000) 2021-06-10 17:38:25 +08:00
206a711f9b [Bug] SimplifyInvalidDateBinaryPredicatesDateRule may cause invalid query plan (#5987)
1. "where 1k > to_date(now())" will return EMPTYSET in query plan.
2. DateLiteral should accept date string like "2021-6-1".
2021-06-10 17:37:26 +08:00
97ed59780d [Bug] Outer join dispose constexpr error in inlineview (#5986) 2021-06-10 17:36:29 +08:00
e7a7b8d2d1 [Bug] Fix bug that start time is null when SQL is forward to master (#5966) 2021-06-10 17:34:59 +08:00
6106cc7d96 [Doris On ES][Bug-Fix] split es publish_address if it has host (#5955) 2021-06-10 17:34:44 +08:00
e245aee33e [Feature] Select outfile support parquet format (#5938)
`Select outfile into` currently only supports to export data with CSV format.
This patch extends the feature to supports parquet format.

Usage:
LocaFile:
```
SELECT citycode FROM table1 INTO OUTFILE "file:///root/doris/" FORMAT AS PARQUET PROPERTIES 
("schema"="required,int32,siteid;", "parquet.compression"="snappy");
```

BrokerFile:
```
SELECT siteid FROM table1 INTO OUTFILE "hdfs://host/test_sql_prc_2019_02_19/" FORMAT AS PARQUET
PROPERTIES ( 
"broker.name" = "hdfs_broker",
"broker.hadoop.security.authentication" = "kerberos",
"broker.kerberos_principal" = "test",
"broker.kerberos_keytab_content" = "base64" ,
"schema"="required,int32,siteid;"
);
```

Field `schema` is required, which defines the schema of a parquet file.
Prefix `parquet.` is the parquet file properties, like compression, version, enable_dictionary.
2021-06-10 17:34:01 +08:00
ad365b3b64 [Bug] Fix bug that cannot cancel alter table operation when table is unstable (#5998)
Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-06-09 15:00:17 +08:00
d9c128b744 [BrokerLoad] Support read properties for broker load when read data (#5845)
* [BrokerLoad] support read properties for broker load when read data

Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-06-09 14:59:55 +08:00
c95bebfa5c [Bug] Ignore drop table log when table has been dropped (#5973)
Although the table lock can control the simultaneous modification of the table by different threads.
But it cannot control the drop operation of the table by other threads.
For example, when drop table and table update occur at the same time.

get table object by thread 1
drop table by thread 2 with table lock
update table object by thread 1
The above process is possible.
At this time, step 3 actually operates a table that no longer exists, which will eventually cause NullPointerException.

In fact, the modified table log after the drop table can be ignored. The reason is that it is meaningless to modify information on a table that no longer exists.

Fixed #5687
2021-06-09 13:00:16 +08:00
60062d97da [Enhance] convert byte size into a human-readable format (#5970) 2021-06-06 22:07:58 +08:00
4b23bca2be [Refactor] catch throwable instead of exception (#5942)
Co-authored-by: 孙忠健(20113660) <sunzj10@ke.com>
2021-06-06 22:06:02 +08:00
f1e881e6f0 [Bug-fix] Show view stmt report error when view references invalid table (#5919) 2021-06-06 22:05:22 +08:00
a5c35eb20f [Bug] Fix the bug of null pointer exception of colocate join (#5961) 2021-06-04 10:19:58 +08:00
3ca6e386c7 [Function] Support Function current_timestamp(), curtime(), current_time() in FE. (#5946)
Support Function `current_timestamp()`, `curtime()`, `current_time()` in FE to do constant fold.
2021-06-03 18:39:19 +08:00
8e4b601ff2 [Bug] Fit the bug of check Fragment whether is colocate / bucket shuffle join error (#5940) 2021-05-31 12:14:44 +08:00
cf2e0cf2c8 [Bug] Fix export job sometimes stuck in exporting state after timeout (#5932)
Fix #5931
The reason is that sometime the method coordinate.exec() is not call when the job is timeout,
so that the query profile in this coordinate is not be initialized,
which will cause an NPE error in the execution of ExportExportingTask.
2021-05-30 23:09:29 +08:00
ba868c610f [Optimize] Optimize some tablet scheduling logic (#5926)
1. The partitions set by the admin repair command are prioritized
   to ensure that the tablets of these partitions can be repaired as soon as possible.

2. Add an FE metric "query_begin" to monitor the number of queries submitted to the Doris.
2021-05-30 23:08:59 +08:00
0da59aab53 [Bug] Fix bug of multi load and some issue about httpv2 (#5848)
To be compatible with http v1 so that user don't need to change their code
2021-05-30 23:08:38 +08:00
63c99eb4cb [Cache][Enhancement] Assure sql cache only one version (#5793)
For PR #5792. This patch add a new param `cache type` to distinguish sql cache and partition cache.
When update sql cache,  we make assure one sql key only has one version cache.
2021-05-28 13:45:47 +08:00
c844e602a7 [BUG] Fix the bug of Desc Query cause Unknown error and some doc revise (#5921) 2021-05-28 11:21:31 +08:00
aa17d40865 [Bug-fix] Update correct data partition of fragment which contains Repeat Node (#5910)
The Repeat Node will change the data partition of fragment
  when the origin data partition of fragment is HashPartition.
The Repeat Node will generate some new rows.
The distribution of these new rows is completely inconsistent with the original data distribution,
  their distribution is RANDOM.

If the data distribution is not corrected,
  an error will occur when the agg node determines whether to perform colocate.
Wrong data distribution will cause the agg node to think that agg can be colocated,
  leading to wrong results.
For example, the following query can not be colocated although the distributed column of table is k1:
```
SELECT k1, k2, SUM( k3 )
FROM table
GROUP BY GROUPING SETS ( (k1, k2), (k1), (k2), ( ) )
```
2021-05-27 22:06:10 +08:00
ce3ae764e5 [Bug] Bucket Shuffle Join may cause:Failed to send brpc batch, Not connected to 0.0.0.0:0 (#5901) 2021-05-27 22:05:15 +08:00
d958bbedc9 [Bug] Fix the problem that the result of query from the view is incorrect (#5860) (#5897)
Fix an issue where the priority of CompoundPredicates in created view does not match the expectation.
2021-05-27 22:04:33 +08:00
ba38973209 use virtual hosted-style request to access object store (#5894)
* use virtual hosted-style access request object store
2021-05-27 15:52:07 +08:00
510606ddd4 [DynamicPartition] Support specifying hot data partition (#5877)
In some scenarios, when users use dynamic partitions, they hope to use Doris' hierarchical storage
function at the same time.
For example, for the dynamic partition rule of partitioning by day, we hope that the partitions of the last 3 days
are stored on the SSD storage medium and automatically migrated to the HDD storage medium after expiration.

This CL add a new dynamic partition property: "hot_partition_num".
This parameter is used to specify how many recent partitions need to be stored on the SSD storage medium.
2021-05-26 10:00:24 +08:00
0b12218022 [Log] Change unauthorized access log to debug level (#5873) 2021-05-26 09:59:29 +08:00
ba69f7a7c8 [Command] [SQL] Add show database/table/partition id command (#5807)
In BE, when a problem happened, in the log, we can find the database id, table id, partition id,
but no database name, table name, partition name.

In FE, there also no way to find database name/table name/partition name accourding to
database id/table id/partition id. Therefore, this patch add 3 new commands:

1. show database id;
mysql> show database 10002;
+----------------------+
| DbName               |
+----------------------+
| default_cluster:test |
+----------------------+

2. show table id;
mysql> show table 11100;
+----------------------+-----------+-------+
| DbName               | TableName | DbId  |
+----------------------+-----------+-------+
| default_cluster:test | table2    | 10002 |
+----------------------+-----------+-------+

3. show partition id;
mysql> show partition 11099;
+----------------------+-----------+---------------+-------+---------+
| DbName               | TableName | PartitionName | DbId  | TableId |
+----------------------+-----------+---------------+-------+---------+
| default_cluster:test | table2    | p201708       | 10002 | 11100   |
+----------------------+-----------+---------------+-------+---------+
2021-05-26 09:58:02 +08:00
d0ca7b037c [Bug] NULL value in where and on clause should return EmptySetNode (#5872) 2021-05-24 12:32:59 +08:00
76eca9de56 [Bug] Kill the FE process when writing BDBJE journal failed (#5861)
1. When an oom error occurs when writing bdbje, catch the error and exit the process.
2. Increase the timeout period of bdbje replica ack and change it to a configuration.
2021-05-22 23:38:47 +08:00
07ad038870 [Feature][RoutineLoad] Support for consuming kafka from the point of time (#5832)
Support when creating a kafka routine load, start consumption from a specified point in time instead of a specific offset.
eg:
```
FROM KAFKA
(
    "kafka_broker_list" = "broker1:9092,broker2:9092",
    "kafka_topic" = "my_topic",
    "property.kafka_default_offsets" = "2021-10-10 11:00:00"
);

or

FROM KAFKA
(
    "kafka_broker_list" = "broker1:9092,broker2:9092",
    "kafka_topic" = "my_topic",
    "kafka_partitions" = "0,1,2",
    "kafka_offsets" = "2021-10-10 11:00:00, 2021-10-10 11:00:00, 2021-10-10 12:00:00"
);
```

This PR also reconstructed the analysis method of properties when creating or altering
routine load jobs, and unified the analysis process in the `RoutineLoadDataSourceProperties` class.
2021-05-22 23:37:53 +08:00
5a06dba4f1 [Colocate plan][Step2] Colocate aggregation covers more situations (#5555)
The old colocate aggregation can only cover the case where the child is scan.
In fact, as long as the child's data distribution meets the requirements,
no matter what the plan node on the child node is, a colocate aggregation can be performed.

This PR also fixes the correct data partition attribute of fragment.
The data partition of fragment which contains scan node is Hash Partition rather than Random.
This modification is mainly to determine the possibility of colocate
through the correct distribution of child fragments.
2021-05-22 23:36:49 +08:00
d4cc5735ac [Bug-fix][Backup] Modify the persistence logic of backup and restore (#5853)
The expose annotation is used in the persistence logic used by the old backup recovery.
This annotation by itself is meant to ignore some variables when serializing and deserializing.
However, this variable was used incorrectly and gson did not ignore the variables that should have been ignored.
This results in duplicate initialization when fe is restarted.

This pr uses the doris wrapped Gson directly, than eliminates the use of the expose annotation.
Fixed sortedTabletInfoList being repeatedly initialized resulting in incorrect numbers.

Fixed #5852
2021-05-21 12:23:07 +08:00
12e4ff2689 [Doc] Fix doc for 'SHOW EXPORT' (#5840) 2021-05-19 09:31:57 +08:00
3bee6c69ed [FE] [Refactor] Remove a useless MD5 calculation in PluginZip. (#5834)
Remove a useless MD5 calculation in PluginZip.
2021-05-19 09:30:05 +08:00
bf4443578c [Feature] Support show view statement for table (#5813)
Help to find all views which contain the given table
2021-05-19 09:29:00 +08:00
65ff464e3d [Feature] Support show data order by (#5770)
Currently, the `show data` does not support sorting. When the number of tables increases, it is inconvenient to manage. Need to support sorting

like:
```
mysql>  show data order by ReplicaCount desc,Size asc;
+-----------+-------------+--------------+
| TableName | Size        | ReplicaCount |
+-----------+-------------+--------------+
| table_c   | 3.102 KB    | 40           |
| table_d   | .000        | 20           |
| table_b   | 324.000 B   | 20           |
| table_a   | 1.266 KB    | 10           |
| Total     | 4.684 KB    | 90           |
| Quota     | 1024.000 GB | 1073741824   |
| Left      | 1024.000 GB | 1073741734   |
+-----------+-------------+--------------+
```
2021-05-19 09:27:27 +08:00
8d74176970 [Optimize] Check invalid datetime to avoid scanning a lots of partitions (#5643)
Support parsing date format `'%Y-%m-%d %H:%i' and '%Y-%m-%d %H'
Support handling date time with nanoseconds
2021-05-19 09:25:58 +08:00
add8c4bb74 [Load] Support reading multi-line json objects for JsonScanner (#5774)
Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-05-18 15:44:45 +08:00
4de21d604a [Debug Enhancement] Associate the old and new query id by retry (#5820)
When a query is retried, the FE log cannot quickly associate the new and old queries by query id.
This will increase the complexity of troubleshooting.
Modify the log printing logic of FE to associate the new and old query ids, and the print log looks like this:
Query {old_query_id} {retry_times} times with new query id: {new_query_id}
2021-05-17 11:53:21 +08:00
8610903924 abstract some codes to a new parse method for further scalability (#5800) 2021-05-17 11:52:45 +08:00
00a60ae5bb Enable http v2 (#5797)
* Solve the situation that the hardware information of the Web UI home page cannot be loaded

Solve the situation that the hardware information of the Web UI home page cannot be loaded

* Http v2 version is enabled by default

Http v2 version is enabled by default

Co-authored-by: zhangjf@shuhaisc.com <zhangfeng800729>
2021-05-17 11:51:41 +08:00
9d25bfe980 [Bug] Fix bug that database not found when replaying batch transaction remove log (#5815)
* [Bug] Fix bug that database not found when replaying batch transaction remove log

[GlobalTransactionMgr.replayBatchRemoveTransactions():353] replay batch remove transactions failed. db 0
org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = databaseTransactionMgr[0] does not exist
        at org.apache.doris.transaction.GlobalTransactionMgr.getDatabaseTransactionMgr(GlobalTransactionMgr.java:84) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.transaction.GlobalTransactionMgr.replayBatchRemoveTransactions(GlobalTransactionMgr.java:350) [palo-fe.jar:3.4.0]
        at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:601) [palo-fe.jar:3.4.0]
        at org.apache.doris.catalog.Catalog.replayJournal(Catalog.java:2452) [palo-fe.jar:3.4.0]
        at org.apache.doris.master.Checkpoint.runAfterCatalogReady(Checkpoint.java:101) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:3.4.0]

The id of information_scheam database is 0, and it has no txn at all.
2021-05-17 11:50:46 +08:00
e7a6d659a9 [Optimize] Use BufferedOutputStream to speed up checkpoint (#5802)
Use BufferedOutputStream to speed up checkpoint
2021-05-13 22:34:23 +08:00
7c9396b760 [Bug-fix] Fix Be cores when user specified HLL or BITMAP as operand of BinaryPredicate (#5799)
Fix be cores when user specified HLL or BITMAP type as operand of BinaryPredicate.
2021-05-13 22:33:57 +08:00
0c83e43a67 [Optimize] Optimize profile lock conflict and view profile while query is executing (#5762)
1. Reduce lock conflicts in RuntimeProfile of be;
2. can view query profile when the query is executing;
3. reduce wait time for 'show proc /current_queries'.
2021-05-13 22:33:26 +08:00
bdd2a6d055 [Optimize] Make array config readable (#5780) 2021-05-12 11:01:07 +08:00
55ca52a42d [Bug] Fix bug that Drop olap table may introduce some problems when table's state is not normal (#5712)
Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-05-12 10:38:23 +08:00