Commit Graph

3509 Commits

Author SHA1 Message Date
dcad6ff5e5 [License] Add License header for missing files (#7130)
1. Add License header for missing files
2. Modify the spark pom.xml to correct the location of `thrift`
2021-11-16 18:37:54 +08:00
5710cf8feb [Blog] Example of binlog load usage (#7080)
Example of binlog load usage
2021-11-16 12:12:44 +08:00
5b01f7bba2 [Feature] Support query hive table (#6569)
Users can directly query the data in the hive table in Doris, and can use join to perform complex queries without laboriously importing data from hive.

Main changes list below:

FE:

Extend HiveScanNode from BrokerScanNode
HiveMetaStoreClientHelper communicate with HIVE and HDFS.
BE:
Treate HiveScanNode as BrokerScanNode, treate HiveTable as BrokerTable.

broker_scanner.cpp: suppot read column from HDFS path.
orc_scanner.cpp: support read hdfs file.
POM:

Add hive.version=2.3.7, hive-metastore and hive-exec
Add hadoop.version=2.8.0, hadoop-hdfs
Upgrade commons-lang to fix incompatiblity of Java 9 and later.
Thrift:

Add THiveTable
Add read_by_column_def in TBrokerRangeDesc
2021-11-16 11:59:07 +08:00
e476e155c6 [Bug] Fix inappropriate Cmake option used to build ZSTD (#7111) 2021-11-16 10:04:16 +08:00
ccb1ea801a [Refactor] logger error in BDBStateChangeListener.java (#7101)
the logger BDBStateChangeListener.java should be BDBStateChangeListener.class instead of EditLog.class.
2021-11-16 10:03:53 +08:00
5aaf24bf55 [Compile] Remove unused import (#7112) 2021-11-15 11:57:35 +08:00
8b557c0e70 [Refactor] Refact code of sequence column (#7007) 2021-11-15 11:10:45 +08:00
896a08cbcf [Enhancement] add thread id in be log (#6891)
Add thread id in be log in order to quickly find the query id that caused the BE crushed by segmentation fault
See #6890
2021-11-14 18:52:01 +08:00
85fd05a8ae website bugfix (#7103)
website bugfix
2021-11-13 18:10:22 +08:00
d4c0156e0f [Doc] REPLACE_IF_NOT_NULL document modification (#7100)
REPLACE_IF_NOT_NULL document modification
2021-11-13 17:11:20 +08:00
7db90cb6ac [Build] Openssl development package (#7088)
Ubuntu: libssl-dev
RedHat/CentOS: openssl-devel
2021-11-13 17:11:05 +08:00
11cca0b15d [JoinReorder] Add session variable to close join order (#7076)
The new session variable 'close_join_reorder' is used to turn off all automatic join reorder algorithms.
If close_join_reorder is true, the Doris will execute query by the order in the original query.
2021-11-13 17:10:44 +08:00
88651a47c7 [Feature] Support Flink and Spark connector support String type (#7075)
Support String type for Flink and Spark connector
2021-11-13 17:10:22 +08:00
ed61055912 [SparkConnector] Add thrift dir for spark connector (#7074)
Add thrift dir for spark connector, to fix error when building spark-doris-connector
2021-11-13 17:09:52 +08:00
93ccef4ec7 [Feature] Add degradate strategy for local_replica_selection. (#7064)
When local_replica_selection is turned on, support select a non-local BE to service the query
when the local be is unavailable
2021-11-13 17:09:25 +08:00
4f245952cb [Github] Guiding users to use mailing lists (#7099)
Guide users to use the mailing list in the github issue templates.
2021-11-13 16:56:14 +08:00
3d8166504a [Alter] Support alter table engine type from MySQL to ODBC (#6993)
Support alter table engine type from MySQL to ODBC:

```
ALTER TABLE tbl MODIFY ENGINE TO odbc PROPERTIES("driver" = "odbc");
```
2021-11-12 15:12:41 +08:00
f93dae98e4 [Doc] Reorganize documents (#7093)
1. Migrate some of the best practice articles to the Blog
2. Changed the names of performance tests and best practices to performance tests and examples
2021-11-12 12:05:10 +08:00
c7e9430432 [Optimize] hll optimize: trace memory usage, new explicit data when really need (#6971)
1. reduce hll memory occupied:
    replace uint64_t _explicit_data[1602] with uint64_t
    new memory for explicit data when really needed
2. trace HLL memory usage
2021-11-12 11:35:06 +08:00
c55e7221dc [Bug] Fix bug with use tableId to get table in publish version (#7091)
If table has been dropped when finishing txn, skip it.
2021-11-12 10:56:33 +08:00
9692131abc [BUG] Fix CacheAnalyzer's bug when aggregate column contains expression. (#7085)
When partition_cache is enabled, if Query's aggregate columns contain expression,
CacheAnalyzer may throw exception and causes the query to fail.
2021-11-12 10:54:24 +08:00
890bcdf606 [Feature] Clean up old sync jobs regularly (#7061)
#7060
#6287

Each job that has been stopped for more than 3 days(set with Config.label_keep_max_second)
will be permanently cleaned up.
2021-11-12 10:53:50 +08:00
d751937828 [Optimize] Optimize mem_tracker (#6988)
1. Optimize HighWaterMarkCounter::add(), call `UpdateMax()` only if delta greater than 0
to reduce function call times

2. delete useless code lines to keep MemTracker clean
    some member datas never be set, but check its value,the if condition never meet, so clean these codes
2021-11-12 10:51:45 +08:00
795d549eb3 [Proc] Add stream load info to system info of web site (#6970)
#6969
2021-11-12 10:44:09 +08:00
553cf8fa0c [Bug] Fix the error of inaccurate checking RLE page whether is full (#6934)
Fix #5957
2021-11-12 10:42:13 +08:00
047b83b987 [Optimize][Set Operation Node] Reduce the memory expansion operation of the hash table in ExceptNode and IntersectNode (#6915)
Reduce the memory expansion operation of the hash table in ExceptNode and IntersectNode
2021-11-12 10:39:59 +08:00
35da149ebe [SparkDpp]Add not() and xor() methods to bitmapValue (#6885)
Add not() and xor() methods to bitmapValue
2021-11-12 10:38:15 +08:00
6674af6001 [BUG] fix streaming_preagg dcheck cause be down (#6873)
in debug mode,query memory not enough, may cause be down
fe set useStreamingPreagg true, but be function CreateHashPartitions check is_streaming_preagg_ should false.

then casue core dump.
```
*** Check failure stack trace: ***
    @          0x2aa48ad  google::LogMessage::Fail()
    @          0x2aa6734  google::LogMessage::SendToLog()
    @          0x2aa43d4  google::LogMessage::Flush()
    @          0x2aa7169  google::LogMessageFatal::~LogMessageFatal()
    @          0x24703be  doris::PartitionedAggregationNode::CreateHashPartitions()
    @          0x2468fd6  doris::PartitionedAggregationNode::open()
    @          0x1e3b153  doris::PlanFragmentExecutor::open_internal()
    @          0x1e3af4b  doris::PlanFragmentExecutor::open()
    @          0x1d81b92  doris::FragmentExecState::execute()
    @          0x1d840f7  doris::FragmentMgr::_exec_actual()
```

we should remove DCHECK(!is_streaming_preagg_)
2021-11-12 10:37:46 +08:00
667e8bdce3 [Bug] Fix NumberFormatException for partition cache (#6846)
Fix #6845
2021-11-12 10:36:58 +08:00
0ae6e92dd4 [Build] fix unused import (#7094)
remove unused import . introduced by #7065
2021-11-11 19:59:43 +08:00
c9023acca4 [Bug] Use object to replace pointer to avoid BE crash (#7024)
use `NodeInfo _node_info` to replace `NodeInfo *_node_info`
2021-11-11 17:58:58 +08:00
4dd77f602f [Bug] Fix bug that NPE thrown when adding partition for table with MV (#7069)
The `defineExpr` in `Column` must be analyzed before calling its `treeToThrift` method.
And fro CreateReplicaTask, no need to set `defineExpr` in TColumn.
2021-11-11 15:43:16 +08:00
108914db92 [Log] fix log error for ActionController (#7065) 2021-11-11 15:42:57 +08:00
8e9f36877c [Compile] Fix spark-connector compile problem (#7048)
Use `thrift` in thirdparty
2021-11-11 15:42:30 +08:00
58804d3570 [Colocate] Fix bug that colocate group can not be redistributed after dropping a backend (#7020)
Mainly changes:

1. Fix [Bug] Colocate group can not redistributed after dropping a backend #7019
2. Add detail msg about why a colocate group is unstable.
3. Add more suggestion when upgrading Doris cluster.
2021-11-11 15:41:49 +08:00
cf085b8b1a [RoutineLoad] And "runningTxns" fields in SHOW ROUTINE LOAD result (#6986)
Add a new field `runningTxns` in the result of `SHOW ROUTINE LOAD`. eg:

```
                  Id: 11001
                Name: test4
          CreateTime: 2021-11-02 00:04:54
           PauseTime: NULL
             EndTime: NULL
              DbName: default_cluster:db1
           TableName: tbl1
               State: RUNNING
      DataSourceType: KAFKA
      CurrentTaskNum: 1
       JobProperties: {xxx}
    CustomProperties: {"kafka_default_offsets":"OFFSET_BEGINNING","group.id":"test4"}
           Statistic: {"receivedBytes":6,"runningTxns":[1001, 1002],"errorRows":0,"committedTaskNum":1,"loadedRows":2,"loadRowsRate":0,"abortedTaskNum":13,"errorRowsAfterResumed":0,"totalRows":2,"unselectedRows":0,"receivedBytesRate":0,"taskExecuteTimeMs":20965}
            Progress: {"0":"10"}
ReasonOfStateChanged:
        ErrorLogUrls:
            OtherMsg:
```

So that user can view the status of corresponding transactions of this job by executing `show transaction where id=xx`;
2021-11-11 15:41:13 +08:00
632f8fcc75 [libhdfs] Add errno for hdfs writer. when no dir, hdfs writer open failed, the dir need to be created. (#7050)
1. Add errno message for hdfs writer failed.
2. When call openWrite for hdfs, the dir will be created when it doesn't exist,
2021-11-11 15:21:21 +08:00
c47beb4d3a [Website][Docs]Add author field to blog (#7086)
* Add author field to blog

Co-authored-by: 943155336 <wangyongfeng>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
2021-11-11 10:23:44 +08:00
4c6cbdf463 [Bug] Fix version nav button loaded multiple times in docs website header (#7062)
* Fix version nav button loaded multiple times

Co-authored-by: 943155336 <wangyongfeng>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
2021-11-09 18:23:44 +08:00
088a16d33b Chinese annotation modification (#6958)
* Modify Chinese comment (#6951)
2021-11-09 18:00:14 +08:00
906c305a19 [Bug] Fix docs website home page last news icon loading failure (#7057)
* Fix last news icon loading failure

Co-authored-by: 943155336 <wangyongfeng>
Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
2021-11-09 17:34:42 +08:00
5d946ccd5e [Docs] Add hdfs outfile example (#7052) 2021-11-09 10:02:28 +08:00
b54a12ef11 [Build]Compile and output the jar file, add Spark, Flink version and Scala version (#7051)
The jar file compiled by Flink and Spark Connector, with the corresponding Flink, Spark version
and Scala version at compile time, so that users can know whether the version number matches when using it.

Example of output file name:doris-spark-1.0.0-spark-3.2.0_2.12.jar
2021-11-09 10:02:08 +08:00
34637589c5 [Website][Doc] Add the sharing blog function to the document site (#7047)
Add the sharing blog function to the document site, including the blog list and detail page. At the same time, a guide on how to share blogs has been added to the developer guide.
2021-11-09 10:01:23 +08:00
Pxl
fc62090558 [Bug] fix Log tags empty reference core dump (#7043)
key may have been destructed when key reference is called.
2021-11-09 10:00:08 +08:00
8ba2d79fe1 [Bug] Change DateTimeValue Memmory Layout To Old (#7022)
Change DateTimeValue Memmory Layout To Old to fix compatibility problems
2021-11-08 21:56:14 +08:00
9c12060db3 [Compile] Fix FE compile problem (#7029)
Co-authored-by: morningman <chenmingyu@baidu.com>
2021-11-08 10:35:49 +08:00
Pxl
29ca77622f [Refactor] Refactor part of RuntimeFilter's code (#6998)
#6997
2021-11-07 17:40:45 +08:00
9b1a80114e [Bug] Fix some return logic error in init BE encoding_map (#6936)
Checking _encoding_map in the original code to return in advance will cause some encoding methods cannot be pushed to default_encoding_type_map_ or value_seek_encoding_map_ in EncodingInfoResolver constructor.
E.g:
EncodingInfoResolver::EncodingInfoResolver() {
....
    _add_map<OLAP_FIELD_TYPE_BOOL, PLAIN_ENCODING>();
    _add_map<OLAP_FIELD_TYPE_BOOL, PLAIN_ENCODING, true>();
...
}
The second line code is invilid.
2021-11-07 17:40:18 +08:00
ca8268f1c9 [Feature] Extend logger interface, support structured log output (#6600)
Support structured logging.
2021-11-07 17:39:53 +08:00