Commit Graph

324 Commits

Author SHA1 Message Date
f1db289934 Fix compile error (#461) 2018-12-24 10:26:03 +08:00
90d71508ff Add UserFunctionCache to cache UDF's library (#453)
* Add UserFunctionCache to cache UDF's library

This patch replace LibCache with UserFunctionCache. LibCache use HDFS
URL to identify a UDF's Library, and when BE process restart all of
downloaded library should be loaded another time. We use function id
corresponding to a library, and when process restart, all downloaded
libraries can be loaded without another downloading.

* update
2018-12-21 22:07:21 +08:00
0341ffde67 Revert commit 'Add log to detect empty load file' (#447)
Looks like we still need to send push task without file
to Backend, or load job will fail.
Fix it later.
2018-12-19 12:29:12 +08:00
5a6e5cfd07 Add log to detect empty load file (#445)
We find that a load file may not be generated for rollup tablet,
add a log to observe.
2018-12-18 12:44:36 +08:00
b9201ece0b Parse thrift port from cluster state (#443) 2018-12-18 11:28:42 +08:00
e99468c387 Add HttpClient class (#441)
Replace FileDownloader with HttpClient, this patch change clone_copy and
pusher's download.
2018-12-18 09:55:11 +08:00
e177c23787 Update the increased frequency of priority for remaining tasks in BlockingPriorityQueue (#434) 2018-12-17 21:05:57 +08:00
dc4cbab11e Report error when loading decimal value with scientific notation (#428)
Currently we do not support scientific notation of decimal value.
2018-12-17 21:04:18 +08:00
d291068b37 Fix memory_copy core dump in aggregate (#440) 2018-12-17 19:02:49 +08:00
7f014bdb11 Check meta context when update partition version (#438)
Partition.updateVisibleVersionAndVersionHash() is the only method that
may call Catalog.getCurrentCatalogJournalVersion() in a non-replay thread.

So we have to check whether MetaContext is null. If MetaContext is null, which
means this is a non-replay thread, and we do not need call Catalog.getCurrentCatalogJournalVersion().

Also modify the load logic to make delete job done more quickly.
2018-12-17 18:46:27 +08:00
45e42bd003 Redesign the access to meta version (#436)
Because the meta version is only be used in catalog saving and loading.
So currently this version is a field of Catalog class. And we can get this
version only by calling Catalog.getCurrentCatalogJournalVersion().

But in restore process, we need to read the meta data which is saved with
a specified meta version. So we need a flexible way to read a meta data
using a specified meta version, not only the version from Catalog.

So we create a new class called MetaContext. Currently it only has one field,
'journalVersion', to save the current journal version. And it is a
thread local variable, so that we can create a MetaContext anywhere we want,
and setting the 'journalVersion' which we want to use for reading meta.

Currently, there are 4 threads which is related to meta data saving and loading.

The Frontend starting thread, which will call Catalog.initialize() to load the image.
the Frontend state listener thread, which will listen the state changing, and call
transferToMaster() or transferToNonMaster().
Edit log replayed thread, which is created when calling transferToNonMaster().
It will replay edit log
Checkpoint thread, which is created when calling transferToMaster(). It will do
the checkpoint periodically.
Notice that we get the 'current meta version' only when 'READING' the meta (not WRITING).
So we only need to take care of all 'READING' threads.
We create MetaContext thread local variable for these 4 threads, and thread 2,3,4's
meta context inherit from thread 1's meta context. Because thread 1 will load the origin
image file and get the very first meta version.

And we leave the Catalog.getCurrentCatalogJournalVersion()'s name unchanged, just
change its content, because we don't want change a lot codes this time.

On the other hand, we add the current meta version in backup job info file when doing
backup job. So that when restoring from a backup snapshot, we can know which meta
version we should use for read the meta.
And also , we add a new property "meta_version" for Restore stmt, so that we can specify
the meta version used for reading backup meta. It is for those old backup snapshots
which do not has meta version saving in backup job info file.
2018-12-17 10:05:16 +08:00
0bedd336b5 Support add new key column for LinkedSchemaChange (#432)
* Support add key column for LinkedSchemaChange

* Move ColumnMapping to single file
2018-12-14 20:25:25 +08:00
74cc5c5404 Write summary line in load error file anyway (#425)
Summary line should be wrote in spite of the limit error number
2018-12-13 12:35:19 +08:00
850150896a Fix bug that load error log is empty sometimes (#424)
_num_print_error_rows is not initialized
2018-12-13 11:23:32 +08:00
371e3d18ca Fix base compaction bug. (#421)
Base compaction may choose tablet which has missed versions now.
After compaction, it will failed to check tablet integrity by versions and cored dump.
Ignore this tablet when find tablet to base_compaction.
2018-12-12 20:44:32 +08:00
e2bb86cf78 Add Md5Digest to util (#420) 2018-12-12 20:06:35 +08:00
842e943f56 Fix compaction and ingestion core (#417)
Error occurs when reading data by compaction and ingestion.
Under the circumstance, the two operation should stop and return error.
2018-12-12 11:30:06 +08:00
548da0546a Fix compile error in run-fe-ut.sh (#415) 2018-12-11 17:46:13 +08:00
81ee15ed25 Fix compile failure in RLTaskTxnCommitAttachment (#414) 2018-12-11 16:00:07 +08:00
c403c4e999 Add JDK8 in the docker image (#413) 2018-12-11 15:42:07 +08:00
8913c23134 Fix compile failure in GlobalTransactionMgrTest (#412) 2018-12-11 13:53:38 +08:00
fc41842c18 Add a frontend interface for committing RoutineLoadTask (#368)
1. add a needSchedulerTasksQueue in LoadManager: the RoutineLoadTaskScheduler will poll task from this queue and schedule task.
2. add a frontend interface named rlTaskCommit: commit txn, update offset and renew a task for the same partitions
3. add extra property in transaction state: in rlTaskCommit, extra property which looks like {"job_id": xxx, "progress": xxx}
When fe initialize routine load job meta from logs, all of txn state which related to routine load job will be used for initializing progress of job.

Add a TxnStateChangeListener interface for transaction
1. onCommitted , onAborted, beforeAborted will be called by different type of txn
2. RoutineLoadJob will update job progress and renew a task when onCommitted callback
3. Add TxnStateChangeListener into TransactionState
4. set transactionState to committed will call onCommitted callback if callback is not null
5. set transactionState to aborted will call beforeAborted and onAborted
6. beforeAborted in RoutineLoadJob will check if there is related task when TxnStatusChangeReason is TIMEOUT. It will prevent abort when there is a related task by throw TransactionException
7. Other reason of abort will not prevent abort. The onAborted will be call and job state will be change to paused

Change extra to TxnCommitAttachment in TLoadTxnCommitRequest
1. The KAFKA source of TTxnSourceType means that this is a routine load task commit. And the TRLTaskTxnCommitAttachment is the commitInfo of this task.
2. TRLTaskTxnCommitAttachment will be convert to RLTaskTxnCommitAttachment which include progress of this task, task id, numOfErrorData etc.

Add param TxnCommitAttachment into commitTransaction
1. The TxnCommitAttachment will be updated in commitTransaction
2018-12-11 11:06:25 +08:00
ac01da4984 Clear client pool when heartbeat failed (#408)
When heartbeat failed, we should clear the connections cached
in client pool, or we will get broken connections from the pool.
Since we don't have the REOPEN logic(which may cause ugly code style),
a broken connection may cause a rpc blocked and failed.
So clear them all and recreate them when needed is a simple way to
resolve this problem.

We only clear connections in backend and broker pool.
No need to clear heartbeat pool because heartbeat is very frequent,
such the connections can be invalid automatically.
2018-12-10 18:52:51 +08:00
5634d13dd2 Release push_lock when schema change failed (#407)
Release push_lock when schema change failed
2018-12-10 18:39:19 +08:00
b5737ee59a Refactor heartbeat logic (#403)
* Refactor heartbeat logic

Currently we only have Backend heartbeat. And without Frontend
or Broker heartbeat, we don't know the status of these nodes,
thus can't do failover logic in some cases.

1. Add Frontend and Broker heartbeat.
    Frontend heartbeat using BootstrapFinish http rest api
    Broker heartbeat using ping() rpc.
2. All heartbeats are managed in HeartbeatMgr.
3. Rename BrokerAddress to FsBroker.
2018-12-10 14:41:12 +08:00
37636d38e4 Improve build scripts (#404) 2018-12-10 13:56:09 +08:00
530bdec020 Fix bug that null value will be loaded to non-nullable column (#401)
* Fix bug that null value will be loaded to non-nullable column

* Optimize performance
2018-12-06 19:55:55 +08:00
b4d89b19e8 Fix bug that ColumnType is no longer used (#400) 2018-12-06 19:19:29 +08:00
088a914e11 Support Colocate Join (#245) (#246)
* Support colocate join

Colocate join means two table are distributed by the columns being joined,
then we can join them locally on each backend.

Colocate join no data movement and has more concurrency.
2018-12-06 18:59:17 +08:00
e913e45343 Fix bug that null value will be loaded to non-nullable column (#397) 2018-12-06 10:09:34 +08:00
2622da1c5e Add 'fileNameOnly' param to broker's listPath method (#394) 2018-12-05 20:49:31 +08:00
7b2007f852 Revert 'Support 'NO_BACKSLASH_ESCAPES' sql_mode (#392) 2018-12-05 20:23:34 +08:00
5f9f01669b Fix building Docker image error (#390) 2018-12-05 18:27:50 +08:00
cb7e8ff2bb Fix compile failure in ScanNode (#384) 2018-12-04 16:51:48 +08:00
31d1630149 Support 'NO_BACKSLASH_ESCAPES' sql_mode (#382) 2018-12-04 11:33:04 +08:00
d9eb8a2ca1 Fix cast error in BrokerScanNode (#383) 2018-12-04 11:30:03 +08:00
6b4049e21c Unify Slice code path (#380) 2018-12-03 18:11:47 +08:00
603f4e0ca9 Fix a sending signal error when starting Doris BE (#367)
Redirect output message of kill to /dev/null.

Co-Authored-By: chalsliu <45041955+chalsliu@users.noreply.github.com>

ISSUE #365
2018-12-03 15:38:33 +08:00
ff95f23615 Remove OLAP_LOG_DEBUG AND OLAP_LOG_TRACE log format (#378)
Use VLOG(3) and VLOG(10) instead
2018-12-03 10:08:21 +08:00
fb8304123d Support _IP/_HOST in principal of kerberos (#373) 2018-12-03 10:03:57 +08:00
c556ed13f6 Support TRUNCATE TABLE stmt (#377)
* Support TRUNCATE TABLE stmt

User can use TRUNCATE TABLE stmt to empties a table
or partitions completely.
Unlike DELETE, it will drop the tablets directly, and
without any performance impact.

* Fix bugs that new partition should use new ID

* Use equals() to compare Integer

* Fix compile bug

* Fix bug on single range parititon

* Check table's state again after creating partition
2018-12-01 21:18:27 +08:00
5dea8bd3e6 Remove OLAP_LOG_FATAL log format. Use LOG(FATAL) instead (#376) 2018-12-01 19:26:08 +08:00
3d324e38ea Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#372) 2018-11-30 20:59:40 +08:00
49302955c8 Revert "Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#370)" (#371)
This reverts commit a816925776de06dc7503ea7429802cad9042d0e4.
2018-11-30 20:56:51 +08:00
a816925776 Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#370)
* Remove unused row-oriented format flags

* Remove unused row-oriented format flags

* Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead
2018-11-30 20:36:58 +08:00
9447a349ec Subsititue ColumnType to Type (#366)
* Subsititue ColumnType to Type
2018-11-30 16:30:30 +08:00
85d0996b35 Rename Rowset to SegmentGroup (#364)
* Rename Rowset to SegmentGroup

* Modify protobuf related rowset to SegmentGroup
2018-11-29 17:30:41 +08:00
5694bcbd78 Fix stream load failure when target table contains HLL and insert failure when it contains subquery (#359) 2018-11-29 15:40:04 +08:00
aa27d0e056 Fix snapshot's making header bug (#362) 2018-11-28 18:58:21 +08:00
1ffc294833 Ubuntu llvm compile (#361) 2018-11-28 15:22:00 +08:00