Commit Graph

36 Commits

Author SHA1 Message Date
079141e14a Add disk usage percent in SHOW BACKEND stmt (#571) 2019-01-23 14:08:33 +08:00
09df294898 Fix some bugs (#566)
1. Backup obj should set state to NORMAL.
2. Replica with version 1-0 should be handled correctly.
2019-01-22 12:21:55 +08:00
f7155217bf Remove build rows counter in PartitionHashJoinNode (#557)
* Remove build rows counter in PartitionHashJoinNode
* Fix unit test fail in RuntimeProfileTest
* Add check for result type length in cast_to_string_val
2019-01-21 14:08:59 +08:00
6bef41633c Add DORIS_THIRDPARTY env in docker image (#539)
* Add param of specified thirdparty path
1. The thirdparth path can be specify on build.sh: ./build.sh --thirdparty /specified/path/to/thirdparty
2. If there are only thirdparty param of build.sh, it will build both fe and be
3. Add unit test of routine load stmt
4. Remove source code in docker image

* Add DORIS_THIRDPARTY env in docker image
1. Set DORIS_THIRDPARTY env in docker image. The build.sh will use /var/local/thirdparty instead of /source/code/thirdparty
2. remove --thirdparty param of build.sh

* Change image workdir to /root
2019-01-17 14:19:13 +08:00
0e5b193243 Add cpu and io indicates to audit log (#531) 2019-01-17 12:43:15 +08:00
33b133c6ff Fix bug that internal retry of stream load return wrong result (#541)
Add an internal-generated timestamp as a unique identifier to identify a request and a retry request
2019-01-16 18:59:19 +08:00
798a66e6a0 Implement new tablet repair and balance framework (#336)
More detail, see issue #540
2019-01-16 13:29:17 +08:00
d372b04e42 Revert "Add cpu and io indicates to audit log (#513)" (#520)
This reverts commit 5192e2f010308eefffa5271b0bdc947dfd9168ae.
2019-01-10 12:44:09 +08:00
5192e2f010 Add cpu and io indicates to audit log (#513)
Record query consumption into fe audit log. Its basic mode of work is as follows, one of instance of parent plan is responsible for accumulating sub plan's consumption and send to it's parent, BE coordinator will get total consumption because it's a single instance.
2019-01-09 22:28:20 +08:00
69f9987abd EsTable without partition info (#511) 2019-01-09 11:14:19 +08:00
483c5a971e Add routine load statement (#456)
1. Add sql parser and sql scanner for routine load stmt such as KW_ROUTINE(routine), KW_PAUSE.
2. Create routine load statment like
      CREATE ROUTINE LOAD name ON database.table
      (properties of routine load)
      [PROPERTIES (key1=value1, )]
      FROM [KAFKA](type of routine load)
      (properties of this type)

      properties of routine load:
          The load property of CreateRoutineLoadStmt is disordered: Both 'LoadColumnsInfo, PartitionNames xxx' and 'PartitionNames, ColumnsInfo xxx' is right.
          [COLUMNS TERMINATED BY separator ]
          [(col1, ...)]
          [SET (k1=f1(xx), k2=f2(xx))]
          WHERE
          [PARTITION (p1, p2)]

      type of routine load:
          KAFKA

      different type has different properties
      properties of this type:
          k1 = v1
          k2 = v2
3. Pause/Resume/Stop routine load statment like
      PAUSE/RESUME/STOP ROUTINE LOAD jobName
4. Ddlexecutor support CreateRoutineLoadStmt, Pause/Resume/StopRoutineLoadStmt
5. Pause/Stop routine load will clear all of task which belong to job immediately
      The task which has been not committed will be abort.
6. Resume routine load will change job state to need scheduler
      The RoutineLoadJobScheduler will scheduler it later.
7. Show routine load statment like
      SHOW ROUTINE LOAD jobName
8. All of load property can implement LoadProperty such as LoadColumnsInfo, PartitionsNames etc
9. The sql of LoadColumnsInfo is Columns (c1, c2, c3) set (c1, c2, c3=c1+c2)
10. Add check of routineLoadName, db.routineLoadName is unique in database when job state is not final state.
2019-01-04 13:49:49 +08:00
a51ce03595 Enhance the usability of Load operation (#490)
1. Add broker load error hub
A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly.
We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed.

2. Show load error logs by using SHOW LOAD WARNINGS stmt
We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW  LOAD' stmt.
eg:
show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx";

3. Support now() function in broker load
User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started.

4. Support more types of wildcard in broker load
Currently, we only support wildcard '*' to match the file names. wildcard like '/path/to/20190[1-4]*' is not support.
2019-01-03 19:07:27 +08:00
7d7934112f Fix fe ut (#469)
1. Fix StreamLoadScanNodeTest
2. Revert the fix of decimal value with scientific notation, this still need to fix it later
2018-12-25 20:07:03 +08:00
5b1e3d3f40 Optimize backup & restore process (#460)
1. Print broker address for debug.
2. Do not letting backup job cancelled if it already in state UPLOAD_INFO.
3. Cancel task on Backends when job is cancelled.
4. Show detail progress of backup and restore job.
5. Make 'show snapshot' result more readable.
6. Change upload and download thread num of backup and restore in Backend to 1.
2018-12-24 16:49:16 +08:00
45e42bd003 Redesign the access to meta version (#436)
Because the meta version is only be used in catalog saving and loading.
So currently this version is a field of Catalog class. And we can get this
version only by calling Catalog.getCurrentCatalogJournalVersion().

But in restore process, we need to read the meta data which is saved with
a specified meta version. So we need a flexible way to read a meta data
using a specified meta version, not only the version from Catalog.

So we create a new class called MetaContext. Currently it only has one field,
'journalVersion', to save the current journal version. And it is a
thread local variable, so that we can create a MetaContext anywhere we want,
and setting the 'journalVersion' which we want to use for reading meta.

Currently, there are 4 threads which is related to meta data saving and loading.

The Frontend starting thread, which will call Catalog.initialize() to load the image.
the Frontend state listener thread, which will listen the state changing, and call
transferToMaster() or transferToNonMaster().
Edit log replayed thread, which is created when calling transferToNonMaster().
It will replay edit log
Checkpoint thread, which is created when calling transferToMaster(). It will do
the checkpoint periodically.
Notice that we get the 'current meta version' only when 'READING' the meta (not WRITING).
So we only need to take care of all 'READING' threads.
We create MetaContext thread local variable for these 4 threads, and thread 2,3,4's
meta context inherit from thread 1's meta context. Because thread 1 will load the origin
image file and get the very first meta version.

And we leave the Catalog.getCurrentCatalogJournalVersion()'s name unchanged, just
change its content, because we don't want change a lot codes this time.

On the other hand, we add the current meta version in backup job info file when doing
backup job. So that when restoring from a backup snapshot, we can know which meta
version we should use for read the meta.
And also , we add a new property "meta_version" for Restore stmt, so that we can specify
the meta version used for reading backup meta. It is for those old backup snapshots
which do not has meta version saving in backup job info file.
2018-12-17 10:05:16 +08:00
548da0546a Fix compile error in run-fe-ut.sh (#415) 2018-12-11 17:46:13 +08:00
8913c23134 Fix compile failure in GlobalTransactionMgrTest (#412) 2018-12-11 13:53:38 +08:00
fc41842c18 Add a frontend interface for committing RoutineLoadTask (#368)
1. add a needSchedulerTasksQueue in LoadManager: the RoutineLoadTaskScheduler will poll task from this queue and schedule task.
2. add a frontend interface named rlTaskCommit: commit txn, update offset and renew a task for the same partitions
3. add extra property in transaction state: in rlTaskCommit, extra property which looks like {"job_id": xxx, "progress": xxx}
When fe initialize routine load job meta from logs, all of txn state which related to routine load job will be used for initializing progress of job.

Add a TxnStateChangeListener interface for transaction
1. onCommitted , onAborted, beforeAborted will be called by different type of txn
2. RoutineLoadJob will update job progress and renew a task when onCommitted callback
3. Add TxnStateChangeListener into TransactionState
4. set transactionState to committed will call onCommitted callback if callback is not null
5. set transactionState to aborted will call beforeAborted and onAborted
6. beforeAborted in RoutineLoadJob will check if there is related task when TxnStatusChangeReason is TIMEOUT. It will prevent abort when there is a related task by throw TransactionException
7. Other reason of abort will not prevent abort. The onAborted will be call and job state will be change to paused

Change extra to TxnCommitAttachment in TLoadTxnCommitRequest
1. The KAFKA source of TTxnSourceType means that this is a routine load task commit. And the TRLTaskTxnCommitAttachment is the commitInfo of this task.
2. TRLTaskTxnCommitAttachment will be convert to RLTaskTxnCommitAttachment which include progress of this task, task id, numOfErrorData etc.

Add param TxnCommitAttachment into commitTransaction
1. The TxnCommitAttachment will be updated in commitTransaction
2018-12-11 11:06:25 +08:00
b5737ee59a Refactor heartbeat logic (#403)
* Refactor heartbeat logic

Currently we only have Backend heartbeat. And without Frontend
or Broker heartbeat, we don't know the status of these nodes,
thus can't do failover logic in some cases.

1. Add Frontend and Broker heartbeat.
    Frontend heartbeat using BootstrapFinish http rest api
    Broker heartbeat using ping() rpc.
2. All heartbeats are managed in HeartbeatMgr.
3. Rename BrokerAddress to FsBroker.
2018-12-10 14:41:12 +08:00
9447a349ec Subsititue ColumnType to Type (#366)
* Subsititue ColumnType to Type
2018-11-30 16:30:30 +08:00
dedfccfaf5 Optimize the publish logic of streaming load (#350)
1. Only collect all error replicas if publish task is timeout.
2. Add 2 metrics to monitor the success of failure of txn.
3. Change publish timeout to Config.load_straggler_wait_second
2018-11-26 19:01:50 +08:00
bbdf4fba4a Add distributor which schedule task to be fairly, for routine load job (#333)
Step1: updateBeIdTaskMaps, remove unavailable BE and add new alive BE
Step2: process timeout tasks, if a task has already been allocated to BE but not finished before DEFAULT_TASK_TIMEOUT, it will be discarded.
       At the same time, the partitions belong to old tasks will be allocated to a new task. The new task with a signature will be added in the queue of needSchedulerRoutineLoadTask.
Step3: process all needSchedulerRoutineLoadTasks, allocate task to BE. The task will be executed by BE.
2018-11-23 10:35:10 +08:00
485db34f1e Modify partition's version name to what it means (#334)
* Modify partition's version name to what it means.

1. committedVersion(Hash) -> visibleVersion(Hash)
2. currentVersion(Hash) -> committedVersion(Hash)
3. add some comment to make the code more readable

* Check if editlog is null in CatalogIdGenerator
    To avoid unit test failure
2018-11-21 19:21:16 +08:00
791e89568e Change PaloMetrics' name and Catalog's Id generator (#329)
* Change PaloMetrics' name and Catalog's Id generator
1. Remove 'Palo' prefix of class Metric.
2. Add a new CatalogIdGenerator to replace the old AtomicLong, to avoid too many edit logs.
3. Add a new histogram to monitor write letency of edit log write.

* modify next id logic

* fix a bug that Metric is not init before using HISTO_EDIT_LOG_WRITE_LATENCY

* fix a problem
2018-11-20 18:59:18 +08:00
9a2ad18428 Add path info of replica in catalog (#327)
Add path info of replica in catalog

Also fix a bug that when calling check_none_row_oriented_table,
store is null, it cannot be used to create table.
Instead, OLAPHeader can be used to get storage type information.
2018-11-19 17:42:46 +08:00
44029937e4 Add scheduler routine load job for stream load (#313)
1. fetch need_scheduler routine load job
2. caculate current concurrent task number of job
3. divide kafka partition into tasks
2018-11-15 21:04:22 +08:00
d7ee57e881 Optimize quota unit (#309)
Originally, we can only set quota in bytes unit. This commit add quota unit:K/KB/M/MB/G/GB/T/TB/P/PB
for convenience.
2018-11-15 14:03:52 +08:00
fc8f78d81c Fix unit test failure (#286) 2018-11-07 12:53:11 +08:00
2a290048c7 Change the lock type of Catalog lock (#265)
* Change the lock type of Catalog lock

Implement a QueryableReentrantLock to see which thread held the lock.

* Key in empty version has no min/max value

It should be ignored to reconstruct min/max statistics upon restart.
2018-11-01 16:33:09 +08:00
2868793b6b Change license to Apache License 2.0 (#262) 2018-11-01 09:06:01 +08:00
06413999e5 Fix some compile and script errors (#259)
1. Fix error class in start_fe.sh and start_broker.sh.
2. Add log4j2.xml in fe/src/test/resources/ to run fe ut without log4j warnings.
3. Reduce the test file size in be ut.
2018-10-31 19:45:21 +08:00
2be7991561 Change com.baidu.palo to org.apache.doris
Change some package names in fe/fs_brokers/gensrc dir.
2018-10-31 17:07:16 +08:00
051aced48d Missing many files in last commit
In last commit, a lot of files has been missed
2018-10-31 16:19:21 +08:00
5d3fc80067 Added:
* Add streaming load feature. You can execute 'help stream load;' to see more information.

Changed:
* Loading phase of a certain table can be parallelized, to reduce the load job execution time when multi load jobs to a single table.
* Using RocksDB to save the header info of tablets in Backends, to reduce the IO operations and increate speeding of restarting.

Fixed:
* A lot of bugs fixed.
2018-10-31 14:46:22 +08:00
ae9ce81453 Changed: change build.sh to use environment variable to get thirdparty's
path, and change PALO_HOME to DORIS_HOME
2018-10-30 16:29:06 +08:00
68d663fe7a Changed: change the compilation method of Frontend and Apache hdfs broker from ANT to MAVEN 2018-10-27 11:35:20 +08:00