doris

Author	SHA1	Message	Date
Mingyu Chen	400d8a906f	Optimize the consumer assignment of Kafka routine load job (#870 ) 1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine load job. Test results： * 1 consumer, 1 partitions: consume time: 4.469s, rows: 990140, bytes: 128737139. 221557 rows/s, 28M/s * 1 consumer, 3 partitions: consume time: 12.765s, rows: 2000143, bytes: 258631271. 156689 rows/s, 20M/s blocking get time(us): 12268241, blocking put time(us): 1886431 * 3 consumers, 3 partitions: consume time(all 3): 6.095s, rows: 2000503, bytes: 258631576. 328220 rows/s, 42M/s blocking get time(us): 1041639, blocking put time(us): 10356581 The next 2 cases show that we can achieve higher speed by adding more consumers. But the bottle neck transfers from Kafka consumer to Doris ingestion, so 3 consumers in a group is enough. I also add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 3. In my test(1 Backend, 2 tablets, 1 replicas), 1 routine load task can achieve 10M/s, which is same as raw stream load. 2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load	2019-04-28 10:33:50 +08:00
Mingyu Chen	cef2078cb8	Fix FE UT (#850 )	2019-04-28 10:33:50 +08:00
EmmyMiao87	e1c6ba8397	Add show proc of routine load and task (#818 ) 1. add show proc "/routine_loads" to show statistic of all of jobs and tasks 2. add show proc "/routine_loads/jobname" to show info of all of jobs named jobname 3. add show proc "/routine_loads/jobname/jobid" to show tasks belong to jobid 4. fix bug of allocateBeToTask	2019-04-28 10:33:50 +08:00
EmmyMiao87	2e250482fd	Modify routine load fe unit test (#803 )	2019-04-28 10:33:50 +08:00
EmmyMiao87	d213f922be	Implement ShowRoutineLoadStmt and ShowRoutineLoadTaskStmt (#786 ) 1. ShowRoutineLoadStmt is sames like class description. It does not support show all of routine load job in all of db 2. ShowRoutineLoadTaskStmt is sames like class description. It does not support show all of routine laod task in all of job 3. Init partitionIdsToOffset in constructor of KafkaProgress 4. Change Create/Pause/Resume/Stop routine load job to LabelName such as [db.]name 5. Exclude final job when updating job 6. Catch all of exception when scheduling one job. The exception will not block the another jobs.	2019-04-28 10:33:50 +08:00
Mingyu Chen	95d0186e18	Modify some task scheduler logic (#767 ) 1. add job id and cluster name to Task info 2. Simplify the logic of getting beIdToMaxConcurrentTaskNum	2019-04-28 10:33:50 +08:00
Mingyu Chen	aa7f4c82da	modify the replay logic of routine load job (#762 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	8f781f95c7	Add persist operations for routine load job (#754 )	2019-04-28 10:33:50 +08:00
EmmyMiao87	e1fb02d4c0	Add routine load job cleaner (#742 ) 1. the stopped and cancelled job will be cleaned after the interval of clean second 2. the interval of clean second * 1000 = current timestamp - end timestamp 3. if job could not fetch topic metadata when need_schedule, job will be cancelled 4. fix the deadlock of job and txn. the lock of txn must be in front of the lock of job 5. the job will be paused or cancelled depend on the abort reason of txn 6. the job will be cancelled immediately if the abort reason named offsets out of range	2019-04-28 10:33:50 +08:00
EmmyMiao87	8b52787114	Stream load with no data will abort txn (#735 ) 1. stream load executor will abort txn when no correct data in task 2. change txn label to DebugUtil.print(UUID) which is same as task id printed by be 3. change print uuid to hi-lo	2019-04-28 10:33:50 +08:00
EmmyMiao87	062f827b60	Add attachment in rollback txn (#725 ) 1. init cmt offset in stream load context 2. init default max error num = 5000 rows / per 10000 rows 3. add log builder for routine load job and task 4. clone plan fragment param for every task 5. be does not throw too many filter rows while the init max error ratio is 1	2019-04-28 10:33:50 +08:00
EmmyMiao87	fbbe0d19ba	Change the relationship between txn and task (#703 ) 1. Check if properties is null before check routine load properties 2. Change transactionStateChange reason to string 3. calculate current num by beId 4. Add kafka offset properties 5. Prefer to use previous be id 6. Add before commit listener of txn: if txn is committed after task is aborted, commit will be aborted 7. queryId of stream load plan = taskId	2019-04-28 10:33:50 +08:00
EmmyMiao87	2314a3ecd4	Put begin txn into task scheduler (#687 ) 1. fix the nesting lock of db and txn 2. the txn of task will be init in task scheduler before take task from queue	2019-04-28 10:33:50 +08:00
Mingyu Chen	20b2b2c37f	Modify interface (#684 ) 1. Add batch submit interface 2. Add Kafka Event callback to catch Kafka events	2019-04-28 10:33:50 +08:00
EmmyMiao87	152606fbd6	Submit routine load task immediately (#682 ) 1. Use submit_routine_load_task instead of agentTaskQueue 2. Remove thrift dependency in StreamLoadPlanner and StreamLoadScanNode	2019-04-28 10:33:50 +08:00
Mingyu Chen	0820a29b8d	Implement the routine load process of Kafka on Backend (#671 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	d3251a19f7	Modify the method to obtain some metrics (#904 )	2019-04-10 19:37:48 +08:00
kangkaisen	2a3bf5842f	Parallel fragment exec instance (#851 )	2019-04-09 10:05:39 +08:00
Mingyu Chen	d47600ed84	Modify the logic of setting password (#798 ) * Modify the logic of setting password 1. User can set password for current_user() or if it has GRANT priv 2. And USER() function support	2019-03-25 09:27:40 +08:00
lide	c34b306b4f	Decimal optimize branch #695 (#727 )	2019-03-22 17:22:16 +08:00
kangkaisen	2a152e0943	Remove colocate table meta when drop db (#761 )	2019-03-17 15:23:38 +08:00
Mingyu Chen	5f9e82b0fa	Support calculate unix_timestamp() on Frontend (#732 ) #731	2019-03-13 09:58:29 +08:00
Mingyu Chen	584b4371e3	Fix balance with diff storage medium (#705 )	2019-03-11 09:22:30 +08:00
Mingyu Chen	4dbbd32a72	Remove sensitive info (#692 )	2019-03-06 17:29:11 +08:00
EmmyMiao87	ac818c2b7b	Optimize the some schedule policy of routine load in FE (#665 ) * Change routine load task sheduler interval to 0 1. change routine load task scheduler interval to 0 2. init progress when routine load scheduler 3. add unit test and function test of routine load scheduler and task commit * Add checker of custom kafka partition 1. need scheduler to need schedule 2. add checker of custom kafka partition when create routine load job 3. fix unit test error	2019-02-27 15:38:51 +08:00
Mingyu Chen	9252beca99	Simplify the delete stmt (#668 ) Remove the restrict that delete stmt must specify partition even for unpartitioned table	2019-02-27 12:46:36 +08:00
Mingyu Chen	d872f79496	Handle unused disks and tablets report (#633 ) When Backend report unused replica, which means this replica is bad, Frontend should set this replica as bad and repair it. Also, when a disk is reported unused, Frontend should mark this disk as OFFLINE. And no more replica will be assigned to this disk. We also add 3 new metrics: disk_state, tablet_num and scheduled_tablet_num on Frontend to monitor the disk state and number of tablet on each Backend.	2019-02-18 10:20:56 +08:00
Mingyu Chen	171eaa642f	Fix BackendsProcDirTest ut (#612 )	2019-01-31 20:09:15 +08:00
kangkaisen	100eeb18cd	Add test for colocate table (#587 )	2019-01-31 19:23:12 +08:00
Mingyu Chen	daa9d975ca	Fix bugs of Tablet Scheduler (#600 )	2019-01-29 15:35:07 +08:00
kangkaisen	cd7a2c3fd5	Refactor CreateTableTest (#579 )	2019-01-24 13:56:41 +08:00
Mingyu Chen	079141e14a	Add disk usage percent in SHOW BACKEND stmt (#571 )	2019-01-23 14:08:33 +08:00
Mingyu Chen	09df294898	Fix some bugs (#566 ) 1. Backup obj should set state to NORMAL. 2. Replica with version 1-0 should be handled correctly.	2019-01-22 12:21:55 +08:00
chenhao	f7155217bf	Remove build rows counter in PartitionHashJoinNode (#557 ) * Remove build rows counter in PartitionHashJoinNode * Fix unit test fail in RuntimeProfileTest * Add check for result type length in cast_to_string_val	2019-01-21 14:08:59 +08:00
EmmyMiao87	6bef41633c	Add DORIS_THIRDPARTY env in docker image (#539 ) * Add param of specified thirdparty path 1. The thirdparth path can be specify on build.sh: ./build.sh --thirdparty /specified/path/to/thirdparty 2. If there are only thirdparty param of build.sh, it will build both fe and be 3. Add unit test of routine load stmt 4. Remove source code in docker image * Add DORIS_THIRDPARTY env in docker image 1. Set DORIS_THIRDPARTY env in docker image. The build.sh will use /var/local/thirdparty instead of /source/code/thirdparty 2. remove --thirdparty param of build.sh * Change image workdir to /root	2019-01-17 14:19:13 +08:00
chenhao	0e5b193243	Add cpu and io indicates to audit log (#531 )	2019-01-17 12:43:15 +08:00
Mingyu Chen	33b133c6ff	Fix bug that internal retry of stream load return wrong result (#541 ) Add an internal-generated timestamp as a unique identifier to identify a request and a retry request	2019-01-16 18:59:19 +08:00
Mingyu Chen	798a66e6a0	Implement new tablet repair and balance framework (#336 ) More detail, see issue #540	2019-01-16 13:29:17 +08:00
chenhao	d372b04e42	Revert "Add cpu and io indicates to audit log (#513 )" (#520 ) This reverts commit 5192e2f010308eefffa5271b0bdc947dfd9168ae.	2019-01-10 12:44:09 +08:00
chenhao	5192e2f010	Add cpu and io indicates to audit log (#513 ) Record query consumption into fe audit log. Its basic mode of work is as follows, one of instance of parent plan is responsible for accumulating sub plan's consumption and send to it's parent, BE coordinator will get total consumption because it's a single instance.	2019-01-09 22:28:20 +08:00
yiguolei	69f9987abd	EsTable without partition info (#511 )	2019-01-09 11:14:19 +08:00
EmmyMiao87	483c5a971e	Add routine load statement (#456 ) 1. Add sql parser and sql scanner for routine load stmt such as KW_ROUTINE(routine), KW_PAUSE. 2. Create routine load statment like CREATE ROUTINE LOAD name ON database.table (properties of routine load) [PROPERTIES (key1=value1, )] FROM [KAFKA](type of routine load) (properties of this type) properties of routine load: The load property of CreateRoutineLoadStmt is disordered: Both 'LoadColumnsInfo, PartitionNames xxx' and 'PartitionNames, ColumnsInfo xxx' is right. [COLUMNS TERMINATED BY separator ] [(col1, ...)] [SET (k1=f1(xx), k2=f2(xx))] WHERE [PARTITION (p1, p2)] type of routine load: KAFKA different type has different properties properties of this type: k1 = v1 k2 = v2 3. Pause/Resume/Stop routine load statment like PAUSE/RESUME/STOP ROUTINE LOAD jobName 4. Ddlexecutor support CreateRoutineLoadStmt, Pause/Resume/StopRoutineLoadStmt 5. Pause/Stop routine load will clear all of task which belong to job immediately The task which has been not committed will be abort. 6. Resume routine load will change job state to need scheduler The RoutineLoadJobScheduler will scheduler it later. 7. Show routine load statment like SHOW ROUTINE LOAD jobName 8. All of load property can implement LoadProperty such as LoadColumnsInfo, PartitionsNames etc 9. The sql of LoadColumnsInfo is Columns (c1, c2, c3) set (c1, c2, c3=c1+c2) 10. Add check of routineLoadName, db.routineLoadName is unique in database when job state is not final state.	2019-01-04 13:49:49 +08:00
Mingyu Chen	a51ce03595	Enhance the usability of Load operation (#490 ) 1. Add broker load error hub A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly. We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed. 2. Show load error logs by using SHOW LOAD WARNINGS stmt We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW LOAD' stmt. eg: show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx"; 3. Support now() function in broker load User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started. 4. Support more types of wildcard in broker load Currently, we only support wildcard '' to match the file names. wildcard like '/path/to/20190[1-4]' is not support.	2019-01-03 19:07:27 +08:00
Mingyu Chen	7d7934112f	Fix fe ut (#469 ) 1. Fix StreamLoadScanNodeTest 2. Revert the fix of decimal value with scientific notation, this still need to fix it later	2018-12-25 20:07:03 +08:00
Mingyu Chen	5b1e3d3f40	Optimize backup & restore process (#460 ) 1. Print broker address for debug. 2. Do not letting backup job cancelled if it already in state UPLOAD_INFO. 3. Cancel task on Backends when job is cancelled. 4. Show detail progress of backup and restore job. 5. Make 'show snapshot' result more readable. 6. Change upload and download thread num of backup and restore in Backend to 1.	2018-12-24 16:49:16 +08:00
Mingyu Chen	45e42bd003	Redesign the access to meta version (#436 ) Because the meta version is only be used in catalog saving and loading. So currently this version is a field of Catalog class. And we can get this version only by calling Catalog.getCurrentCatalogJournalVersion(). But in restore process, we need to read the meta data which is saved with a specified meta version. So we need a flexible way to read a meta data using a specified meta version, not only the version from Catalog. So we create a new class called MetaContext. Currently it only has one field, 'journalVersion', to save the current journal version. And it is a thread local variable, so that we can create a MetaContext anywhere we want, and setting the 'journalVersion' which we want to use for reading meta. Currently, there are 4 threads which is related to meta data saving and loading. The Frontend starting thread, which will call Catalog.initialize() to load the image. the Frontend state listener thread, which will listen the state changing, and call transferToMaster() or transferToNonMaster(). Edit log replayed thread, which is created when calling transferToNonMaster(). It will replay edit log Checkpoint thread, which is created when calling transferToMaster(). It will do the checkpoint periodically. Notice that we get the 'current meta version' only when 'READING' the meta (not WRITING). So we only need to take care of all 'READING' threads. We create MetaContext thread local variable for these 4 threads, and thread 2,3,4's meta context inherit from thread 1's meta context. Because thread 1 will load the origin image file and get the very first meta version. And we leave the Catalog.getCurrentCatalogJournalVersion()'s name unchanged, just change its content, because we don't want change a lot codes this time. On the other hand, we add the current meta version in backup job info file when doing backup job. So that when restoring from a backup snapshot, we can know which meta version we should use for read the meta. And also , we add a new property "meta_version" for Restore stmt, so that we can specify the meta version used for reading backup meta. It is for those old backup snapshots which do not has meta version saving in backup job info file.	2018-12-17 10:05:16 +08:00
lide	548da0546a	Fix compile error in run-fe-ut.sh (#415 )	2018-12-11 17:46:13 +08:00
EmmyMiao87	8913c23134	Fix compile failure in GlobalTransactionMgrTest (#412 )	2018-12-11 13:53:38 +08:00
EmmyMiao87	fc41842c18	Add a frontend interface for committing RoutineLoadTask (#368 ) 1. add a needSchedulerTasksQueue in LoadManager: the RoutineLoadTaskScheduler will poll task from this queue and schedule task. 2. add a frontend interface named rlTaskCommit: commit txn, update offset and renew a task for the same partitions 3. add extra property in transaction state: in rlTaskCommit, extra property which looks like {"job_id": xxx, "progress": xxx} When fe initialize routine load job meta from logs, all of txn state which related to routine load job will be used for initializing progress of job. Add a TxnStateChangeListener interface for transaction 1. onCommitted , onAborted, beforeAborted will be called by different type of txn 2. RoutineLoadJob will update job progress and renew a task when onCommitted callback 3. Add TxnStateChangeListener into TransactionState 4. set transactionState to committed will call onCommitted callback if callback is not null 5. set transactionState to aborted will call beforeAborted and onAborted 6. beforeAborted in RoutineLoadJob will check if there is related task when TxnStatusChangeReason is TIMEOUT. It will prevent abort when there is a related task by throw TransactionException 7. Other reason of abort will not prevent abort. The onAborted will be call and job state will be change to paused Change extra to TxnCommitAttachment in TLoadTxnCommitRequest 1. The KAFKA source of TTxnSourceType means that this is a routine load task commit. And the TRLTaskTxnCommitAttachment is the commitInfo of this task. 2. TRLTaskTxnCommitAttachment will be convert to RLTaskTxnCommitAttachment which include progress of this task, task id, numOfErrorData etc. Add param TxnCommitAttachment into commitTransaction 1. The TxnCommitAttachment will be updated in commitTransaction	2018-12-11 11:06:25 +08:00
Mingyu Chen	b5737ee59a	Refactor heartbeat logic (#403 ) * Refactor heartbeat logic Currently we only have Backend heartbeat. And without Frontend or Broker heartbeat, we don't know the status of these nodes, thus can't do failover logic in some cases. 1. Add Frontend and Broker heartbeat. Frontend heartbeat using BootstrapFinish http rest api Broker heartbeat using ping() rpc. 2. All heartbeats are managed in HeartbeatMgr. 3. Rename BrokerAddress to FsBroker.	2018-12-10 14:41:12 +08:00

1 2

67 Commits