doris

Author	SHA1	Message	Date
Mingyu Chen	15c9be4dfe	Fix bug that balance task always choose high usage path (#1143 )	2019-05-11 22:07:17 +08:00
kangkaisen	ae18cebe0b	Improve colocate table balance logic for backend added (#1139 ) 1. Improve colocate table balance logic for backend added 2. Add more comment 3. Break loop early	2019-05-11 21:49:51 +08:00
HangyuanLiu	1eeb5ea891	Add str_to_date function in fe (#1118 )	2019-05-09 17:20:44 +08:00
Mingyu Chen	a08170fd50	Enhance the usabilities (#1100 ) * Enhence the usabilities 1. Add metrics to monitor transactions and steaming load process in BE. 2. Modify BE config 'result_buffer_cancelled_interval_time' to 300s. 3. Modify FE config 'enable_metric_calculator' to true. 4. Add more log for tracing broker load process. 5. Modify the query report process, to cancel query immediately if some instance failed. * Fix bugs 1. Avoid NullPointer when enabling colocation join with broker load 2. Return immediately when pull load task coordinator execution failed	2019-05-07 15:55:04 +08:00
HangyuanLiu	588aa7bed3	Fix date_format function in fe (#1082 )	2019-05-01 22:20:49 +08:00
EmmyMiao87	1662d91877	Change the logic of RoutineLoadTaskScheduler (#1061 ) 1. TaskScheduler will process one task per round 2. TaskScheduler will be blocked till queue tasks a new task 3. TaskScheduler will submit tasks when queue is empty 4. Add a example of creating a broker table by BOS 5. Change syntax of show routine load job	2019-04-28 20:05:48 +08:00
Mingyu Chen	60df7cdb8d	fix ut bug (#1051 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	0adb150da7	Fix ut bugs (#1046 ) Also fix a metrics collection bug	2019-04-28 10:33:50 +08:00
EmmyMiao87	4a95c53f07	Fix bug of listener (#1017 ) * Fix bug of listener * Change txnStateChangeListener to txnStateChangeCallback * Fix the logic of beforeAborted 1. It task is not belong to job, the txn attachment will be set to null. * Txn will be abort normally without attachment. * Job will not be updated by this task which attachment is null.	2019-04-28 10:33:50 +08:00
Mingyu Chen	3409ed41ac	Reset commit offset if task aborted due to runtime error (#994 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	1b5643c6fb	Fix some bugs (#979 ) 1. Add Config.max_routine_load_concurrent_task_num instead of the old one 2. Fix a bug that SHOW ALTER TABLE COLUMN may throw Nullpointer exception 3. Fix some misspelling of docs	2019-04-28 10:33:50 +08:00
Mingyu Chen	b7b66527ce	Fix some load bugs (#961 ) 1. Use load job's timeout as its txn timeout 2. Add a new session variable 'forward_to_master' for SHOW PROC and ADMIN stmt	2019-04-28 10:33:50 +08:00
EmmyMiao87	e352a08339	Change tips of show routine load task (#959 ) 1. Add pauseTimestamp 2. It will be set when job is paused and it will be removed when job is resumed	2019-04-28 10:33:50 +08:00
Mingyu Chen	2b4d02b2fa	Add error load log url for routine load job (#938 )	2019-04-28 10:33:50 +08:00
EmmyMiao87	8e0512e88d	Move lock of routine load job (#934 ) 1. Moving lock of routine load job from inside of lock of txn to outside. 2. The process of routine load task commit or abort is following: * lock job check task lock txn commit txn unlock txn commit task * unlock job 3. The process of checking timeout txn will be ignored when there are related task of txn. 4. The relationship between task and txn will be removed when task timeout.	2019-04-28 10:33:50 +08:00
EmmyMiao87	75674753c2	Add unit test for RoutineLoadManager and RoutineLoadJob (#881 ) 1. Add ut 2. Show history job when table has been deleted. Checking auth whatever tablename is null or not.	2019-04-28 10:33:50 +08:00
Mingyu Chen	400d8a906f	Optimize the consumer assignment of Kafka routine load job (#870 ) 1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine load job. Test results： * 1 consumer, 1 partitions: consume time: 4.469s, rows: 990140, bytes: 128737139. 221557 rows/s, 28M/s * 1 consumer, 3 partitions: consume time: 12.765s, rows: 2000143, bytes: 258631271. 156689 rows/s, 20M/s blocking get time(us): 12268241, blocking put time(us): 1886431 * 3 consumers, 3 partitions: consume time(all 3): 6.095s, rows: 2000503, bytes: 258631576. 328220 rows/s, 42M/s blocking get time(us): 1041639, blocking put time(us): 10356581 The next 2 cases show that we can achieve higher speed by adding more consumers. But the bottle neck transfers from Kafka consumer to Doris ingestion, so 3 consumers in a group is enough. I also add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 3. In my test(1 Backend, 2 tablets, 1 replicas), 1 routine load task can achieve 10M/s, which is same as raw stream load. 2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load	2019-04-28 10:33:50 +08:00
Mingyu Chen	cef2078cb8	Fix FE UT (#850 )	2019-04-28 10:33:50 +08:00
EmmyMiao87	e1c6ba8397	Add show proc of routine load and task (#818 ) 1. add show proc "/routine_loads" to show statistic of all of jobs and tasks 2. add show proc "/routine_loads/jobname" to show info of all of jobs named jobname 3. add show proc "/routine_loads/jobname/jobid" to show tasks belong to jobid 4. fix bug of allocateBeToTask	2019-04-28 10:33:50 +08:00
EmmyMiao87	2e250482fd	Modify routine load fe unit test (#803 )	2019-04-28 10:33:50 +08:00
EmmyMiao87	d213f922be	Implement ShowRoutineLoadStmt and ShowRoutineLoadTaskStmt (#786 ) 1. ShowRoutineLoadStmt is sames like class description. It does not support show all of routine load job in all of db 2. ShowRoutineLoadTaskStmt is sames like class description. It does not support show all of routine laod task in all of job 3. Init partitionIdsToOffset in constructor of KafkaProgress 4. Change Create/Pause/Resume/Stop routine load job to LabelName such as [db.]name 5. Exclude final job when updating job 6. Catch all of exception when scheduling one job. The exception will not block the another jobs.	2019-04-28 10:33:50 +08:00
Mingyu Chen	95d0186e18	Modify some task scheduler logic (#767 ) 1. add job id and cluster name to Task info 2. Simplify the logic of getting beIdToMaxConcurrentTaskNum	2019-04-28 10:33:50 +08:00
Mingyu Chen	aa7f4c82da	modify the replay logic of routine load job (#762 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	8f781f95c7	Add persist operations for routine load job (#754 )	2019-04-28 10:33:50 +08:00
EmmyMiao87	e1fb02d4c0	Add routine load job cleaner (#742 ) 1. the stopped and cancelled job will be cleaned after the interval of clean second 2. the interval of clean second * 1000 = current timestamp - end timestamp 3. if job could not fetch topic metadata when need_schedule, job will be cancelled 4. fix the deadlock of job and txn. the lock of txn must be in front of the lock of job 5. the job will be paused or cancelled depend on the abort reason of txn 6. the job will be cancelled immediately if the abort reason named offsets out of range	2019-04-28 10:33:50 +08:00
EmmyMiao87	8b52787114	Stream load with no data will abort txn (#735 ) 1. stream load executor will abort txn when no correct data in task 2. change txn label to DebugUtil.print(UUID) which is same as task id printed by be 3. change print uuid to hi-lo	2019-04-28 10:33:50 +08:00
EmmyMiao87	062f827b60	Add attachment in rollback txn (#725 ) 1. init cmt offset in stream load context 2. init default max error num = 5000 rows / per 10000 rows 3. add log builder for routine load job and task 4. clone plan fragment param for every task 5. be does not throw too many filter rows while the init max error ratio is 1	2019-04-28 10:33:50 +08:00
EmmyMiao87	fbbe0d19ba	Change the relationship between txn and task (#703 ) 1. Check if properties is null before check routine load properties 2. Change transactionStateChange reason to string 3. calculate current num by beId 4. Add kafka offset properties 5. Prefer to use previous be id 6. Add before commit listener of txn: if txn is committed after task is aborted, commit will be aborted 7. queryId of stream load plan = taskId	2019-04-28 10:33:50 +08:00
EmmyMiao87	2314a3ecd4	Put begin txn into task scheduler (#687 ) 1. fix the nesting lock of db and txn 2. the txn of task will be init in task scheduler before take task from queue	2019-04-28 10:33:50 +08:00
Mingyu Chen	20b2b2c37f	Modify interface (#684 ) 1. Add batch submit interface 2. Add Kafka Event callback to catch Kafka events	2019-04-28 10:33:50 +08:00
EmmyMiao87	152606fbd6	Submit routine load task immediately (#682 ) 1. Use submit_routine_load_task instead of agentTaskQueue 2. Remove thrift dependency in StreamLoadPlanner and StreamLoadScanNode	2019-04-28 10:33:50 +08:00
Mingyu Chen	0820a29b8d	Implement the routine load process of Kafka on Backend (#671 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	d3251a19f7	Modify the method to obtain some metrics (#904 )	2019-04-10 19:37:48 +08:00
kangkaisen	2a3bf5842f	Parallel fragment exec instance (#851 )	2019-04-09 10:05:39 +08:00
Mingyu Chen	d47600ed84	Modify the logic of setting password (#798 ) * Modify the logic of setting password 1. User can set password for current_user() or if it has GRANT priv 2. And USER() function support	2019-03-25 09:27:40 +08:00
lide	c34b306b4f	Decimal optimize branch #695 (#727 )	2019-03-22 17:22:16 +08:00
kangkaisen	2a152e0943	Remove colocate table meta when drop db (#761 )	2019-03-17 15:23:38 +08:00
Mingyu Chen	5f9e82b0fa	Support calculate unix_timestamp() on Frontend (#732 ) #731	2019-03-13 09:58:29 +08:00
Mingyu Chen	584b4371e3	Fix balance with diff storage medium (#705 )	2019-03-11 09:22:30 +08:00
Mingyu Chen	4dbbd32a72	Remove sensitive info (#692 )	2019-03-06 17:29:11 +08:00
EmmyMiao87	ac818c2b7b	Optimize the some schedule policy of routine load in FE (#665 ) * Change routine load task sheduler interval to 0 1. change routine load task scheduler interval to 0 2. init progress when routine load scheduler 3. add unit test and function test of routine load scheduler and task commit * Add checker of custom kafka partition 1. need scheduler to need schedule 2. add checker of custom kafka partition when create routine load job 3. fix unit test error	2019-02-27 15:38:51 +08:00
Mingyu Chen	9252beca99	Simplify the delete stmt (#668 ) Remove the restrict that delete stmt must specify partition even for unpartitioned table	2019-02-27 12:46:36 +08:00
Mingyu Chen	d872f79496	Handle unused disks and tablets report (#633 ) When Backend report unused replica, which means this replica is bad, Frontend should set this replica as bad and repair it. Also, when a disk is reported unused, Frontend should mark this disk as OFFLINE. And no more replica will be assigned to this disk. We also add 3 new metrics: disk_state, tablet_num and scheduled_tablet_num on Frontend to monitor the disk state and number of tablet on each Backend.	2019-02-18 10:20:56 +08:00
Mingyu Chen	171eaa642f	Fix BackendsProcDirTest ut (#612 )	2019-01-31 20:09:15 +08:00
kangkaisen	100eeb18cd	Add test for colocate table (#587 )	2019-01-31 19:23:12 +08:00
Mingyu Chen	daa9d975ca	Fix bugs of Tablet Scheduler (#600 )	2019-01-29 15:35:07 +08:00
kangkaisen	cd7a2c3fd5	Refactor CreateTableTest (#579 )	2019-01-24 13:56:41 +08:00
Mingyu Chen	079141e14a	Add disk usage percent in SHOW BACKEND stmt (#571 )	2019-01-23 14:08:33 +08:00
Mingyu Chen	09df294898	Fix some bugs (#566 ) 1. Backup obj should set state to NORMAL. 2. Replica with version 1-0 should be handled correctly.	2019-01-22 12:21:55 +08:00
chenhao	f7155217bf	Remove build rows counter in PartitionHashJoinNode (#557 ) * Remove build rows counter in PartitionHashJoinNode * Fix unit test fail in RuntimeProfileTest * Add check for result type length in cast_to_string_val	2019-01-21 14:08:59 +08:00

1 2

83 Commits