doris

Author	SHA1	Message	Date
ZHAO Chun	9d03ba236b	Uniform Status (#1317 )	2019-06-14 23:38:31 +08:00
Mingyu Chen	ff0dd0d2da	Support SSL authentication with Kafka in routine load job (#1235 )	2019-06-07 16:29:01 +08:00
HangyuanLiu	9d19c6c315	Support arbitrary kafka properties (#1204 )	2019-05-28 10:03:50 +08:00
Mingyu Chen	722a9e71c7	Optimize json functions (#1177 ) 1. get_json_xxx() now support using quoto to escape dot 2. Implement json_path_prepare() function to preprocess json_path Performance of get_json_string() on 1000000 rows reduces from 2.27s to 0.27s	2019-05-21 09:13:12 +08:00
Mingyu Chen	cf1e7aa844	Add close tablet writer log (#1014 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	2b4d02b2fa	Add error load log url for routine load job (#938 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	400d8a906f	Optimize the consumer assignment of Kafka routine load job (#870 ) 1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine load job. Test results： * 1 consumer, 1 partitions: consume time: 4.469s, rows: 990140, bytes: 128737139. 221557 rows/s, 28M/s * 1 consumer, 3 partitions: consume time: 12.765s, rows: 2000143, bytes: 258631271. 156689 rows/s, 20M/s blocking get time(us): 12268241, blocking put time(us): 1886431 * 3 consumers, 3 partitions: consume time(all 3): 6.095s, rows: 2000503, bytes: 258631576. 328220 rows/s, 42M/s blocking get time(us): 1041639, blocking put time(us): 10356581 The next 2 cases show that we can achieve higher speed by adding more consumers. But the bottle neck transfers from Kafka consumer to Doris ingestion, so 3 consumers in a group is enough. I also add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 3. In my test(1 Backend, 2 tablets, 1 replicas), 1 routine load task can achieve 10M/s, which is same as raw stream load. 2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load	2019-04-28 10:33:50 +08:00
Mingyu Chen	9d08be3c5f	Add metrics for routine load (#795 ) * Add metrics for routine load * limit the max number of routine load task in backend to 10 * Fix bug that some partitions will no be assigned	2019-04-28 10:33:50 +08:00
Mingyu Chen	8d2de42b36	Fix some routine load bugs (#787 ) 1. Reserve the column order in load stmt. 2. Fix some replay bugs of routine load task.	2019-04-28 10:33:50 +08:00
Mingyu Chen	9fa5e1b768	Add a cleaner bg thread to clean idle data consumer (#776 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	8f781f95c7	Add persist operations for routine load job (#754 )	2019-04-28 10:33:50 +08:00
EmmyMiao87	8b52787114	Stream load with no data will abort txn (#735 ) 1. stream load executor will abort txn when no correct data in task 2. change txn label to DebugUtil.print(UUID) which is same as task id printed by be 3. change print uuid to hi-lo	2019-04-28 10:33:50 +08:00
Mingyu Chen	8474061d63	Add some logs (#711 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	567d5de2de	Add a data consumer pool to reuse the data consumer (#691 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	20b2b2c37f	Modify interface (#684 ) 1. Add batch submit interface 2. Add Kafka Event callback to catch Kafka events	2019-04-28 10:33:50 +08:00
Mingyu Chen	9618d20a72	Add unit test (#675 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	0820a29b8d	Implement the routine load process of Kafka on Backend (#671 )	2019-04-28 10:33:50 +08:00

17 Commits