1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine
load job.
Test results:
* 1 consumer, 1 partitions:
consume time: 4.469s, rows: 990140, bytes: 128737139. 221557 rows/s, 28M/s
* 1 consumer, 3 partitions:
consume time: 12.765s, rows: 2000143, bytes: 258631271. 156689 rows/s, 20M/s
blocking get time(us): 12268241, blocking put time(us): 1886431
* 3 consumers, 3 partitions:
consume time(all 3): 6.095s, rows: 2000503, bytes: 258631576. 328220 rows/s, 42M/s
blocking get time(us): 1041639, blocking put time(us): 10356581
The next 2 cases show that we can achieve higher speed by adding more consumers. But the bottle neck transfers from Kafka consumer to Doris ingestion, so 3 consumers in a group is enough.
I also add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 3.
In my test(1 Backend, 2 tablets, 1 replicas), 1 routine load task can achieve 10M/s, which is same as raw stream load.
2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load
Philosophy
write once, use everywhere
Documentations will be written once, and will be converted to other format according to different application scenarios.
Implementation
+---------------+
| Documentation |
+-------+-------+
|
+-------+-------+
| Doc Builder |
+-------+-------+
|
+--------------------------------+
| | |
+---+---+ +---+----+ +-----+----+
| PDF | | HTML | .... | Help Doc |
+-------+ +--------+ +----------+
Documentation:Text contents which is written by human. And this is the only place for documentation. Doc Builder: Tools that convert documentations to other format, such as PDF, HTML. There could be many tools, and we can use different tools to convert documentation to different formats.
Organization
docs/documentation: Root directory for documentation. And for different languages, there is a root directory for it. For example,docs/documentation/cnis the Chinese documentation's root directory.docs/scripts: Place ofDoc Builder.docs/resources: Resources that are referenced in documentation, such as pictures.
Constraints
- All documents are written in Markdown format, and file name is end with ".md".
- All documents are started with level 1 title
# Title, and should have only one level 1 title. - Names of file and directory are in lowercase letters, and use dashes as separator.
- Documentation can be constructed as a directory or a single Markdown file, these two formats equal with each other in logical. Relationship is represented by parent-child directory in directory format, and by title level in file format. It is recommended to use directory format to manage a large documentation, because it is easy to maintain.
- A directory corresponds to a title, and readme.md in this directory is its content. Other documents in this directory is its sub-sections.
- For manual like section, such as function description, there should be
Description,Syntax,Examplessection in documents.
level directories
- doris-concepts
- installing
- getting-started
- administrator-guide
- sql-references
- best-practices
- internals