Files
doris/docs
Mingyu Chen 400d8a906f Optimize the consumer assignment of Kafka routine load job (#870)
1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine
load job. 

Test results:

* 1 consumer, 1 partitions:
    consume time: 4.469s, rows: 990140, bytes: 128737139.  221557 rows/s, 28M/s
* 1 consumer, 3 partitions:
    consume time: 12.765s, rows: 2000143, bytes: 258631271. 156689 rows/s, 20M/s
    blocking get time(us): 12268241, blocking put time(us): 1886431
* 3 consumers, 3 partitions:
    consume time(all 3): 6.095s, rows: 2000503, bytes: 258631576. 328220 rows/s, 42M/s
    blocking get time(us): 1041639, blocking put time(us): 10356581

The next 2 cases show that we can achieve higher speed by adding more consumers. But the bottle neck transfers from Kafka consumer to Doris ingestion, so 3 consumers in a group is enough.

I also add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 3.

In my test(1 Backend, 2 tablets, 1 replicas), 1 routine load task can achieve 10M/s, which is same as raw stream load.

2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load
2019-04-28 10:33:50 +08:00
..
2018-11-09 14:30:09 +08:00

Philosophy

write once, use everywhere

Documentations will be written once, and will be converted to other format according to different application scenarios.

Implementation

         +---------------+
         | Documentation |
         +-------+-------+
                 |
         +-------+-------+
         |  Doc Builder  |
         +-------+-------+
                 |
    +--------------------------------+
    |            |                   |
+---+---+    +---+----+        +-----+----+
|  PDF  |    |  HTML  |  ....  | Help Doc |
+-------+    +--------+        +----------+

Documentation:Text contents which is written by human. And this is the only place for documentation. Doc Builder: Tools that convert documentations to other format, such as PDF, HTML. There could be many tools, and we can use different tools to convert documentation to different formats.

Organization

docs/documentation: Root directory for documentation. And for different languages, there is a root directory for it. For example, docs/documentation/cn is the Chinese documentation's root directory. docs/scripts: Place of Doc Builder.  docs/resources: Resources that are referenced in documentation, such as pictures.

Constraints

  1. All documents are written in Markdown format, and file name is end with ".md".
  2. All documents are started with level 1 title # Title, and should have only one level 1 title.
  3. Names of file and directory are in lowercase letters, and use dashes as separator.
  4. Documentation can be constructed as a directory or a single Markdown file, these two formats equal with each other in logical. Relationship is represented by parent-child directory in directory format, and by title level in file format. It is recommended to use directory format to manage a large documentation, because it is easy to maintain.
  5. A directory corresponds to a title, and readme.md in this directory is its content. Other documents in this directory is its sub-sections.
  6. For manual like section, such as function description, there should be Description, Syntax, Examples section in documents.

level directories

  1. doris-concepts
  2. installing
  3. getting-started
  4. administrator-guide
  5. sql-references
  6. best-practices
  7. internals