Move spec of experimental RTP header extensions to source repository

The specification of experiemental RTP header extensions have previously
been located at Github. Move the specs here and folloup with redirection
of the new website to this place to make sure that the existing URLs on
the format webrtc.org/experiements/rtp_hdrext continue to work.

Bug: webrtc:11335
Change-Id: I7735e259a7dd6cd2fa7bbc09fa3c0ff460057e52
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/168126
Commit-Queue: Johannes Kron <kron@webrtc.org>
Reviewed-by: Mirko Bonadei <mbonadei@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#30447}
This commit is contained in:
Johannes Kron
2020-02-03 14:16:58 +01:00
committed by Commit Bot
parent f2be3eff26
commit 73ff1ffd0f
10 changed files with 475 additions and 0 deletions

View File

@ -33,8 +33,10 @@ bugs found in native code.
* [Development][webrtc-development]
* [Android][webtc-android-development]
* [iOS][webrtc-ios-development]
* [Experimental RTP header extensions][rtp_hdrext]
[webrtc-prerequitite-sw]: https://webrtc.googlesource.com/src/+/refs/heads/master/docs/native-code/development/prerequisite-sw/index.md
[webrtc-development]: https://webrtc.googlesource.com/src/+/refs/heads/master/docs/native-code/development/index.md
[webtc-android-development]: https://webrtc.googlesource.com/src/+/refs/heads/master/docs/native-code/android/index.md
[webrtc-ios-development]: https://webrtc.googlesource.com/src/+/refs/heads/master/docs/native-code/ios/index.md
[rtp_hdrext]: https://webrtc.googlesource.com/src/+/refs/heads/master/docs/native-code/rtp_hdrext/index.md

View File

@ -0,0 +1,119 @@
The Absolute Capture Time extension is used to stamp RTP packets with a NTP
timestamp showing when the first audio or video frame in a packet was originally
captured. The intent of this extension is to provide a way to accomplish
audio-to-video synchronization when RTCP-terminating intermediate systems (e.g.
mixers) are involved.
**Name:**
"Absolute Capture Time"; "RTP Header Extension for Absolute Capture Time"
**Formal name:**
<http://www.webrtc.org/experiments/rtp-hdrext/abs-capture-time>
**Status:**
This extension is defined here to allow for experimentation. Once experience has
shown that it is useful, we intend to make a proposal based on it for
standardization in the IETF.
Contact <chxg@google.com> for more info.
## RTP header extension format
### Data layout overview
Data layout of the shortened version of `abs-capture-time` with a 1-byte header
\+ 8 bytes of data:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | len=7 | absolute capture timestamp (bit 0-23) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| absolute capture timestamp (bit 24-55) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... (56-63) |
+-+-+-+-+-+-+-+-+
Data layout of the extended version of `abs-capture-time` with a 1-byte header +
16 bytes of data:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | len=15| absolute capture timestamp (bit 0-23) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| absolute capture timestamp (bit 24-55) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... (56-63) | estimated capture clock offset (bit 0-23) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| estimated capture clock offset (bit 24-55) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... (56-63) |
+-+-+-+-+-+-+-+-+
### Data layout details
#### Absolute capture timestamp
Absolute capture timestamp is the NTP timestamp of when the first frame in a
packet was originally captured. This timestamp MUST be based on the same clock
as the clock used to generate NTP timestamps for RTCP sender reports on the
capture system.
It's not always possible to do an NTP clock readout at the exact moment of when
a media frame is captured. A capture system MAY postpone the readout until a
more convenient time. A capture system SHOULD have known delays (e.g. from
hardware buffers) subtracted from the readout to make the final timestamp as
close to the actual capture time as possible.
This field is encoded as a 64-bit unsigned fixed-point number with the high 32
bits for the timestamp in seconds and low 32 bits for the fractional part. This
is also known as the UQ32.32 format and is what the RTP specification defines as
the canonical format to represent NTP timestamps.
#### Estimated capture clock offset
Estimated capture clock offset is the sender's estimate of the offset between
its own NTP clock and the capture system's NTP clock. The sender is here defined
as the system that owns the NTP clock used to generate the NTP timestamps for
the RTCP sender reports on this stream. The sender system is typically either
the capture system or a mixer.
This field is encoded as a 64-bit two’s complement **signed** fixed-point number
with the high 32 bits for the seconds and low 32 bits for the fractional part.
It’s intended to make it easy for a receiver, that knows how to estimate the
sender system’s NTP clock, to also estimate the capture system’s NTP clock:
Capture NTP Clock = Sender NTP Clock + Capture Clock Offset
### Further details
#### Capture system
A receiver MUST treat the first CSRC in the CSRC list of a received packet as if
it belongs to the capture system. If the CSRC list is empty, then the receiver
MUST treat the SSRC as if it belongs to the capture system. Mixers SHOULD put
the most prominent CSRC as the first CSRC in a packet’s CSRC list.
#### Intermediate systems
An intermediate system (e.g. mixer) MAY adjust these timestamps as needed. It
MAY also choose to rewrite the timestamps completely, using its own NTP clock as
reference clock, if it wants to present itself as a capture system for A/V-sync
purposes.
#### Timestamp interpolation
A sender SHOULD save bandwidth by not sending `abs-capture-time` with every
RTP packet. It SHOULD still send them at regular intervals (e.g. every second)
to help mitigate the impact of clock drift and packet loss. Mixers SHOULD always
send `abs-capture-time` with the first RTP packet after changing capture system.
A receiver SHOULD memorize the capture system (i.e. CSRC/SSRC), capture
timestamp, and RTP timestamp of the most recently received `abs-capture-time`
packet on each received stream. It can then use that information, in combination
with RTP timestamps of packets without `abs-capture-time`, to extrapolate
missing capture timestamps.
Timestamp interpolation works fine as long as there’s reasonably low NTP/RTP
clock drift. This is not always true. Senders that detect "jumps" between its
NTP and RTP clock mappings SHOULD send `abs-capture-time` with the first RTP
packet after such a thing happening.

View File

@ -0,0 +1,29 @@
The Absolute Send Time extension is used to stamp RTP packets with a timestamp
showing the departure time from the system that put this packet on the wire
(or as close to this as we can manage). Contact <solenberg@google.com> for
more info.
Name: "Absolute Sender Time" ; "RTP Header Extension for Absolute Sender Time"
Formal name: <http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time>
SDP "a= name": "abs-send-time" ; this is also used in client/cloud signaling.
Not unlike [RTP with TFRC](http://tools.ietf.org/html/draft-ietf-avt-tfrc-profile-10#section-5)
Wire format: 1-byte extension, 3 bytes of data. total 4 bytes extra per packet
(plus shared 4 bytes for all extensions present: 2 byte magic word 0xBEDE, 2
byte # of extensions). Will in practice replace the "toffset" extension so we
should see no long term increase in traffic as a result.
Encoding: Timestamp is in seconds, 24 bit 6.18 fixed point, yielding 64s
wraparound and 3.8us resolution (one increment for each 477 bytes going out on
a 1Gbps interface).
Relation to NTP timestamps: abs_send_time_24 = (ntp_timestamp_64 >> 14) &
0x00ffffff ; NTP timestamp is 32 bits for whole seconds, 32 bits fraction of
second.
Notes: Packets are time stamped when going out, preferably close to metal.
Intermediate RTP relays (entities possibly altering the stream) should remove
the extension or set its own timestamp.

View File

@ -0,0 +1,86 @@
The color space extension is used to communicate color space information and
optionally also metadata that is needed in order to properly render a high
dynamic range (HDR) video stream. Contact <kron@google.com> for more info.
**Name:** "Color space" ; "RTP Header Extension for color space"
**Formal name:** <http://www.webrtc.org/experiments/rtp-hdrext/color-space>
**Status:** This extension is defined here to allow for experimentation. Once experience
has shown that it is useful, we intend to make a proposal based on it for standardization
in the IETF.
## RTP header extension format
### Data layout overview
Data layout without HDR metadata (one-byte RTP header extension)
1-byte header + 4 bytes of data:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | L = 3 | primaries | transfer | matrix |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|range+chr.sit. |
+-+-+-+-+-+-+-+-+
Data layout of color space with HDR metadata (two-byte RTP header extension)
2-byte header + 28 bytes of data:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | length=28 | primaries | transfer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| matrix |range+chr.sit. | luminance_max |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| luminance_min | mastering_metadata.|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|primary_r.x and .y | mastering_metadata.|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|primary_g.x and .y | mastering_metadata.|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|primary_b.x and .y | mastering_metadata.|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|white.x and .y | max_content_light_level |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| max_frame_average_light_level |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
### Data layout details
The data is written in the following order,
Color space information (4 bytes):
* Color primaries value according to ITU-T H.273 Table 2.
* Transfer characteristic value according to ITU-T H.273 Table 3.
* Matrix coefficients value according to ITU-T H.273 Table 4.
* Range and chroma siting as specified at
https://www.webmproject.org/docs/container/#colour. Range (range), horizontal (horz)
and vertical (vert) siting are merged to one byte by the operation: (range << 4) +
(horz << 2) + vert.
The extension may optionally include HDR metadata written in the following order,
Mastering metadata (20 bytes):
* Luminance max, specified in nits, where 1 nit = 1 cd/m<sup>2</sup>.
(16-bit unsigned integer)
* Luminance min, scaled by a factor of 10000 and specified in the unit 1/10000
nits. (16-bit unsigned integer)
* CIE 1931 xy chromaticity coordinates of the primary red, scaled by a factor of 50000.
(2x 16-bit unsigned integers)
* CIE 1931 xy chromaticity coordinates of the primary green, scaled by a factor of 50000.
(2x 16-bit unsigned integers)
* CIE 1931 xy chromaticity coordinates of the primary blue, scaled by a factor of 50000.
(2x 16-bit unsigned integers)
* CIE 1931 xy chromaticity coordinates of the white point, scaled by a factor of 50000.
(2x 16-bit unsigned integers)
Followed by max light levels (4 bytes):
* Max content light level, specified in nits. (16-bit unsigned integer)
* Max frame average light level, specified in nits. (16-bit unsigned integer)
Note, the byte order for all integers is big endian.
See the standard SMPTE ST 2086 for more information about these entities.
Notes: Extension should be present only in the last packet of video frames. If attached
to other packets it should be ignored.

View File

@ -0,0 +1,55 @@
**Name:** "Inband Comfort Noise" ; "RTP Header Extension to signal inband comfort noise"
**Formal name:** <http://www.webrtc.org/experiments/rtp-hdrext/inband-cn>
**Status:** This extension is defined here to allow for experimentation. Once experience has shown that it is useful, we intend to make a proposal based on it for standardization in the IETF.
## Introduction
Comfort noise \(CN\) is widely used in real time communication, as it significantly reduces the frequency of RTP packets, and thus saves the network bandwidth, when participants in the communication are constantly actively speaking.
One way of deploying CN is through \[RFC 3389\]. It defines CN as a special payload, which needs to be encoded and decoded independently from the codec\(s\) applied to active speech signals. This deployment is referred to as outband CN in this context.
Some codecs, for example RFC 6716: Definition of the Opus Audio Codec, implement their own CN schemes. Basically, the encoder can notify that a CN packet is issued and/or no packet needs to be transmitted.
Since CN packets have their particularities, cloud and client may need to identify them and treat them differently. Special treatments on CN packets include but are not limited to
* Upon receiving multiple streams of CN packets, choose only one to relay or mix.
* Adapt jitter buffer wisely according to the discontinuous transmission nature of CN packets.
While RTP packets that contain outband CN can be easily identified as they bear a different payload type, inband CN cannot. Some codecs may be able to extract the information by decoding the packet, but that depends on codec implementation, not even mentioning that decoding packets is not always feasible. This document proposes using an RTP header extension to signal the inband CN.
## RTP header extension format
The inband CN extension can be encoded using either the one-byte or two-byte header defined in \[RFC 5285\]. Figures 1 and 2 show encodings with each of these header formats.
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | len=0 |N| noise level |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1. Encoding Using the One-Byte Header Format
0 1 2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | len=1 |N| noise level |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2. Encoding Using the Two-Byte Header Format
Noise level is an optional data. The bit "N" being 1 indicates that there is a noise level. The noise level is defined the same way as the audio level in \[RFC 6464\] and therefore can be used to avoid the Audio Level Header Extension on the same RTP packet. This also means that this level is defined the same as the noise level in \[RFC 3389\] and therfore can be compared against outband CN.
## Further details
The existence of this header extension in an RTP packet indicates that it has inband CN, and therefore it will be used sparsely, and results in very small transmission cost.
The end receiver can utilize this RTP header extension to get notified about an upcoming discontinuous transmission. This can be useful for its jitter buffer management. This RTP header extension signals comfort noise, it can also be used by audio mixer to mix streams wisely. As an example, it can avoid mixing multiple comfort noises together.
Cloud may have the benefits of this RTP header extension as an end receiver, if it does transcoding. It may also utilize this RTP header extension to prioritize RTP packets if it does packet filtering. In both cases, this RTP header extension should not be encrypted.
## References
* \[RFC 3389\] Zopf, R., "Real-time Transport Protocol \(RTP\) Payload for Comfort Noise \(CN\)", RFC 3389, September 2002.
* \[RFC 6465\] Ivov, E., Ed., Marocco, E., Ed., and J. Lennox, "A Real-time Transport Protocol \(RTP\) Header Extension for Mixer-to-Client Audio Level Indication", RFC 6465, December 2011.
* \[RFC 5285\] Singer, D. and H. Desineni, "A General Mechanism for RTP Header Extensions", RFC 5285, July 2008.

View File

@ -0,0 +1,10 @@
The following subpages define experiemental RTP header extensions:
* [abs-send-time](abs-send-time)
* [abs-capture-time](abs-capture-time)
* [color-space](color-space)
* [playout-delay](playout-delay)
* [transport-wide-cc-02](transport-wide-cc-02)
* [video-content-type](video-content-type)
* [video-timing](video-timing)
* [inband-cn](inband-cn)

View File

@ -0,0 +1,52 @@
**Name:** "Playout Delay" ; "RTP Header Extension to control Playout Delay"
**Formal name:** <http://www.webrtc.org/experiments/rtp-hdrext/playout-delay>
**SDP "a= name":** "playout-delay" ; this is also used in client/cloud signaling.
**Status:** This extension is defined here to allow for experimentation. Once experience
has shown that it is useful, we intend to make a proposal based on it for standardization
in the IETF.
## Introduction
On WebRTC, the RTP receiver continuously measures inter-packet delay and evaluates packet jitter. Besides this, an estimated delay for decode and render at the receiver is computed. The jitter buffer, the local time extrapolation and the predicted render time (based on predicted decode and render time) impact the delay on a frame before it is rendered at the receiver.
This document proposes an RTP extension to enable the RTP sender to try and limit the amount of playout delay at the receiver in a certain range. A minimum and maximum delay from the sender provides guidance on the range over which the receiver can smooth out rendering.
Thus, this extension aims to provide the sender’s intent to the receiver on how quickly a frame needs to be rendered.
The following use cases are addressed by this extension:
* Interactive streaming (gaming, remote access): Interactive streaming is highly sensitive to end-to-end latency and any delay in render impacts the end-user experience. These use cases prioritize reducing delay over any smoothing done at the receiver. In these cases, the RTP sender would like to disable all smoothing at receiver (min delay = max delay = 0)
* Movie playback: In some scenarios, the user prefers smooth playback and adaptive delay impacts end-user experience (audio can speed up and slow down). In these cases the sender would like to have a fixed delay at all times (min delay = max delay = K)
* Interactive communication: This is the scenarios where the receiver is best suited to adjust the delay adaptively to minimize latency and at the same time add some smoothing based on jitter prevalent due to network conditions (min delay = K1, max delay = K2)
## MIN and MAX playout delay
The playout delay on a frame represents the amount of delay added to a frame the time it is captured at the sender to the time it is expected to be rendered at the receiver. Thus playout delay is essentially:
Playout delay = ExpectedRenderTime(frame) - ExpectedCaptureTime(frame)
MIN and MAX playout delay in turn represent the minimum and maximum delay that can be seen on a frame. This restriction range is best effort. The receiver is expected to try and meet the range as best as it can.
A value of 0 for example is meaningless from the perspective of actually meeting the suggested delay, but it indicates to the receiver that the frame should be rendered as soon as possible. It is up-to the receiver to decide how to handle a frame when it arrives too late (i.e., whether to simply drop or hand over for rendering as soon as possible).
## RTP header extension format
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | len=2 | MIN delay | MAX delay |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
12 bits for Minimum and Maximum delay. This represents a range of 0 - 40950 milliseconds for minimum and maximum (with a granularity of 10 ms). A granularity of 10 ms is sufficient since we expect the following typical use cases:
* 0 ms: Certain gaming scenarios (likely without audio) where we will want to play the frame as soon as possible. Also, for remote desktop without audio where rendering a frame asap makes sense
* 100/150/200 ms: These could be the max target latency for interactive streaming use cases depending on the actual application (gaming, remoting with audio, interactive scenarios)
* 400 ms: Application that want to ensure a network glitch has very little chance of causing a freeze can start with a minimum delay target that is high enough to deal with network issues. Video streaming is one example.
The header is attached to the RTP packet by the RTP sender when it needs to change the min and max smoothing delay at the receiver. Once the sender is informed that at least one RTP packet which has the min and max details is delivered, it MAY stop providing details on all further RTP packets until another change warrants communicating the details to the receiver again. This is done as follows:
RTCP feedback to RTP sender includes the highest sequence number that was seen on the RTP receiver. The RTP sender can track the sequence number on the packet that first had the playout delay extension and then stop sending the extension once the received sequence number is greater than the sequence number on the first packet containing the current values playout delay in this extension.

View File

@ -0,0 +1,60 @@
This RTP header extension is an extended version of the extension defined in
<https://tools.ietf.org/html/draft-holmer-rmcat-transport-wide-cc-extensions-01>
**Name:** "Transport-wide congenstion control 02"
**Formal name:**
<http://www.webrtc.org/experiments/rtp-hdrext/transport-wide-cc-02>
**Status:** This extension is defined here to allow for experimentation. Once
experience has shown that it is useful, we intend to make a proposal based on
it for standardization in the IETF.
The original extension defines a transport-wide sequence number that is used in
feedback packets for congestion control. The original implementation sends these
feedback packets at a periodic interval. The extended version presented here has
two changes compared to the original version:
* Feedback is sent only on request by the sender, therefore, the extension has
two optional bytes that signals that a feedback packet is requested.
* The sender determines if timing information should be included or not in the
feedback packet. The original version always include timing information.
Contact <kron@google.com> or <sprang@google.com> for more info.
## RTP header extension format
### Data layout overview
Data layout of transport-wide sequence number
1-byte header + 2 bytes of data:
0              1 2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID   | L=1 |transport-wide sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Data layout of transport-wide sequence number and optional feedback request
1-byte header + 4 bytes of data:
0              1 2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID   | L=3 |transport-wide sequence number |T|  seq count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|seq count cont.|
+-+-+-+-+-+-+-+-+
### Data layout details
The data is written in the following order,
* transport-wide sequence number (16-bit unsigned integer)
* feedback request (optional) (16-bit unsigned integer)<br>
If the extension contains two extra bytes for feedback request, this means
that a feedback packet should be generated and sent immediately. The feedback
request consists of a one-bit field giving the flag value T and a 15-bit
field giving the sequence count as an unsigned number.
- If the bit T is set the feedback packet must contain timing information.
- seq count specifies how many packets of history that should be included in
the feedback packet. If seq count is zero no feedback should be be
generated, which is equivalent of sending the two-byte extension above.
This is added as an option to allow for a fixed packet header size.

View File

@ -0,0 +1,22 @@
The Video Content Type extension is used to communicate a video content type
from sender to receiver of rtp video stream. Contact <ilnik@google.com> for
more info.
Name: "Video Content Type" ; "RTP Header Extension for Video Content Type"
Formal name: <http://www.webrtc.org/experiments/rtp-hdrext/video-content-type>
SDP "a= name": "video-content-type" ; this is also used in client/cloud signaling.
Wire format: 1-byte extension, 1 bytes of data. total 2 bytes extra per packet
(plus shared 4 bytes for all extensions present: 2 byte magic word 0xBEDE, 2
byte # of extensions).
Values:
* 0x00: *Unspecified*. Default value. Treated the same as an absence of an extension.
* 0x01: *Screenshare*. Video stream is of a screenshare type.
Notes: Extension shoud be present only in the last packet of key-frames. If
attached to other packets it should be ignored. If extension is absent,
*Unspecified* value is assumed.

View File

@ -0,0 +1,40 @@
The Video Timing extension is used to communicate a timing information on
per-frame basis to receiver of rtp video stream. Contact <ilnik@google.com> for
more info. It may be generalized to audio frames as well in the future.
Name: "Video Timing" ; "RTP Header Extension for Video timing"
Formal name: <http://www.webrtc.org/experiments/rtp-hdrext/video-timing>
SDP "a= name": "video-timing" ; this is also used in client/cloud signaling.
Wire format: 1-byte extension, 13 bytes of data. Total 14 bytes extra per packet
(plus 1-3 padding byte in some cases, plus shared 4 bytes for all extensions
present: 2 byte magic word 0xBEDE, 2 byte # of extensions).
First byte is a flags field. Defined flags:
* 0x01 - extension is set due to timer.
* 0x02 - extension is set because the frame is larger than usual.
Both flags may be set at the same time. All remaining 6 bits are reserved and
should be ignored.
Next, 6 timestamps are stored as 16-bit values in big-endian order, representing
delta from the capture time of a packet in ms.
Timestamps are, in order:
* Encode start.
* Encode finish.
* Packetization complete.
* Last packet left the pacer.
* Reserved for network.
* Reserved for network (2).
Pacer timestamp should be updated inside the RTP packet by pacer component when
the last packet (containing the extension) is sent to the network. Last two,
reserved timstamps, are not set by the sender but are reserved in packet for any
in-network RTP stream processor to modify.
Notes: Extension shoud be present only in the last packet of video frames. If
attached to other packets it should be ignored.