draft-ietf-rtcweb-rtp-usage-12.txt | draft-ietf-rtcweb-rtp-usage-13.txt | |||
---|---|---|---|---|
RTCWEB Working Group C. S. Perkins | RTCWEB Working Group C. Perkins | |||
Internet-Draft University of Glasgow | Internet-Draft University of Glasgow | |||
Intended status: Standards Track M. Westerlund | Intended status: Standards Track M. Westerlund | |||
Expires: August 18, 2014 Ericsson | Expires: October 25, 2014 Ericsson | |||
J. Ott | J. Ott | |||
Aalto University | Aalto University | |||
February 14, 2014 | April 23, 2014 | |||
Web Real-Time Communication (WebRTC): Media Transport and Use of RTP | Web Real-Time Communication (WebRTC): Media Transport and Use of RTP | |||
draft-ietf-rtcweb-rtp-usage-12 | draft-ietf-rtcweb-rtp-usage-13 | |||
Abstract | Abstract | |||
The Web Real-Time Communication (WebRTC) framework provides support | The Web Real-Time Communication (WebRTC) framework provides support | |||
for direct interactive rich communication using audio, video, text, | for direct interactive rich communication using audio, video, text, | |||
collaboration, games, etc. between two peers' web-browsers. This | collaboration, games, etc. between two peers' web-browsers. This | |||
memo describes the media transport aspects of the WebRTC framework. | memo describes the media transport aspects of the WebRTC framework. | |||
It specifies how the Real-time Transport Protocol (RTP) is used in | It specifies how the Real-time Transport Protocol (RTP) is used in | |||
the WebRTC context, and gives requirements for which RTP features, | the WebRTC context, and gives requirements for which RTP features, | |||
profiles, and extensions need to be supported. | profiles, and extensions need to be supported. | |||
skipping to change at page 1, line 39 | skipping to change at page 1, line 39 | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on August 18, 2014. | This Internet-Draft will expire on October 25, 2014. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2014 IETF Trust and the persons identified as the | Copyright (c) 2014 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 17 | skipping to change at page 2, line 17 | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
4. WebRTC Use of RTP: Core Protocols . . . . . . . . . . . . . . 5 | 4. WebRTC Use of RTP: Core Protocols . . . . . . . . . . . . . . 5 | |||
4.1. RTP and RTCP . . . . . . . . . . . . . . . . . . . . . . 5 | 4.1. RTP and RTCP . . . . . . . . . . . . . . . . . . . . . . 5 | |||
4.2. Choice of the RTP Profile . . . . . . . . . . . . . . . . 6 | 4.2. Choice of the RTP Profile . . . . . . . . . . . . . . . . 7 | |||
4.3. Choice of RTP Payload Formats . . . . . . . . . . . . . . 7 | 4.3. Choice of RTP Payload Formats . . . . . . . . . . . . . . 7 | |||
4.4. Use of RTP Sessions . . . . . . . . . . . . . . . . . . . 8 | 4.4. Use of RTP Sessions . . . . . . . . . . . . . . . . . . . 9 | |||
4.5. RTP and RTCP Multiplexing . . . . . . . . . . . . . . . . 9 | 4.5. RTP and RTCP Multiplexing . . . . . . . . . . . . . . . . 9 | |||
4.6. Reduced Size RTCP . . . . . . . . . . . . . . . . . . . . 10 | 4.6. Reduced Size RTCP . . . . . . . . . . . . . . . . . . . . 10 | |||
4.7. Symmetric RTP/RTCP . . . . . . . . . . . . . . . . . . . 10 | 4.7. Symmetric RTP/RTCP . . . . . . . . . . . . . . . . . . . 10 | |||
4.8. Choice of RTP Synchronisation Source (SSRC) . . . . . . . 10 | 4.8. Choice of RTP Synchronisation Source (SSRC) . . . . . . . 11 | |||
4.9. Generation of the RTCP Canonical Name (CNAME) . . . . . . 11 | 4.9. Generation of the RTCP Canonical Name (CNAME) . . . . . . 11 | |||
4.10. Handling of Leap Seconds . . . . . . . . . . . . . . . . 12 | ||||
5. WebRTC Use of RTP: Extensions . . . . . . . . . . . . . . . . 12 | 5. WebRTC Use of RTP: Extensions . . . . . . . . . . . . . . . . 12 | |||
5.1. Conferencing Extensions . . . . . . . . . . . . . . . . . 12 | 5.1. Conferencing Extensions and Topologies . . . . . . . . . 12 | |||
5.1.1. Full Intra Request (FIR) . . . . . . . . . . . . . . 13 | 5.1.1. Full Intra Request (FIR) . . . . . . . . . . . . . . 14 | |||
5.1.2. Picture Loss Indication (PLI) . . . . . . . . . . . . 13 | 5.1.2. Picture Loss Indication (PLI) . . . . . . . . . . . . 14 | |||
5.1.3. Slice Loss Indication (SLI) . . . . . . . . . . . . . 13 | 5.1.3. Slice Loss Indication (SLI) . . . . . . . . . . . . . 14 | |||
5.1.4. Reference Picture Selection Indication (RPSI) . . . . 14 | 5.1.4. Reference Picture Selection Indication (RPSI) . . . . 15 | |||
5.1.5. Temporal-Spatial Trade-off Request (TSTR) . . . . . . 14 | 5.1.5. Temporal-Spatial Trade-off Request (TSTR) . . . . . . 15 | |||
5.1.6. Temporary Maximum Media Stream Bit Rate Request | 5.1.6. Temporary Maximum Media Stream Bit Rate Request | |||
(TMMBR) . . . . . . . . . . . . . . . . . . . . . . . 14 | (TMMBR) . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
5.2. Header Extensions . . . . . . . . . . . . . . . . . . . . 14 | 5.2. Header Extensions . . . . . . . . . . . . . . . . . . . . 16 | |||
5.2.1. Rapid Synchronisation . . . . . . . . . . . . . . . . 15 | 5.2.1. Rapid Synchronisation . . . . . . . . . . . . . . . . 16 | |||
5.2.2. Client-to-Mixer Audio Level . . . . . . . . . . . . . 15 | 5.2.2. Client-to-Mixer Audio Level . . . . . . . . . . . . . 16 | |||
5.2.3. Mixer-to-Client Audio Level . . . . . . . . . . . . . 15 | 5.2.3. Mixer-to-Client Audio Level . . . . . . . . . . . . . 17 | |||
6. WebRTC Use of RTP: Improving Transport Robustness . . . . . . 16 | 6. WebRTC Use of RTP: Improving Transport Robustness . . . . . . 17 | |||
6.1. Negative Acknowledgements and RTP Retransmission . . . . 16 | 6.1. Negative Acknowledgements and RTP Retransmission . . . . 17 | |||
6.2. Forward Error Correction (FEC) . . . . . . . . . . . . . 17 | 6.2. Forward Error Correction (FEC) . . . . . . . . . . . . . 18 | |||
7. WebRTC Use of RTP: Rate Control and Media Adaptation . . . . 17 | 7. WebRTC Use of RTP: Rate Control and Media Adaptation . . . . 19 | |||
7.1. Boundary Conditions and Circuit Breakers . . . . . . . . 18 | 7.1. Boundary Conditions and Circuit Breakers . . . . . . . . 20 | |||
7.2. RTCP Limitations for Congestion Control . . . . . . . . . 19 | 7.2. RTCP Limitations for Congestion Control . . . . . . . . . 20 | |||
7.3. Congestion Control Interoperability and Legacy Systems . 20 | 7.3. Congestion Control Interoperability and Legacy Systems . 22 | |||
8. WebRTC Use of RTP: Performance Monitoring . . . . . . . . . . 20 | 8. WebRTC Use of RTP: Performance Monitoring . . . . . . . . . . 23 | |||
9. WebRTC Use of RTP: Future Extensions . . . . . . . . . . . . 21 | 9. WebRTC Use of RTP: Future Extensions . . . . . . . . . . . . 24 | |||
10. Signalling Considerations . . . . . . . . . . . . . . . . . . 21 | 10. Signalling Considerations . . . . . . . . . . . . . . . . . . 24 | |||
11. WebRTC API Considerations . . . . . . . . . . . . . . . . . . 23 | 11. WebRTC API Considerations . . . . . . . . . . . . . . . . . . 25 | |||
12. RTP Implementation Considerations . . . . . . . . . . . . . . 25 | 12. RTP Implementation Considerations . . . . . . . . . . . . . . 28 | |||
12.1. Configuration and Use of RTP Sessions . . . . . . . . . 25 | 12.1. Configuration and Use of RTP Sessions . . . . . . . . . 28 | |||
12.1.1. Use of Multiple Media Flows Within an RTP Session . 25 | 12.1.1. Use of Multiple Media Sources Within an RTP Session 28 | |||
12.1.2. Use of Multiple RTP Sessions . . . . . . . . . . . . 26 | 12.1.2. Use of Multiple RTP Sessions . . . . . . . . . . . . 29 | |||
12.1.3. Differentiated Treatment of Flows . . . . . . . . . 31 | 12.1.3. Differentiated Treatment of RTP Packet Streams . . . 34 | |||
12.2. Source, Flow, and Participant Identification . . . . . . 32 | 12.2. Media Source, RTP Packet Streams, and Participant | |||
12.2.1. Media Streams . . . . . . . . . . . . . . . . . . . 33 | Identification . . . . . . . . . . . . . . . . . . . . . 35 | |||
12.2.2. Media Streams: SSRC Collision Detection . . . . . . 33 | 12.2.1. Media Source . . . . . . . . . . . . . . . . . . . . 36 | |||
12.2.3. Media Synchronisation Context . . . . . . . . . . . 34 | 12.2.2. SSRC Collision Detection . . . . . . . . . . . . . . 36 | |||
13. Security Considerations . . . . . . . . . . . . . . . . . . . 35 | 12.2.3. Media Synchronisation Context . . . . . . . . . . . 37 | |||
14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 | 13. Security Considerations . . . . . . . . . . . . . . . . . . . 38 | |||
15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 | 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 | |||
16. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 | 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 39 | |||
16.1. Normative References . . . . . . . . . . . . . . . . . . 36 | 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
16.2. Informative References . . . . . . . . . . . . . . . . . 39 | 16.1. Normative References . . . . . . . . . . . . . . . . . . 39 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 | 16.2. Informative References . . . . . . . . . . . . . . . . . 42 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44 | ||||
1. Introduction | 1. Introduction | |||
The Real-time Transport Protocol (RTP) [RFC3550] provides a framework | The Real-time Transport Protocol (RTP) [RFC3550] provides a framework | |||
for delivery of audio and video teleconferencing data and other real- | for delivery of audio and video teleconferencing data and other real- | |||
time media applications. Previous work has defined the RTP protocol, | time media applications. Previous work has defined the RTP protocol, | |||
along with numerous profiles, payload formats, and other extensions. | along with numerous profiles, payload formats, and other extensions. | |||
When combined with appropriate signalling, these form the basis for | When combined with appropriate signalling, these form the basis for | |||
many teleconferencing systems. | many teleconferencing systems. | |||
skipping to change at page 4, line 17 | skipping to change at page 4, line 19 | |||
(Section 13) and IANA considerations (Section 14). | (Section 13) and IANA considerations (Section 14). | |||
2. Rationale | 2. Rationale | |||
The RTP framework comprises the RTP data transfer protocol, the RTP | The RTP framework comprises the RTP data transfer protocol, the RTP | |||
control protocol, and numerous RTP payload formats, profiles, and | control protocol, and numerous RTP payload formats, profiles, and | |||
extensions. This range of add-ons has allowed RTP to meet various | extensions. This range of add-ons has allowed RTP to meet various | |||
needs that were not envisaged by the original protocol designers, and | needs that were not envisaged by the original protocol designers, and | |||
to support many new media encodings, but raises the question of what | to support many new media encodings, but raises the question of what | |||
extensions are to be supported by new implementations. The | extensions are to be supported by new implementations. The | |||
development of the WebRTC framework provides an opportunity for us to | development of the WebRTC framework provides an opportunity to review | |||
review the available RTP features and extensions, and to define a | the available RTP features and extensions, and to define a common | |||
common baseline feature set for all WebRTC implementations of RTP. | baseline feature set for all WebRTC implementations of RTP. This | |||
This builds on the past 20 years development of RTP to mandate the | builds on the past 20 years development of RTP to mandate the use of | |||
use of extensions that have shown widespread utility, while still | extensions that have shown widespread utility, while still remaining | |||
remaining compatible with the wide installed base of RTP | compatible with the wide installed base of RTP implementations where | |||
implementations where possible. | possible. | |||
Other RTP and RTCP extensions not discussed in this document can be | RTP and RTCP extensions that are not discussed in this document can | |||
implemented by WebRTC end-points if they are beneficial for new use | be implemented by WebRTC end-points if they are beneficial for new | |||
cases. However, they are not necessary to address the WebRTC use | use cases. However, they are not necessary to address the WebRTC use | |||
cases and requirements identified to date | cases and requirements identified in | |||
[I-D.ietf-rtcweb-use-cases-and-requirements]. | [I-D.ietf-rtcweb-use-cases-and-requirements]. | |||
While the baseline set of RTP features and extensions defined in this | While the baseline set of RTP features and extensions defined in this | |||
memo is targeted at the requirements of the WebRTC framework, it is | memo is targeted at the requirements of the WebRTC framework, it is | |||
expected to be broadly useful for other conferencing-related uses of | expected to be broadly useful for other conferencing-related uses of | |||
RTP. In particular, it is likely that this set of RTP features and | RTP. In particular, it is likely that this set of RTP features and | |||
extensions will be appropriate for other desktop or mobile video | extensions will be appropriate for other desktop or mobile video | |||
conferencing systems, or for room-based high-quality telepresence | conferencing systems, or for room-based high-quality telepresence | |||
applications. | applications. | |||
3. Terminology | 3. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in [RFC2119]. The RFC | document are to be interpreted as described in [RFC2119]. The RFC | |||
2119 interpretation of these key words applies only when written in | 2119 interpretation of these key words applies only when written in | |||
ALL CAPS. Lower- or mixed-case uses of these key words are not to be | ALL CAPS. Lower- or mixed-case uses of these key words are not to be | |||
interpreted as carrying special significance in this memo. | interpreted as carrying special significance in this memo. | |||
We define the following terms: | We define the following additional terms: | |||
RTP Media Stream: A sequence of RTP packets, and associated RTCP | WebRTC MediaStream: The MediaStream concept defined by the W3C in | |||
packets, using a single synchronisation source (SSRC) that | the WebRTC API [W3C.WD-mediacapture-streams-20130903]. | |||
together carries part or all of the content of a specific Media | ||||
Type from a specific sender source within a given RTP session. | ||||
RTP Session: As defined by [RFC3550], the endpoints belonging to the | Transport-layer Flow: A uni-directional flow of transport packets | |||
same RTP Session are those that share a single SSRC space. That | that are identified by having a particular 5-tuple of source IP | |||
is, those endpoints can see an SSRC identifier transmitted by any | address, source port, destination IP address, destination port, | |||
one of the other endpoints. An endpoint can see an SSRC either | and transport protocol used. | |||
directly in RTP and RTCP packets, or as a contributing source | ||||
(CSRC) in RTP packets from a mixer. The RTP Session scope is | ||||
hence decided by the endpoints' network interconnection topology, | ||||
in combination with RTP and RTCP forwarding strategies deployed by | ||||
endpoints and any interconnecting middle nodes. | ||||
WebRTC MediaStream: The MediaStream concept defined by the W3C in | Bi-directional Transport-layer Flow: A bi-directional transport- | |||
the API [W3C.WD-mediacapture-streams-20130903]. | layer flow is a transport-layer flow that is symmetric. That is, | |||
the transport-layer flow in the reverse direction has a 5-tuple | ||||
where the source and destination address and ports are swapped | ||||
compared to the forward path transport-layer flow, and the | ||||
transport protocol is the same. | ||||
Other terms are used according to their definitions from the RTP | This document uses the terminology from | |||
Specification [RFC3550]. | [I-D.ietf-avtext-rtp-grouping-taxonomy]. Other terms are used | |||
according to their definitions from the RTP Specification [RFC3550]. | ||||
We especially note the following frequently used terms: RTP Packet | ||||
Stream, RTP Session, and End-point. | ||||
4. WebRTC Use of RTP: Core Protocols | 4. WebRTC Use of RTP: Core Protocols | |||
The following sections describe the core features of RTP and RTCP | The following sections describe the core features of RTP and RTCP | |||
that need to be implemented, along with the mandated RTP profiles. | that need to be implemented, along with the mandated RTP profiles. | |||
Also described are the core extensions providing essential features | Also described are the core extensions providing essential features | |||
that all WebRTC implementations need to implement to function | that all WebRTC implementations need to implement to function | |||
effectively on today's networks. | effectively on today's networks. | |||
4.1. RTP and RTCP | 4.1. RTP and RTCP | |||
skipping to change at page 6, line 9 | skipping to change at page 6, line 12 | |||
optimisations for multi-SSRC sessions defined in | optimisations for multi-SSRC sessions defined in | |||
[I-D.ietf-avtcore-rtp-multi-stream-optimisation] is RECOMMENDED. | [I-D.ietf-avtcore-rtp-multi-stream-optimisation] is RECOMMENDED. | |||
o Random choice of SSRC on joining a session; collision detection | o Random choice of SSRC on joining a session; collision detection | |||
and resolution for SSRC values (see also Section 4.8). | and resolution for SSRC values (see also Section 4.8). | |||
o Support for reception of RTP data packets containing CSRC lists, | o Support for reception of RTP data packets containing CSRC lists, | |||
as generated by RTP mixers, and RTCP packets relating to CSRCs. | as generated by RTP mixers, and RTCP packets relating to CSRCs. | |||
o Sending correct synchronisation information in the RTCP Sender | o Sending correct synchronisation information in the RTCP Sender | |||
Reports, to allow receivers to implement lip-sync, with support | Reports, to allow receivers to implement lip-synchronisation; | |||
for the rapid RTP synchronisation extensions (see Section 5.2.1) | support for the rapid RTP synchronisation extensions (see | |||
being RECOMMENDED. | Section 5.2.1) is RECOMMENDED. | |||
o Support for multiple synchronisation contexts. Participants that | o Support for multiple synchronisation contexts. Participants that | |||
send multiple simultaneous RTP media streams MAY do so as part of | send multiple simultaneous RTP packet streams SHOULD do so as part | |||
a single synchronisation context, using a single RTCP CNAME for | of a single synchronisation context, using a single RTCP CNAME for | |||
all streams and allowing receivers to play the streams out in a | all streams and allowing receivers to play the streams out in a | |||
synchronised manner, or they MAY use different synchronisation | synchronised manner. For compatibility with potential future | |||
contexts, and hence different RTCP CNAMEs, for some or all of the | versions of this specification, or for interoperability with non- | |||
streams. Receivers MUST support reception of multiple RTCP CNAMEs | WebRTC devices through a gateway, receivers MUST support multiple | |||
from each participant in an RTP session. See also Section 4.9. | synchronisation contexts, indicated by the use of multiple RTCP | |||
CNAMEs in an RTP session. This specification requires the usage | ||||
of a single CNAME when sending RTP Packet Streams in some | ||||
circumstances, see Section 4.9. | ||||
o Support for sending and receiving RTCP SR, RR, SDES, and BYE | o Support for sending and receiving RTCP SR, RR, SDES, and BYE | |||
packet types, with OPTIONAL support for other RTCP packet types | packet types, with OPTIONAL support for other RTCP packet types | |||
unless mandated by other parts of this specification; | unless mandated by other parts of this specification; | |||
implementations MUST ignore unknown RTCP packet types. Note that | implementations MUST ignore unknown RTCP packet types. Note that | |||
additional RTCP Packet types are needed by the RTP/SAVPF Profile | additional RTCP Packet types are used by the RTP/SAVPF Profile | |||
(Section 4.2) and the other RTCP extensions (Section 5). | (Section 4.2) and the other RTCP extensions (Section 5). | |||
o Support for multiple end-points in a single RTP session, and for | o Support for multiple end-points in a single RTP session, and for | |||
scaling the RTCP transmission interval according to the number of | scaling the RTCP transmission interval according to the number of | |||
participants in the session; support for randomised RTCP | participants in the session; support for randomised RTCP | |||
transmission intervals to avoid synchronisation of RTCP reports; | transmission intervals to avoid synchronisation of RTCP reports; | |||
support for RTCP timer reconsideration. | support for RTCP timer reconsideration. | |||
o Support for configuring the RTCP bandwidth as a fraction of the | o Support for configuring the RTCP bandwidth as a fraction of the | |||
media bandwidth, and for configuring the fraction of the RTCP | media bandwidth, and for configuring the fraction of the RTCP | |||
bandwidth allocated to senders, e.g., using the SDP "b=" line. | bandwidth allocated to senders, e.g., using the SDP "b=" line | |||
[RFC4566][RFC3556]. Support for the reduced minimum RTCP | ||||
reporting interval described in Section 6.2 of [RFC3550] is | ||||
RECOMMENDED. | ||||
It is known that a significant number of legacy RTP implementations, | It is known that a significant number of legacy RTP implementations, | |||
especially those targeted at VoIP-only systems, do not support all of | especially those targeted at VoIP-only systems, do not support all of | |||
the above features, and in some cases do not support RTCP at all. | the above features, and in some cases do not support RTCP at all. | |||
Implementers are advised to consider the requirements for graceful | Implementers are advised to consider the requirements for graceful | |||
degradation when interoperating with legacy implementations. | degradation when interoperating with legacy implementations. | |||
Other implementation considerations are discussed in Section 12. | Other implementation considerations are discussed in Section 12. | |||
4.2. Choice of the RTP Profile | 4.2. Choice of the RTP Profile | |||
The complete specification of RTP for a particular application domain | The complete specification of RTP for a particular application domain | |||
requires the choice of an RTP Profile. For WebRTC use, the Extended | requires the choice of an RTP Profile. For WebRTC use, the Extended | |||
Secure RTP Profile for RTCP-Based Feedback (RTP/SAVPF) [RFC5124], as | Secure RTP Profile for RTCP-Based Feedback (RTP/SAVPF) [RFC5124], as | |||
extended by [RFC7007], MUST be implemented. This builds on the basic | extended by [RFC7007], MUST be implemented. The RTP/SAVPF profile is | |||
RTP/AVP profile [RFC3551], the RTP profile for RTCP-based feedback | the combination of basic RTP/AVP profile [RFC3551], the RTP profile | |||
(RTP/AVPF) [RFC4585], and the secure RTP profile (RTP/SAVP) | for RTCP-based feedback (RTP/AVPF) [RFC4585], and the secure RTP | |||
[RFC3711]. | profile (RTP/SAVP) [RFC3711]. | |||
The RTCP-based feedback extensions [RFC4585] are needed for the | The RTCP-based feedback extensions [RFC4585] are needed for the | |||
improved RTCP timer model, that allows more flexible transmission of | improved RTCP timer model. This allows more flexible transmission of | |||
RTCP packets in response to events, rather than strictly according to | RTCP packets in response to events, rather than strictly according to | |||
bandwidth. This is vital for being able to report congestion events. | bandwidth, and is vital for being able to report congestion signals | |||
These extensions also allow saving RTCP bandwidth, and an endpoint | as well as media events. These extensions also allow saving RTCP | |||
will commonly only use the full RTCP bandwidth allocation if there | bandwidth, and an end-point will commonly only use the full RTCP | |||
are many events that require feedback. The timer rules are also | bandwidth allocation if there are many events that require feedback. | |||
needed to make use of the RTP conferencing extensions discussed in | The timer rules are also needed to make use of the RTP conferencing | |||
Section 5.1. | extensions discussed in Section 5.1. | |||
Note: The enhanced RTCP timer model defined in the RTP/AVPF | Note: The enhanced RTCP timer model defined in the RTP/AVPF | |||
profile is backwards compatible with legacy systems that implement | profile is backwards compatible with legacy systems that implement | |||
only the RTP/AVP or RTP/SAVP profile, given some constraints on | only the RTP/AVP or RTP/SAVP profile, given some constraints on | |||
parameter configuration such as the RTCP bandwidth value and "trr- | parameter configuration such as the RTCP bandwidth value and "trr- | |||
int" (the most important factor for interworking with RTP/(S)AVP | int" (the most important factor for interworking with RTP/(S)AVP | |||
end-points via a gateway is to set the trr-int parameter to a | end-points via a gateway is to set the trr-int parameter to a | |||
value representing 4 seconds). | value representing 4 seconds). | |||
The secure RTP (SRTP) profile [RFC3711] is needed to provide media | The secure RTP (SRTP) profile extensions [RFC3711] are needed to | |||
encryption, integrity protection, replay protection and a limited | provide media encryption, integrity protection, replay protection and | |||
form of source authentication. WebRTC implementations MUST NOT send | a limited form of source authentication. WebRTC implementations MUST | |||
packets using the basic RTP/AVP profile or the RTP/AVPF profile; they | NOT send packets using the basic RTP/AVP profile or the RTP/AVPF | |||
MUST employ the full RTP/SAVPF profile to protect all RTP and RTCP | profile; they MUST employ the full RTP/SAVPF profile to protect all | |||
packets that are generated (i.e., implementations MUST use SRTP and | RTP and RTCP packets that are generated (i.e., implementations MUST | |||
SRTCP). The RTP/SAVPF profile MUST be configured using the cipher | use SRTP and SRTCP). The RTP/SAVPF profile MUST be configured using | |||
suites, DTLS-SRTP protection profiles, keying mechanisms, and other | the cipher suites, DTLS-SRTP protection profiles, keying mechanisms, | |||
parameters described in [I-D.ietf-rtcweb-security-arch]. | and other parameters described in [I-D.ietf-rtcweb-security-arch]. | |||
4.3. Choice of RTP Payload Formats | 4.3. Choice of RTP Payload Formats | |||
The set of mandatory to implement codecs and RTP payload formats for | The set of mandatory to implement codecs and RTP payload formats for | |||
WebRTC is not specified in this memo, instead they are defined in | WebRTC is not specified in this memo, instead they are defined in | |||
separate specifications, such as [I-D.ietf-rtcweb-audio]. | separate specifications, such as [I-D.ietf-rtcweb-audio]. | |||
Implementations can support any codec for which an RTP payload format | Implementations can support any codec for which an RTP payload format | |||
and associated signalling is defined. Implementation cannot assume | and associated signalling is defined. Implementation cannot assume | |||
that the other participants in an RTP session understand any RTP | that the other participants in an RTP session understand any RTP | |||
payload format, no matter how common; the mapping between RTP payload | payload format, no matter how common; the mapping between RTP payload | |||
type numbers and specific configurations of particular RTP payload | type numbers and specific configurations of particular RTP payload | |||
formats MUST be agreed before those payload types/formats can be | formats MUST be agreed before those payload types/formats can be | |||
used. In an SDP context, this can be done using the "a=rtpmap:" and | used. In an SDP context, this can be done using the "a=rtpmap:" and | |||
"a=fmtp:" attributes associated with an "m=" line. | "a=fmtp:" attributes associated with an "m=" line, along with any | |||
other SDP attributes needed to configure the RTP payload format. | ||||
Endpoints can signal support for multiple RTP payload formats, or | End-points can signal support for multiple RTP payload formats, or | |||
multiple configurations of a single RTP payload format, as long as | multiple configurations of a single RTP payload format, as long as | |||
each unique RTP payload format configuration uses a different RTP | each unique RTP payload format configuration uses a different RTP | |||
payload type number. As outlined in Section 4.8, the RTP payload | payload type number. As outlined in Section 4.8, the RTP payload | |||
type number is sometimes used to associate an RTP media stream with a | type number is sometimes used to associate an RTP packet stream with | |||
signalling context. This association is possible provided unique RTP | a signalling context. This association is possible provided unique | |||
payload type numbers are used in each context. For example, an RTP | RTP payload type numbers are used in each context. For example, an | |||
media stream can be associated with an SDP "m=" line by comparing the | RTP packet stream can be associated with an SDP "m=" line by | |||
RTP payload type numbers used by the media stream with payload types | comparing the RTP payload type numbers used by the RTP packet stream | |||
signalled in the "a=rtpmap:" lines in the media sections of the SDP. | with payload types signalled in the "a=rtpmap:" lines in the media | |||
If RTP media streams are being associated with signalling contexts | sections of the SDP. If RTP packet streams are being associated with | |||
based on the RTP payload type, then the assignment of RTP payload | signalling contexts based on the RTP payload type, then the | |||
type numbers MUST be unique across signalling contexts; if the same | assignment of RTP payload type numbers MUST be unique across | |||
RTP payload format configuration is used in multiple contexts, then a | signalling contexts; if the same RTP payload format configuration is | |||
different RTP payload type number has to be assigned in each context | used in multiple contexts, then a different RTP payload type number | |||
to ensure uniqueness. If the RTP payload type number is not being | has to be assigned in each context to ensure uniqueness. If the RTP | |||
used to associated RTP media streams with a signalling context, then | payload type number is not being used to associate RTP packet streams | |||
the same RTP payload type number can be used to indicate the exact | with a signalling context, then the same RTP payload type number can | |||
same RTP payload format configuration in multiple contexts. | be used to indicate the exact same RTP payload format configuration | |||
in multiple contexts. A single RTP payload type number MUST NOT be | ||||
assigned to different RTP payload formats, or different | ||||
configurations of the same RTP payload format, within a single RTP | ||||
session (note that the different "m=" lines in an SDP bundle group | ||||
[I-D.ietf-mmusic-sdp-bundle-negotiation] form a single RTP session). | ||||
An endpoint that has signalled support for multiple RTP payload | An end-point that has signalled support for multiple RTP payload | |||
formats SHOULD accept data in any of those payload formats at any | formats SHOULD be able to accept data in any of those payload formats | |||
time, unless it has previously signalled limitations on its decoding | at any time, unless it has previously signalled limitations on its | |||
capability. This requirement is constrained if several types of | decoding capability. This requirement is constrained if several | |||
media (e.g., audio and video) are sent in the same RTP session. In | types of media (e.g., audio and video) are sent in the same RTP | |||
such a case, a source (SSRC) is restricted to switching only between | session. In such a case, a source (SSRC) is restricted to switching | |||
the RTP payload formats signalled for the type of media that is being | only between the RTP payload formats signalled for the type of media | |||
sent by that source; see Section 4.4. To support rapid rate | that is being sent by that source; see Section 4.4. To support rapid | |||
adaptation by changing codec, RTP does not require advance signalling | rate adaptation by changing codec, RTP does not require advance | |||
for changes between RTP payload formats that were signalled during | signalling for changes between RTP payload formats used by a single | |||
session set-up. | SSRC that were signalled during session set-up. | |||
An RTP sender that changes between two RTP payload types that use | An RTP sender that changes between two RTP payload types that use | |||
different RTP clock rates MUST follow the recommendations in | different RTP clock rates MUST follow the recommendations in | |||
Section 4.1 of [I-D.ietf-avtext-multiple-clock-rates]. RTP receivers | Section 4.1 of [RFC7160]. RTP receivers MUST follow the | |||
MUST follow the recommendations in Section 4.3 of | recommendations in Section 4.3 of [RFC7160] in order to support | |||
[I-D.ietf-avtext-multiple-clock-rates], in order to support sources | sources that switch between clock rates in an RTP session (these | |||
that switch between clock rates in an RTP session (these | ||||
recommendations for receivers are backwards compatible with the case | recommendations for receivers are backwards compatible with the case | |||
where senders use only a single clock rate). | where senders use only a single clock rate). | |||
4.4. Use of RTP Sessions | 4.4. Use of RTP Sessions | |||
An association amongst a set of participants communicating using RTP | An association amongst a set of end-points communicating using RTP is | |||
is known as an RTP session. A participant can be involved in several | known as an RTP session [RFC3550]. An end-point can be involved in | |||
RTP sessions at the same time. In a multimedia session, each type of | several RTP sessions at the same time. In a multimedia session, each | |||
media has typically been carried in a separate RTP session (e.g., | type of media has typically been carried in a separate RTP session | |||
using one RTP session for the audio, and a separate RTP session using | (e.g., using one RTP session for the audio, and a separate RTP | |||
different transport addresses for the video). WebRTC implementations | session using a different transport-layer flow for the video). | |||
of RTP are REQUIRED to implement support for multimedia sessions in | WebRTC implementations of RTP are REQUIRED to implement support for | |||
this way, separating each session using different transport-layer | multimedia sessions in this way, separating each session using | |||
addresses (e.g., different UDP ports) for compatibility with legacy | different transport-layer flows for compatibility with legacy | |||
systems. | systems. | |||
In modern day networks, however, with the widespread use of network | In modern day networks, however, with the widespread use of network | |||
address/port translators (NAT/NAPT) and firewalls, it is desirable to | address/port translators (NAT/NAPT) and firewalls, it is desirable to | |||
reduce the number of transport-layer flows used by RTP applications. | reduce the number of transport-layer flows used by RTP applications. | |||
This can be done by sending all the RTP media streams in a single RTP | This can be done by sending all the RTP packet streams in a single | |||
session, which will comprise a single transport-layer flow (this will | RTP session, which will comprise a single transport-layer flow (this | |||
prevent the use of some quality-of-service mechanisms, as discussed | will prevent the use of some quality-of-service mechanisms, as | |||
in Section 12.1.3). Implementations are REQUIRED to support | discussed in Section 12.1.3). Implementations are therefore also | |||
transport of all RTP media streams, independent of media type, in a | REQUIRED to support transport of all RTP packet streams, independent | |||
single RTP session according to | of media type, in a single RTP session using a single transport layer | |||
[I-D.ietf-avtcore-multi-media-rtp-session]. If multiple types of | flow, according to [I-D.ietf-avtcore-multi-media-rtp-session]. If | |||
media are to be used in a single RTP session, all participants in | multiple types of media are to be used in a single RTP session, all | |||
that session MUST agree to this usage. In an SDP context, | participants in that RTP session MUST agree to this usage. In an SDP | |||
[I-D.ietf-mmusic-sdp-bundle-negotiation] can be used to signal this. | context, [I-D.ietf-mmusic-sdp-bundle-negotiation] can be used to | |||
signal such a bundle of RTP packet streams forming a single RTP | ||||
session. | ||||
Further discussion about when different RTP session structures and | Further discussion about the suitability of different RTP session | |||
multiplexing methods are suitable can be found in | structures and multiplexing methods to different scenarios are | |||
[I-D.ietf-avtcore-multiplex-guidelines]. | suitable can be found in [I-D.ietf-avtcore-multiplex-guidelines]. | |||
4.5. RTP and RTCP Multiplexing | 4.5. RTP and RTCP Multiplexing | |||
Historically, RTP and RTCP have been run on separate transport layer | Historically, RTP and RTCP have been run on separate transport layer | |||
addresses (e.g., two UDP ports for each RTP session, one port for RTP | flows (e.g., two UDP ports for each RTP session, one port for RTP and | |||
and one port for RTCP). With the increased use of Network Address/ | one port for RTCP). With the increased use of Network Address/Port | |||
Port Translation (NAPT) this has become problematic, since | Translation (NAT/NAPT) this has become problematic, since maintaining | |||
maintaining multiple NAT bindings can be costly. It also complicates | multiple NAT bindings can be costly. It also complicates firewall | |||
firewall administration, since multiple ports need to be opened to | administration, since multiple ports need to be opened to allow RTP | |||
allow RTP traffic. To reduce these costs and session set-up times, | traffic. To reduce these costs and session set-up times, support for | |||
support for multiplexing RTP data packets and RTCP control packets on | multiplexing RTP data packets and RTCP control packets on a single | |||
a single port for each RTP session is REQUIRED, as specified in | transport-layer flow for each RTP session is REQUIRED, provided it is | |||
negotiated in the signalling channel before use as specified in | ||||
[RFC5761]. For backwards compatibility, implementations are also | [RFC5761]. For backwards compatibility, implementations are also | |||
REQUIRED to support RTP and RTCP sent on separate transport-layer | REQUIRED to support RTP and RTCP sent on separate transport-layer | |||
addresses. | flows. | |||
Note that the use of RTP and RTCP multiplexed onto a single transport | Note that the use of RTP and RTCP multiplexed onto a single | |||
port ensures that there is occasional traffic sent on that port, even | transport-layer flow ensures that there is occasional traffic sent on | |||
if there is no active media traffic. This can be useful to keep NAT | that port, even if there is no active media traffic. This can be | |||
bindings alive, and is the recommend method for application level | useful to keep NAT bindings alive, and is the recommend method for | |||
keep-alives of RTP sessions [RFC6263]. | application level keep-alives of RTP sessions [RFC6263]. | |||
4.6. Reduced Size RTCP | 4.6. Reduced Size RTCP | |||
RTCP packets are usually sent as compound RTCP packets, and [RFC3550] | RTCP packets are usually sent as compound RTCP packets, and [RFC3550] | |||
requires that those compound packets start with an Sender Report (SR) | requires that those compound packets start with an Sender Report (SR) | |||
or Receiver Report (RR) packet. When using frequent RTCP feedback | or Receiver Report (RR) packet. When using frequent RTCP feedback | |||
messages under the RTP/AVPF Profile [RFC4585] these statistics are | messages under the RTP/AVPF Profile [RFC4585] these statistics are | |||
not needed in every packet, and unnecessarily increase the mean RTCP | not needed in every packet, and unnecessarily increase the mean RTCP | |||
packet size. This can limit the frequency at which RTCP packets can | packet size. This can limit the frequency at which RTCP packets can | |||
be sent within the RTCP bandwidth share. | be sent within the RTCP bandwidth share. | |||
To avoid this problem, [RFC5506] specifies how to reduce the mean | To avoid this problem, [RFC5506] specifies how to reduce the mean | |||
RTCP message size and allow for more frequent feedback. Frequent | RTCP message size and allow for more frequent feedback. Frequent | |||
feedback, in turn, is essential to make real-time applications | feedback, in turn, is essential to make real-time applications | |||
quickly aware of changing network conditions, and to allow them to | quickly aware of changing network conditions, and to allow them to | |||
adapt their transmission and encoding behaviour. Support for non- | adapt their transmission and encoding behaviour. Support for non- | |||
compound RTCP feedback packets [RFC5506] is REQUIRED, but MUST be | compound RTCP feedback packets [RFC5506] is REQUIRED, but MUST be | |||
negotiated using the signalling channel before use. For backwards | negotiated using the signalling channel before use. For backwards | |||
compatibility, implementations are also REQUIRED to support the use | compatibility, implementations are also REQUIRED to support the use | |||
of compound RTCP feedback packets if the remote endpoint does not | of compound RTCP feedback packets if the remote end-point does not | |||
agree to the use of non-compound RTCP in the signalling exchange. | agree to the use of non-compound RTCP in the signalling exchange. | |||
4.7. Symmetric RTP/RTCP | 4.7. Symmetric RTP/RTCP | |||
To ease traversal of NAT and firewall devices, implementations are | To ease traversal of NAT and firewall devices, implementations are | |||
REQUIRED to implement and use Symmetric RTP [RFC4961]. The reasons | REQUIRED to implement and use Symmetric RTP [RFC4961]. The reason | |||
for using symmetric RTP is primarily to avoid issues with NAT and | for using symmetric RTP is primarily to avoid issues with NATs and | |||
Firewalls by ensuring that the flow is actually bi-directional and | Firewalls by ensuring that the send and receive RTP packet streams, | |||
thus kept alive and registered as flow the intended recipient | as well as RTCP, are actually bi-directional transport-layer flows. | |||
actually wants. In addition, it saves resources, specifically ports | This will keep alive the NAT and firewall pinholes, and help indicate | |||
at the end-points, but also in the network as NAT mappings or | consent that the receive direction is a transport-layer flow the | |||
firewall state is not unnecessary bloated. Also the amount of QoS | intended recipient actually wants. In addition, it saves resources, | |||
state is reduced. | specifically ports at the end-points, but also in the network as NAT | |||
mappings or firewall state is not unnecessary bloated. The amount of | ||||
per flow QoS state kept in the network is also reduced. | ||||
4.8. Choice of RTP Synchronisation Source (SSRC) | 4.8. Choice of RTP Synchronisation Source (SSRC) | |||
Implementations are REQUIRED to support signalled RTP synchronisation | Implementations are REQUIRED to support signalled RTP synchronisation | |||
source (SSRC) identifiers, using the "a=ssrc:" SDP attribute defined | source (SSRC) identifiers, using the "a=ssrc:" SDP attribute defined | |||
in Section 4.1 and Section 5 of [RFC5576]. Implementations MUST also | in Section 4.1 and Section 5 of [RFC5576]. Implementations MUST also | |||
support the "previous-ssrc" source attribute defined in Section 6.2 | support the "previous-ssrc" source attribute defined in Section 6.2 | |||
of [RFC5576]. Other per-SSRC attributes defined in [RFC5576] MAY be | of [RFC5576]. Other per-SSRC attributes defined in [RFC5576] MAY be | |||
supported. | supported. | |||
Use of the "a=ssrc:" attribute to signal SSRC identifiers in an RTP | Use of the "a=ssrc:" attribute to signal SSRC identifiers in an RTP | |||
session is OPTIONAL. Implementations MUST be prepared to accept RTP | session is OPTIONAL. Implementations MUST be prepared to accept RTP | |||
and RTCP packets using SSRCs that have not been explicitly signalled | and RTCP packets using SSRCs that have not been explicitly signalled | |||
ahead of time. Implementations MUST support random SSRC assignment, | ahead of time. Implementations MUST support random SSRC assignment, | |||
and MUST support SSRC collision detection and resolution, according | and MUST support SSRC collision detection and resolution, according | |||
to [RFC3550]. When using signalled SSRC values, collision detection | to [RFC3550]. When using signalled SSRC values, collision detection | |||
MUST be performed as described in Section 5 of [RFC5576]. | MUST be performed as described in Section 5 of [RFC5576]. | |||
It is often desirable to associate an RTP media stream with a non-RTP | It is often desirable to associate an RTP packet stream with a non- | |||
context. For users of the WebRTC API a mapping between SSRCs and | RTP context. For users of the WebRTC API a mapping between SSRCs and | |||
MediaStreamTracks are provided per Section 11. For gateways or other | MediaStreamTracks are provided per Section 11. For gateways or other | |||
usages it is possible to associate an RTP media stream with an "m=" | usages it is possible to associate an RTP packet stream with an "m=" | |||
line in a session description formatted using SDP. If SSRCs are | line in a session description formatted using SDP. If SSRCs are | |||
signalled this is straightforward (in SDP the "a=ssrc:" line will be | signalled this is straightforward (in SDP the "a=ssrc:" line will be | |||
at the media level, allowing a direct association with an "m=" line). | at the media level, allowing a direct association with an "m=" line). | |||
If SSRCs are not signalled, the RTP payload type numbers used in an | If SSRCs are not signalled, the RTP payload type numbers used in an | |||
RTP media stream are often sufficient to associate that media stream | RTP packet stream are often sufficient to associate that packet | |||
with a signalling context (e.g., if RTP payload type numbers are | stream with a signalling context (e.g., if RTP payload type numbers | |||
assigned as described in Section 4.3 of this memo, the RTP payload | are assigned as described in Section 4.3 of this memo, the RTP | |||
types used by an RTP media stream can be compared with values in SDP | payload types used by an RTP packet stream can be compared with | |||
"a=rtpmap:" lines, which are at the media level in SDP, and so map to | values in SDP "a=rtpmap:" lines, which are at the media level in SDP, | |||
an "m=" line). | and so map to an "m=" line). | |||
4.9. Generation of the RTCP Canonical Name (CNAME) | 4.9. Generation of the RTCP Canonical Name (CNAME) | |||
The RTCP Canonical Name (CNAME) provides a persistent transport-level | The RTCP Canonical Name (CNAME) provides a persistent transport-level | |||
identifier for an RTP endpoint. While the Synchronisation Source | identifier for an RTP end-point. While the Synchronisation Source | |||
(SSRC) identifier for an RTP endpoint can change if a collision is | (SSRC) identifier for an RTP end-point can change if a collision is | |||
detected, or when the RTP application is restarted, its RTCP CNAME is | detected, or when the RTP application is restarted, its RTCP CNAME is | |||
meant to stay unchanged, so that RTP endpoints can be uniquely | meant to stay unchanged for the duration of a RTCPeerConnection | |||
identified and associated with their RTP media streams within a set | [W3C.WD-webrtc-20130910], so that RTP end-points can be uniquely | |||
of related RTP sessions. For proper functionality, each RTP endpoint | identified and associated with their RTP packet streams within a set | |||
needs to have at least one unique RTCP CNAME value. An endpoint MAY | of related RTP sessions. | |||
have multiple CNAMEs, as the CNAME also identifies a particular | ||||
synchronisation context, i.e. all SSRC associated with a CNAME share | Each RTP end-point MUST have at least one RTCP CNAME, and that RTCP | |||
a common reference clock, and if an endpoint have SSRCs associated | CNAME MUST be unique within the RTCPeerConnection. RTCP CNAMEs | |||
with different reference clocks it will need to use multiple CNAMEs. | identify a particular synchronisation context, i.e., all SSRCs | |||
This ought not be common, and if possible reference clocks ought to | associated with a single RTCP CNAME share a common reference clock. | |||
be mapped to each other and one chosen to be used with RTP and RTCP. | If an end-point has SSRCs that are associated with several | |||
unsynchronised reference clocks, and hence different synchronisation | ||||
contexts, it will need to use multiple RTCP CNAMEs, one for each | ||||
synchronisation context. | ||||
Taking the discussion in Section 11 into account, a WebRTC end-point | ||||
MUST NOT use more than one RTCP CNAME in the RTP sessions belonging | ||||
to single RTCPeerConnection (that is, an RTCPeerConnection forms a | ||||
synchronisation context). RTP middleboxes MAY generate RTP packet | ||||
streams associated with more than one RTCP CNAME, to allow them to | ||||
avoid having to resynchronize media from multiple different end- | ||||
points part of a multi-party RTP session. | ||||
The RTP specification [RFC3550] includes guidelines for choosing a | The RTP specification [RFC3550] includes guidelines for choosing a | |||
unique RTP CNAME, but these are not sufficient in the presence of NAT | unique RTP CNAME, but these are not sufficient in the presence of NAT | |||
devices. In addition, long-term persistent identifiers can be | devices. In addition, long-term persistent identifiers can be | |||
problematic from a privacy viewpoint. Accordingly, support for | problematic from a privacy viewpoint (Section 13). Accordingly, a | |||
generating a short-term persistent RTCP CNAMEs following [RFC7022] is | WebRTC endpoint MUST generate a new, unique, short-term persistent | |||
RECOMMENDED. | RTCP CNAME for each RTCPeerConnection, following [RFC7022], with a | |||
single exception; if explicitly requested at creation an | ||||
RTCPeerConnection MAY use the same CNAME as as an existing | ||||
RTCPeerConnection within their common same-origin context. | ||||
An WebRTC end-point MUST support reception of any CNAME that matches | An WebRTC end-point MUST support reception of any CNAME that matches | |||
the syntax limitations specified by the RTP specification [RFC3550] | the syntax limitations specified by the RTP specification [RFC3550] | |||
and cannot assume that any CNAME will be chosen according to the form | and cannot assume that any CNAME will be chosen according to the form | |||
suggested above. | suggested above. | |||
4.10. Handling of Leap Seconds | ||||
The guidelines regarding handling of leap seconds to limit their | ||||
impact on RTP media playout and synchronization given in [RFC7164] | ||||
SHOULD be followed. | ||||
5. WebRTC Use of RTP: Extensions | 5. WebRTC Use of RTP: Extensions | |||
There are a number of RTP extensions that are either needed to obtain | There are a number of RTP extensions that are either needed to obtain | |||
full functionality, or extremely useful to improve on the baseline | full functionality, or extremely useful to improve on the baseline | |||
performance, in the WebRTC application context. One set of these | performance, in the WebRTC application context. One set of these | |||
extensions is related to conferencing, while others are more generic | extensions is related to conferencing, while others are more generic | |||
in nature. The following subsections describe the various RTP | in nature. The following subsections describe the various RTP | |||
extensions mandated or suggested for use within the WebRTC context. | extensions mandated or suggested for use within the WebRTC context. | |||
5.1. Conferencing Extensions | 5.1. Conferencing Extensions and Topologies | |||
RTP is inherently a group communication protocol. Groups can be | RTP is a protocol that inherently supports group communication. | |||
implemented using a centralised server, multi-unicast, or using IP | Groups can be implemented by having each endpoint send its RTP packet | |||
multicast. While IP multicast is popular in IPTV systems, overlay- | streams to an RTP middlebox that redistributes the traffic, by using | |||
based topologies dominate in interactive conferencing environments. | a mesh of unicast RTP packet streams between endpoints, or by using | |||
Such overlay-based topologies typically use one or more central | an IP multicast group to distribute the RTP packet streams. These | |||
servers to connect end-points in a star or flat tree topology. These | topologies can be implemented in a number of ways as discussed in | |||
central servers can be implemented in a number of ways as discussed | ||||
in the memo on RTP Topologies | ||||
[I-D.ietf-avtcore-rtp-topologies-update]. | [I-D.ietf-avtcore-rtp-topologies-update]. | |||
Not all of the possible the overlay-based topologies are suitable for | While the use of IP multicast groups is popular in IPTV systems, the | |||
use in the WebRTC environment. Specifically: | topologies based on RTP middleboxes are dominant in interactive video | |||
conferencing environments. Topologies based on a mesh of unicast | ||||
transport-layer flows to create a common RTP session have not seen | ||||
widespread deployment to date. Accordingly, WebRTC implementations | ||||
are not expected to support topologies based on IP multicast groups | ||||
or to support mesh-based topologies, such as a point-to-multipoint | ||||
mesh configured as a single RTP session (Topo-Mesh in the terminology | ||||
of [I-D.ietf-avtcore-rtp-topologies-update]). However, a point-to- | ||||
multipoint mesh constructed using several RTP sessions, implemented | ||||
in the WebRTC context using independent RTCPeerConnections, can be | ||||
expected to be utilised by WebRTC applications and needs to be | ||||
supported. | ||||
o The use of video switching MCUs makes the use of RTCP for | WebRTC implementations of RTP endpoints implemented according to this | |||
congestion control and quality of service reports problematic (see | memo are expected to support all the topologies described in | |||
Section 3.6.2 of [I-D.ietf-avtcore-rtp-topologies-update]). | [I-D.ietf-avtcore-rtp-topologies-update] where the RTP endpoints send | |||
and receive unicast RTP packet streams to and from some peer device, | ||||
provided that peer can participate in performing congestion control | ||||
on the RTP packet streams. The peer device could be another RTP | ||||
endpoint, or it could be an RTP middlebox that redistributes the RTP | ||||
packet streams to other RTP endpoints. This limitation means that | ||||
some of the RTP middlebox-based topologies are not suitable for use | ||||
in the WebRTC environment. Specifically: | ||||
o The use of content modifying MCUs with RTCP termination breaks RTP | o Video switching MCUs (Topo-Video-switch-MCU) SHOULD NOT be used, | |||
loop detection, and prevents receivers from identifying active | since they make the use of RTCP for congestion control and quality | |||
senders (see section 3.8 of | of service reports problematic (see Section 3.8 of | |||
[I-D.ietf-avtcore-rtp-topologies-update]). | [I-D.ietf-avtcore-rtp-topologies-update]). | |||
Accordingly, only Point to Point (Topo-Point-to-Point), Multiple | o The Relay-Transport Translator (Topo-PtM-Trn-Translator) topology | |||
concurrent Point to Point (Mesh) and RTP Mixers (Topo-Mixer) | SHOULD NOT be used because its safe use requires a congestion | |||
topologies are needed to achieve the use-cases to be supported in | control algorithm or RTP circuit breaker that handles point to | |||
WebRTC initially. These RECOMMENDED topologies are expected to be | multipoint, which has not yet been standardised. | |||
supported by all WebRTC end-points (these topologies require no | ||||
special RTP-layer support in the end-point if the RTP features | The following topology can be used, however it has some issues worth | |||
mandated in this memo are implemented). | noting: | |||
o Content modifying MCUs with RTCP termination (Topo-RTCP- | ||||
terminating-MCU) MAY be used. Note that in this RTP Topology, RTP | ||||
loop detection and identification of active senders is the | ||||
responsibility of the WebRTC application; since the clients are | ||||
isolated from each other at the RTP layer, RTP cannot assist with | ||||
these functions (see section 3.9 of | ||||
[I-D.ietf-avtcore-rtp-topologies-update]). | ||||
The RTP extensions described in Section 5.1.1 to Section 5.1.6 are | The RTP extensions described in Section 5.1.1 to Section 5.1.6 are | |||
designed to be used with centralised conferencing, where an RTP | designed to be used with centralised conferencing, where an RTP | |||
middlebox (e.g., a conference bridge) receives a participant's RTP | middlebox (e.g., a conference bridge) receives a participant's RTP | |||
media streams and distributes them to the other participants. These | packet streams and distributes them to the other participants. These | |||
extensions are not necessary for interoperability; an RTP endpoint | extensions are not necessary for interoperability; an RTP end-point | |||
that does not implement these extensions will work correctly, but | that does not implement these extensions will work correctly, but | |||
might offer poor performance. Support for the listed extensions will | might offer poor performance. Support for the listed extensions will | |||
greatly improve the quality of experience and, to provide a | greatly improve the quality of experience and, to provide a | |||
reasonable baseline quality, some of these extensions are mandatory | reasonable baseline quality, some of these extensions are mandatory | |||
to be supported by WebRTC end-points. | to be supported by WebRTC end-points. | |||
The RTCP conferencing extensions are defined in Extended RTP Profile | The RTCP conferencing extensions are defined in Extended RTP Profile | |||
for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/ | for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/ | |||
AVPF) [RFC4585] and the "Codec Control Messages in the RTP Audio- | AVPF) [RFC4585] and the memo on Codec Control Messages (CCM) in RTP/ | |||
Visual Profile with Feedback (AVPF)" (CCM) [RFC5104] and are fully | AVPF [RFC5104]; they are fully usable by the Secure variant of this | |||
usable by the Secure variant of this profile (RTP/SAVPF) [RFC5124]. | profile (RTP/SAVPF) [RFC5124]. | |||
5.1.1. Full Intra Request (FIR) | 5.1.1. Full Intra Request (FIR) | |||
The Full Intra Request is defined in Sections 3.5.1 and 4.3.1 of the | The Full Intra Request message is defined in Sections 3.5.1 and 4.3.1 | |||
Codec Control Messages [RFC5104]. This message is used to make the | of the Codec Control Messages [RFC5104]. It is used to make the | |||
mixer request a new Intra picture from a participant in the session. | mixer request a new Intra picture from a participant in the session. | |||
This is used when switching between sources to ensure that the | This is used when switching between sources to ensure that the | |||
receivers can decode the video or other predictive media encoding | receivers can decode the video or other predictive media encoding | |||
with long prediction chains. WebRTC senders MUST understand and | with long prediction chains. WebRTC senders MUST understand and | |||
react to the FIR feedback message since it greatly improves the user | react to FIR feedback messages they receiver, since this greatly | |||
experience when using centralised mixer-based conferencing; support | improves the user experience when using centralised mixer-based | |||
for sending the FIR message is OPTIONAL. | conferencing. Support for sending FIR messages is OPTIONAL. | |||
5.1.2. Picture Loss Indication (PLI) | 5.1.2. Picture Loss Indication (PLI) | |||
The Picture Loss Indication is defined in Section 6.3.1 of the RTP/ | The Picture Loss Indication message is defined in Section 6.3.1 of | |||
AVPF profile [RFC4585]. It is used by a receiver to tell the sending | the RTP/AVPF profile [RFC4585]. It is used by a receiver to tell the | |||
encoder that it lost the decoder context and would like to have it | sending encoder that it lost the decoder context and would like to | |||
repaired somehow. This is semantically different from the Full Intra | have it repaired somehow. This is semantically different from the | |||
Request above as there could be multiple ways to fulfil the request. | Full Intra Request above as there could be multiple ways to fulfil | |||
WebRTC senders MUST understand and react to this feedback message as | the request. WebRTC senders MUST understand and react to PLI | |||
a loss tolerance mechanism; receivers MAY send PLI messages. | feedback messages as a loss tolerance mechanism. Receivers MAY send | |||
PLI messages. | ||||
5.1.3. Slice Loss Indication (SLI) | 5.1.3. Slice Loss Indication (SLI) | |||
The Slice Loss Indicator is defined in Section 6.3.2 of the RTP/AVPF | The Slice Loss Indication message is defined in Section 6.3.2 of the | |||
profile [RFC4585]. It is used by a receiver to tell the encoder that | RTP/AVPF profile [RFC4585]. It is used by a receiver to tell the | |||
it has detected the loss or corruption of one or more consecutive | encoder that it has detected the loss or corruption of one or more | |||
macro blocks, and would like to have these repaired somehow. It is | consecutive macro blocks, and would like to have these repaired | |||
RECOMMENDED that receivers generate SLI feedback messages if slices | somehow. It is RECOMMENDED that receivers generate SLI feedback | |||
are lost when using a codec that supports the concept of macro | messages if slices are lost when using a codec that supports the | |||
blocks. A sender that receives an SLI feedback message SHOULD | concept of macro blocks. A sender that receives an SLI feedback | |||
attempt to repair the lost slice(s). | message SHOULD attempt to repair the lost slice(s). | |||
5.1.4. Reference Picture Selection Indication (RPSI) | 5.1.4. Reference Picture Selection Indication (RPSI) | |||
Reference Picture Selection Indication (RPSI) messages are defined in | Reference Picture Selection Indication (RPSI) messages are defined in | |||
Section 6.3.3 of the RTP/AVPF profile [RFC4585]. Some video encoding | Section 6.3.3 of the RTP/AVPF profile [RFC4585]. Some video encoding | |||
standards allow the use of older reference pictures than the most | standards allow the use of older reference pictures than the most | |||
recent one for predictive coding. If such a codec is in use, and if | recent one for predictive coding. If such a codec is in use, and if | |||
the encoder has learnt that encoder-decoder synchronisation has been | the encoder has learnt that encoder-decoder synchronisation has been | |||
lost, then a known as correct reference picture can be used as a base | lost, then a known as correct reference picture can be used as a base | |||
for future coding. The RPSI message allows this to be signalled. | for future coding. The RPSI message allows this to be signalled. | |||
Receivers that detect that encoder-decoder synchronisation has been | Receivers that detect that encoder-decoder synchronisation has been | |||
lost SHOULD generate an RPSI feedback message if codec being used | lost SHOULD generate an RPSI feedback message if codec being used | |||
supports reference picture selection. A RTP media stream sender that | supports reference picture selection. A RTP packet stream sender | |||
receives such an RPSI message SHOULD act on that messages to change | that receives such an RPSI message SHOULD act on that messages to | |||
the reference picture, if it is possible to do so within the | change the reference picture, if it is possible to do so within the | |||
available bandwidth constraints. | available bandwidth constraints, and with the codec being used. | |||
5.1.5. Temporal-Spatial Trade-off Request (TSTR) | 5.1.5. Temporal-Spatial Trade-off Request (TSTR) | |||
The temporal-spatial trade-off request and notification are defined | The temporal-spatial trade-off request and notification are defined | |||
in Sections 3.5.2 and 4.3.2 of [RFC5104]. This request can be used | in Sections 3.5.2 and 4.3.2 of [RFC5104]. This request can be used | |||
to ask the video encoder to change the trade-off it makes between | to ask the video encoder to change the trade-off it makes between | |||
temporal and spatial resolution, for example to prefer high spatial | temporal and spatial resolution, for example to prefer high spatial | |||
image quality but low frame rate. Support for TSTR requests and | image quality but low frame rate. Support for TSTR requests and | |||
notifications is OPTIONAL. | notifications is OPTIONAL. | |||
5.1.6. Temporary Maximum Media Stream Bit Rate Request (TMMBR) | 5.1.6. Temporary Maximum Media Stream Bit Rate Request (TMMBR) | |||
This feedback message is defined in Sections 3.5.4 and 4.2.1 of the | The TMMBR feedback message is defined in Sections 3.5.4 and 4.2.1 of | |||
Codec Control Messages [RFC5104]. This message and its notification | the Codec Control Messages [RFC5104]. This request and its | |||
message are used by a media receiver to inform the sending party that | notification message are used by a media receiver to inform the | |||
there is a current limitation on the amount of bandwidth available to | sending party that there is a current limitation on the amount of | |||
this receiver. This can be various reasons for this: for example, an | bandwidth available to this receiver. This can be various reasons | |||
RTP mixer can use this message to limit the media rate of the sender | for this: for example, an RTP mixer can use this message to limit the | |||
being forwarded by the mixer (without doing media transcoding) to fit | media rate of the sender being forwarded by the mixer (without doing | |||
the bottlenecks existing towards the other session participants. | media transcoding) to fit the bottlenecks existing towards the other | |||
WebRTC senders are REQUIRED to implement support for TMMBR messages, | session participants. WebRTC senders are REQUIRED to implement | |||
and MUST follow bandwidth limitations set by a TMMBR message received | support for TMMBR messages, and MUST follow bandwidth limitations set | |||
for their SSRC. The sending of TMMBR requests is OPTIONAL. | by a TMMBR message received for their SSRC. The sending of TMMBR | |||
requests is OPTIONAL. | ||||
5.2. Header Extensions | 5.2. Header Extensions | |||
The RTP specification [RFC3550] provides the capability to include | The RTP specification [RFC3550] provides the capability to include | |||
RTP header extensions containing in-band data, but the format and | RTP header extensions containing in-band data, but the format and | |||
semantics of the extensions are poorly specified. The use of header | semantics of the extensions are poorly specified. The use of header | |||
extensions is OPTIONAL in the WebRTC context, but if they are used, | extensions is OPTIONAL in the WebRTC context, but if they are used, | |||
they MUST be formatted and signalled following the general mechanism | they MUST be formatted and signalled following the general mechanism | |||
for RTP header extensions defined in [RFC5285], since this gives | for RTP header extensions defined in [RFC5285], since this gives | |||
well-defined semantics to RTP header extensions. | well-defined semantics to RTP header extensions. | |||
As noted in [RFC5285], the requirement from the RTP specification | As noted in [RFC5285], the requirement from the RTP specification | |||
that header extensions are "designed so that the header extension may | that header extensions are "designed so that the header extension may | |||
be ignored" [RFC3550] stands. To be specific, header extensions MUST | be ignored" [RFC3550] stands. To be specific, header extensions MUST | |||
only be used for data that can safely be ignored by the recipient | only be used for data that can safely be ignored by the recipient | |||
without affecting interoperability, and MUST NOT be used when the | without affecting interoperability, and MUST NOT be used when the | |||
presence of the extension has changed the form or nature of the rest | presence of the extension has changed the form or nature of the rest | |||
of the packet in a way that is not compatible with the way the stream | of the packet in a way that is not compatible with the way the stream | |||
is signalled (e.g., as defined by the payload type). Valid examples | is signalled (e.g., as defined by the payload type). Valid examples | |||
might include metadata that is additional to the usual RTP | of RTP header extensions might include metadata that is additional to | |||
information. | the usual RTP information, but that can safely be ignored without | |||
compromising interoperability. | ||||
5.2.1. Rapid Synchronisation | 5.2.1. Rapid Synchronisation | |||
Many RTP sessions require synchronisation between audio, video, and | Many RTP sessions require synchronisation between audio, video, and | |||
other content. This synchronisation is performed by receivers, using | other content. This synchronisation is performed by receivers, using | |||
information contained in RTCP SR packets, as described in the RTP | information contained in RTCP SR packets, as described in the RTP | |||
specification [RFC3550]. This basic mechanism can be slow, however, | specification [RFC3550]. This basic mechanism can be slow, however, | |||
so it is RECOMMENDED that the rapid RTP synchronisation extensions | so it is RECOMMENDED that the rapid RTP synchronisation extensions | |||
described in [RFC6051] be implemented in addition to RTCP SR-based | described in [RFC6051] be implemented in addition to RTCP SR-based | |||
synchronisation. The rapid synchronisation extensions use the | synchronisation. The rapid synchronisation extensions use the | |||
general RTP header extension mechanism [RFC5285], which requires | general RTP header extension mechanism [RFC5285], which requires | |||
signalling, but are otherwise backwards compatible. | signalling, but are otherwise backwards compatible. | |||
5.2.2. Client-to-Mixer Audio Level | 5.2.2. Client-to-Mixer Audio Level | |||
The Client to Mixer Audio Level extension [RFC6464] is an RTP header | The Client to Mixer Audio Level extension [RFC6464] is an RTP header | |||
extension used by a client to inform a mixer about the level of audio | extension used by an endpoint to inform a mixer about the level of | |||
activity in the packet to which the header is attached. This enables | audio activity in the packet to which the header is attached. This | |||
a central node to make mixing or selection decisions without decoding | enables an RTP middlebox to make mixing or selection decisions | |||
or detailed inspection of the payload, reducing the complexity in | without decoding or detailed inspection of the payload, reducing the | |||
some types of central RTP nodes. It can also save decoding resources | complexity in some types of mixer. It can also save decoding | |||
in receivers, which can choose to decode only the most relevant RTP | resources in receivers, which can choose to decode only the most | |||
media streams based on audio activity levels. | relevant RTP packet streams based on audio activity levels. | |||
The Client-to-Mixer Audio Level [RFC6464] extension is RECOMMENDED to | The Client-to-Mixer Audio Level [RFC6464] header extension is | |||
be implemented. If it is implemented, it is REQUIRED that the header | RECOMMENDED to be implemented. If this header extension is | |||
extensions are encrypted according to [RFC6904] since the information | implemented, it is REQUIRED that implementations are capable of | |||
contained in these header extensions can be considered sensitive. | encrypting the header extension according to [RFC6904] since the | |||
information contained in these header extensions can be considered | ||||
sensitive. It is further RECOMMENDED that this encryption is used, | ||||
unless the encryption has been explicitly disabled through API or | ||||
signalling. | ||||
5.2.3. Mixer-to-Client Audio Level | 5.2.3. Mixer-to-Client Audio Level | |||
The Mixer to Client Audio Level header extension [RFC6465] provides | The Mixer to Client Audio Level header extension [RFC6465] provides | |||
the client with the audio level of the different sources mixed into a | an endpoint with the audio level of the different sources mixed into | |||
common mix by a RTP mixer. This enables a user interface to indicate | a common mix by a RTP mixer. This enables a user interface to | |||
the relative activity level of each session participant, rather than | indicate the relative activity level of each session participant, | |||
just being included or not based on the CSRC field. This is a pure | rather than just being included or not based on the CSRC field. This | |||
optimisations of non critical functions, and is hence OPTIONAL to | is a pure optimisations of non critical functions, and is hence | |||
implement. If it is implemented, it is REQUIRED that the header | OPTIONAL to implement. If this header extension is implemented, it | |||
extensions are encrypted according to [RFC6904] since the information | is REQUIRED that implementations are capable of encrypting the header | |||
contained in these header extensions can be considered sensitive. | extension according to [RFC6904] since the information contained in | |||
these header extensions can be considered sensitive. It is further | ||||
RECOMMENDED that this encryption is used, unless the encryption has | ||||
been explicitly disabled through API or signalling. | ||||
6. WebRTC Use of RTP: Improving Transport Robustness | 6. WebRTC Use of RTP: Improving Transport Robustness | |||
There are tools that can make RTP media streams robust against packet | There are tools that can make RTP packet streams robust against | |||
loss and reduce the impact of loss on media quality. However, they | packet loss and reduce the impact of loss on media quality. However, | |||
all add extra bits compared to a non-robust stream. The overhead of | they all add overhead compared to a non-robust stream. The overhead | |||
these extra bits needs to be considered, and the aggregate bit-rate | needs to be considered, and the aggregate bit-rate MUST be rate | |||
MUST be rate controlled to avoid causing network congestion (see | controlled to avoid causing network congestion (see Section 7). As a | |||
Section 7). As a result, improving robustness might require a lower | result, improving robustness might require a lower base encoding | |||
base encoding quality, but has the potential to deliver that quality | quality, but has the potential to deliver that quality with fewer | |||
with fewer errors. The mechanisms described in the following sub- | errors. The mechanisms described in the following sub-sections can | |||
sections can be used to improve tolerance to packet loss. | be used to improve tolerance to packet loss. | |||
6.1. Negative Acknowledgements and RTP Retransmission | 6.1. Negative Acknowledgements and RTP Retransmission | |||
As a consequence of supporting the RTP/SAVPF profile, implementations | As a consequence of supporting the RTP/SAVPF profile, implementations | |||
can support negative acknowledgements (NACKs) for RTP data packets | can send negative acknowledgements (NACKs) for RTP data packets | |||
[RFC4585]. This feedback can be used to inform a sender of the loss | [RFC4585]. This feedback can be used to inform a sender of the loss | |||
of particular RTP packets, subject to the capacity limitations of the | of particular RTP packets, subject to the capacity limitations of the | |||
RTCP feedback channel. A sender can use this information to optimise | RTCP feedback channel. A sender can use this information to optimise | |||
the user experience by adapting the media encoding to compensate for | the user experience by adapting the media encoding to compensate for | |||
known lost packets, for example. | known lost packets. | |||
RTP Media Stream Senders are REQUIRED to understand the Generic NACK | RTP packet stream Senders are REQUIRED to understand the Generic NACK | |||
message defined in Section 6.2.1 of [RFC4585], but MAY choose to | message defined in Section 6.2.1 of [RFC4585], but MAY choose to | |||
ignore this feedback (following Section 4.2 of [RFC4585]). Receivers | ignore some or all of this feedback (following Section 4.2 of | |||
MAY send NACKs for missing RTP packets; [RFC4585] provides some | [RFC4585]). Receivers MAY send NACKs for missing RTP packets. | |||
guidelines on when to send NACKs. It is not expected that a receiver | Guidelines on when to send NACKs are provided in [RFC4585]. It is | |||
will send a NACK for every lost RTP packet, rather it needs to | not expected that a receiver will send a NACK for every lost RTP | |||
consider the cost of sending NACK feedback, and the importance of the | packet, rather it needs to consider the cost of sending NACK | |||
lost packet, to make an informed decision on whether it is worth | feedback, and the importance of the lost packet, to make an informed | |||
telling the sender about a packet loss event. | decision on whether it is worth telling the sender about a packet | |||
loss event. | ||||
The RTP Retransmission Payload Format [RFC4588] offers the ability to | The RTP Retransmission Payload Format [RFC4588] offers the ability to | |||
retransmit lost packets based on NACK feedback. Retransmission needs | retransmit lost packets based on NACK feedback. Retransmission needs | |||
to be used with care in interactive real-time applications to ensure | to be used with care in interactive real-time applications to ensure | |||
that the retransmitted packet arrives in time to be useful, but can | that the retransmitted packet arrives in time to be useful, but can | |||
be effective in environments with relatively low network RTT (an RTP | be effective in environments with relatively low network RTT (an RTP | |||
sender can estimate the RTT to the receivers using the information in | sender can estimate the RTT to the receivers using the information in | |||
RTCP SR and RR packets, as described at the end of Section 6.4.1 of | RTCP SR and RR packets, as described at the end of Section 6.4.1 of | |||
[RFC3550]). The use of retransmissions can also increase the forward | [RFC3550]). The use of retransmissions can also increase the forward | |||
RTP bandwidth, and can potentially worsen the problem if the packet | RTP bandwidth, and can potentially caused increased packet loss if | |||
loss was caused by network congestion. We note, however, that | the original packet loss was caused by network congestion. We note, | |||
retransmission of an important lost packet to repair decoder state | however, that retransmission of an important lost packet to repair | |||
can have lower cost than sending a full intra frame. It is not | decoder state can have lower cost than sending a full intra frame. | |||
appropriate to blindly retransmit RTP packets in response to a NACK. | It is not appropriate to blindly retransmit RTP packets in response | |||
to a NACK. The importance of lost packets and the likelihood of them | ||||
The importance of lost packets and the likelihood of them arriving in | arriving in time to be useful needs to be considered before RTP | |||
time to be useful needs to be considered before RTP retransmission is | retransmission is used. | |||
used. | ||||
Receivers are REQUIRED to implement support for RTP retransmission | Receivers are REQUIRED to implement support for RTP retransmission | |||
packets [RFC4588]. Senders MAY send RTP retransmission packets in | packets [RFC4588]. Senders MAY send RTP retransmission packets in | |||
response to NACKs if the RTP retransmission payload format has been | response to NACKs if the RTP retransmission payload format has been | |||
negotiated for the session, and if the sender believes it is useful | negotiated for the session, and if the sender believes it is useful | |||
to send a retransmission of the packet(s) referenced in the NACK. An | to send a retransmission of the packet(s) referenced in the NACK. An | |||
RTP sender does not need to retransmit every NACKed packet. | RTP sender does not need to retransmit every NACKed packet. | |||
6.2. Forward Error Correction (FEC) | 6.2. Forward Error Correction (FEC) | |||
The use of Forward Error Correction (FEC) can provide an effective | The use of Forward Error Correction (FEC) can provide an effective | |||
protection against some degree of packet loss, at the cost of steady | protection against some degree of packet loss, at the cost of steady | |||
bandwidth overhead. There are several FEC schemes that are defined | bandwidth overhead. There are several FEC schemes that are defined | |||
for use with RTP. Some of these schemes are specific to a particular | for use with RTP. Some of these schemes are specific to a particular | |||
RTP payload format, others operate across RTP packets and can be used | RTP payload format, others operate across RTP packets and can be used | |||
with any payload format. It needs to be noted that using redundant | with any payload format. It needs to be noted that using redundant | |||
encoding or FEC will lead to increased play out delay, which needs to | encoding or FEC will lead to increased play out delay, which needs to | |||
be considered when choosing the redundancy or FEC formats and their | be considered when choosing the redundancy or FEC formats and their | |||
respective parameters. | respective parameters. | |||
If an RTP payload format negotiated for use in a WebRTC session | If an RTP payload format negotiated for use in a RTCPeerConnection | |||
supports redundant transmission or FEC as a standard feature of that | supports redundant transmission or FEC as a standard feature of that | |||
payload format, then that support MAY be used in the WebRTC session, | payload format, then that support MAY be used in the | |||
subject to any appropriate signalling. | RTCPeerConnection, subject to any appropriate signalling. | |||
There are several block-based FEC schemes that are designed for use | There are several block-based FEC schemes that are designed for use | |||
with RTP independent of the chosen RTP payload format. At the time | with RTP independent of the chosen RTP payload format. At the time | |||
of this writing there is no consensus on which, if any, of these FEC | of this writing there is no consensus on which, if any, of these FEC | |||
schemes is appropriate for use in the WebRTC context. Accordingly, | schemes is appropriate for use in the WebRTC context. Accordingly, | |||
this memo makes no recommendation on the choice of block-based FEC | this memo makes no recommendation on the choice of block-based FEC | |||
for WebRTC use. | for WebRTC use. | |||
7. WebRTC Use of RTP: Rate Control and Media Adaptation | 7. WebRTC Use of RTP: Rate Control and Media Adaptation | |||
WebRTC will be used in heterogeneous network environments using a | WebRTC will be used in heterogeneous network environments using a | |||
variety set of link technologies, including both wired and wireless | variety set of link technologies, including both wired and wireless | |||
links, to interconnect potentially large groups of users around the | links, to interconnect potentially large groups of users around the | |||
world. As a result, the network paths between users can have widely | world. As a result, the network paths between users can have widely | |||
varying one-way delays, available bit-rates, load levels, and traffic | varying one-way delays, available bit-rates, load levels, and traffic | |||
mixtures. Individual end-points can send one or more RTP media | mixtures. Individual end-points can send one or more RTP packet | |||
streams to each participant in a WebRTC conference, and there can be | streams to each participant in a WebRTC conference, and there can be | |||
several participants. Each of these RTP media streams can contain | several participants. Each of these RTP packet streams can contain | |||
different types of media, and the type of media, bit rate, and number | different types of media, and the type of media, bit rate, and number | |||
of flows can be highly asymmetric. Non-RTP traffic can share the | of RTP packet streams as well as transport-layer flows can be highly | |||
network paths with RTP flows. Since the network environment is not | asymmetric. Non-RTP traffic can share the network paths with RTP | |||
predictable or stable, WebRTC endpoints MUST ensure that the RTP | transport-layer flows. Since the network environment is not | |||
predictable or stable, WebRTC end-points MUST ensure that the RTP | ||||
traffic they generate can adapt to match changes in the available | traffic they generate can adapt to match changes in the available | |||
network capacity. | network capacity. | |||
The quality of experience for users of WebRTC implementation is very | The quality of experience for users of WebRTC implementation is very | |||
dependent on effective adaptation of the media to the limitations of | dependent on effective adaptation of the media to the limitations of | |||
the network. End-points have to be designed so they do not transmit | the network. End-points have to be designed so they do not transmit | |||
significantly more data than the network path can support, except for | significantly more data than the network path can support, except for | |||
very short time periods, otherwise high levels of network packet loss | very short time periods, otherwise high levels of network packet loss | |||
or delay spikes will occur, causing media quality degradation. The | or delay spikes will occur, causing media quality degradation. The | |||
limiting factor on the capacity of the network path might be the link | limiting factor on the capacity of the network path might be the link | |||
bandwidth, or it might be competition with other traffic on the link | bandwidth, or it might be competition with other traffic on the link | |||
(this can be non-WebRTC traffic, traffic due to other WebRTC flows, | (this can be non-WebRTC traffic, traffic due to other WebRTC flows, | |||
or even competition with other WebRTC flows in the same session). | or even competition with other WebRTC flows in the same session). | |||
An effective media congestion control algorithm is therefore an | An effective media congestion control algorithm is therefore an | |||
essential part of the WebRTC framework. However, at the time of this | essential part of the WebRTC framework. However, at the time of this | |||
writing, there is no standard congestion control algorithm that can | writing, there is no standard congestion control algorithm that can | |||
be used for interactive media applications such as WebRTC flows. | be used for interactive media applications such as WebRTC's flows. | |||
Some requirements for congestion control algorithms for WebRTC | Some requirements for congestion control algorithms for | |||
sessions are discussed in [I-D.ietf-rmcat-cc-requirements], and it is | RTCPeerConnections are discussed in [I-D.ietf-rmcat-cc-requirements]. | |||
expected that a future version of this memo will mandate the use of a | It is expected that a future version of this memo will mandate the | |||
congestion control algorithm that satisfies these requirements. | use of a congestion control algorithm that satisfies these | |||
requirements. | ||||
7.1. Boundary Conditions and Circuit Breakers | 7.1. Boundary Conditions and Circuit Breakers | |||
In the absence of a concrete congestion control algorithm, all WebRTC | In the absence of a concrete congestion control algorithm, all WebRTC | |||
implementations MUST implement the RTP circuit breaker algorithm that | implementations MUST implement the RTP circuit breaker algorithm that | |||
is in described [I-D.ietf-avtcore-rtp-circuit-breakers]. The RTP | is described in [I-D.ietf-avtcore-rtp-circuit-breakers]. The RTP | |||
circuit breaker is designed to enable applications to recognise and | circuit breaker is designed to enable applications to recognise and | |||
react to situations of extreme network congestion. However, since | react to situations of extreme network congestion. However, since | |||
the RTP circuit breaker might not be triggered until congestion | the RTP circuit breaker might not be triggered until congestion | |||
becomes extreme, it cannot be considered a substitute for congestion | becomes extreme, it cannot be considered a substitute for congestion | |||
control, and applications MUST also implement congestion control to | control, and applications MUST also implement congestion control to | |||
allow them to adapt to changes in network capacity. Any future RTP | allow them to adapt to changes in network capacity. Any future RTP | |||
congestion control algorithms are expected to operate within the | congestion control algorithms are expected to operate within the | |||
envelope allowed by the circuit breaker. | envelope allowed by the circuit breaker. | |||
The session establishment signalling will also necessarily establish | The session establishment signalling will also necessarily establish | |||
skipping to change at page 19, line 21 | skipping to change at page 20, line 35 | |||
"b=AS:" or "b=CT:" lines, and the RTP/AVPF Temporary Maximum Media | "b=AS:" or "b=CT:" lines, and the RTP/AVPF Temporary Maximum Media | |||
Stream Bit Rate (TMMBR) Requests (see Section 5.1.6 of this memo). | Stream Bit Rate (TMMBR) Requests (see Section 5.1.6 of this memo). | |||
The combination of media codec choice and signalled bandwidth limits | The combination of media codec choice and signalled bandwidth limits | |||
SHOULD be used to limit traffic based on known bandwidth limitations, | SHOULD be used to limit traffic based on known bandwidth limitations, | |||
for example the capacity of the edge links, to the extent possible. | for example the capacity of the edge links, to the extent possible. | |||
7.2. RTCP Limitations for Congestion Control | 7.2. RTCP Limitations for Congestion Control | |||
Experience with the congestion control algorithms of TCP [RFC5681], | Experience with the congestion control algorithms of TCP [RFC5681], | |||
TFRC [RFC5348], and DCCP [RFC4341], [RFC4342], [RFC4828], has shown | TFRC [RFC5348], and DCCP [RFC4341], [RFC4342], [RFC4828], has shown | |||
that feedback on packet arrivals needs to be sent roughly once per | that feedback on packet arrivals needs to be sent frequently (roughly | |||
round trip time. We note that the real-time media traffic might not | once per round trip time is common). We note that the real-time | |||
have to adapt to changing path conditions as rapidly as needed for | media traffic might not be able to adapt to changing path conditions | |||
the elastic applications TCP was designed for, but frequent feedback | as rapidly as elastic applications using TCP, but frequent feedback, | |||
is still needed to allow the congestion control algorithm to track | perhaps on the order of once per video frame, is still needed to | |||
the path dynamics. | allow the congestion control algorithm to track the path dynamics. | |||
The total RTCP bandwidth is normally limited in its transmission rate | As an example of the type of RTCP congestion control feedback that is | |||
to a fraction of the nominal RTP traffic (by default 5%). RTCP | possible, consider one of the simplest scenarios for WebRTC: a point | |||
packets are larger than, e.g., TCP ACKs (even when non-compound RTCP | to point video call between two end systems. There will be four RTP | |||
packets are used). The RTP media stream bit rate thus limits the | flows in this scenario, two audio and two video, with all four flows | |||
maximum feedback rate as a function of the mean RTCP packet size. | being active for essentially all the time (the audio flows will | |||
likely use voice activity detection and comfort noise to reduce the | ||||
packet rate during silent periods, but doesn't cause transmissions to | ||||
stop). Assume all four flows are sent in a single RTP session, each | ||||
using a separate SSRC. Further, assume each SSRC sends RTCP reports | ||||
for all other SSRCs in the session (i.e., the optimisations in | ||||
[I-D.ietf-avtcore-rtp-multi-stream-optimisation] are not used, giving | ||||
the worst case for the RTCP overhead). When all members are senders | ||||
like this, the RTCP timing rules in Sections 6.2 and 6.3 of [RFC3550] | ||||
and [RFC4585] reduce to: | ||||
Interactive communication might not be able to afford waiting for | rtcp_interval = avg_rtcp_size * n / rtcp_bw | |||
where avg_rtcp_size is measured in octets, and the rtcp_bw is the | ||||
bandwidth available for RTCP. The average RTCP size will depend on | ||||
the amount of feedback that is sent in each RTCP packet, on the | ||||
number of members in the session, and on the size of source | ||||
description (RTCP SDES) information sent. As a baseline, each RTCP | ||||
packet will be a compound RTCP packet that contains an RTCP SR and an | ||||
RTCP SDES packet. In the scenario above, each RTCP SR packet will | ||||
contain three report blocks, once for each of the other RTP SSRCs | ||||
sending data, for a total of 100 octets (this is 8 octets header, 20 | ||||
octets sender info, and 3 * 24 octets report blocks). The RTCP SDES | ||||
packet will comprise a header (4 octets), an originating SSRC (4 | ||||
octets), a CNAME chunk, and padding. If the CNAME follows [RFC7022] | ||||
and it will be 19 octets in size, and require 1 octet of padding. | ||||
The resulting compound RTCP packet will be 128 octets in size. If | ||||
sent in UDP/IPv4 with no IP options and using Secure RTP, which adds | ||||
20 (IPv4) + 8 (UDP) + 14 (SRTP with 80 bit Authentication tag), the | ||||
avg_rtcp_size will therefore be 170 octets, including the header | ||||
overhead. The value n is this scenario is 4, and the rtcp_bw is | ||||
assumed to be 5% of the session bandwidth. | ||||
If it is desired to send RTCP feedback packets on average 30 times | ||||
per second, to correspond to one RTCP report every frame for 30fps | ||||
video, we can invert the above rtcp_interval calculation to get an | ||||
rtcp_bw that gives an interval of 1/30th of a second or lower. This | ||||
corresponds to an rtcp_bw of 20400 octets per second (since 1/30 = | ||||
170 * 4 / 20400). This is 163200 bits per second, which if 5% of the | ||||
session bandwidth, gives a session bandwidth of approximately 3.3Mbps | ||||
(i.e., 3.3Mbps media rate, plus an additional 5% for RTCP, to give a | ||||
total data rate of approximately 3.4Mbps). That is, RTCP can report | ||||
on every frame of video provided the session bandwidth is 3.3Mbps or | ||||
larger, when every SSRC sends a report for every video frame. Please | ||||
note that the actual RTCP transmission intervals will be within the | ||||
interval [0.0135, 0.0406]s, but maintaining an average RTCP | ||||
transmission interval of 0.033s. | ||||
Note: To achieve the RTCP transmission intervals above the RTP/ | ||||
SAVPF profile with T_rr_interval=0 is used, since even when using | ||||
the reduced minimal transmission interval, the RTP/SAVP profile | ||||
would only allow sending RTCP at most every 0.11s (every third | ||||
frame of video). Using RTP/SAVPF with T_rr_interval=0 however is | ||||
capable of fully utilizing the configured 5% RTCP bandwidth | ||||
fraction. | ||||
If additional feedback beyond the standard report block is needed, | ||||
the session bandwidth needed will increase. For example, with an | ||||
additional 20 octets data being reported in each RTCP packet, the | ||||
session bandwidth needed increases to 3.5Mbps for every SSRC to be | ||||
able to report on every frame. However, the above baseline might not | ||||
be the most appropriate usage of the RTCP bandwidth. Depending on | ||||
needs, a less frequent usage of regular RTCP compound packets, | ||||
controlled by T_rr_interval combined with using the reduced size RTCP | ||||
packets, can achieve more frequent and useful reporting. Also the | ||||
reporting requirements defined in | ||||
[I-D.ietf-avtcore-rtp-multi-stream-optimisation] will reduced the | ||||
amount of bandwidth consumed for reporting when each endpoint has | ||||
multiple SSRCs. | ||||
Calculations such as these show that RTCP cannot be used to send per- | ||||
packet congestion feedback. RTCP can, however, be used to send | ||||
congestion feedback on each frame of video sent in an interactive | ||||
video conferencing scenario, provided the RTCP parameters are | ||||
correctly configured and the overall session bandwidth exceeds a | ||||
couple of megabits per second (the exact rate depending on the number | ||||
of session participants, the RTCP bandwidth fraction, and whether | ||||
audio and video are sent in one or two RTP sessions). Using similar | ||||
calculations, it can be shown that RTCP can likely also be used to | ||||
send feedback on a per-RTT basis, provided the RTT is not too low. | ||||
Interactive communication might not be able to afford to wait for | ||||
packet losses to occur to indicate congestion, because an increase in | packet losses to occur to indicate congestion, because an increase in | |||
play out delay due to queuing (most prominent in wireless networks) | play out delay due to queuing (most prominent in wireless networks) | |||
can easily lead to packets being dropped due to late arrival at the | can easily lead to packets being dropped due to late arrival at the | |||
receiver. Therefore, more sophisticated cues might need to be | receiver. Therefore, more sophisticated cues might need to be | |||
reported -- to be defined in a suitable congestion control framework | reported -- to be defined in a suitable congestion control framework | |||
as noted above -- which, in turn, increase the report size again. | as noted above -- which, in turn, increase the report size again. | |||
For example, different RTCP XR report blocks (jointly) provide the | For example, different RTCP XR report blocks (jointly) provide the | |||
necessary details to implement a variety of congestion control | necessary details to implement a variety of congestion control | |||
algorithms, but the (compound) report size grows quickly. | algorithms, but the (compound) report size grows quickly. | |||
In group communication, the share of RTCP bandwidth needs to be | ||||
shared by all group members, reducing the capacity and thus the | ||||
reporting frequency per node. | ||||
Example: assuming 512 kbit/s video yields 3200 bytes/s RTCP | ||||
bandwidth, split across two entities in a point-to-point session. An | ||||
endpoint could thus send a report of 100 bytes about every 70ms or | ||||
for every other frame in a 30 fps video. | ||||
7.3. Congestion Control Interoperability and Legacy Systems | 7.3. Congestion Control Interoperability and Legacy Systems | |||
There are legacy implementations that do not implement RTCP, and | There are legacy RTP implementations that do not implement RTCP, and | |||
hence do not provide any congestion feedback. Congestion control | hence do not provide any congestion feedback. Congestion control | |||
cannot be performed with these end-points. WebRTC implementations | cannot be performed with these end-points. WebRTC implementations | |||
that need to interwork with such end-points MUST limit their | that need to interwork with such end-points MUST limit their | |||
transmission to a low rate, equivalent to a VoIP call using a low | transmission to a low rate, equivalent to a VoIP call using a low | |||
bandwidth codec, that is unlikely to cause any significant | bandwidth codec, that is unlikely to cause any significant | |||
congestion. | congestion. | |||
When interworking with legacy implementations that support RTCP using | When interworking with legacy implementations that support RTCP using | |||
the RTP/AVP profile [RFC3551], congestion feedback is provided in | the RTP/AVP profile [RFC3551], congestion feedback is provided in | |||
RTCP RR packets every few seconds. Implementations that have to | RTCP RR packets every few seconds. Implementations that have to | |||
skipping to change at page 21, line 4 | skipping to change at page 23, line 40 | |||
both directions. If the IETF were to standardise both sender- and | both directions. If the IETF were to standardise both sender- and | |||
receiver-based congestion control algorithms for WebRTC traffic in | receiver-based congestion control algorithms for WebRTC traffic in | |||
the future, the issues of interoperability, control, and ensuring | the future, the issues of interoperability, control, and ensuring | |||
that both directions of media flow are congestion controlled would | that both directions of media flow are congestion controlled would | |||
also need to be considered. | also need to be considered. | |||
8. WebRTC Use of RTP: Performance Monitoring | 8. WebRTC Use of RTP: Performance Monitoring | |||
As described in Section 4.1, implementations are REQUIRED to generate | As described in Section 4.1, implementations are REQUIRED to generate | |||
RTCP Sender Report (SR) and Reception Report (RR) packets relating to | RTCP Sender Report (SR) and Reception Report (RR) packets relating to | |||
the RTP media streams they send and receive. These RTCP reports can | the RTP packet streams they send and receive. These RTCP reports can | |||
be used for performance monitoring purposes, since they include basic | be used for performance monitoring purposes, since they include basic | |||
packet loss and jitter statistics. | packet loss and jitter statistics. | |||
A large number of additional performance metrics are supported by the | A large number of additional performance metrics are supported by the | |||
RTCP Extended Reports (XR) framework [RFC3611][RFC6792]. It is not | RTCP Extended Reports (XR) framework [RFC3611][RFC6792]. At the time | |||
yet clear what extended metrics are appropriate for use in the WebRTC | of this writing, it is not clear what extended metrics are suitable | |||
context, so there is no requirement that implementations generate | for use in the WebRTC context, so there is no requirement that | |||
RTCP XR packets. However, implementations that can use detailed | implementations generate RTCP XR packets. However, implementations | |||
performance monitoring data MAY generate RTCP XR packets as | that can use detailed performance monitoring data MAY generate RTCP | |||
appropriate; the use of such packets SHOULD be signalled in advance. | XR packets as appropriate; the use of such packets SHOULD be | |||
signalled in advance. | ||||
All WebRTC implementations MUST be prepared to receive RTP XR report | All WebRTC implementations MUST be prepared to receive RTP XR report | |||
packets, whether or not they were signalled. There is no requirement | packets, whether or not they were signalled. There is no requirement | |||
that the data contained in such reports be used, or exposed to the | that the data contained in such reports be used, or exposed to the | |||
Javascript application, however. | Javascript application, however. | |||
9. WebRTC Use of RTP: Future Extensions | 9. WebRTC Use of RTP: Future Extensions | |||
It is possible that the core set of RTP protocols and RTP extensions | It is possible that the core set of RTP protocols and RTP extensions | |||
specified in this memo will prove insufficient for the future needs | specified in this memo will prove insufficient for the future needs | |||
skipping to change at page 22, line 11 | skipping to change at page 24, line 48 | |||
RTP/AVP [RFC3551] and RTP/AVPF [RFC4585] profiles can interoperate | RTP/AVP [RFC3551] and RTP/AVPF [RFC4585] profiles can interoperate | |||
on basic level, as can their secure variants RTP/SAVP [RFC3711] | on basic level, as can their secure variants RTP/SAVP [RFC3711] | |||
and RTP/SAVPF [RFC5124]. The secure variants of the profiles do | and RTP/SAVPF [RFC5124]. The secure variants of the profiles do | |||
not directly interoperate with the non-secure variants, due to the | not directly interoperate with the non-secure variants, due to the | |||
presence of additional header fields for authentication in SRTP | presence of additional header fields for authentication in SRTP | |||
packets and cryptographic transformation of the payload. WebRTC | packets and cryptographic transformation of the payload. WebRTC | |||
requires the use of the RTP/SAVPF profile, and this MUST be | requires the use of the RTP/SAVPF profile, and this MUST be | |||
signalled if SDP is used. Interworking functions might transform | signalled if SDP is used. Interworking functions might transform | |||
this into the RTP/SAVP profile for a legacy use case, by | this into the RTP/SAVP profile for a legacy use case, by | |||
indicating to the WebRTC end-point that the RTP/SAVPF is used, and | indicating to the WebRTC end-point that the RTP/SAVPF is used, and | |||
limiting the usage of the "a=rtcp:" attribute to indicate a trr- | limiting the usage of the "a=rtcp-fb:" attribute to indicate a | |||
int value of 4 seconds. | trr-int value of 4 seconds. | |||
Transport Information: Source and destination IP address(s) and | Transport Information: Source and destination IP address(s) and | |||
ports for RTP and RTCP MUST be signalled for each RTP session. In | ports for RTP and RTCP MUST be signalled for each RTP session. In | |||
WebRTC these transport addresses will be provided by ICE that | WebRTC these transport addresses will be provided by ICE that | |||
signals candidates and arrives at nominated candidate address | signals candidates and arrives at nominated candidate address | |||
pairs. If RTP and RTCP multiplexing [RFC5761] is to be used, such | pairs. If RTP and RTCP multiplexing [RFC5761] is to be used, such | |||
that a single port is used for RTP and RTCP flows, this MUST be | that a single port, i.e. transport-layer flow, is used for RTP and | |||
signalled (see Section 4.5). | RTCP flows, this MUST be signalled (see Section 4.5). | |||
RTP Payload Types, media formats, and format parameters: The mapping | RTP Payload Types, media formats, and format parameters: The mapping | |||
between media type names (and hence the RTP payload formats to be | between media type names (and hence the RTP payload formats to be | |||
used), and the RTP payload type numbers MUST be signalled. Each | used), and the RTP payload type numbers MUST be signalled. Each | |||
media type MAY also have a number of media type parameters that | media type MAY also have a number of media type parameters that | |||
MUST also be signalled to configure the codec and RTP payload | MUST also be signalled to configure the codec and RTP payload | |||
format (the "a=fmtp:" line from SDP). Section 4.3 of this memo | format (the "a=fmtp:" line from SDP). Section 4.3 of this memo | |||
discusses requirements for uniqueness of payload types. | discusses requirements for uniqueness of payload types. | |||
RTP Extensions: The RTP extensions to be used SHOULD be agreed upon, | RTP Extensions: The RTP extensions to be used SHOULD be agreed upon, | |||
skipping to change at page 23, line 22 | skipping to change at page 25, line 51 | |||
11. WebRTC API Considerations | 11. WebRTC API Considerations | |||
The WebRTC API [W3C.WD-webrtc-20130910] and the Media Capture and | The WebRTC API [W3C.WD-webrtc-20130910] and the Media Capture and | |||
Streams API [W3C.WD-mediacapture-streams-20130903] defines and uses | Streams API [W3C.WD-mediacapture-streams-20130903] defines and uses | |||
the concept of a MediaStream that consists of zero or more | the concept of a MediaStream that consists of zero or more | |||
MediaStreamTracks. A MediaStreamTrack is an individual stream of | MediaStreamTracks. A MediaStreamTrack is an individual stream of | |||
media from any type of media source like a microphone or a camera, | media from any type of media source like a microphone or a camera, | |||
but also conceptual sources, like a audio mix or a video composition, | but also conceptual sources, like a audio mix or a video composition, | |||
are possible. The MediaStreamTracks within a MediaStream need to be | are possible. The MediaStreamTracks within a MediaStream need to be | |||
possible to play out synchronised. The below text uses the | possible to play out synchronised. | |||
terminology from [I-D.ietf-avtext-rtp-grouping-taxonomy]. | ||||
A MediaStreamTrack's realisation in RTP in the context of an | A MediaStreamTrack's realisation in RTP in the context of an | |||
RTCPeerConnection consists of a source packet stream identified with | RTCPeerConnection consists of a source packet stream identified with | |||
an SSRC within an RTP session part of the RTCPeerConnection. The | an SSRC within an RTP session part of the RTCPeerConnection. The | |||
MediaStreamTrack can also result in additional packet streams, and | MediaStreamTrack can also result in additional packet streams, and | |||
thus SSRCs, in the same RTP session. These can be dependent packet | thus SSRCs, in the same RTP session. These can be dependent packet | |||
streams from scalable encoding of the source stream associated with | streams from scalable encoding of the source stream associated with | |||
the MediaStreamTrack, if such a media encoder is used. They can also | the MediaStreamTrack, if such a media encoder is used. They can also | |||
be redundancy packet streams, these are created when applying Forward | be redundancy packet streams, these are created when applying Forward | |||
Error Correction (Section 6.2) or RTP retransmission (Section 6.1) to | Error Correction (Section 6.2) or RTP retransmission (Section 6.1) to | |||
skipping to change at page 24, line 8 | skipping to change at page 26, line 35 | |||
for multiple source packet streams to share encoded streams (but not | for multiple source packet streams to share encoded streams (but not | |||
packet streams), but this is an implementation choice to try to | packet streams), but this is an implementation choice to try to | |||
utilise such optimisations. Note that such optimizations would need | utilise such optimisations. Note that such optimizations would need | |||
to take into account that the constraints for one of the | to take into account that the constraints for one of the | |||
MediaStreamTracks can at any moment change, meaning that the encoding | MediaStreamTracks can at any moment change, meaning that the encoding | |||
configurations might no longer be identical. | configurations might no longer be identical. | |||
The same MediaStreamTrack can also be included in multiple | The same MediaStreamTrack can also be included in multiple | |||
MediaStreams, thus multiple sets of MediaStreams can implicitly need | MediaStreams, thus multiple sets of MediaStreams can implicitly need | |||
to use the same synchronisation base. To ensure that this works in | to use the same synchronisation base. To ensure that this works in | |||
all cases, and don't forces a endpoint to change synchronisation base | all cases, and don't forces a end-point to change synchronisation | |||
and CNAME in the middle of a ongoing delivery of any packet streams, | base and CNAME in the middle of a ongoing delivery of any packet | |||
which would cause media disruption; all MediaStreamTracks and their | streams, which would cause media disruption; all MediaStreamTracks | |||
associated SSRCs originating from the same endpoint MUST be sent | and their associated SSRCs originating from the same end-point needs | |||
using the same CNAME within one RTCPeerConnection as well as across | to be sent using the same CNAME within one RTCPeerConnection. This | |||
all RTCPeerConnections part of the same communication session | is motivating the strong recommendation in Section 4.9 to only use a | |||
context, which for a browser are a single origin. | single CNAME. | |||
Note: It is important that the same CNAME is not used in different | The requirement on using the same CNAME for all SSRCs that | |||
communication session contexts or origins, as that could enable | originates from the same end-point, does not require middleboxes | |||
tracking of a user and its device usage of different services. | that forwards traffic from multiple end-points to only use a | |||
See Section 4.4.1 of Security Considerations for WebRTC | single CNAME. | |||
[I-D.ietf-rtcweb-security] for further discussion. | ||||
The reasons to require the same CNAME across multiple | Different CNAMEs normally need to be used for different | |||
RTCPeerConnections is to enable synchronisation of different | RTCPeerConnection instances, as specified in Section 4.9. Having two | |||
MediaStreamTracks originating from one endpoint despite them being | communication sessions with the same CNAME could enable tracking of a | |||
transported over different RTCPeerConnections. | user or device across different services (see Section 4.4.1 of | |||
[I-D.ietf-rtcweb-security] for details). A web application can | ||||
request that the CNAMEs used in different RTCPeerConnection within a | ||||
same-orign context to be the same, this allow for synchronization of | ||||
the endpoint's RTP packet streams across the different | ||||
RTCPeerConnections. | ||||
The above will currently force a WebRTC endpoint that receives an | Note: this doesn't result in a tracking issue, since the creation | |||
of matching CNAMEs depends on existing tracking. | ||||
The above will currently force a WebRTC end-point that receives an | ||||
MediaStreamTrack on one RTCPeerConnection and adds it as an outgoing | MediaStreamTrack on one RTCPeerConnection and adds it as an outgoing | |||
on any RTCPeerConnection to perform resynchronisation of the stream. | on any RTCPeerConnection to perform resynchronisation of the stream. | |||
This, as the sending party needs to change the CNAME, which implies | This, as the sending party needs to change the CNAME, which implies | |||
that it has to use a locally available system clock as timebase for | that it has to use a locally available system clock as timebase for | |||
the synchronisation. Thus, the relative relation between the | the synchronisation. Thus, the relative relation between the | |||
timebase of the incoming stream and the system sending out needs to | timebase of the incoming stream and the system sending out needs to | |||
defined. This relation also needs monitoring for clock drift and | defined. This relation also needs monitoring for clock drift and | |||
likely adjustments of the synchronisation. The sending entity is | likely adjustments of the synchronisation. The sending entity is | |||
also responsible for congestion control for its the sent streams. In | also responsible for congestion control for its the sent streams. In | |||
cases of packet loss the loss of incoming data also needs to be | cases of packet loss the loss of incoming data also needs to be | |||
handled. This leads to the observation that the method that is least | handled. This leads to the observation that the method that is least | |||
likely to cause issues or interruptions in the outgoing source packet | likely to cause issues or interruptions in the outgoing source packet | |||
stream is a model of full decoding, including repair etc followed by | stream is a model of full decoding, including repair etc followed by | |||
encoding of the media again into the outgoing packet stream. | encoding of the media again into the outgoing packet stream. | |||
Optimisations of this method is clearly possible and implementation | Optimisations of this method is clearly possible and implementation | |||
specific. | specific. | |||
A WebRTC endpoint MUST support receiving multiple MediaStreamTracks, | A WebRTC end-point MUST support receiving multiple MediaStreamTracks, | |||
where each of different MediaStreamTracks (and their sets of | where each of different MediaStreamTracks (and their sets of | |||
associated packet streams) uses different CNAMEs. However, | associated packet streams) uses different CNAMEs. However, | |||
MediaStreamTracks that are received with different CNAMEs have no | MediaStreamTracks that are received with different CNAMEs have no | |||
defined synchronisation. | defined synchronisation. | |||
Note: The motivation for supporting reception of multiple CNAMEs | Note: The motivation for supporting reception of multiple CNAMEs | |||
are to allow for forward compatibility with any future changes | are to allow for forward compatibility with any future changes | |||
that enables more efficient stream handling when endpoints relay/ | that enables more efficient stream handling when end-points relay/ | |||
forward streams. It also ensures that endpoints can interoperate | forward streams. It also ensures that end-points can interoperate | |||
with certain types of multi-stream middleboxes or endpoints that | with certain types of multi-stream middleboxes or end-points that | |||
are not WebRTC. | are not WebRTC. | |||
The binding between the WebRTC MediaStreams, MediaStreamTracks and | The binding between the WebRTC MediaStreams, MediaStreamTracks and | |||
the SSRC is done as specified in "Cross Session Stream Identification | the SSRC is done as specified in "Cross Session Stream Identification | |||
in the Session Description Protocol" [I-D.ietf-mmusic-msid]. This | in the Session Description Protocol" [I-D.ietf-mmusic-msid]. This | |||
document [I-D.ietf-mmusic-msid] also defines, in section 4.1, how to | document [I-D.ietf-mmusic-msid] also defines, in section 4.1, how to | |||
map unknown source packet stream SSRCs to MediaStreamTracks and | map unknown source packet stream SSRCs to MediaStreamTracks and | |||
MediaStreams. Commonly the RTP Payload Type of any incoming packets | MediaStreams. Commonly the RTP Payload Type of any incoming packets | |||
will reveal if the packet stream is a source stream or a redundancy | will reveal if the packet stream is a source stream or a redundancy | |||
or dependent packet stream. The association to the correct source | or dependent packet stream. The association to the correct source | |||
skipping to change at page 25, line 36 | skipping to change at page 28, line 23 | |||
12. RTP Implementation Considerations | 12. RTP Implementation Considerations | |||
The following discussion provides some guidance on the implementation | The following discussion provides some guidance on the implementation | |||
of the RTP features described in this memo. The focus is on a WebRTC | of the RTP features described in this memo. The focus is on a WebRTC | |||
end-point implementation perspective, and while some mention is made | end-point implementation perspective, and while some mention is made | |||
of the behaviour of middleboxes, that is not the focus of this memo. | of the behaviour of middleboxes, that is not the focus of this memo. | |||
12.1. Configuration and Use of RTP Sessions | 12.1. Configuration and Use of RTP Sessions | |||
A WebRTC end-point will be a simultaneous participant in one or more | A WebRTC end-point will be a simultaneous participant in one or more | |||
RTP sessions. Each RTP session can convey multiple media flows, and | RTP sessions. Each RTP session can convey multiple media sources, | |||
can include media data from multiple end-points. In the following, | and can include media data from multiple end-points. In the | |||
we outline some ways in which WebRTC end-points can configure and use | following, we outline some ways in which WebRTC end-points can | |||
RTP sessions. | configure and use RTP sessions. | |||
12.1.1. Use of Multiple Media Flows Within an RTP Session | 12.1.1. Use of Multiple Media Sources Within an RTP Session | |||
RTP is a group communication protocol, and in a WebRTC context every | RTP is a group communication protocol, and every RTP session can | |||
RTP session can potentially contain multiple media flows. There are | potentially contain multiple RTP packet streams. There are several | |||
several reasons why this might be desirable: | reasons why this might be desirable: | |||
Multiple media types: Outside of WebRTC, it is common to use one RTP | Multiple media types: Outside of WebRTC, it is common to use one RTP | |||
session for each type of media (e.g., one RTP session for audio | session for each type of media sources (e.g., one RTP session for | |||
and one for video, each sent on a different UDP port). However, | audio sources and one for video sources, each sent over different | |||
to reduce the number of UDP ports used, the default in WebRTC is | transport layer flows). However, to reduce the number of UDP | |||
to send all types of media in a single RTP session, as described | ports used, the default in WebRTC is to send all types of media in | |||
in Section 4.4, using RTP and RTCP multiplexing (Section 4.5) to | a single RTP session, as described in Section 4.4, using RTP and | |||
further reduce the number of UDP ports needed. This RTP session | RTCP multiplexing (Section 4.5) to further reduce the number of | |||
then uses only one UDP flow, but will contain multiple RTP media | UDP ports needed. This RTP session then uses only one bi- | |||
streams, each containing a different type of media. A common | directional transport-layer flow, but will contain multiple RTP | |||
example might be an end-point with a camera and microphone that | packet streams, each containing a different type of media. A | |||
sends two RTP streams, one video and one audio, into a single RTP | common example might be an end-point with a camera and microphone | |||
session. | that sends two RTP packet streams, one video and one audio, into a | |||
single RTP session. | ||||
Multiple Capture Devices: A WebRTC end-point might have multiple | Multiple Capture Devices: A WebRTC end-point might have multiple | |||
cameras, microphones, or other media capture devices, and so might | cameras, microphones, or other media capture devices, and so might | |||
want to generate several RTP media streams of the same media type. | want to generate several RTP packet streams of the same media | |||
Alternatively, it might want to send media from a single capture | type. Alternatively, it might want to send media from a single | |||
device in several different formats or quality settings at once. | capture device in several different formats or quality settings at | |||
Both can result in a single end-point sending multiple RTP media | once. Both can result in a single end-point sending multiple RTP | |||
streams of the same media type into a single RTP session at the | packet streams of the same media type into a single RTP session at | |||
same time. | the same time. | |||
Associated Repair Data: An end-point might send a media stream that | Associated Repair Data: An end-point might send a RTP packet stream | |||
is somehow associated with another stream. For example, it might | that is somehow associated with another stream. For example, it | |||
send an RTP stream that contains FEC or retransmission data | might send an RTP packet stream that contains FEC or | |||
relating to another stream. Some RTP payload formats send this | retransmission data relating to another stream. Some RTP payload | |||
sort of associated repair data as part of the original media | formats send this sort of associated repair data as part of the | |||
stream, while others send it as a separate stream. | source packet stream, while others send it as a separate packet | |||
stream. | ||||
Layered or Multiple Description Coding: An end-point can use a | Layered or Multiple Description Coding: An end-point can use a | |||
layered media codec, for example H.264 SVC, or a multiple | layered media codec, for example H.264 SVC, or a multiple | |||
description codec, that generates multiple media flows, each with | description codec, that generates multiple RTP packet streams, | |||
a distinct RTP SSRC, within a single RTP session. | each with a distinct RTP SSRC, within a single RTP session. | |||
RTP Mixers, Translators, and Other Middleboxes: An RTP session, in | RTP Mixers, Translators, and Other Middleboxes: An RTP session, in | |||
the WebRTC context, is a point-to-point association between an | the WebRTC context, is a point-to-point association between an | |||
end-point and some other peer device, where those devices share a | end-point and some other peer device, where those devices share a | |||
common SSRC space. The peer device might be another WebRTC end- | common SSRC space. The peer device might be another WebRTC end- | |||
point, or it might be an RTP mixer, translator, or some other form | point, or it might be an RTP mixer, translator, or some other form | |||
of media processing middlebox. In the latter cases, the middlebox | of media processing middlebox. In the latter cases, the middlebox | |||
might send mixed or relayed RTP streams from several participants, | might send mixed or relayed RTP streams from several participants, | |||
that the WebRTC end-point will need to render. Thus, even though | that the WebRTC end-point will need to render. Thus, even though | |||
a WebRTC end-point might only be a member of a single RTP session, | a WebRTC end-point might only be a member of a single RTP session, | |||
the peer device might be extending that RTP session to incorporate | the peer device might be extending that RTP session to incorporate | |||
other end-points. WebRTC is a group communication environment and | other end-points. WebRTC is a group communication environment and | |||
end-points need to be capable of receiving, decoding, and playing | end-points need to be capable of receiving, decoding, and playing | |||
out multiple RTP media streams at once, even in a single RTP | out multiple RTP packet streams at once, even in a single RTP | |||
session. | session. | |||
12.1.2. Use of Multiple RTP Sessions | 12.1.2. Use of Multiple RTP Sessions | |||
In addition to sending and receiving multiple media streams within a | In addition to sending and receiving multiple RTP packet streams | |||
single RTP session, a WebRTC end-point might participate in multiple | within a single RTP session, a WebRTC end-point might participate in | |||
RTP sessions. There are several reasons why a WebRTC end-point might | multiple RTP sessions. There are several reasons why a WebRTC end- | |||
choose to do this: | point might choose to do this: | |||
To interoperate with legacy devices: The common practice in the non- | To interoperate with legacy devices: The common practice in the non- | |||
WebRTC world is to send different types of media in separate RTP | WebRTC world is to send different types of media in separate RTP | |||
sessions, for example using one RTP session for audio and another | sessions, for example using one RTP session for audio and another | |||
RTP session, on a different UDP port, for video. All WebRTC end- | RTP session, on a separate transport layer flow, for video. All | |||
points need to support the option of sending different types of | WebRTC end-points need to support the option of sending different | |||
media on different RTP sessions, so they can interwork with such | types of media on different RTP sessions, so they can interwork | |||
legacy devices. This is discussed further in Section 4.4. | with such legacy devices. This is discussed further in | |||
Section 4.4. | ||||
To provide enhanced quality of service: Some network-based quality | To provide enhanced quality of service: Some network-based quality | |||
of service mechanisms operate on the granularity of UDP 5-tuples. | of service mechanisms operate on the granularity of transport | |||
If it is desired to use these mechanisms to provide differentiated | layer flows. If it is desired to use these mechanisms to provide | |||
quality of service for some RTP flows, then those RTP flows need | differentiated quality of service for some RTP packet streams, | |||
to be sent in a separate RTP session using a different UDP port | then those RTP packet streams need to be sent in a separate RTP | |||
number, and with appropriate quality of service marking. This is | session using a different transport-layer flow, and with | |||
discussed further in Section 12.1.3. | appropriate quality of service marking. This is discussed further | |||
in Section 12.1.3. | ||||
To separate media with different purposes: An end-point might want | To separate media with different purposes: An end-point might want | |||
to send media streams that have different purposes on different | to send RTP packet streams that have different purposes on | |||
RTP sessions, to make it easy for the peer device to distinguish | different RTP sessions, to make it easy for the peer device to | |||
them. For example, some centralised multiparty conferencing | distinguish them. For example, some centralised multiparty | |||
systems display the active speaker in high resolution, but show | conferencing systems display the active speaker in high | |||
low resolution "thumbnails" of other participants. Such systems | resolution, but show low resolution "thumbnails" of other | |||
might configure the end-points to send simulcast high- and low- | participants. Such systems might configure the end-points to send | |||
resolution versions of their video using separate RTP sessions, to | simulcast high- and low-resolution versions of their video using | |||
simplify the operation of the central mixer. In the WebRTC | separate RTP sessions, to simplify the operation of the RTP | |||
context this is currently possible to accomplished by establishing | middlebox. In the WebRTC context this is currently possible to | |||
multiple WebRTC MediaStreamTracks that have the same media source | accomplished by establishing multiple WebRTC MediaStreamTracks | |||
in one (or more) RTCPeerConnection. Each MediaStreamTrack is then | that have the same media source in one (or more) | |||
configured to deliver a particular media quality and thus media | RTCPeerConnection. Each MediaStreamTrack is then configured to | |||
bit-rate, and will produce an independently encoded version with | deliver a particular media quality and thus media bit-rate, and | |||
the codec parameters agreed specifically in the context of that | will produce an independently encoded version with the codec | |||
RTCPeerConnection. The central mixer can distinguish packets | parameters agreed specifically in the context of that | |||
RTCPeerConnection. The RTP middlebox can distinguish packets | ||||
corresponding to the low- and high-resolution streams by | corresponding to the low- and high-resolution streams by | |||
inspecting their SSRC, RTP payload type, or some other information | inspecting their SSRC, RTP payload type, or some other information | |||
contained in RTP payload, RTP header extension or RTCP packets, | contained in RTP payload, RTP header extension or RTCP packets, | |||
but it can be easier to distinguish the flows if they arrive on | but it can be easier to distinguish the RTP packet streams if they | |||
separate RTP sessions on separate UDP ports. | arrive on separate RTP sessions on separate transport-layer flows. | |||
To directly connect with multiple peers: A multi-party conference | To directly connect with multiple peers: A multi-party conference | |||
does not need to use a central mixer. Rather, a multi-unicast | does not need to use an RTP middlebox. Rather, a multi-unicast | |||
mesh can be created, comprising several distinct RTP sessions, | mesh can be created, comprising several distinct RTP sessions, | |||
with each participant sending RTP traffic over a separate RTP | with each participant sending RTP traffic over a separate RTP | |||
session (that is, using an independent RTCPeerConnection object) | session (that is, using an independent RTCPeerConnection object) | |||
to every other participant, as shown in Figure 1. This topology | to every other participant, as shown in Figure 1. This topology | |||
has the benefit of not requiring a central mixer node that is | has the benefit of not requiring an RTP middlebox node that is | |||
trusted to access and manipulate the media data. The downside is | trusted to access and manipulate the media data. The downside is | |||
that it increases the used bandwidth at each sender by requiring | that it increases the used bandwidth at each sender by requiring | |||
one copy of the RTP media streams for each participant that are | one copy of the RTP packet streams for each participant that are | |||
part of the same session beyond the sender itself. | part of the same session beyond the sender itself. | |||
+---+ +---+ | +---+ +---+ | |||
| A |<--->| B | | | A |<--->| B | | |||
+---+ +---+ | +---+ +---+ | |||
^ ^ | ^ ^ | |||
\ / | \ / | |||
\ / | \ / | |||
v v | v v | |||
+---+ | +---+ | |||
skipping to change at page 28, line 50 | skipping to change at page 31, line 46 | |||
peers, thus not forcing B to endure the same quality reductions if | peers, thus not forcing B to endure the same quality reductions if | |||
there are limitations in the transport from A to C as C will. It | there are limitations in the transport from A to C as C will. It | |||
it believed that these advantages outweigh the limitations in | it believed that these advantages outweigh the limitations in | |||
debugging power. | debugging power. | |||
To indirectly connect with multiple peers: A common scenario in | To indirectly connect with multiple peers: A common scenario in | |||
multi-party conferencing is to create indirect connections to | multi-party conferencing is to create indirect connections to | |||
multiple peers, using an RTP mixer, translator, or some other type | multiple peers, using an RTP mixer, translator, or some other type | |||
of RTP middlebox. Figure 2 outlines a simple topology that might | of RTP middlebox. Figure 2 outlines a simple topology that might | |||
be used in a four-person centralised conference. The middlebox | be used in a four-person centralised conference. The middlebox | |||
acts to optimise the transmission of RTP media streams from | acts to optimise the transmission of RTP packet streams from | |||
certain perspectives, either by only sending some of the received | certain perspectives, either by only sending some of the received | |||
RTP media stream to any given receiver, or by providing a combined | RTP packet stream to any given receiver, or by providing a | |||
RTP media stream out of a set of contributing streams. | combined RTP packet stream out of a set of contributing streams. | |||
+---+ +-------------+ +---+ | +---+ +-------------+ +---+ | |||
| A |<---->| |<---->| B | | | A |<---->| |<---->| B | | |||
+---+ | RTP mixer, | +---+ | +---+ | RTP mixer, | +---+ | |||
| translator, | | | translator, | | |||
| or other | | | or other | | |||
+---+ | middlebox | +---+ | +---+ | middlebox | +---+ | |||
| C |<---->| |<---->| D | | | C |<---->| |<---->| D | | |||
+---+ +-------------+ +---+ | +---+ +-------------+ +---+ | |||
Figure 2: RTP mixer with only unicast paths | Figure 2: RTP mixer with only unicast paths | |||
There are various methods of implementation for the middlebox. If | There are various methods of implementation for the middlebox. If | |||
implemented as a standard RTP mixer or translator, a single RTP | implemented as a standard RTP mixer or translator, a single RTP | |||
session will extend across the middlebox and encompass all the | session will extend across the middlebox and encompass all the | |||
end-points in one multi-party session. Other types of middlebox | end-points in one multi-party session. Other types of middlebox | |||
might use separate RTP sessions between each end-point and the | might use separate RTP sessions between each end-point and the | |||
middlebox. A common aspect is that these central nodes can use a | middlebox. A common aspect is that these RTP middleboxes can use | |||
number of tools to control the media encoding provided by a WebRTC | a number of tools to control the media encoding provided by a | |||
end-point. This includes functions like requesting breaking the | WebRTC end-point. This includes functions like requesting | |||
encoding chain and have the encoder produce a so called Intra | breaking the encoding chain and have the encoder produce a so | |||
frame. Another is limiting the bit-rate of a given stream to | called Intra frame. Another is limiting the bit-rate of a given | |||
better suit the mixer view of the multiple down-streams. Others | stream to better suit the mixer view of the multiple down-streams. | |||
are controlling the most suitable frame-rate, picture resolution, | Others are controlling the most suitable frame-rate, picture | |||
the trade-off between frame-rate and spatial quality. The | resolution, the trade-off between frame-rate and spatial quality. | |||
middlebox gets the significant responsibility to correctly perform | The middlebox gets the significant responsibility to correctly | |||
congestion control, source identification, manage synchronisation | perform congestion control, source identification, manage | |||
while providing the application with suitable media optimizations. | synchronisation while providing the application with suitable | |||
The middlebox is also has to be a trusted node when it comes to | media optimizations. The middlebox is also has to be a trusted | |||
security, since it manipulates either the RTP header or the media | node when it comes to security, since it manipulates either the | |||
itself (or both) received from one end-point, before sending it on | RTP header or the media itself (or both) received from one end- | |||
towards the end-point(s), thus they need to be able to decrypt and | point, before sending it on towards the end-point(s), thus they | |||
then encrypt it before sending it out. | need to be able to decrypt and then encrypt it before sending it | |||
out. | ||||
RTP Mixers can create a situation where an end-point experiences a | RTP Mixers can create a situation where an end-point experiences a | |||
situation in-between a session with only two end-points and | situation in-between a session with only two end-points and | |||
multiple RTP sessions. Mixers are expected to not forward RTCP | multiple RTP sessions. Mixers are expected to not forward RTCP | |||
reports regarding RTP media streams across themselves. This is | reports regarding RTP packet streams across themselves. This is | |||
due to the difference in the RTP media streams provided to the | due to the difference in the RTP packet streams provided to the | |||
different end-points. The original media source lacks information | different end-points. The original media source lacks information | |||
about a mixer's manipulations prior to sending it the different | about a mixer's manipulations prior to sending it the different | |||
receivers. This scenario also results in that an end-point's | receivers. This scenario also results in that an end-point's | |||
feedback or requests goes to the mixer. When the mixer can't act | feedback or requests goes to the mixer. When the mixer can't act | |||
on this by itself, it is forced to go to the original media source | on this by itself, it is forced to go to the original media source | |||
to fulfil the receivers request. This will not necessarily be | to fulfil the receivers request. This will not necessarily be | |||
explicitly visible any RTP and RTCP traffic, but the interactions | explicitly visible any RTP and RTCP traffic, but the interactions | |||
and the time to complete them will indicate such dependencies. | and the time to complete them will indicate such dependencies. | |||
Providing source authentication in multi-party scenarios is a | Providing source authentication in multi-party scenarios is a | |||
skipping to change at page 30, line 26 | skipping to change at page 33, line 22 | |||
each others' master keys, and can thus inject packets claimed to | each others' master keys, and can thus inject packets claimed to | |||
come from another end-point in the session. Any node performing | come from another end-point in the session. Any node performing | |||
relay can perform non-cryptographic mitigation by preventing | relay can perform non-cryptographic mitigation by preventing | |||
forwarding of packets that have SSRC fields that came from other | forwarding of packets that have SSRC fields that came from other | |||
end-points before. For cryptographic verification of the source | end-points before. For cryptographic verification of the source | |||
SRTP would require additional security mechanisms, for example | SRTP would require additional security mechanisms, for example | |||
TESLA for SRTP [RFC4383], that are not part of the base WebRTC | TESLA for SRTP [RFC4383], that are not part of the base WebRTC | |||
standards. | standards. | |||
To forward media between multiple peers: It is sometimes desirable | To forward media between multiple peers: It is sometimes desirable | |||
for an end-point that receives an RTP media stream to be able to | for an end-point that receives an RTP packet stream to be able to | |||
forward that media stream to a third party. The are some obvious | forward that RTP packet stream to a third party. The are some | |||
security and privacy implications in supporting this, but also | obvious security and privacy implications in supporting this, but | |||
potential uses. This is supported in the W3C API by taking the | also potential uses. This is supported in the W3C API by taking | |||
received and decoded media and using it as media source that is | the received and decoded media and using it as media source that | |||
re-encoding and transmitted as a new stream. | is re-encoding and transmitted as a new stream. | |||
At the RTP layer, media forwarding acts as a back-to-back RTP | At the RTP layer, media forwarding acts as a back-to-back RTP | |||
receiver and RTP sender. The receiving side terminates the RTP | receiver and RTP sender. The receiving side terminates the RTP | |||
session and decodes the media, while the sender side re-encodes | session and decodes the media, while the sender side re-encodes | |||
and transmits the media using an entirely separate RTP session. | and transmits the media using an entirely separate RTP session. | |||
The original sender will only see a single receiver of the media, | The original sender will only see a single receiver of the media, | |||
and will not be able to tell that forwarding is happening based on | and will not be able to tell that forwarding is happening based on | |||
RTP-layer information since the RTP session that is used to send | RTP-layer information since the RTP session that is used to send | |||
the forwarded media is not connected to the RTP session on which | the forwarded media is not connected to the RTP session on which | |||
the media was received by the node doing the forwarding. | the media was received by the node doing the forwarding. | |||
The end-point that is performing the forwarding is responsible for | The end-point that is performing the forwarding is responsible for | |||
producing an RTP media stream suitable for onwards transmission. | producing an RTP packet stream suitable for onwards transmission. | |||
The outgoing RTP session that is used to send the forwarded media | The outgoing RTP session that is used to send the forwarded media | |||
is entirely separate to the RTP session on which the media was | is entirely separate to the RTP session on which the media was | |||
received. This will require media transcoding for congestion | received. This will require media transcoding for congestion | |||
control purpose to produce a suitable bit-rate for the outgoing | control purpose to produce a suitable bit-rate for the outgoing | |||
RTP session, reducing media quality and forcing the forwarding | RTP session, reducing media quality and forcing the forwarding | |||
end-point to spend the resource on the transcoding. The media | end-point to spend the resource on the transcoding. The media | |||
transcoding does result in a separation of the two different legs | transcoding does result in a separation of the two different legs | |||
removing almost all dependencies, and allowing the forwarding end- | removing almost all dependencies, and allowing the forwarding end- | |||
point to optimize its media transcoding operation. The cost is | point to optimize its media transcoding operation. The cost is | |||
greatly increased computational complexity on the forwarding node. | greatly increased computational complexity on the forwarding node. | |||
Receivers of the forwarded stream will see the forwarding device | Receivers of the forwarded stream will see the forwarding device | |||
as the sender of the stream, and will not be able to tell from the | as the sender of the stream, and will not be able to tell from the | |||
RTP layer that they are receiving a forwarded stream rather than | RTP layer that they are receiving a forwarded stream rather than | |||
an entirely new media stream generated by the forwarding device. | an entirely new RTP packet stream generated by the forwarding | |||
device. | ||||
12.1.3. Differentiated Treatment of Flows | 12.1.3. Differentiated Treatment of RTP Packet Streams | |||
There are use cases for differentiated treatment of RTP media | There are use cases for differentiated treatment of RTP packet | |||
streams. Such differentiation can happen at several places in the | streams. Such differentiation can happen at several places in the | |||
system. First of all is the prioritization within the end-point | system. First of all is the prioritization within the end-point | |||
sending the media, which controls, both which RTP media streams that | sending the media, which controls, both which RTP packet streams that | |||
will be sent, and their allocation of bit-rate out of the current | will be sent, and their allocation of bit-rate out of the current | |||
available aggregate as determined by the congestion control. | available aggregate as determined by the congestion control. | |||
It is expected that the WebRTC API [W3C.WD-webrtc-20130910] will | It is expected that the WebRTC API [W3C.WD-webrtc-20130910] will | |||
allow the application to indicate relative priorities for different | allow the application to indicate relative priorities for different | |||
MediaStreamTracks. These priorities can then be used to influence | MediaStreamTracks. These priorities can then be used to influence | |||
the local RTP processing, especially when it comes to congestion | the local RTP processing, especially when it comes to congestion | |||
control response in how to divide the available bandwidth between the | control response in how to divide the available bandwidth between the | |||
RTP flows. Any changes in relative priority will also need to be | RTP packet streams. Any changes in relative priority will also need | |||
considered for RTP flows that are associated with the main RTP flows, | to be considered for RTP packet streams that are associated with the | |||
such as RTP retransmission streams and FEC. The importance of such | main RTP packet streams, such as redundant streams for RTP | |||
associated RTP traffic flows is dependent on the media type and codec | retransmission and FEC. The importance of such redundant RTP packet | |||
used, in regards to how robust that codec is to packet loss. | streams is dependent on the media type and codec used, in regards to | |||
However, a default policy might to be to use the same priority for | how robust that codec is to packet loss. However, a default policy | |||
associated RTP flows as for the primary RTP flow. | might to be to use the same priority for redundant RTP packet stream | |||
as for the source RTP packet stream. | ||||
Secondly, the network can prioritize packet flows, including RTP | Secondly, the network can prioritize transport-layer flows and sub- | |||
media streams. Typically, differential treatment includes two steps, | flows, including RTP packet streams. Typically, differential | |||
the first being identifying whether an IP packet belongs to a class | treatment includes two steps, the first being identifying whether an | |||
that has to be treated differently, the second the actual mechanism | IP packet belongs to a class that has to be treated differently, the | |||
to prioritize packets. This is done according to three methods: | second the actual mechanism to prioritize packets. This is done | |||
according to three methods: | ||||
DiffServ: The end-point marks a packet with a DiffServ code point to | DiffServ: The end-point marks a packet with a DiffServ code point to | |||
indicate to the network that the packet belongs to a particular | indicate to the network that the packet belongs to a particular | |||
class. | class. | |||
Flow based: Packets that need to be given a particular treatment are | Flow based: Packets that need to be given a particular treatment are | |||
identified using a combination of IP and port address. | identified using a combination of IP and port address. | |||
Deep Packet Inspection: A network classifier (DPI) inspects the | Deep Packet Inspection: A network classifier (DPI) inspects the | |||
packet and tries to determine if the packet represents a | packet and tries to determine if the packet represents a | |||
particular application and type that is to be prioritized. | particular application and type that is to be prioritized. | |||
Flow-based differentiation will provide the same treatment to all | Flow-based differentiation will provide the same treatment to all | |||
packets within a flow, i.e., relative prioritization is not possible. | packets within a transport-layer flow, i.e., relative prioritization | |||
Moreover, if the resources are limited it might not be possible to | is not possible. Moreover, if the resources are limited it might not | |||
provide differential treatment compared to best-effort for all the | be possible to provide differential treatment compared to best-effort | |||
flows in a WebRTC application. When flow-based differentiation is | for all the RTP packet streams in a WebRTC application. When flow- | |||
available the WebRTC application needs to know about it so that it | based differentiation is available the WebRTC application needs to | |||
can provide the separation of the RTP media streams onto different | know about it so that it can provide the separation of the RTP packet | |||
UDP flows to enable a more granular usage of flow based | streams onto different UDP flows to enable a more granular usage of | |||
differentiation. That way at least providing different | flow based differentiation. That way at least providing different | |||
prioritization of audio and video if desired by application. | prioritization of audio and video if desired by application. | |||
DiffServ assumes that either the end-point or a classifier can mark | DiffServ assumes that either the end-point or a classifier can mark | |||
the packets with an appropriate DSCP so that the packets are treated | the packets with an appropriate DSCP so that the packets are treated | |||
according to that marking. If the end-point is to mark the traffic | according to that marking. If the end-point is to mark the traffic | |||
two requirements arise in the WebRTC context: 1) The WebRTC | two requirements arise in the WebRTC context: 1) The WebRTC | |||
application or browser has to know which DSCP to use and that it can | application or browser has to know which DSCP to use and that it can | |||
use them on some set of RTP media streams. 2) The information needs | use them on some set of RTP packet streams. 2) The information needs | |||
to be propagated to the operating system when transmitting the | to be propagated to the operating system when transmitting the | |||
packet. Details of this process are outside the scope of this memo | packet. Details of this process are outside the scope of this memo | |||
and are further discussed in "DSCP and other packet markings for | and are further discussed in "DSCP and other packet markings for | |||
RTCWeb QoS" [I-D.dhesikan-tsvwg-rtcweb-qos]. | RTCWeb QoS" [I-D.ietf-tsvwg-rtcweb-qos]. | |||
For packet based marking schemes it might be possible to mark | For packet based marking schemes it might be possible to mark | |||
individual RTP packets differently based on the relative priority of | individual RTP packets differently based on the relative priority of | |||
the RTP payload. For example video codecs that have I, P, and B | the RTP payload. For example video codecs that have I, P, and B | |||
pictures could prioritise any payloads carrying only B frames less, | pictures could prioritise any payloads carrying only B frames less, | |||
as these are less damaging to loose. However, depending on the QoS | as these are less damaging to loose. However, depending on the QoS | |||
mechanism and what markings that are applied, this can result in not | mechanism and what markings that are applied, this can result in not | |||
only different packet drop probabilities but also packet reordering, | only different packet drop probabilities but also packet reordering, | |||
see [I-D.dhesikan-tsvwg-rtcweb-qos] for further discussion. As | see [I-D.ietf-tsvwg-rtcweb-qos] for further discussion. As default | |||
default policy all RTP packets related to a media stream ought to be | policy all RTP packets related to a RTP packet stream ought to be | |||
provided with the same prioritization; per-packet prioritization is | provided with the same prioritization; per-packet prioritization is | |||
outside the scope of this memo, but might be specified elsewhere in | outside the scope of this memo, but might be specified elsewhere in | |||
future. | future. | |||
It is also important to consider how RTCP packets associated with a | It is also important to consider how RTCP packets associated with a | |||
particular RTP media flow need to be marked. RTCP compound packets | particular RTP packet stream need to be marked. RTCP compound | |||
with Sender Reports (SR), ought to be marked with the same priority | packets with Sender Reports (SR), ought to be marked with the same | |||
as the RTP media flow itself, so the RTCP-based round-trip time (RTT) | priority as the RTP packet stream itself, so the RTCP-based round- | |||
measurements are done using the same flow priority as the media flow | trip time (RTT) measurements are done using the same transport-layer | |||
experiences. RTCP compound packets containing RR packet ought to be | flow priority as the RTP packet stream experiences. RTCP compound | |||
sent with the priority used by the majority of the RTP media flows | packets containing RR packet ought to be sent with the priority used | |||
reported on. RTCP packets containing time-critical feedback packets | by the majority of the RTP packet streams reported on. RTCP packets | |||
can use higher priority to improve the timeliness and likelihood of | containing time-critical feedback packets can use higher priority to | |||
delivery of such feedback. | improve the timeliness and likelihood of delivery of such feedback. | |||
12.2. Source, Flow, and Participant Identification | 12.2. Media Source, RTP Packet Streams, and Participant Identification | |||
12.2.1. Media Streams | 12.2.1. Media Source | |||
Each RTP media stream is identified by a unique synchronisation | Each RTP packet stream is identified by a unique synchronisation | |||
source (SSRC) identifier. The SSRC identifier is carried in the RTP | source (SSRC) identifier. The SSRC identifier is carried in each of | |||
data packets comprising a media stream, and is also used to identify | the RTP packets comprising a RTP packet stream, and is also used to | |||
that stream in the corresponding RTCP reports. The SSRC is chosen as | identify that stream in the corresponding RTCP reports. The SSRC is | |||
discussed in Section 4.8. The first stage in demultiplexing RTP and | chosen as discussed in Section 4.8. The first stage in | |||
RTCP packets received at a WebRTC end-point is to separate the media | demultiplexing RTP and RTCP packets received on a single transport | |||
layer flow at a WebRTC end-point is to separate the RTP packet | ||||
streams based on their SSRC value; once that is done, additional | streams based on their SSRC value; once that is done, additional | |||
demultiplexing steps can determine how and where to render the media. | demultiplexing steps can determine how and where to render the media. | |||
RTP allows a mixer, or other RTP-layer middlebox, to combine media | RTP allows a mixer, or other RTP-layer middlebox, to combine encoded | |||
flows from multiple sources to form a new media flow. The RTP data | streams from multiple media sources to form a new encoded stream from | |||
packets in that new flow can include a Contributing Source (CSRC) | a new media source (the mixer). The RTP packets in that new RTP | |||
list, indicating which original SSRCs contributed to the combined | packet stream can include a Contributing Source (CSRC) list, | |||
packet. As described in Section 4.1, implementations need to support | indicating which original SSRCs contributed to the combined source | |||
stream. As described in Section 4.1, implementations need to support | ||||
reception of RTP data packets containing a CSRC list and RTCP packets | reception of RTP data packets containing a CSRC list and RTCP packets | |||
that relate to sources present in the CSRC list. The CSRC list can | that relate to sources present in the CSRC list. The CSRC list can | |||
change on a packet-by-packet basis, depending on the mixing operation | change on a packet-by-packet basis, depending on the mixing operation | |||
being performed. Knowledge of what sources contributed to a | being performed. Knowledge of what media sources contributed to a | |||
particular RTP packet can be important if the user interface | particular RTP packet can be important if the user interface | |||
indicates which participants are active in the session. Changes in | indicates which participants are active in the session. Changes in | |||
the CSRC list included in packets needs to be exposed to the WebRTC | the CSRC list included in packets needs to be exposed to the WebRTC | |||
application using some API, if the application is to be able to track | application using some API, if the application is to be able to track | |||
changes in session participation. It is desirable to map CSRC values | changes in session participation. It is desirable to map CSRC values | |||
back into WebRTC MediaStream identities as they cross this API, to | back into WebRTC MediaStream identities as they cross this API, to | |||
avoid exposing the SSRC/CSRC name space to JavaScript applications. | avoid exposing the SSRC/CSRC name space to JavaScript applications. | |||
If the mixer-to-client audio level extension [RFC6465] is being used | If the mixer-to-client audio level extension [RFC6465] is being used | |||
in the session (see Section 5.2.3), the information in the CSRC list | in the session (see Section 5.2.3), the information in the CSRC list | |||
is augmented by audio level information for each contributing source. | is augmented by audio level information for each contributing source. | |||
This information can usefully be exposed in the user interface. | This information can usefully be exposed in the user interface. | |||
12.2.2. Media Streams: SSRC Collision Detection | 12.2.2. SSRC Collision Detection | |||
The RTP standard [RFC3550] requires any RTP implementation to have | The RTP standard [RFC3550] requires any RTP implementation to have | |||
support for detecting and handling SSRC collisions, i.e., resolve the | support for detecting and handling SSRC collisions, i.e., resolve the | |||
conflict when two different end-points use the same SSRC value. This | conflict when two different end-points use the same SSRC value. This | |||
requirement also applies to WebRTC end-points. There are several | requirement also applies to WebRTC end-points. There are several | |||
scenarios where SSRC collisions can occur. | scenarios where SSRC collisions can occur: | |||
In a point-to-point session where each SSRC is associated with either | o In a point-to-point session where each SSRC is associated with | |||
of the two end-points and where the main media carrying SSRC | either of the two end-points and where the main media carrying | |||
identifier will be announced in the signalling channel, a collision | SSRC identifier will be announced in the signalling channel, a | |||
is less likely to occur due to the information about used SSRCs | collision is less likely to occur due to the information about | |||
provided by Source-Specific SDP Attributes [RFC5576]. Still if both | used SSRCs provided by Source-Specific SDP Attributes [RFC5576]. | |||
end-points start uses an new SSRC identifier prior to having | ||||
signalled it to the peer and received acknowledgement on the | ||||
signalling message, there can be collisions. The Source-Specific SDP | ||||
Attributes [RFC5576] contains no mechanism to resolve SSRC collisions | ||||
or reject a end-points usage of an SSRC. | ||||
There could also appear SSRC values that are not signalled. This is | Still, collisions can occur if both end-points start uses an new | |||
more likely than it appears as certain RTP functions need extra SSRCs | SSRC identifier prior to having signalled it to the peer and | |||
to provide functionality related to another (the "main") SSRC, for | received acknowledgement on the signalling message. The Source- | |||
example, SSRC multiplexed RTP retransmission [RFC4588]. In those | Specific SDP Attributes [RFC5576] contains no mechanism to resolve | |||
cases, an end-point can create a new SSRC that strictly doesn't need | SSRC collisions or reject a end-points usage of an SSRC. | |||
to be announced over the signalling channel to function correctly on | ||||
o SSRC values that have not been signalled could also appear in an | ||||
RTP session. This is more likely than it appears, since some RTP | ||||
functions use extra SSRCs to provide their functionality. For | ||||
example, retransmission data might be transmitted using a separate | ||||
RTP packet stream that requires its own SSRC, separate to the SSRC | ||||
of the source RTP packet stream [RFC4588]. In those cases, an | ||||
end-point can create a new SSRC that strictly doesn't need to be | ||||
announced over the signalling channel to function correctly on | ||||
both RTP and RTCPeerConnection level. | both RTP and RTCPeerConnection level. | |||
The more likely case for SSRC collision is that multiple end-points | o Multiple end-points in a multiparty conference can create new | |||
in a multiparty conference create new sources and signals those | sources and signal those towards the RTP middlebox. In cases | |||
towards the central server. In cases where the SSRC/CSRC are | where the SSRC/CSRC are propagated between the different end- | |||
propagated between the different end-points from the central node | points from the RTP middlebox collisions can occur. | |||
collisions can occur. | ||||
Another scenario is when the central node manages to connect an end- | o An RTP middlebox could connect an end-point's RTCPeerConnection to | |||
point's RTCPeerConnection to another RTCPeerConnection the end-point | another RTCPeerConnection from the same end-point, thus forming a | |||
already has, thus forming a loop where the end-point will receive its | loop where the end-point will receive its own traffic. While is | |||
own traffic. While is is clearly considered a bug, it is important | is clearly considered a bug, it is important that the end-point is | |||
that the end-point is able to recognise and handle the case when it | able to recognise and handle the case when it occurs. This case | |||
occurs. This case becomes even more problematic when media mixers, | becomes even more problematic when media mixers, and so on, are | |||
and so on, are involved, where the stream received is a different | involved, where the stream received is a different stream but | |||
stream but still contains this client's input. | still contains this client's input. | |||
These SSRC/CSRC collisions can only be handled on RTP level as long | These SSRC/CSRC collisions can only be handled on RTP level as long | |||
as the same RTP session is extended across multiple | as the same RTP session is extended across multiple | |||
RTCPeerConnections by a RTP middlebox. To resolve the more generic | RTCPeerConnections by a RTP middlebox. To resolve the more generic | |||
case where multiple RTCPeerConnections are interconnected, then | case where multiple RTCPeerConnections are interconnected, then | |||
identification of the media source(s) part of a MediaStreamTrack | identification of the media source(s) part of a MediaStreamTrack | |||
being propagated across multiple interconnected RTCPeerConnection | being propagated across multiple interconnected RTCPeerConnection | |||
needs to be preserved across these interconnections. | needs to be preserved across these interconnections. | |||
12.2.3. Media Synchronisation Context | 12.2.3. Media Synchronisation Context | |||
When an end-point sends media from more than one media source, it | When an end-point sends media from more than one media source, it | |||
needs to consider if (and which of) these media sources are to be | needs to consider if (and which of) these media sources are to be | |||
synchronized. In RTP/RTCP, synchronisation is provided by having a | synchronized. In RTP/RTCP, synchronisation is provided by having a | |||
set of RTP media streams be indicated as coming from the same | set of RTP packet streams be indicated as coming from the same | |||
synchronisation context and logical end-point by using the same RTCP | synchronisation context and logical end-point by using the same RTCP | |||
CNAME identifier. | CNAME identifier. | |||
The next provision is that the internal clocks of all media sources, | The next provision is that the internal clocks of all media sources, | |||
i.e., what drives the RTP timestamp, can be correlated to a system | i.e., what drives the RTP timestamp, can be correlated to a system | |||
clock that is provided in RTCP Sender Reports encoded in an NTP | clock that is provided in RTCP Sender Reports encoded in an NTP | |||
format. By correlating all RTP timestamps to a common system clock | format. By correlating all RTP timestamps to a common system clock | |||
for all sources, the timing relation of the different RTP media | for all sources, the timing relation of the different RTP packet | |||
streams, also across multiple RTP sessions can be derived at the | streams, also across multiple RTP sessions can be derived at the | |||
receiver and, if desired, the streams can be synchronized. The | receiver and, if desired, the streams can be synchronized. The | |||
requirement is for the media sender to provide the correlation | requirement is for the media sender to provide the correlation | |||
information; it is up to the receiver to use it or not. | information; it is up to the receiver to use it or not. | |||
13. Security Considerations | 13. Security Considerations | |||
The overall security architecture for WebRTC is described in | The overall security architecture for WebRTC is described in | |||
[I-D.ietf-rtcweb-security-arch], and security considerations for the | [I-D.ietf-rtcweb-security-arch], and security considerations for the | |||
WebRTC framework are described in [I-D.ietf-rtcweb-security]. These | WebRTC framework are described in [I-D.ietf-rtcweb-security]. These | |||
considerations apply to this memo also. | considerations also apply to this memo. | |||
The security considerations of the RTP specification, the RTP/SAVPF | The security considerations of the RTP specification, the RTP/SAVPF | |||
profile, and the various RTP/RTCP extensions and RTP payload formats | profile, and the various RTP/RTCP extensions and RTP payload formats | |||
that form the complete protocol suite described in this memo apply. | that form the complete protocol suite described in this memo apply. | |||
We do not believe there are any new security considerations resulting | We do not believe there are any new security considerations resulting | |||
from the combination of these various protocol extensions. | from the combination of these various protocol extensions. | |||
The Extended Secure RTP Profile for Real-time Transport Control | The Extended Secure RTP Profile for Real-time Transport Control | |||
Protocol (RTCP)-Based Feedback [RFC5124] (RTP/SAVPF) provides | Protocol (RTCP)-Based Feedback [RFC5124] (RTP/SAVPF) provides | |||
handling of fundamental issues by offering confidentiality, integrity | handling of fundamental issues by offering confidentiality, integrity | |||
and partial source authentication. A mandatory to implement media | and partial source authentication. A mandatory to implement media | |||
security solution is created by combing this secured RTP profile and | security solution is created by combing this secured RTP profile and | |||
DTLS-SRTP keying [RFC5764] as defined by Section 5.5 of | DTLS-SRTP keying [RFC5764] as defined by Section 5.5 of | |||
[I-D.ietf-rtcweb-security-arch]. | [I-D.ietf-rtcweb-security-arch]. | |||
RTCP packets convey a Canonical Name (CNAME) identifier that is used | RTCP packets convey a Canonical Name (CNAME) identifier that is used | |||
to associate media flows that need to be synchronised across related | to associate RTP packet streams that need to be synchronised across | |||
RTP sessions. Inappropriate choice of CNAME values can be a privacy | related RTP sessions. Inappropriate choice of CNAME values can be a | |||
concern, since long-term persistent CNAME identifiers can be used to | privacy concern, since long-term persistent CNAME identifiers can be | |||
track users across multiple WebRTC calls. Section 4.9 of this memo | used to track users across multiple WebRTC calls. Section 4.9 of | |||
provides guidelines for generation of untraceable CNAME values that | this memo provides guidelines for generation of untraceable CNAME | |||
alleviate this risk. | values that alleviate this risk. | |||
The guidelines in [RFC6562] apply when using variable bit rate (VBR) | The guidelines in [RFC6562] apply when using variable bit rate (VBR) | |||
audio codecs such as Opus (see Section 4.3 for discussion of mandated | audio codecs such as Opus (see Section 4.3 for discussion of mandated | |||
audio codecs). These guidelines in [RFC6562] also apply, but are of | audio codecs). The guidelines in [RFC6562] also apply, but are of | |||
lesser importance, when using the client-to-mixer audio level header | lesser importance, when using the client-to-mixer audio level header | |||
extensions (Section 5.2.2) or the mixer-to-client audio level header | extensions (Section 5.2.2) or the mixer-to-client audio level header | |||
extensions (Section 5.2.3). | extensions (Section 5.2.3). The use of the encryption of the header | |||
extensions are RECOMMENDED, unless there are known reasons, like RTP | ||||
middleboxes or third party monitoring that will greatly benefit from | ||||
the information, and this has been expressed using API or signalling. | ||||
If further evidence are produced to show that information leakage is | ||||
significant from audio level indications, then use of encryption | ||||
needs to be mandated at that time. | ||||
14. IANA Considerations | 14. IANA Considerations | |||
This memo makes no request of IANA. | This memo makes no request of IANA. | |||
Note to RFC Editor: this section is to be removed on publication as | Note to RFC Editor: this section is to be removed on publication as | |||
an RFC. | an RFC. | |||
15. Acknowledgements | 15. Acknowledgements | |||
The authors would like to thank Bernard Aboba, Harald Alvestrand, | The authors would like to thank Bernard Aboba, Harald Alvestrand, | |||
Cary Bran, Charles Eckel, Cullen Jennings, Dan Romascanu, and the | Cary Bran, Charles Eckel, Christian Groves, Cullen Jennings, Dan | |||
other members of the IETF RTCWEB working group for their valuable | Romascanu, Martin Thomson, and the other members of the IETF RTCWEB | |||
feedback. | working group for their valuable feedback. | |||
16. References | 16. References | |||
16.1. Normative References | 16.1. Normative References | |||
[I-D.ietf-avtcore-multi-media-rtp-session] | [I-D.ietf-avtcore-multi-media-rtp-session] | |||
Westerlund, M., Perkins, C., and J. Lennox, "Sending | Westerlund, M., Perkins, C., and J. Lennox, "Sending | |||
Multiple Types of Media in a Single RTP Session", draft- | Multiple Types of Media in a Single RTP Session", draft- | |||
ietf-avtcore-multi-media-rtp-session-04 (work in | ietf-avtcore-multi-media-rtp-session-05 (work in | |||
progress), January 2014. | progress), February 2014. | |||
[I-D.ietf-avtcore-rtp-circuit-breakers] | [I-D.ietf-avtcore-rtp-circuit-breakers] | |||
Perkins, C. and V. Singh, "Multimedia Congestion Control: | Perkins, C. and V. Singh, "Multimedia Congestion Control: | |||
Circuit Breakers for Unicast RTP Sessions", draft-ietf- | Circuit Breakers for Unicast RTP Sessions", draft-ietf- | |||
avtcore-rtp-circuit-breakers-04 (work in progress), | avtcore-rtp-circuit-breakers-05 (work in progress), | |||
January 2014. | February 2014. | |||
[I-D.ietf-avtcore-rtp-multi-stream-optimisation] | [I-D.ietf-avtcore-rtp-multi-stream-optimisation] | |||
Lennox, J., Westerlund, M., Wu, Q., and C. Perkins, | Lennox, J., Westerlund, M., Wu, W., and C. Perkins, | |||
"Sending Multiple Media Streams in a Single RTP Session: | "Sending Multiple Media Streams in a Single RTP Session: | |||
Grouping RTCP Reception Statistics and Other Feedback ", | Grouping RTCP Reception Statistics and Other Feedback ", | |||
draft-ietf-avtcore-rtp-multi-stream-optimisation-00 (work | draft-ietf-avtcore-rtp-multi-stream-optimisation-02 (work | |||
in progress), July 2013. | in progress), February 2014. | |||
[I-D.ietf-avtcore-rtp-multi-stream] | [I-D.ietf-avtcore-rtp-multi-stream] | |||
Lennox, J., Westerlund, M., Wu, W., and C. Perkins, | Lennox, J., Westerlund, M., Wu, W., and C. Perkins, | |||
"Sending Multiple Media Streams in a Single RTP Session", | "Sending Multiple Media Streams in a Single RTP Session", | |||
draft-ietf-avtcore-rtp-multi-stream-02 (work in progress), | draft-ietf-avtcore-rtp-multi-stream-03 (work in progress), | |||
January 2014. | February 2014. | |||
[I-D.ietf-avtext-multiple-clock-rates] | ||||
Petit-Huguenin, M. and G. Zorn, "Support for Multiple | ||||
Clock Rates in an RTP Session", draft-ietf-avtext- | ||||
multiple-clock-rates-11 (work in progress), November 2013. | ||||
[I-D.ietf-rtcweb-security-arch] | [I-D.ietf-rtcweb-security-arch] | |||
Rescorla, E., "WebRTC Security Architecture", draft-ietf- | Rescorla, E., "WebRTC Security Architecture", draft-ietf- | |||
rtcweb-security-arch-08 (work in progress), January 2014. | rtcweb-security-arch-09 (work in progress), February 2014. | |||
[I-D.ietf-rtcweb-security] | [I-D.ietf-rtcweb-security] | |||
Rescorla, E., "Security Considerations for WebRTC", draft- | Rescorla, E., "Security Considerations for WebRTC", draft- | |||
ietf-rtcweb-security-06 (work in progress), January 2014. | ietf-rtcweb-security-06 (work in progress), January 2014. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
[RFC2736] Handley, M. and C. Perkins, "Guidelines for Writers of RTP | [RFC2736] Handley, M. and C. Perkins, "Guidelines for Writers of RTP | |||
Payload Format Specifications", BCP 36, RFC 2736, December | Payload Format Specifications", BCP 36, RFC 2736, December | |||
skipping to change at page 37, line 28 | skipping to change at page 40, line 36 | |||
July 2003. | July 2003. | |||
[RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth | [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth | |||
Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC | Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC | |||
3556, July 2003. | 3556, July 2003. | |||
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. | [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. | |||
Norrman, "The Secure Real-time Transport Protocol (SRTP)", | Norrman, "The Secure Real-time Transport Protocol (SRTP)", | |||
RFC 3711, March 2004. | RFC 3711, March 2004. | |||
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session | ||||
Description Protocol", RFC 4566, July 2006. | ||||
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, | [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, | |||
"Extended RTP Profile for Real-time Transport Control | "Extended RTP Profile for Real-time Transport Control | |||
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July | Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July | |||
2006. | 2006. | |||
[RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. | [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. | |||
Hakenberg, "RTP Retransmission Payload Format", RFC 4588, | Hakenberg, "RTP Retransmission Payload Format", RFC 4588, | |||
July 2006. | July 2006. | |||
[RFC4961] Wing, D., "Symmetric RTP / RTP Control Protocol (RTCP)", | [RFC4961] Wing, D., "Symmetric RTP / RTP Control Protocol (RTCP)", | |||
skipping to change at page 38, line 44 | skipping to change at page 42, line 9 | |||
[RFC7007] Terriberry, T., "Update to Remove DVI4 from the | [RFC7007] Terriberry, T., "Update to Remove DVI4 from the | |||
Recommended Codecs for the RTP Profile for Audio and Video | Recommended Codecs for the RTP Profile for Audio and Video | |||
Conferences with Minimal Control (RTP/AVP)", RFC 7007, | Conferences with Minimal Control (RTP/AVP)", RFC 7007, | |||
August 2013. | August 2013. | |||
[RFC7022] Begen, A., Perkins, C., Wing, D., and E. Rescorla, | [RFC7022] Begen, A., Perkins, C., Wing, D., and E. Rescorla, | |||
"Guidelines for Choosing RTP Control Protocol (RTCP) | "Guidelines for Choosing RTP Control Protocol (RTCP) | |||
Canonical Names (CNAMEs)", RFC 7022, September 2013. | Canonical Names (CNAMEs)", RFC 7022, September 2013. | |||
[RFC7160] Petit-Huguenin, M. and G. Zorn, "Support for Multiple | ||||
Clock Rates in an RTP Session", RFC 7160, April 2014. | ||||
[RFC7164] Gross, K. and R. Brandenburg, "RTP and Leap Seconds", RFC | ||||
7164, March 2014. | ||||
[W3C.WD-mediacapture-streams-20130903] | [W3C.WD-mediacapture-streams-20130903] | |||
Burnett, D., Bergkvist, A., Jennings, C., and A. | Burnett, D., Bergkvist, A., Jennings, C., and A. | |||
Narayanan, "Media Capture and Streams", World Wide Web | Narayanan, "Media Capture and Streams", World Wide Web | |||
Consortium WD WD-mediacapture-streams-20130903, September | Consortium WD WD-mediacapture-streams-20130903, September | |||
2013, <http://www.w3.org/TR/2013/WD-mediacapture- | 2013, <http://www.w3.org/TR/2013/ | |||
streams-20130903>. | WD-mediacapture-streams-20130903>. | |||
[W3C.WD-webrtc-20130910] | [W3C.WD-webrtc-20130910] | |||
Bergkvist, A., Burnett, D., Jennings, C., and A. | Bergkvist, A., Burnett, D., Jennings, C., and A. | |||
Narayanan, "WebRTC 1.0: Real-time Communication Between | Narayanan, "WebRTC 1.0: Real-time Communication Between | |||
Browsers", World Wide Web Consortium WD WD- | Browsers", World Wide Web Consortium WD WD- | |||
webrtc-20130910, September 2013, | webrtc-20130910, September 2013, | |||
<http://www.w3.org/TR/2013/WD-webrtc-20130910>. | <http://www.w3.org/TR/2013/WD-webrtc-20130910>. | |||
16.2. Informative References | 16.2. Informative References | |||
[I-D.dhesikan-tsvwg-rtcweb-qos] | ||||
Dhesikan, S., Druta, D., Jones, P., and J. Polk, "DSCP and | ||||
other packet markings for RTCWeb QoS", draft-dhesikan- | ||||
tsvwg-rtcweb-qos-04 (work in progress), January 2014. | ||||
[I-D.ietf-avtcore-multiplex-guidelines] | [I-D.ietf-avtcore-multiplex-guidelines] | |||
Westerlund, M., Perkins, C., and H. Alvestrand, | Westerlund, M., Perkins, C., and H. Alvestrand, | |||
"Guidelines for using the Multiplexing Features of RTP to | "Guidelines for using the Multiplexing Features of RTP to | |||
Support Multiple Media Streams", draft-ietf-avtcore- | Support Multiple Media Streams", draft-ietf-avtcore- | |||
multiplex-guidelines-02 (work in progress), January 2014. | multiplex-guidelines-02 (work in progress), January 2014. | |||
[I-D.ietf-avtcore-rtp-topologies-update] | [I-D.ietf-avtcore-rtp-topologies-update] | |||
Westerlund, M. and S. Wenger, "RTP Topologies", draft- | Westerlund, M. and S. Wenger, "RTP Topologies", draft- | |||
ietf-avtcore-rtp-topologies-update-01 (work in progress), | ietf-avtcore-rtp-topologies-update-01 (work in progress), | |||
October 2013. | October 2013. | |||
[I-D.ietf-avtext-rtp-grouping-taxonomy] | [I-D.ietf-avtext-rtp-grouping-taxonomy] | |||
Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro, | Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro, | |||
"A Taxonomy of Grouping Semantics and Mechanisms for Real- | "A Taxonomy of Grouping Semantics and Mechanisms for Real- | |||
Time Transport Protocol (RTP) Sources", draft-ietf-avtext- | Time Transport Protocol (RTP) Sources", draft-ietf-avtext- | |||
rtp-grouping-taxonomy-00 (work in progress), November | rtp-grouping-taxonomy-01 (work in progress), February | |||
2013. | 2014. | |||
[I-D.ietf-mmusic-msid] | [I-D.ietf-mmusic-msid] | |||
Alvestrand, H., "WebRTC MediaStream Identification in the | Alvestrand, H., "WebRTC MediaStream Identification in the | |||
Session Description Protocol", draft-ietf-mmusic-msid-04 | Session Description Protocol", draft-ietf-mmusic-msid-05 | |||
(work in progress), February 2014. | (work in progress), March 2014. | |||
[I-D.ietf-mmusic-sdp-bundle-negotiation] | [I-D.ietf-mmusic-sdp-bundle-negotiation] | |||
Holmberg, C., Alvestrand, H., and C. Jennings, | Holmberg, C., Alvestrand, H., and C. Jennings, | |||
"Multiplexing Negotiation Using Session Description | "Negotiating Media Multiplexing Using the Session | |||
Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- | Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle- | |||
bundle-negotiation-05 (work in progress), October 2013. | negotiation-07 (work in progress), April 2014. | |||
[I-D.ietf-payload-rtp-howto] | [I-D.ietf-payload-rtp-howto] | |||
Westerlund, M., "How to Write an RTP Payload Format", | Westerlund, M., "How to Write an RTP Payload Format", | |||
draft-ietf-payload-rtp-howto-13 (work in progress), | draft-ietf-payload-rtp-howto-13 (work in progress), | |||
January 2014. | January 2014. | |||
[I-D.ietf-rmcat-cc-requirements] | [I-D.ietf-rmcat-cc-requirements] | |||
Jesup, R., "Congestion Control Requirements For RMCAT", | Jesup, R., "Congestion Control Requirements For RMCAT", | |||
draft-ietf-rmcat-cc-requirements-02 (work in progress), | draft-ietf-rmcat-cc-requirements-04 (work in progress), | |||
February 2014. | April 2014. | |||
[I-D.ietf-rtcweb-audio] | [I-D.ietf-rtcweb-audio] | |||
Valin, J. and C. Bran, "WebRTC Audio Codec and Processing | Valin, J. and C. Bran, "WebRTC Audio Codec and Processing | |||
Requirements", draft-ietf-rtcweb-audio-05 (work in | Requirements", draft-ietf-rtcweb-audio-05 (work in | |||
progress), February 2014. | progress), February 2014. | |||
[I-D.ietf-rtcweb-overview] | [I-D.ietf-rtcweb-overview] | |||
Alvestrand, H., "Overview: Real Time Protocols for Brower- | Alvestrand, H., "Overview: Real Time Protocols for Brower- | |||
based Applications", draft-ietf-rtcweb-overview-08 (work | based Applications", draft-ietf-rtcweb-overview-09 (work | |||
in progress), September 2013. | in progress), February 2014. | |||
[I-D.ietf-rtcweb-use-cases-and-requirements] | [I-D.ietf-rtcweb-use-cases-and-requirements] | |||
Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- | Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- | |||
Time Communication Use-cases and Requirements", draft- | Time Communication Use-cases and Requirements", draft- | |||
ietf-rtcweb-use-cases-and-requirements-14 (work in | ietf-rtcweb-use-cases-and-requirements-14 (work in | |||
progress), February 2014. | progress), February 2014. | |||
[I-D.ietf-tsvwg-rtcweb-qos] | ||||
Dhesikan, S., Druta, D., Jones, P., and J. Polk, "DSCP and | ||||
other packet markings for RTCWeb QoS", draft-ietf-tsvwg- | ||||
rtcweb-qos-00 (work in progress), April 2014. | ||||
[RFC3611] Friedman, T., Caceres, R., and A. Clark, "RTP Control | [RFC3611] Friedman, T., Caceres, R., and A. Clark, "RTP Control | |||
Protocol Extended Reports (RTCP XR)", RFC 3611, November | Protocol Extended Reports (RTCP XR)", RFC 3611, November | |||
2003. | 2003. | |||
[RFC4341] Floyd, S. and E. Kohler, "Profile for Datagram Congestion | [RFC4341] Floyd, S. and E. Kohler, "Profile for Datagram Congestion | |||
Control Protocol (DCCP) Congestion Control ID 2: TCP-like | Control Protocol (DCCP) Congestion Control ID 2: TCP-like | |||
Congestion Control", RFC 4341, March 2006. | Congestion Control", RFC 4341, March 2006. | |||
[RFC4342] Floyd, S., Kohler, E., and J. Padhye, "Profile for | [RFC4342] Floyd, S., Kohler, E., and J. Padhye, "Profile for | |||
Datagram Congestion Control Protocol (DCCP) Congestion | Datagram Congestion Control Protocol (DCCP) Congestion | |||
End of changes. 161 change blocks. | ||||
615 lines changed or deleted | 790 lines changed or added | |||
This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |