Do we really need QCN for FCoE?
The IEEE P802.1Qau/D2.4 standard, also known as the Quality Congestion Notification (QCN) protocol, is one of the primary components of DCB. It has received less attention than the other components of DCB because the purpose of the QCN protocol is to manage long-term congestion and congestion propagation. As we all know, the first and main application to leverage DCB technology has been FCoE. Most of the current FCoE implementations are mainly so-called “top of rack” (TOR) setups which involve only short distances (< 10m) and a simple network structure. In a TOR scenario, priority flow control (PFC) mechanisms are good enough to manage short-term congestion, and there is little need for the long-term capabilities of QCN. This is the main reason that development has been focused on PFC and enhanced transmission selection (ETS), and that we have seen very limited QCN implementation since its introduction. However, with FCoE applications moving deeply into the core network, as well as the emergence of new applications that can leverage DCB technologies, we are observing an uprising interest in implementing QCN as well.
QCN specifies how to support congestion management of long-lived data flows within network domains of limited bandwidth-delay product, especially those supporting higher layer protocols that are particularly sensitive to packet loss or latency. QCN enables bridging mechanisms that signal increasing congestion to end stations capable of limiting their transmission rate or pausing to avoid frame loss. The protocol also extends the ability to bridges to use priority remapping to automatically “defend” a congestion notification domain against sources that are not aware of or support congestion notification. The result is that long-lived data flows have a significantly reduced chance of frame loss compared to networks without some mechanism for congestion notification.
There is some debate as to the need for QCN given the capabilities of ETS and PFC to make Enhanced Ethernet completely lossless. Link-by-link guaranteed bandwidth is provided by ETS and, with PFC, the lossless nature of Fibre Channel can be achieved over Ethernet using DCB. The fact that native Fibre Channel has no equivalent of QCN suggests that QCN may not be necessary for FCoE. From a deployment perspective, QCN requires that every device in the data path supports QCN. Since QCN is implemented in hardware, not firmware, network administrators must purchase new equipment to implement it.
The delay of implementing QCN in FCoE devices is mainly because Fibre Channel SANs tend to be very flat with only 2 to 3 hops. In contrast, Ethernet networks tend to have a hierarchical structure, and 5 or more hops is not uncommon. This extra complexity can aggravate congestion issues and is the type of problem QCN is designed to address. ETS and PFC do very well across a few hops but, for a reliable end-to-end FCoE link, QCN may be needed to manage congestion where the network is oversubscribed.
Is QCN essential for FCoE deployments? This may be the wrong question. Converged networks will bring together a wider variety of traffic types and applications than just those required to support FCoE, and QCN, as part of the Enhanced Ethernet enabled by DCB, plays an important role in improving overall network performance and reliability. For example, while PFC and ETS can manage the congestion that can arise from oversubscription, the manner in which they do so can aggravate that congestion across every link along a data path. QCN, in contrast, pushes congestion out to the edge of the network and slows the rate at which incoming packets are introduced until the congestion is relieved. With this approach, a point of congestion is prevented from spreading to other links.
Another important role played by QCN is to support multiple disparate SAN that are aggregated over the same converged network. While multiple virtual SANs can be mapped to a single PFC class (and thus treated as a single priority flow that requires no QCN), disparate SANs need to be configured into their own separate parallel PFC priority. QCN is then required to manage congestions events between the various PFC classes. In fact, QCN is needed to manage congestion between any disparate lossless traffic classes. This is especially true given that as T11-BB-5 moved to T11-BB-6 almost a year ago, part of their focus was to expand the current scope of FCoE into long reach applications. As FCoE technology moves deeper into core Ethernet networking, we should see more QCN implementation for FCoE applications as well.
In addition to FCoE, DCB enhances convergence Ethernet technology in a way that attracts other applications onto the unified network. For example, iSCSI over DCB and RDMA over CEE are two emerging technologies that run over the DCB network. I will talk more about RDMA over CEE in a later blog. As for iSCSI over DCB, QCN plays a key role in maintaining the overall reliability of long distance communications. The testing of QCN interoperability at the DCB Plugfest shows that manufacturers are working hard to bring the maturity of QCN up to the same level as PFC and ETS.
