From 31771bef7b9a3c5e9d8eb875a21f404b87ecd106 Mon Sep 17 00:00:00 2001 From: Suyog Sainkar Date: Thu, 7 May 2015 16:21:30 +1000 Subject: [PATCH] Remove passive voice from chap 5, arch guide Change-Id: Iaaa9d2a052c9f81cefebd18bdc866d96d4fed64e Closes-Bug: #1427935 --- doc/arch-design/ch_network_focus.xml | 184 ++++----- .../section_architecture_network_focus.xml | 372 ++++++++---------- ...erational_considerations_network_focus.xml | 122 +++--- ...on_prescriptive_examples_network_focus.xml | 175 ++++---- ...tion_tech_considerations_network_focus.xml | 232 +++++------ ...ection_user_requirements_network_focus.xml | 205 +++++----- 6 files changed, 588 insertions(+), 702 deletions(-) diff --git a/doc/arch-design/ch_network_focus.xml b/doc/arch-design/ch_network_focus.xml index 586ed8a0fc..b3721cc94c 100644 --- a/doc/arch-design/ch_network_focus.xml +++ b/doc/arch-design/ch_network_focus.xml @@ -5,167 +5,141 @@ version="5.0" xml:id="network_focus"> Network focused - - All OpenStack deployments are dependent, to some extent, on - network communication in order to function properly due to a - service-based nature. In some cases, however, use cases - dictate that the network is elevated beyond simple - infrastructure. This chapter is a discussion of architectures - that are more reliant or focused on network services. These - architectures are heavily dependent on the network - infrastructure and need to be architected so that the network - services perform and are reliable in order to satisfy user and - application requirements. + All OpenStack deployments depend on network communication in order + to function properly due to its service-based nature. In some cases, + however, the network elevates beyond simple + infrastructure. This chapter discusses architectures that are more + reliant or focused on network services. These architectures depend + on the network infrastructure and require + network services that perform reliably in order to satisfy user and + application requirements. Some possible use cases include: Content delivery network - This could include - streaming video, photographs or any other cloud based - repository of data that is distributed to a large - number of end users. Mass market streaming video will - be very heavily affected by the network configurations - that would affect latency, bandwidth, and the - distribution of instances. Not all video streaming is - consumer focused. For example, multicast videos (used - for media, press conferences, corporate presentations, - web conferencing services, and so on) can also utilize a - content delivery network. Content delivery will be - affected by the location of the video repository and - its relationship to end users. Performance is also - affected by network throughput of the back-end systems, - as well as the WAN architecture and the cache - methodology. + This includes streaming video, viewing photographs, or + accessing any other cloud-based data repository distributed to + a large number of end users. Network configuration affects + latency, bandwidth, and the distribution of instances. Therefore, + it impacts video streaming. Not all video streaming is + consumer-focused. For example, multicast videos (used for media, + press conferences, corporate presentations, and web conferencing + services) can also use a content delivery network. + The location of the video repository and its relationship to end + users affects content delivery. Network throughput of the back-end + systems, as well as the WAN architecture and the cache methodology, + also affect performance. Network management functions - A cloud that provides - network service functions would be built to support - the delivery of back-end network services such as DNS, - NTP or SNMP and would be used by a company for - internal network management. + Use this cloud to provide network service functions built to + support the delivery of back-end network services such as DNS, + NTP, or SNMP. A company can use these services for internal + network management. Network service offerings - A cloud can be used to - run customer facing network tools to support services. - For example, VPNs, MPLS private networks, GRE tunnels - and others. + Use this cloud to run customer-facing network tools to + support services. Examples include VPNs, MPLS private networks, + and GRE tunnels. Web portals or web services - Web servers are a common - application for cloud services and we recommend - an understanding of the network requirements. - The network will need to be able to scale out to meet - user demand and deliver webpages with a minimum of - latency. Internal east-west and north-south network - bandwidth must be considered depending on the details - of the portal architecture. + Web servers are a common application for cloud services, + and we recommend an understanding of their network requirements. + The network requires scaling out to meet user demand and deliver + web pages with a minimum latency. Depending on the details of + the portal architecture, consider the internal east-west and + north-south network bandwidth. High speed and high volume transactional systems - These types of applications are very sensitive to - network configurations. Examples include many - financial systems, credit card transaction - applications, trading and other extremely high volume - systems. These systems are sensitive to network jitter - and latency. They also have a high volume of both - east-west and north-south network traffic that needs - to be balanced to maximize efficiency of the data - delivery. Many of these systems have large high - performance database back ends that need to be - accessed. + These types of applications are sensitive to network + configurations. Examples include financial systems, + credit card transaction applications, and trading and other + extremely high volume systems. These systems are sensitive + to network jitter and latency. They must balance a high volume + of East-West and North-South network traffic to + maximize efficiency of the data delivery. + Many of these systems must access large, high performance + database back ends. High availability - These types of use cases are - highly dependent on the proper sizing of the network - to maintain replication of data between sites for high - availability. If one site becomes unavailable, the - extra sites will be able to serve the displaced load - until the original site returns to service. It is - important to size network capacity to handle the loads - that are desired. + These types of use cases are dependent on the proper sizing + of the network to maintain replication of data between sites for + high availability. If one site becomes unavailable, the extra + sites can serve the displaced load until the original site + returns to service. It is important to size network capacity + to handle the desired loads. Big data - Clouds that will be used for the - management and collection of big data (data ingest) - will have a significant demand on network resources. - Big data often uses partial replicas of the data to - maintain data integrity over large distributed clouds. - Other big data applications that require a large - amount of network resources are Hadoop, Cassandra, - NuoDB, RIAK and other No-SQL and distributed - databases. + Clouds used for the management and collection of big data + (data ingest) have a significant demand on network resources. + Big data often uses partial replicas of the data to maintain + integrity over large distributed clouds. Other big data + applications that require a large amount of network resources + are Hadoop, Cassandra, NuoDB, Riak, and other NoSQL and + distributed databases. Virtual desktop infrastructure (VDI) - This use case - is very sensitive to network congestion, latency, - jitter and other network characteristics. Like video - streaming, the user experience is very important - however, unlike video streaming, caching is not an - option to offset the network issues. VDI requires both - upstream and downstream traffic and cannot rely on - caching for the delivery of the application to the end - user. + This use case is sensitive to network congestion, latency, + jitter, and other network characteristics. Like video streaming, + the user experience is important. However, unlike video + streaming, caching is not an option to offset the network issues. + VDI requires both upstream and downstream traffic and cannot rely + on caching for the delivery of the application to the end user. Voice over IP (VoIP) - This is extremely sensitive to - network congestion, latency, jitter and other network - characteristics. VoIP has a symmetrical traffic - pattern and it requires network quality of service - (QoS) for best performance. It may also require an - active queue management implementation to ensure - delivery. Users are very sensitive to latency and - jitter fluctuations and can detect them at very low - levels. + This is sensitive to network congestion, latency, jitter, + and other network characteristics. VoIP has a symmetrical traffic + pattern and it requires network quality of service (QoS) for best + performance. In addition, you can implement active queue management + to deliver voice and multimedia content. Users are sensitive to + latency and jitter fluctuations and can detect them at very low + levels. Video Conference or web conference - This also is - extremely sensitive to network congestion, latency, - jitter and other network flaws. Video Conferencing has - a symmetrical traffic pattern, but unless the network - is on an MPLS private network, it cannot use network - quality of service (QoS) to improve performance. - Similar to VOIP, users will be sensitive to network - performance issues even at low levels. + This is sensitive to network congestion, latency, jitter, + and other network characteristics. Video Conferencing has a + symmetrical traffic pattern, but unless the network is on an + MPLS private network, it cannot use network quality of service + (QoS) to improve performance. Similar to VoIP, users are + sensitive to network performance issues even at low levels. High performance computing (HPC) - This is a complex - use case that requires careful consideration of the - traffic flows and usage patterns to address the needs - of cloud clusters. It has high East-West traffic - patterns for distributed computing, but there can be - substantial North-South traffic depending on the - specific application. + This is a complex use case that requires careful + consideration of the traffic flows and usage patterns to address + the needs of cloud clusters. It has high east-west traffic + patterns for distributed computing, but there can be substantial + north-south traffic depending on the specific application. diff --git a/doc/arch-design/network_focus/section_architecture_network_focus.xml b/doc/arch-design/network_focus/section_architecture_network_focus.xml index c92ab7448d..5314680dd2 100644 --- a/doc/arch-design/network_focus/section_architecture_network_focus.xml +++ b/doc/arch-design/network_focus/section_architecture_network_focus.xml @@ -5,222 +5,190 @@ version="5.0" xml:id="architecture-network-focus"> Architecture - Network focused OpenStack architectures have many - similarities to other OpenStack architecture use cases. There - are a number of very specific considerations to keep in mind when - designing for a network-centric or network-heavy application - environment. - Networks exist to serve as a medium of transporting data - between systems. It is inevitable that an OpenStack design - has inter-dependencies with non-network portions of OpenStack - as well as on external systems. Depending on the specific - workload, there may be major interactions with storage systems - both within and external to the OpenStack environment. For - example, if the workload is a content delivery network, then - the interactions with storage will be two-fold. There will be - traffic flowing to and from the storage array for ingesting - and serving content in a north-south direction. In addition, - there is replication traffic flowing in an east-west - direction. - Compute-heavy workloads may also induce interactions with - the network. Some high performance compute applications - require network-based memory mapping and data sharing and, as - a result, will induce a higher network load when they transfer - results and data sets. Others may be highly transactional and - issue transaction locks, perform their functions and rescind - transaction locks at very high rates. This also has an impact - on the network performance. - Some network dependencies are going to be external to - OpenStack. While OpenStack Networking is capable of providing network - ports, IP addresses, some level of routing, and overlay - networks, there are some other functions that it cannot - provide. For many of these, external systems or equipment may - be required to fill in the functional gaps. Hardware load - balancers are an example of equipment that may be necessary to - distribute workloads or offload certain functions. Note that, - as of the Kilo release, dynamic routing is currently in - its infancy within OpenStack and may need to be implemented - either by an external device or a specialized service instance - within OpenStack. Tunneling is a feature provided by OpenStack Networking, - however it is constrained to a Networking-managed region. If the - need arises to extend a tunnel beyond the OpenStack region to - either another region or an external system, it is necessary - to implement the tunnel itself outside OpenStack or by using a - tunnel management system to map the tunnel or overlay to an - external tunnel. OpenStack does not currently provide quotas - for network resources. Where network quotas are required, it - is necessary to implement quality of service management - outside of OpenStack. In many of these instances, similar - solutions for traffic shaping or other network functions will - be needed. + Network-focused OpenStack architectures have many similarities to + other OpenStack architecture use cases. There are several factors + to consider when designing for a network-centric or network-heavy + application environment. + Networks exist to serve as a medium of transporting data between + systems. It is inevitable that an OpenStack design has inter-dependencies + with non-network portions of OpenStack as well as on external systems. + Depending on the specific workload, there may be major interactions with + storage systems both within and external to the OpenStack environment. + For example, in the case of content delivery network, there is twofold + interaction with storage. Traffic flows to and from the storage array for + ingesting and serving content in a north-south direction. In addition, + there is replication traffic flowing in an east-west direction. + Compute-heavy workloads may also induce interactions with the + network. Some high performance compute applications require network-based + memory mapping and data sharing and, as a result, induce a higher network + load when they transfer results and data sets. Others may be highly + transactional and issue transaction locks, perform their functions, and + revoke transaction locks at high rates. This also has an impact on the + network performance. + Some network dependencies are external to OpenStack. While + OpenStack Networking is capable of providing network ports, IP addresses, + some level of routing, and overlay networks, there are some other + functions that it cannot provide. For many of these, you may require + external systems or equipment to fill in the functional gaps. Hardware + load balancers are an example of equipment that may be necessary to + distribute workloads or offload certain functions. As of the Icehouse + release, dynamic routing is currently in its infancy within OpenStack and + you may require an external device or a specialized service instance + within OpenStack to implement it. OpenStack Networking provides a + tunneling feature, however it is constrained to a Networking-managed + region. If the need arises to extend a tunnel beyond the OpenStack region + to either another region or an external system, implement the tunnel + itself outside OpenStack or use a tunnel management system to map the + tunnel or overlay to an external tunnel. OpenStack does not currently + provide quotas for network resources. Where network quotas are required, + implement quality of service management outside of OpenStack. In many of + these instances, similar solutions for traffic shaping or other network + functions are needed. Depending on the selected design, Networking itself might not - even support the required - layer-3 + support the required layer-3 network functionality. If you choose to use the provider networking mode without running the layer-3 agent, you must install an external router to provide layer-3 connectivity to outside systems. Interaction with orchestration services is inevitable in - larger-scale deployments. The Orchestration module is capable of allocating - network resource defined in templates to map to tenant - networks and for port creation, as well as allocating floating - IPs. If there is a requirement to define and manage network - resources in using orchestration, we recommend that the - design include the Orchestration module to meet the demands of - users. + larger-scale deployments. The Orchestration module is capable of + allocating network resource defined in templates to map to tenant + networks and for port creation, as well as allocating floating IPs. + If there is a requirement to define and manage network resources when + using orchestration, we recommend that the design include the + Orchestration module to meet the demands of users.
Design impacts - A wide variety of factors can affect a network focused - OpenStack architecture. While there are some considerations - shared with a general use case, specific workloads related to - network requirements will influence network design - decisions. - One decision includes whether or not to use Network Address - Translation (NAT) and where to implement it. If there is a - requirement for floating IPs to be available instead of using - public fixed addresses then NAT is required. This can be seen - in network management applications that rely on an IP - endpoint. An example of this is a DHCP relay that needs to - know the IP of the actual DHCP server. In these cases it is - easier to automate the infrastructure to apply the target IP - to a new instance rather than reconfigure legacy or external - systems for each new instance. - NAT for floating IPs managed by Networking will reside within - the hypervisor but there are also versions of NAT that may be - running elsewhere. If there is a shortage of IPv4 addresses - there are two common methods to mitigate this externally to - OpenStack. The first is to run a load balancer either within - OpenStack as an instance, or use an external load balancing - solution. In the internal scenario, load balancing software, - such as HAproxy, can be managed with Networking's - Load-Balancer-as-a-Service (LBaaS). This is specifically to - manage the - Virtual IP (VIP) while a dual-homed connection from the - HAproxy instance connects the public network with the tenant - private network that hosts all of the content servers. In the - external scenario, a load balancer would need to serve the VIP - and also be joined to the tenant overlay network through - external means or routed to it via private addresses. - Another kind of NAT that may be useful is protocol NAT. In - some cases it may be desirable to use only IPv6 addresses on - instances and operate either an instance or an external - service to provide a NAT-based transition technology such as - NAT64 and DNS64. This provides the ability to have a globally - routable IPv6 address while only consuming IPv4 addresses as - necessary or in a shared manner. - Application workloads will affect the design of the - underlying network architecture. If a workload requires - network-level redundancy, the routing and switching - architecture will have to accommodate this. There are - differing methods for providing this that are dependent on the - network hardware selected, the performance of the hardware, - and which networking model is deployed. Some examples of this - are the use of Link aggregation (LAG) or Hot Standby Router - Protocol (HSRP). There are also the considerations of whether - to deploy OpenStack Networking or legacy networking (nova-network) - and which plug-in to select - for OpenStack Networking. If using an external system, Networking will need to - be configured to run - layer 2 - with a provider network - configuration. For example, it may be necessary to implement - HSRP to terminate layer-3 connectivity. - Depending on the workload, overlay networks may or may not - be a recommended configuration. Where application network - connections are small, short lived or bursty, running a - dynamic overlay can generate as much bandwidth as the packets - it carries. It also can induce enough latency to cause issues - with certain applications. There is an impact to the device - generating the overlay which, in most installations, will be - the hypervisor. This will cause performance degradation on - packet per second and connection per second rates. - Overlays also come with a secondary option that may or may - not be appropriate to a specific workload. While all of them - will operate in full mesh by default, there might be good - reasons to disable this function because it may cause - excessive overhead for some workloads. Conversely, other - workloads will operate without issue. For example, most web - services applications will not have major issues with a full - mesh overlay network, while some network monitoring tools or - storage replication workloads will have performance issues - with throughput or excessive broadcast traffic. - Many people overlook an important design decision: The choice - of layer-3 - protocols. While OpenStack was initially built with only IPv4 + A wide variety of factors can affect a network-focused OpenStack + architecture. While there are some considerations shared with a general + use case, specific workloads related to network requirements influence + network design decisions. + One decision includes whether or not to use Network Address + Translation (NAT) and where to implement it. If there is a requirement + for floating IPs instead of public fixed addresses then you must use + NAT. An example of this is a DHCP relay that must know the IP of the + DHCP server. In these cases it is easier to automate the infrastructure + to apply the target IP to a new instance rather than to reconfigure + legacy or external systems for each new instance. + NAT for floating IPs managed by Networking resides within the + hypervisor but there are also versions of NAT that may be running + elsewhere. If there is a shortage of IPv4 addresses there are two common + methods to mitigate this externally to OpenStack. The first is to run a + load balancer either within OpenStack as an instance, or use an external + load balancing solution. In the internal scenario, Networking's + Load-Balancer-as-a-Service (LBaaS) can manage load balancing + software, for example HAproxy. This is specifically to manage the + Virtual IP (VIP) while a dual-homed connection from the HAproxy instance + connects the public network with the tenant private network that hosts + all of the content servers. In the external scenario, a load balancer + needs to serve the VIP and also connect to the tenant overlay + network through external means or through private addresses. + Another kind of NAT that may be useful is protocol NAT. In some + cases it may be desirable to use only IPv6 addresses on instances and + operate either an instance or an external service to provide a NAT-based + transition technology such as NAT64 and DNS64. This provides the ability + to have a globally routable IPv6 address while only consuming IPv4 + addresses as necessary or in a shared manner. + Application workloads affect the design of the underlying network + architecture. If a workload requires network-level redundancy, the + routing and switching architecture have to accommodate this. There + are differing methods for providing this that are dependent on the + selected network hardware, the performance of the hardware, and which + networking model you deploy. Examples include + Link aggregation (LAG) and Hot Standby Router Protocol (HSRP). Also + consider whether to deploy OpenStack Networking or + legacy networking (nova-network), and which plug-in to select for + OpenStack Networking. If using an external system, configure Networking + to run layer 2 + with a provider network configuration. For example, implement HSRP + to terminate layer-3 connectivity. + Depending on the workload, overlay networks may not be the best + solution. Where application network connections are + small, short lived, or bursty, running a dynamic overlay can generate + as much bandwidth as the packets it carries. It also can induce enough + latency to cause issues with certain applications. There is an impact + to the device generating the overlay which, in most installations, + is the hypervisor. This causes performance degradation on packet + per second and connection per second rates. + Overlays also come with a secondary option that may not be + appropriate to a specific workload. While all of them operate in full + mesh by default, there might be good reasons to disable this function + because it may cause excessive overhead for some workloads. Conversely, + other workloads operate without issue. For example, most web services + applications do not have major issues with a full mesh overlay network, + while some network monitoring tools or storage replication workloads + have performance issues with throughput or excessive broadcast + traffic. + Many people overlook an important design decision: The choice of + layer-3 protocols. While OpenStack was initially built with only IPv4 support, Networking now supports IPv6 and dual-stacked networks. - Note that, as of the Icehouse release, this only includes - stateless address auto configuration but work is in - progress to support stateless and stateful DHCPv6 as well as - IPv6 floating IPs without NAT. Some workloads become possible - through the use of IPv6 and IPv6 to IPv4 reverse transition - mechanisms such as NAT64 and DNS64 or 6to4, - because these - options are available. This will alter the requirements for - any address plan as single-stacked and transitional IPv6 - deployments can alleviate the need for IPv4 addresses. - As of the Kilo release, OpenStack has limited support - for dynamic routing, however there are a number of options - available by incorporating third party solutions to implement - routing within the cloud including network equipment, hardware - nodes, and instances. Some workloads will perform well with - nothing more than static routes and default gateways - configured at the layer-3 termination point. In most cases - this will suffice, however some cases require the addition of - at least one type of dynamic routing protocol if not multiple - protocols. Having a form of interior gateway protocol (IGP) - available to the instances inside an OpenStack installation - opens up the possibility of use cases for anycast route - injection for services that need to use it as a geographic - location or failover mechanism. Other applications may wish to - directly participate in a routing protocol, either as a - passive observer as in the case of a looking glass, or as an - active participant in the form of a route reflector. Since an - instance might have a large amount of compute and memory - resources, it is trivial to hold an entire unpartitioned - routing table and use it to provide services such as network - path visibility to other applications or as a monitoring + As of the Icehouse release, this only includes stateless + address auto configuration but work is in progress to support stateless + and stateful DHCPv6 as well as IPv6 floating IPs without NAT. Some + workloads are possible through the use of IPv6 and IPv6 to IPv4 + reverse transition mechanisms such as NAT64 and DNS64 or + 6to4. + This alters the requirements for any address plan as single-stacked and + transitional IPv6 deployments can alleviate the need for IPv4 + addresses. + As of the Icehouse release, OpenStack has limited support for + dynamic routing, however there are a number of options available by + incorporating third party solutions to implement routing within the + cloud including network equipment, hardware nodes, and instances. Some + workloads perform well with nothing more than static routes and default + gateways configured at the layer-3 termination point. In most cases this + is sufficient, however some cases require the addition of at least one + type of dynamic routing protocol if not multiple protocols. Having a + form of interior gateway protocol (IGP) available to the instances + inside an OpenStack installation opens up the possibility of use cases + for anycast route injection for services that need to use it as a + geographic location or failover mechanism. Other applications may wish + to directly participate in a routing protocol, either as a passive + observer, as in the case of a looking glass, or as an active participant + in the form of a route reflector. Since an instance might have a large + amount of compute and memory resources, it is trivial to hold an entire + unpartitioned routing table and use it to provide services such as + network path visibility to other applications or as a monitoring tool. - - Path maximum transmission unit (MTU) failures are lesser known - but harder to diagnose. The MTU must be large enough to handle - normal traffic, overhead from an overlay network, and the - desired layer-3 protocol. When you add externally built tunnels, - the MTU packet size is reduced. In this case, you must pay - attention to the fully calculated MTU size because some systems - are configured to ignore or drop path MTU discovery packets. - + Path maximum transmission unit (MTU) failures are lesser known but + harder to diagnose. The MTU must be large enough to handle normal + traffic, overhead from an overlay network, and the desired layer-3 + protocol. Adding externally built tunnels reduces the MTU packet size. + In this case, you must pay attention to the fully + calculated MTU size because some systems ignore or + drop path MTU discovery packets.
- Tunable networking components - Consider configurable networking components related to an - OpenStack architecture design when designing for network intensive - workloads include MTU and QoS. Some workloads will require a larger - MTU than normal based on a requirement to transfer large blocks of - data. When providing network service for applications such as video - streaming or storage replication, it is recommended to ensure that - both OpenStack hardware nodes and the supporting network equipment - are configured for jumbo frames where possible. This will allow for - a better utilization of available bandwidth. Configuration of jumbo - frames should be done across the complete path the packets will - traverse. If one network component is not capable of handling jumbo - frames then the entire path will revert to the default MTU. - Quality of Service (QoS) also has a great impact on network - intensive workloads by providing instant service to packets which - have a higher priority due to their ability to be impacted by poor - network performance. In applications such as Voice over IP (VoIP) - differentiated services code points are a near requirement for - proper operation. QoS can also be used in the opposite direction for - mixed workloads to prevent low priority but high bandwidth - applications, for example backup services, video conferencing or - file sharing, from blocking bandwidth that is needed for the proper - operation of other workloads. It is possible to tag file storage - traffic as a lower class, such as best effort or scavenger, to allow - the higher priority traffic through. In cases where regions within a - cloud might be geographically distributed it may also be necessary - to plan accordingly to implement WAN optimization to combat latency - or packet loss. + Tunable networking components + Consider configurable networking components related to an + OpenStack architecture design when designing for network intensive + workloads that include MTU and QoS. Some workloads require a larger MTU + than normal due to the transfer of large blocks of data. + When providing network service for applications such as video + streaming or storage replication, we recommend that you configure + both OpenStack hardware nodes and the supporting network equipment + for jumbo frames where possible. This allows for better use of + available bandwidth. Configure jumbo frames + across the complete path the packets traverse. If one network + component is not capable of handling jumbo frames then the entire + path reverts to the default MTU. + Quality of Service (QoS) also has a great impact on network + intensive workloads as it provides instant service to packets which + have a higher priority due to the impact of poor + network performance. In applications such as Voice over IP (VoIP), + differentiated services code points are a near requirement for proper + operation. You can also use QoS in the opposite direction for mixed + workloads to prevent low priority but high bandwidth applications, + for example backup services, video conferencing, or file sharing, + from blocking bandwidth that is needed for the proper operation of + other workloads. It is possible to tag file storage traffic as a + lower class, such as best effort or scavenger, to allow the higher + priority traffic through. In cases where regions within a cloud might + be geographically distributed it may also be necessary to plan + accordingly to implement WAN optimization to combat latency or + packet loss.
diff --git a/doc/arch-design/network_focus/section_operational_considerations_network_focus.xml b/doc/arch-design/network_focus/section_operational_considerations_network_focus.xml index 3c71df6274..b27d42fb4f 100644 --- a/doc/arch-design/network_focus/section_operational_considerations_network_focus.xml +++ b/doc/arch-design/network_focus/section_operational_considerations_network_focus.xml @@ -6,67 +6,63 @@ xml:id="operational-considerations-networking-focus"> Operational considerations - Network focused OpenStack clouds have a number of - operational considerations that will influence the selected - design. Topics including, but not limited to, dynamic routing - of static routes, service level agreements, and ownership of - user management all need to be considered. - One of the first required decisions is the selection of a - telecom company or transit provider. This is especially true - if the network requirements include external or site-to-site - network connectivity. - Additional design decisions need to be made about monitoring - and alarming. These can be an internal responsibility or the - responsibility of the external provider. In the case of using - an external provider, SLAs will likely apply. In addition, - other operational considerations such as bandwidth, latency, - and jitter can be part of a service level agreement. - The ability to upgrade the infrastructure is another subject - for consideration. As demand for network resources increase, - operators will be required to add additional IP address blocks - and add additional bandwidth capacity. Managing hardware and - software life cycle events, for example upgrades, - decommissioning, and outages while avoiding service - interruptions for tenants, will also need to be - considered. - Maintainability will also need to be factored into the - overall network design. This includes the ability to manage - and maintain IP addresses as well as the use of overlay - identifiers including VLAN tag IDs, GRE tunnel IDs, and MPLS - tags. As an example, if all of the IP addresses have to be - changed on a network, a process known as renumbering, then the - design needs to support the ability to do so. - Network focused applications themselves need to be addressed - when concerning certain operational realities. For example, - the impending exhaustion of IPv4 addresses, the migration to - IPv6 and the utilization of private networks to segregate - different types of traffic that an application receives or - generates. In the case of IPv4 to IPv6 migrations, - applications should follow best practices for storing IP - addresses. It is further recommended to avoid relying on IPv4 - features that were not carried over to the IPv6 protocol or - have differences in implementation. - When using private networks to segregate traffic, - applications should create private tenant networks for - database and data storage network traffic, and utilize public - networks for client-facing traffic. By segregating this - traffic, quality of service and security decisions can be made - to ensure that each network has the correct level of service - that it requires. - Finally, decisions must be made about the routing of network - traffic. For some applications, a more complex policy - framework for routing must be developed. The economic cost of - transmitting traffic over expensive links versus cheaper - links, in addition to bandwidth, latency, and jitter - requirements, can be used to create a routing policy that will - satisfy business requirements. - How to respond to network events must also be taken into - consideration. As an example, how load is transferred from one - link to another during a failure scenario could be a factor in - the design. If network capacity is not planned correctly, - failover traffic could overwhelm other ports or network links - and create a cascading failure scenario. In this case, traffic - that fails over to one link overwhelms that link and then - moves to the subsequent links until the all network traffic - stops. + Network-focused OpenStack clouds have a number of operational + considerations that influence the selected design, including: + + + Dynamic routing of static routes + + + Service level agreements (SLAs) + + + Ownership of user management + + + An initial network consideration is the selection of a telecom + company or transit provider. + Make additional design decisions about monitoring and alarming. + This can be an internal responsibility or the responsibility of the + external provider. In the case of using an external provider, service + level agreements (SLAs) likely apply. In addition, other operational + considerations such as bandwidth, latency, and jitter can be part of an + SLA. + Consider the ability to upgrade the infrastructure. As demand for + network resources increase, operators add additional IP address blocks + and add additional bandwidth capacity. In addition, consider managing + hardware and software life cycle events, for example upgrades, + decommissioning, and outages, while avoiding service interruptions for + tenants. + Factor maintainability into the overall network design. This + includes the ability to manage and maintain IP addresses as well as the + use of overlay identifiers including VLAN tag IDs, GRE tunnel IDs, and + MPLS tags. As an example, if you may need to change all of the IP + addresses on a network, a process known as renumbering, then the design + must support this function. + Address network-focused applications when considering certain + operational realities. For example, consider the impending exhaustion + of IPv4 addresses, the migration to IPv6, and the use of private + networks to segregate different types of traffic that an application + receives or generates. In the case of IPv4 to IPv6 migrations, + applications should follow best practices for storing IP addresses. + We recommend you avoid relying on IPv4 features that did not carry over + to the IPv6 protocol or have differences in implementation. + To segregate traffic, allow applications to create a private tenant + network for database and storage network traffic. Use a public network + for services that require direct client access from the internet. Upon + segregating the traffic, consider quality of service (QoS) and security + to ensure each network has the required level of service. + Finally, consider the routing of network traffic. + For some applications, develop a complex policy framework for + routing. To create a routing policy that satisfies business requirements, + consider the economic cost of transmitting traffic over expensive links + versus cheaper links, in addition to bandwidth, latency, and jitter + requirements. + Additionally, consider how to respond to network events. As an + example, how load transfers from one link to another during a + failure scenario could be a factor in the design. If you do not plan + network capacity correctly, failover traffic could overwhelm other ports + or network links and create a cascading failure scenario. In this case, + traffic that fails over to one link overwhelms that link and then moves + to the subsequent links until all network traffic stops. diff --git a/doc/arch-design/network_focus/section_prescriptive_examples_network_focus.xml b/doc/arch-design/network_focus/section_prescriptive_examples_network_focus.xml index cf674499d3..d66b029d49 100644 --- a/doc/arch-design/network_focus/section_prescriptive_examples_network_focus.xml +++ b/doc/arch-design/network_focus/section_prescriptive_examples_network_focus.xml @@ -6,34 +6,34 @@ xml:id="prescriptive-example-large-scale-web-app"> Prescriptive examples - A large-scale web application has been designed with cloud - principles in mind. The application is designed to scale - horizontally in a bursting fashion and will generate a high + An organization design a large-scale web application with cloud + principles in mind. The application scales + horizontally in a bursting fashion and generates a high instance count. The application requires an SSL connection to secure data and must not lose connection state to individual servers. - An example design for this workload is depicted in the - figure below. In this example, a hardware load balancer is - configured to provide SSL offload functionality and to connect + The figure below depicts an example design for this workload. + In this example, a hardware load balancer provides SSL offload + functionality and connects to tenant networks in order to reduce address consumption. - This load balancer is linked to the routing architecture as it - will service the VIP for the application. The router and load - balancer are configured with GRE tunnel ID of the - application's tenant network and provided an IP address within + This load balancer links to the routing architecture as it + services the VIP for the application. The router and load + balancer use the GRE tunnel ID of the + application's tenant network and an IP address within the tenant subnet but outside of the address pool. This is to ensure that the load balancer can communicate with the application's HTTP servers without requiring the consumption of a public IP address. - Because sessions persist until they are closed, the routing and - switching architecture is designed for high availability. - Switches are meshed to each hypervisor and each other, and + Because sessions persist until closed, the routing and + switching architecture provides high availability. + Switches mesh to each hypervisor and each other, and also provide an MLAG implementation to ensure that layer-2 - connectivity does not fail. Routers are configured with VRRP - and fully meshed with switches to ensure layer-3 connectivity. - Since GRE is used as an overlay network, Networking is installed - and configured to use the Open vSwitch agent in GRE tunnel + connectivity does not fail. Routers use VRRP + and fully mesh with switches to ensure layer-3 connectivity. + Since GRE is provides an overlay network, Networking is present + and uses the Open vSwitch agent in GRE tunnel mode. This ensures all devices can reach all other devices and - that tenant networks can be created for private addressing + that you can create tenant networks for private addressing links to the load balancer. @@ -44,9 +44,9 @@ A web service architecture has many options and optional components. Due to this, it can fit into a large number of - other OpenStack designs however a few key components will need + other OpenStack designs. A few key components, however, need to be in place to handle the nature of most web-scale - workloads. The user needs the following components: + workloads. You require the following components: OpenStack Controller services (Image, Identity, @@ -66,59 +66,59 @@ Telemetry module - - Beyond the normal Identity, Compute, Image service and Object - Storage components, the Orchestration module is a recommended - component to handle properly scaling the workloads to adjust to - demand. Due to the requirement for auto-scaling, - the design includes the Telemetry module. Web services - tend to be bursty in load, have very defined peak and valley - usage patterns and, as a result, benefit from automatic scaling - of instances based upon traffic. At a network level, a split - network configuration will work well with databases residing on - private tenant networks since these do not emit a large quantity - of broadcast traffic and may need to interconnect to some - databases for content. + Beyond the normal Identity, Compute, Image service, and Object + Storage components, we recommend the Orchestration module + component to handle the proper scaling of workloads to adjust to + demand. Due to the requirement for auto-scaling, + the design includes the Telemetry module. Web services + tend to be bursty in load, have very defined peak and valley + usage patterns and, as a result, benefit from automatic scaling + of instances based upon traffic. At a network level, a split + network configuration works well with databases residing on + private tenant networks since these do not emit a large quantity + of broadcast traffic and may need to interconnect to some + databases for content.
Load balancing - Load balancing was included in this design to spread - requests across multiple instances. This workload scales well - horizontally across large numbers of instances. This allows - instances to run without publicly routed IP addresses and - simply rely on the load balancer for the service to be - globally reachable. Many of these services do not require + Load balancing spreads requests across multiple instances. + This workload scales well horizontally across large numbers of + instances. This enables instances to run without publicly + routed IP addresses and instead to rely on the load + balancer to provide a globally reachable service. + Many of these services do not require direct server return. This aids in address planning and utilization at scale since only the virtual IP (VIP) must be - public.
- + public.
+
Overlay networks The overlay functionality design includes OpenStack Networking in Open vSwitch GRE tunnel mode. - In this case, the layer-3 external routers are paired with - VRRP and switches should be paired with an implementation of - MLAG running to ensure that you do not lose connectivity with + In this case, the layer-3 external routers pair with + VRRP, and switches pair with an implementation of + MLAG to ensure that you do not lose connectivity with the upstream routing infrastructure.
Performance tuning - Network level tuning for this workload is minimal. - Quality-of-Service (QoS) will be applied to these workloads + Network level tuning for this workload is minimal. + Quality-of-Service (QoS) applies to these workloads for a middle ground Class Selector depending on existing - policies. It will be higher than a best effort queue but lower + policies. It is higher than a best effort queue but lower than an Expedited Forwarding or Assured Forwarding queue. Since this type of application generates larger packets with - longer-lived connections, bandwidth utilization can be - optimized for long duration TCP. Normal bandwidth planning + longer-lived connections, you can optimize bandwidth utilization + for long duration TCP. Normal bandwidth planning applies here with regards to benchmarking a session's usage multiplied by the expected number of concurrent sessions with - overhead.
+ overhead.
+
Network functions - Network functions is a broad category but encompasses + Network functions is a broad category but encompasses workloads that support the rest of a system's network. These workloads tend to consist of large amounts of small packets that are very short lived, such as DNS queries or SNMP traps. @@ -134,63 +134,57 @@ The supporting network for this type of configuration needs to have a low latency and evenly distributed availability. This workload benefits from having services local to the - consumers of the service. A multi-site approach is used as + consumers of the service. Use a multi-site approach as well as deploying many copies of the application to handle load as close as possible to consumers. Since these applications function independently, they do not warrant running overlays to interconnect tenant networks. Overlays also have the drawback of performing poorly with rapid flow setup and may incur too much overhead with large quantities of - small packets and are therefore not recommended. - QoS is desired for some workloads to ensure delivery. DNS + small packets and therefore we do not recommend them. + QoS is desirable for some workloads to ensure delivery. DNS has a major impact on the load times of other services and - needs to be reliable and provide rapid responses. It is to - configure rules in upstream devices to apply a higher Class + needs to be reliable and provide rapid responses. Configure rules + in upstream devices to apply a higher Class Selector to DNS to ensure faster delivery or a better spot in - queuing algorithms.
+ queuing algorithms.
+
Cloud storage - - Another common use case for OpenStack environments is to provide - a cloud-based file storage and sharing service. You might - consider this a storage-focused use case, but its network-side - requirements make it a network-focused use case. - - For example, consider a cloud backup application. This workload - has two specific behaviors that impact the network. Because this - workload is an externally-facing service and an - internally-replicating application, it has both north-south and - east-west traffic - considerations, as follows: - + Another common use case for OpenStack environments is providing + a cloud-based file storage and sharing service. You might + consider this a storage-focused use case, but its network-side + requirements make it a network-focused use case. + For example, consider a cloud backup application. This workload + has two specific behaviors that impact the network. Because this + workload is an externally-facing service and an + internally-replicating application, it has both north-south and + east-west traffic + considerations: north-south traffic - - When a user uploads and stores content, that content moves + When a user uploads and stores content, that content moves into the OpenStack installation. When users download this - content, the content moves from the OpenStack - installation. Because this service is intended primarily + content, the content moves out from the OpenStack + installation. Because this service operates primarily as a backup, most of the traffic moves southbound into the environment. In this situation, it benefits you to configure a network to be asymmetrically downstream because the traffic that enters the OpenStack installation - is greater than the traffic that leaves the installation. - + is greater than the traffic that leaves the installation. east-west traffic - - Likely to be fully symmetric. Because replication + Likely to be fully symmetric. Because replication originates from any node and might target multiple other nodes algorithmically, it is less likely for this traffic to have a larger volume in any specific direction. However - this traffic might interfere with north-south traffic. - + this traffic might interfere with north-south traffic. @@ -201,16 +195,15 @@ /> - - This application prioritizes the north-south traffic over + This application prioritizes the north-south traffic over east-west traffic: the north-south traffic involves - customer-facing data. - + customer-facing data. The network design in this case is less dependent on - availability and more dependent on being able to handle high - bandwidth. As a direct result, it is beneficial to forego - redundant links in favor of bonding those connections. This - increases available bandwidth. It is also beneficial to - configure all devices in the path, including OpenStack, to - generate and pass jumbo frames.
+ availability and more dependent on being able to handle high + bandwidth. As a direct result, it is beneficial to forgo + redundant links in favor of bonding those connections. This + increases available bandwidth. It is also beneficial to + configure all devices in the path, including OpenStack, to + generate and pass jumbo frames. + diff --git a/doc/arch-design/network_focus/section_tech_considerations_network_focus.xml b/doc/arch-design/network_focus/section_tech_considerations_network_focus.xml index 5f83256a55..66c4be329a 100644 --- a/doc/arch-design/network_focus/section_tech_considerations_network_focus.xml +++ b/doc/arch-design/network_focus/section_tech_considerations_network_focus.xml @@ -13,27 +13,23 @@ involve those made about the protocol layer and the point when IP comes into the picture. As an example, a completely internal OpenStack network can exist at layer 2 and ignore - layer 3 however, in order for any traffic to go outside of - that cloud, to another network, or to the Internet, a layer-3 - router or switch must be involved. - - The past few years have seen two competing trends in + layer 3. In order for any traffic to go outside of + that cloud, to another network, or to the Internet, however, you must + use a layer-3 router or switch. + The past few years have seen two competing trends in networking. One trend leans towards building data center network architectures based on layer-2 networking. Another trend treats the cloud environment essentially as a miniature version of the Internet. This approach is radically different from the network - architecture approach that is used in the staging environment: - the Internet is based entirely on layer-3 routing rather than - layer-2 switching. - - - A network designed on layer-2 protocols has advantages over one + architecture approach in the staging environment: + the Internet only uses layer-3 routing rather than + layer-2 switching. + A network designed on layer-2 protocols has advantages over one designed on layer-3 protocols. In spite of the difficulties of using a bridge to perform the network role of a router, many vendors, customers, and service providers choose to use Ethernet in as many parts of their networks as possible. The benefits of - selecting a layer-2 design are: - + selecting a layer-2 design are: Ethernet frames contain all the essentials for @@ -47,13 +43,13 @@ protocol. - More layers added to the Ethernet frame only slow + Adding more layers to the Ethernet frame only slows the networking process down. This is known as 'nodal processing delay'. - Adjunct networking features, for example class of - service (CoS) or multicasting, can be added to + You can add adjunct networking features, for + example class of service (CoS) or multicasting, to Ethernet as readily as IP networks. @@ -62,45 +58,37 @@ Most information starts and ends inside Ethernet frames. - Today this applies to data, voice (for example, VoIP) and - video (for example, web cameras). The concept is that, if more - of the end-to-end transfer of information from a source to a - destination can be done in the form of Ethernet frames, more - of the benefits of Ethernet can be realized on the network. - Though it is not a substitute for IP networking, networking at - layer 2 can be a powerful adjunct to IP networking. - + Today this applies to data, voice (for example, VoIP), and + video (for example, web cameras). The concept is that, if you can + perform more of the end-to-end transfer of information from + a source to a destination in the form of Ethernet frames, the network + benefits more from the advantages of Ethernet. + Although it is not a substitute for IP networking, networking at + layer 2 can be a powerful adjunct to IP networking. Layer-2 Ethernet usage has these advantages over layer-3 IP network usage: - - Speed - + Speed - - Reduced overhead of the IP hierarchy. - + Reduced overhead of the IP hierarchy. - - No need to keep track of address configuration as systems - are moved around. Whereas the simplicity of layer-2 + No need to keep track of address configuration as systems + move around. Whereas the simplicity of layer-2 protocols might work well in a data center with hundreds of physical machines, cloud data centers have the additional burden of needing to keep track of all virtual machine addresses and networks. In these data centers, it is not uncommon for one physical node to support 30-40 - instances. - + instances. - - Networking at the frame level says nothing + Networking at the frame level says nothing about the presence or absence of IP addresses at the packet level. Almost all ports, links, and devices on a network of LAN switches still have IP addresses, as do all the source and @@ -125,8 +113,8 @@ limited. - The need to maintain a set of layer-4 devices to - handle traffic control must be accommodated. + You must accommodate the need to maintain a set of + layer-4 devices to handle traffic control. MLAG, often used for switch redundancy, is a @@ -138,21 +126,20 @@ without IP addresses and ICMP. - - Configuring Configuring ARP - is considered complicated on large layer-2 networks. + can be complicated on large layer-2 networks. All network devices need to be aware of all MACs, even instance MACs, so there is constant churn in MAC - tables and network state changes as instances are - started or stopped. + tables and network state changes as instances start and + stop. Migrating MACs (instance migration) to different - physical locations are a potential problem if ARP - table timeouts are not set properly. + physical locations are a potential problem if you do not + set ARP table timeouts properly. It is important to know that layer 2 has a very limited set @@ -173,14 +160,15 @@ with the new location of the instance. In a layer-2 network, all devices are aware of all MACs, even those that belong to instances. The network state - information in the backbone changes whenever an instance is - started or stopped. As a result there is far too much churn in - the MAC tables on the backbone switches. + information in the backbone changes whenever an instance starts + or stops. As a result there is far too much churn in + the MAC tables on the backbone switches. +
Layer-3 architecture advantages In the layer 3 case, there is no churn in the routing tables due to instances starting and stopping. The only time there - would be a routing state change would be in the case of a Top + would be a routing state change is in the case of a Top of Rack (ToR) switch failure or a link failure in the backbone itself. Other advantages of using a layer-3 architecture include: @@ -194,15 +182,15 @@ straightforward. - Layer 3 can be configured to use You can configure layer 3 to use BGP confederation for scalability so core routers have state proportional to the number of racks, not to the number of servers or instances. - Routing ensures that instance MAC and IP addresses - out of the network core reducing state churn. Routing + Routing takes instance MAC and IP addresses + out of the network core, reducing state churn. Routing state changes only occur in the case of a ToR switch failure or backbone link failure. @@ -211,7 +199,7 @@ example ICMP, to monitor and manage traffic. - Layer-3 architectures allow for the use of Quality + Layer-3 architectures enable the use of Quality of Service (QoS) to manage network performance. @@ -220,17 +208,16 @@ The main limitation of layer 3 is that there is no built-in isolation mechanism comparable to the VLANs in layer-2 networks. Furthermore, the hierarchical nature of IP addresses - means that an instance will also be on the same subnet as its - physical host. This means that it cannot be migrated outside + means that an instance is on the same subnet as its + physical host. This means that you cannot migrate it outside of the subnet easily. For these reasons, network virtualization needs to use IP encapsulation - and software at - the end hosts for both isolation, as well as for separation of - the addressing in the virtual layer from addressing in the + and software at the end hosts for isolation and the separation of + the addressing in the virtual layer from the addressing in the physical layer. Other potential disadvantages of layer 3 include the need to design an IP addressing scheme rather than - relying on the switches to automatically keep track of the MAC - addresses and to configure the interior gateway routing + relying on the switches to keep track of the MAC + addresses automatically and to configure the interior gateway routing protocol in the switches.
@@ -242,13 +229,13 @@ Data in an OpenStack cloud moves both between instances across the network (also known as East-West), as well as in and out of the system (also known as North-South). Physical server - nodes have network requirements that are independent of those - used by instances which need to be isolated from the core - network to account for scalability. It is also recommended to - functionally separate the networks for security purposes and - tune performance through traffic shaping. - A number of important general technical and business factors - need to be taken into consideration when planning and + nodes have network requirements that are independent of instance + network requirements, which you must isolate from the core + network to account for scalability. We recommend + functionally separating the networks for security purposes and + tuning performance through traffic shaping. + You must consider a number of important general technical + and business factors when planning and designing an OpenStack network. They include: @@ -286,11 +273,10 @@ future production environments. - Keeping all of these in mind, the following network design - recommendations can be made: + Bearing in mind these considerations, we recommend the following: - Layer-3 designs are preferred over layer-2 + Layer-3 designs are preferable to layer-2 architectures. @@ -327,16 +313,16 @@
Additional considerations - There are numerous topics to consider when designing a + There are several further considerations when designing a network-focused OpenStack cloud.
OpenStack Networking versus legacy networking (nova-network) considerations - Selecting the type of networking technology to implement + Selecting the type of networking technology to implement depends on many factors. OpenStack Networking (neutron) and - legacy networking (nova-network) both have their advantages and disadvantages. - They are both valid and supported options that fit different - use cases as described in the following table. + legacy networking (nova-network) both have their advantages and + disadvantages. They are both valid and supported options that fit + different use cases: @@ -375,79 +361,75 @@ Redundant networking: ToR switch high availability risk analysis A technical consideration of networking is the idea that - switching gear in the data center that should be installed + you should install switching gear in a data center with backup switches in case of hardware failure. - - Research into the mean time between failures (MTBF) on switches + Research indicates the mean time between failures (MTBF) on switches is between 100,000 and 200,000 hours. This number is dependent on the ambient temperature of the switch in the data center. When properly cooled and maintained, this translates to between 11 and 22 years before failure. Even in the worst case of poor ventilation and high ambient temperatures in the data - center, the MTBF is still 2-3 years. This is based on published - research found at http://www.garrettcom.com/techsupport/papers/ethernet_switch_reliability.pdf - and http://www.n-tron.com/pdf/network_availability.pdf. - - In most cases, it is much more economical to only use a + for further information. + In most cases, it is much more economical to use a single switch with a small pool of spare switches to replace failed units than it is to outfit an entire data center with - redundant switches. Applications should also be able to - tolerate rack level outages without affecting normal - operations since network and compute resources are easily - provisioned and plentiful.
+ redundant switches. Applications should tolerate rack level + outages without affecting normal + operations, since network and compute resources are easily + provisioned and plentiful.
+
Preparing for the future: IPv6 support - - One of the most important networking topics today is the - impending exhaustion of IPv4 addresses. In early 2014, ICANN - announced that they started allocating the final IPv4 address - blocks to the Regional Internet Registries (http://www.internetsociety.org/deploy360/blog/2014/05/goodbye-ipv4-iana-starts-allocating-final-address-blocks/). - This means the IPv4 address space is close to being fully - allocated. As a result, it will soon become difficult to - allocate more IPv4 addresses to an application that has - experienced growth, or is expected to scale out, due to the lack - of unallocated IPv4 address blocks. - For network focused applications the future is the IPv6 + One of the most important networking topics today is the + impending exhaustion of IPv4 addresses. In early 2014, ICANN + announced that they started allocating the final IPv4 address + blocks to the Regional Internet Registries (http://www.internetsociety.org/deploy360/blog/2014/05/goodbye-ipv4-iana-starts-allocating-final-address-blocks/). + This means the IPv4 address space is close to being fully + allocated. As a result, it will soon become difficult to + allocate more IPv4 addresses to an application that has + experienced growth, or that you expect to scale out, due to the lack + of unallocated IPv4 address blocks. + For network focused applications the future is the IPv6 protocol. IPv6 increases the address space significantly, fixes long standing issues in the IPv4 protocol, and will become essential for network focused applications in the future. - OpenStack Networking supports IPv6 when configured to take advantage of - the feature. To enable it, simply create an IPv6 subnet in + OpenStack Networking supports IPv6 when configured to take + advantage of it. To enable IPv6, create an IPv6 subnet in Networking and use IPv6 prefixes when creating security groups.
Asymmetric links - When designing a network architecture, the traffic patterns - of an application will heavily influence the allocation of - total bandwidth and the number of links that are used to send + When designing a network architecture, the traffic patterns + of an application heavily influence the allocation of + total bandwidth and the number of links that you use to send and receive traffic. Applications that provide file storage - for customers will allocate bandwidth and links to favor - incoming traffic, whereas video streaming applications will - allocate bandwidth and links to favor outgoing traffic.
+ for customers allocate bandwidth and links to favor + incoming traffic, whereas video streaming applications + allocate bandwidth and links to favor outgoing traffic. +
Performance - It is important to analyze the applications' tolerance for + It is important to analyze the applications' tolerance for latency and jitter when designing an environment to support network focused applications. Certain applications, for example VoIP, are less tolerant of latency and jitter. Where latency and jitter are concerned, certain applications may require tuning of QoS parameters and network device queues to - ensure that they are queued for transmit immediately or - guaranteed minimum bandwidth. Since OpenStack currently does - not support these functions, some considerations may need to - be made for the network plug-in selected. - The location of a service may also impact the application or - consumer experience. If an application is designed to serve - differing content to differing users it will need to be - designed to properly direct connections to those specific - locations. Use a multi-site installation for these situations, - where appropriate. - Networking can be implemented in two separate - ways. The legacy networking (nova-network) provides a flat DHCP network + ensure that they queue for transmit immediately or + guarantee minimum bandwidth. Since OpenStack currently does + not support these functions, consider carefully your selected + network plug-in. + The location of a service may also impact the application or + consumer experience. If an application serves + differing content to different users it must properly direct + connections to those specific locations. Where appropriate, + use a multi-site installation for these situations. + You can implement networking in two separate + ways. Legacy networking (nova-network) provides a flat DHCP network with a single broadcast domain. This implementation does not support tenant isolation networks or advanced plug-ins, but it is currently the only way to implement a distributed layer-3 @@ -457,15 +439,15 @@ variety of network methods. Some of these include a layer-2 only provider network model, external device plug-ins, or even OpenFlow controllers. - Networking at large scales becomes a set of boundary + Networking at large scales becomes a set of boundary questions. The determination of how large a layer-2 domain - needs to be is based on the amount of nodes within the domain + must be is based on the amount of nodes within the domain and the amount of broadcast traffic that passes between instances. Breaking layer-2 boundaries may require the implementation of overlay networks and tunnels. This decision is a balancing act between the need for a smaller overhead or a need for a smaller domain. - When selecting network devices, be aware that making this + When selecting network devices, be aware that making this decision based on the greatest port density often comes with a drawback. Aggregation switches and routers have not all kept pace with Top of Rack switches and may induce bottlenecks on diff --git a/doc/arch-design/network_focus/section_user_requirements_network_focus.xml b/doc/arch-design/network_focus/section_user_requirements_network_focus.xml index 5544811418..624264ebca 100644 --- a/doc/arch-design/network_focus/section_user_requirements_network_focus.xml +++ b/doc/arch-design/network_focus/section_user_requirements_network_focus.xml @@ -6,187 +6,160 @@ xml:id="user-requirements-network-focus"> User requirements - Network focused architectures vary from the general purpose - designs. They are heavily influenced by a specific subset of - applications that interact with the network in a more - impacting way. Some of the business requirements that will - influence the design include: + Network-focused architectures vary from the general-purpose + architecture designs. Certain network-intensive applications influence + these architectures. Some of the business requirements that influence + the design include: - User experience: User experience is impacted by - network latency through slow page loads, degraded - video streams, and low quality VoIP sessions. Users - are often not aware of how network design and - architecture affects their experiences. Both - enterprise customers and end-users rely on the network - for delivery of an application. Network performance - problems can provide a negative experience for the - end-user, as well as productivity and economic loss. + Network latency through slow page loads, degraded video + streams, and low quality VoIP sessions impacts the user + experience. Users are often not aware of how network design and + architecture affects their experiences. Both enterprise customers + and end-users rely on the network for delivery of an application. + Network performance problems can result in a negative experience + for the end-user, as well as productivity and economic loss. - Regulatory requirements: Networks need to take into - consideration any regulatory requirements about the - physical location of data as it traverses the network. - For example, Canadian medical records cannot pass - outside of Canadian sovereign territory. Another - network consideration is maintaining network - segregation of private data flows and ensuring that - the network between cloud locations is encrypted where - required. Network architectures are affected by - regulatory requirements for encryption and protection - of data in flight as the data moves through various - networks. + Regulatory requirements: Consider regulatory + requirements about the physical location of data as it traverses + the network. In addition, maintain network segregation of private + data flows while ensuring an encrypted network between cloud + locations where required. Regulatory requirements for encryption + and protection of data in flight affect network architectures as + the data moves through various networks. - Many jurisdictions have legislative and regulatory - requirements governing the storage and management of data in - cloud environments. Common areas of regulation include: + Many jurisdictions have legislative and regulatory requirements + governing the storage and management of data in cloud environments. + Common areas of regulation include: - Data retention policies ensuring storage of - persistent data and records management to meet data - archival requirements. + Data retention policies ensuring storage of persistent data + and records management to meet data archival requirements. Data ownership policies governing the possession and - responsibility for data. + responsibility for data. - Data sovereignty policies governing the storage of - data in foreign countries or otherwise separate - jurisdictions. + Data sovereignty policies governing the storage of data in + foreign countries or otherwise separate jurisdictions. - Data compliance policies governing where information - needs to reside in certain locations due to regular - issues and, more importantly, where it cannot reside - in other locations for the same reason. + Data compliance policies govern where information can and + cannot reside in certain locations. - Examples of such legal frameworks include the data - protection framework of the European Union (http://ec.europa.eu/justice/data-protection/) - and the requirements of the Financial Industry Regulatory - Authority (http://www.finra.org/Industry/Regulation/FINRARules) - in the United States. Consult a local regulatory body for more - information. + Examples of such legal frameworks include the data protection + framework of the European Union + (http://ec.europa.eu/justice/data-protection/) + and the requirements of the Financial Industry Regulatory Authority + (http://www.finra.org/Industry/Regulation/FINRARules) + in the United States. Consult a local regulatory body for more + information.
High availability issues - OpenStack installations with high demand on network - resources have high availability requirements that are - determined by the application and use case. Financial - transaction systems will have a much higher requirement for - high availability than a development application. Forms of - network availability, for example quality of service (QoS), - can be used to improve the network performance of sensitive - applications, for example VoIP and video streaming. - Often, high performance systems will have SLA requirements - for a minimum QoS with regard to guaranteed uptime, latency - and bandwidth. The level of the SLA can have a significant - impact on the network architecture and requirements for - redundancy in the systems.
+ Depending on the application and use case, network-intensive + OpenStack installations can have high availability requirements. + Financial transaction systems have a much higher requirement for high + availability than a development application. Use network availability + technologies, for example quality of service (QoS), to improve the + network performance of sensitive applications such as VoIP and video + streaming. + High performance systems have SLA requirements for a minimum + QoS with regard to guaranteed uptime, latency, and bandwidth. The level + of the SLA can have a significant impact on the network architecture and + requirements for redundancy in the systems. +
Risks Network misconfigurations - Configuring incorrect IP - addresses, VLANs, and routes can cause outages to - areas of the network or, in the worst-case scenario, - the entire cloud infrastructure. Misconfigurations can - cause disruptive problems and should be automated to - minimize the opportunity for operator error. + Configuring incorrect IP addresses, VLANs, and routers + can cause outages to areas of the network or, in the worst-case + scenario, the entire cloud infrastructure. Automate network + configurations to minimize the opportunity for operator error + as it can cause disruptive problems. Capacity planning - Cloud networks need to be managed - for capacity and growth over time. There is a risk - that the network will not grow to support the - workload. Capacity planning includes the purchase of - network circuits and hardware that can potentially - have lead times measured in months or more. + Cloud networks require management for capacity and growth + over time. Capacity planning includes the purchase of network + circuits and hardware that can potentially have lead times + measured in months or years. Network tuning - Cloud networks need to be configured - to minimize link loss, packet loss, packet storms, - broadcast storms, and loops. + Configure cloud networks to minimize link loss, packet loss, + packet storms, broadcast storms, and loops. Single Point Of Failure (SPOF) - High availability - must be taken into account even at the physical and - environmental layers. If there is a single point of - failure due to only one upstream link, or only one - power supply, an outage becomes unavoidable. + Consider high availability at the physical and environmental + layers. If there is a single point of failure due to only one + upstream link, or only one power supply, an outage can become + unavoidable. Complexity - An overly complex network design becomes - difficult to maintain and troubleshoot. While - automated tools that handle overlay networks or device - level configuration can mitigate this, non-traditional - interconnects between functions and specialized - hardware need to be well documented or avoided to - prevent outages. + An overly complex network design can be difficult to + maintain and troubleshoot. While device-level configuration + can ease maintenance concerns and automated tools can handle + overlay networks, avoid or document non-traditional interconnects + between functions and specialized hardware to prevent + outages. Non-standard features - There are additional risks - that arise from configuring the cloud network to take - advantage of vendor specific features. One example is - multi-link aggregation (MLAG) that is being used to - provide redundancy at the aggregator switch level of - the network. MLAG is not a standard and, as a result, - each vendor has their own proprietary implementation - of the feature. MLAG architectures are not - interoperable across switch vendors, which leads to - vendor lock-in, and can cause delays or inability when - upgrading components. + There are additional risks that arise from configuring the + cloud network to take advantage of vendor specific features. + One example is multi-link aggregation (MLAG) used to provide + redundancy at the aggregator switch level of the network. MLAG + is not a standard and, as a result, each vendor has their own + proprietary implementation of the feature. MLAG architectures + are not interoperable across switch vendors, which leads to + vendor lock-in, and can cause delays or inability when upgrading + components.
Security - Security is often overlooked or added after a design has - been implemented. Consider security implications and - requirements before designing the physical and logical network - topologies. Some of the factors that need to be addressed - include making sure the networks are properly segregated and - traffic flows are going to the correct destinations without - crossing through locations that are undesirable. Some examples - of factors that need to be taken into consideration are: + Users often overlook or add security after a design implementation. + Consider security implications and requirements before designing the + physical and logical network topologies. Make sure that the networks are + properly segregated and traffic flows are going to the correct + destinations without crossing through locations that are undesirable. + Consider the following example factors: Firewalls - Overlay interconnects for joining separated tenant - networks + Overlay interconnects for joining separated tenant networks Routing through or avoiding specific networks - Another security vulnerability that must be taken into - account is how networks are attached to hypervisors. If a - network must be separated from other systems at all costs, it - may be necessary to schedule instances for that network onto - dedicated compute nodes. This may also be done to mitigate - against exploiting a hypervisor breakout allowing the attacker - access to networks from a compromised instance. + How networks attach to hypervisors can expose security + vulnerabilities. To mitigate against exploiting hypervisor breakouts, + separate networks from other systems and schedule instances for the + network onto dedicated compute nodes. This prevents attackers + from having access to the networks from a compromised instance.