Microsoft Infrastructure as a Service Foundations—Non-Converged Architecture Pattern

This document discusses one of several design patterns that you can use for building out your Infrastructure as a Service fabric infrastructure. This document covers the non-converged architecture pattern, which is defined by creating separate networks for each of the traffic profiles used in a Microsoft IaaS solution. Table of Contents 1 Introduction 2 Non-Converged Architecture Pattern    2.1 Compute    2.2 Network    2.3 Storage  This document is part of a yet to be completed collection of documents that will constitute the Microsoft Infrastructure as a Service Foundations series. These documents will provide information about design elements and recommended options for a Microsoft Infrastructure as a Service infrastructure. The infrastructure could support on-premises IaaS, public cloud provider (Azure) IaaS, hosting service provider IaaS, or any combination of these as part of a hybrid IaaS solution. These documents are intended for IaaS cloud services architects and system designers, but will be of interest to anyone seeking detailed information on what Microsoft has to offer in the Infrastructure as a Service space. Contributors: Adam Fazio – Microsoft David Ziembicki – Microsoft Joel Yoker – Microsoft Artem Pronichkin – Microsoft Jeff Baker – Microsoft Michael Lubanski – Microsoft Robert Larson – Microsoft Steve Chadly – Microsoft Alex Lee – Microsoft Yuri Diogenes – Microsoft Carlos Mayol Berral – Microsoft Ricardo Machado – Microsoft Sacha Narinx – Microsoft Tom Shinder – Microsoft Jim Dial – Microsoft 1 Introduction The goal of the Infrastructure-as-a-Service (IaaS) Foundations series is to help enterprise IT and cloud service providers understand, develop, and implement IaaS infrastructures. This series provides comprehensive conceptual background, a reference architecture and a reference implementation that combines Microsoft software, consolidated guidance, and validated configurations with partner technologies such as compute, network, and storage architectures, in addition to value-added software features. The IaaS Foundations Series utilizes the core capabilities of the Windows Server operating system, Hyper-V, System Center, Windows Azure Pack and Microsoft Azure to deliver on-premises and hybrid cloud Infrastructure as a Service offerings. As part of Microsoft IaaS Foundations series, this document discusses one of several design patterns that you can use for building out your Infrastructure as a Service fabric infrastructure. This document covers the non-converged architecture pattern, which is defined by creating separate networks for each of the traffic profiles used in a Microsoft IaaS solution. 2 Non-Converged Architecture Pattern This section contains an architectural example that defines the non-converged cloud infrastructure pattern. This example provides guidance about the hardware that is required to build the non-converged pattern reference architecture by using high-level, non-OEM–specific system models. The non-converged pattern comprises traditional blade or non-blade servers that utilize a standard network and storage network infrastructure to support a highly available Hyper-V failover cluster fabric infrastructure. This infrastructure pattern provides the performance of a large-scale Hyper-V host infrastructure and the flexibility of utilizing existing infrastructure investments at a lower cost than a converged architecture. Figure 1 outlines a logical structure of components that follow this architectural pattern. Figure 1 Non-converged architecture pattern 2.1 Compute The compute infrastructure is one of the primary elements that must scale to support a large number of workloads. In a non-converged fabric infrastructure, a set of hosts that have the Hyper-V role enabled provide the fabric with the capability to achieve scale in the form of a large-scale failover cluster. Figure 2 provides an overview of the compute layer of the private cloud fabric infrastructure. Figure 2 Compute minimum configuration 2.1.1 Hyper-V Host Infrastructure The Hyper-V host infrastructure is comprised of a minimum of four hosts and a maximum of 64 hosts in a single Hyper-V failover-cluster instance. Although Windows Server 2012 R2 failover clustering supports a minimum of two nodes, a configuration at that scale does not provide a sufficient reserve capacity to achieve cloud attributes such as elasticity and resource pooling. As with any failover-cluster configuration, reserve capacity must be accounted for in the host infrastructure. Adopting a simple n-1 methodology does not always provide a sufficient amount of reserve capacity to support the workloads that are running on the fabric infrastructure. For true resilience to outages, we recommend that you size the reserve capacity within a single scale unit to one or more hosts. This is critical for delivering availability within a private cloud infrastructure and it is a key consideration when you are advertising the potential workload capacity of the fabric infrastructure. Note: Equally important to the overall density of the fabric is the amount of physical memory that is available for each fabric host. For service provider and enterprise configurations, a minimum of 192 GB of memory is required. As the demand for memory within workloads increases, this becomes the second largest factor for scale and density in the compute fabric architecture. Hyper-V provides Dynamic Memory to support higher densities of workloads through a planned oversubscription model. Although it is safe to assume that this feature will provide increased density for the fabric, a private cloud infrastructure should carefully consider the use of Hyper-V Dynamic Memory as part of the compute design due to supportability limitations and performance requirements in certain workloads. Always refer to the vendor workload recommendations and support guidelines when you enable Hyper-V Dynamic Memory. Additional considerations that should be accounted for in density calculations include: The amount of startup RAM that is required for each operating system The minimum RAM that is allocated to the virtual machine after startup for normal operations The maximum RAM that is assigned to the system to prevent oversubscription scenarios when memory demand is high The Hyper-V host must have sufficient memory to provide services such as I/O virtualization, snapshot, and management to support the guest virtual machines. Previous guidance was provided to tune the host system reserve; however; when Dynamic Memory is used, root reserve is calculated automatically (based on the root physical memory and NUMA architecture of the hosts) and no longer requires manual configuration. Although guidance about network connectivity that uses onboard network connections is provided in the following section, you should make sure that out-of-band (OOB) network-management connectivity is provided to support the remote management and provisioning capabilities that are found within System Center. To address these capabilities, the compute infrastructure should support a minimum of one OOB management interface, with support for Intelligent Platform Management Interface (IPMI) 1.5/Data Center Management Interface (DCMI) 1.0 or Systems Management Architecture for Server Hardware (SMASH) 1.0 over WS-Man. Failure to include this component will result in a compute infrastructure that cannot utilize automated provisioning and management capabilities in the private cloud solution. Note: It should be assumed that customers will also require multiple types (or classifications) of resource pools to support a number of scenarios and associated workloads. These types of resource pools are expected to be evaluated as part of the capabilities that the resulting fabric will be required to provide. For example, a resource pool that is intended for VDI resources might have different hardware, such as specialized graphics cards, to support RemoteFX capabilities within Hyper-V. For these reasons, options for a compute infrastructure that provide advanced resource pool capabilities, such as the RemoteFX resource pool, should be available to address these needs and provide a complete solution. 2.2 Network When you are designing the fabric network for the Hyper-V failover cluster in Windows Server 2012 R2, it is important to provide the necessary hardware and network throughput to provide resiliency and Quality of Service (QoS). Resiliency can be achieved through availability mechanisms, and QoS can be provided through dedicated network interfaces or through a combination of hardware and software QoS capabilities. Figure 3 provides an overview of the network layer of the private cloud fabric infrastructure. Figure 3 Network minimum configuration 2.2.1 Host Connectivity During the design of the network topology and associated network components of the private cloud infrastructure, the following key considerations apply: Provide adequate network port density: Designs should contain top-of-rack switches with sufficient density to support all host network interfaces. Provide adequate interfaces to support network resiliency: Designs should contain a sufficient number of network interfaces to establish redundancy through NIC Teaming. Provide network Quality of Service (QoS): Although dedicated cluster networks is an acceptable way to achieve QoS, utilizing high-speed network connections in combination with hardware- or software-defined network QoS policies provides a more flexible solution. For Microsoft IaaS pattern designs, a minimum of two 10 GbE network interfaces and one OOB management connection is assumed a minimum baseline of network connectivity for the fabric architecture. Two interfaces are used for cluster traffic, and the third is available as a management interface. To provide resiliency, additional interfaces can be added and teamed by using the NIC Teaming feature in Windows Server 2012 R2. It is recommended to have redundant network communication between all private cloud cluster nodes. Host connectivity in a private cloud infrastructure should support the following types of communication that are required by Hyper-V and the failover clusters that make up the fabric: Host management Virtual machine Live migration iSCSI (if required) Intra-cluster communication and CSV Host management consists of isolated network traffic to manage the host, and virtual machine traffic is on an accessible network for clients to access the virtual machines. The usage of the virtual machine traffic is highly dependent on the running workload and the interaction of the client with that application or service. Live migration traffic is intermittent and used during virtual machine mobility scenarios such as planned failover events. This has the potential to generate a large amount of network traffic over short periods during transition between nodes. Live migration will default to the second lowest metric if three or more networks are configured in failover clustering. When iSCSI is used, a dedicated storage network should be deployed within the fabric (because this is the non-converged pattern where storage traffic has a dedicated network). These interfaces should be disabled for cluster use, because cluster traffic can contribute to storage latency. Intra-cluster communication and CSV traffic consist of the following traffic types: Network health monitoring Intra-cluster communication CSV I/O redirection Network health monitoring traffic consists of heartbeats that are sent to monitor the health status of network interfaces in a full mesh manner. This lightweight unicast traffic (approximately 134 bytes) is sent between cluster nodes over all cluster-enabled networks. Note: Because of its sensitivity to latency, bandwidth is important, as opposed to Quality of Service, because if heartbeat traffic becomes blocked due to network saturation, fabric nodes could be removed from cluster membership. By default, nodes exchange these heartbeats every one second, and they are considered to be down if they do not respond to five heartbeats. Intra-cluster communication is variable (based on workload), and it is responsible for sending database updates and synchronizing state changes between the nodes in the fabric cluster. This lightweight traffic communicates over a single interface. As with network health monitoring, bandwidth is the primary concern, because this type of traffic is sensitive to latency during state changes, such as failover. CSV I/O redirection traffic consists of lightweight metadata updates, and it can communicate over the same interface as the intra-cluster communication mentioned previously. It requires a defined Quality of Service to function properly. CSV uses SMB to route I/O over the network between nodes during failover events, so sufficient bandwidth is required to handle the forwarded I/O between cluster nodes. Additionally, CSV traffic will utilize SMB Multichannel and advanced network adapter capabilities such as RDMA; however, use of Jumbo Frames has shown little increase in performance. 2.3 Storage Storage provides the final component for workload scaling, and as for any workload, storage must be designed properly to provide the required performance and capacity for overall fabric scale. In a non-converged fabric infrastructure, traditional SAN infrastructures that are connected over Fibre Channel or iSCSI provide the fabric with sufficient capacity to achieve storage scale. Figure 4 provides an overview of the storage infrastructure for the non-converged pattern. Figure 4 Storage minimum configuration 2.3.1 Storage Connectivity For the operating system volume of the host system that is using direct-attached storage , an internal SATA or SAS controller is required, unless the design utilizes SAN for all system storage requirements, including boot from SAN for the host operating system (Fibre Channel and iSCSI boot are supported in Windows Server 2012 R2). Depending on the storage protocol and devices that are used in the non-converged storage design, the following adapters are required to allow shared storage access: If using Fibre Channel SAN, two or more host bus adapters (HBAs) If using iSCSI, two or more 10 GbE network adapters or HBAs Hyper-V in Windows Server 2012 R2 supports the ability to present SAN storage to the guest workloads that are hosted on the fabric infrastructure by using virtual Fibre Channel adapters. Virtual SANs are logical equivalents of virtual network switches within Hyper-V, and each virtual SAN maps to a single physical Fibre Channel uplink. To support multiple HBAs, a separate virtual SAN must be created per physical Fibre Channel HBA and mapped exactly to its corresponding physical topology. Note: When configurations use multiple HBAs, MPIO must be enabled within the virtual machine workload. A virtual SAN assignment should follow a pattern that is similar to a Hyper-V virtual switch assignment, in that if there are different classifications of service within the SAN, it should be reflected within the fabric. As discussed in earlier sections, all physical Fibre Channel equipment must support NPIV. Hardware vendors must also provide drivers that display the Designed for Windows logo for all Fibre Channel HBAs, unless the drivers are provided in Windows Server 2012 R2. If zoning that is based on physical Fibre Channel switch ports is part of the fabric design, all physical ports must be added to allow for virtual machine mobility scenarios across hosts in the fabric cluster. Although virtual machines can support iSCSI boot, boot from SAN is not supported over the virtual Fibre Channel adapter and should not be considered as part of workload design. 2.3.2 Storage Infrastructure The key attribute of the storage infrastructure for the non-converged pattern is the use of a traditional SAN infrastructure to provide access to storage to the fabric, fabric management, and workload layers. As discussed earlier, the primary reasons to adopt or maintain this design are to preserve existing investments in SAN or to maintain the current level of flexibility and capabilities that a SAN-based storage-array architecture provides. For Hyper-V failover cluster and workload operations in a non-converged infrastructure, the fabric components utilize the following types of storage: Operating system: Non-shared physical boot disks (direct-attached storage or SAN) for the fabric management host servers Cluster witness: Shared witness disk or file share to support the failover cluster quorum Cluster Shared Volumes (CSV): One or more CSV LUNs for virtual machines (Fibre Channel or iSCSI), as presented by the SAN Guest clustering [optional]: Shared Fibre Channel, shared VHDX, or shared iSCSI LUNs for guest clustering Figure 5 provides a conceptual view of the storage architecture for the non-converged pattern. Figure 5 Non-converged architecture pattern Although boot from SAN by using Fibre Channel or iSCSI storage is supported in Windows Server 2012 R2, it is widely accepted to have storage configured locally per server to provide these capabilities for each server given the configuration of standard non-converged servers. In these cases, local storage should include two disks that are configured as RAID 1 (mirror) as a minimum, with an optional global hot spare. To provide quorum for the server infrastructure, we recommend utilizing a quorum configuration of Node and Disk Majority. A cluster witness disk is required to support this quorum model. In non-converged pattern configurations, we recommend that you provide a 1 GB witness disk that is formatted as NTFS for all fabric and fabric management clusters. This provides resiliency and prevents partition-in-time scenarios within the cluster. Windows Server 2012 R2 provides multiple hosts access to a shared disk infrastructure through CSV. For non-converged patterns, the SAN should be configured to provide adequate storage for virtual machine workloads. Given that workload, virtual disks often exceed multiple gigabytes, so we recommend that where supported by the workload, dynamically expanding disks be used to provide higher density and more efficient use of storage. Note: Additional SAN capabilities, such as thin provisioning of LUNs, can assist with the consumption of physical space. However, this functionality should be evaluated to help make sure that workload performance is not negatively affected. For the purposes of Hyper-V failover clustering, CSV must be configured in Windows as a basic disk that is formatted as NTFS (FAT and FAT32 are not supported for CSV). CSV cannot be used as a witness disk, and they cannot have Windows Data Deduplication enabled. Although supported, ReFS should not be used in conjunction with a CSV with Hyper-V workloads. A CSV has no restrictions for the number of virtual machines that it can support on an individual CSV because metadata updates on a CSV are orchestrated on the server side, and they run in parallel to provide no interruption and increased scalability. Performance considerations fall primarily on the IOPS that SAN provides, given that multiple servers from the Hyper-V failover cluster stream I/O to a commonly shared LUN. Providing more than one CSV to the Hyper-V failover cluster within the fabric can increase performance, depending on the SAN configuration. To support guest clustering, LUNs can be presented to the guest operating system through iSCSI or Fibre Channel. Configurations for the non-converged pattern should include sufficient space on the SAN to support the number of LUNs that are needed for workloads with high-availability requirements that must be satisfied within the guest virtual machines and associated applications. Go Social with Building Clouds! Building Clouds blog Private Cloud Architecture Facebook page Private Cloud Architecture Twitter account Building Clouds Twitter account Private Cloud Architecture LinkedIn Group Building Clouds Google+ Community Cloud TechNet forums TechNet Cloud and Datacenter Solutions Site Cloud and Datacenter Solutions on the TechNet Wiki