Cray CS-Storm 500GT 3U Instructions d'installation

CS-Storm™ 500GT 3U Server Hardware Guide
(Rev C)
H-6150

Contents
About the CS-Storm 500GT 3U Server Hardware Guide..........................................................................................3
System Description....................................................................................................................................................8
Server Components.................................................................................................................................................11
Controls and Indicators..................................................................................................................................14
Drive Support and Configuration...................................................................................................................17
System Interconnect Diagram.................................................................................................................................19
PCIe Architecture.....................................................................................................................................................20
PCIe Connections and Cabling................................................................................................................................21
Power Distribution....................................................................................................................................................23
Power Supplies..............................................................................................................................................24
Hydra Fan Control Utility..........................................................................................................................................25
Management Daughter Card (MDC)........................................................................................................................35
MDC Control Panel........................................................................................................................................35
MDC DIP Switch Configuration.....................................................................................................................36
PCIe Bifurcation of the 4 PCIe Switch Board...........................................................................................................39
Environmental Specifications...................................................................................................................................40
S2600BP Motherboard Description.........................................................................................................................41
S2600BP Component Locations...................................................................................................................43
S2600BP Processor Socket Assembly..........................................................................................................48
S2600BP Architecture...................................................................................................................................50
S2600BP Processor Population Rules..........................................................................................................51
S2600BP Memory Support and Population Rules.........................................................................................52
S2600BP Configuration and Recovery Jumpers...........................................................................................53
S2600BP BIOS Features...............................................................................................................................56
Contents
H-6150 (Rev C) 2

About the CS-Storm 500GT 3U Server Hardware Guide
The Cray® CS-Storm 500GT™ 3U Server Hardware Guide H-6150 describes the 3U server (Model 7201)
components and features. This guide does not include information about peripheral switches or network fabric
components. Refer to the manufacturer's documentation for peripheral equipment.
Document Versions
Table 1. Record of Revision
Publication Title Date Updates
CS-Storm™ 500GT 3U Server Hardware Guide H-6150 Rev C Feb 2018 Volta 100 GPU.
CS-Storm™ 500GT Hardware Guide H-6150 Rev B Oct 2017 Technical updates.
CS-Storm™ 500GT Hardware Guide H-6150 Rev A Sept 2017 Original publication.
Scope and Audience
This document provides information about the CS-Storm 500GT 3U server. Installation and service information is
provided for users who have experience maintaining high performance computing (HPC) equipment. Installation
and maintenance tasks should be performed by experienced technicians in accordance with the service
agreement.
Related Publications
●CS-Storm 500GT Hardware Replacement Procedures H-6159
Acronyms and Terms
The following table lists the acronyms and their definitions used in this guide.
Acronym Definition
Accelerator Specialized hardware that performs some functions more efficiently than is possible
with software running on a more general-purpose CPU. GPU-accelerated computing is
the use of a GPU together with a CPU to accelerate scientific, analytics, engineering,
consumer, and enterprise applications. In use, GPU accelerator is often shortened to
GPU.
ASHRAE American Society of Heating Refrigeration and Air Conditioning Engineers.
BIOS Basic Input/Output System. Non-volatile firmware used to perform hardware
initialization during the booting process, and to provide runtime services for the
operating system.
About the CS-Storm 500GT 3U Server Hardware Guide
H-6150 (Rev C) 3

Acronym Definition
Bridge board Bridge board. A PCI board/card that provides front panel control signals from the
motherboard to the power backplane and SATA signals from the motherboard to the
disk backplane.
FPGA Field Programmable Gate Array. An integrated circuit designed to be configured by a
customer after it is manufactured.
GPU Graphics Processing Unit (GPU). A processor chip that performs rapid mathematical
calculations, primarily for the purpose of rendering images. GPUs perform parallel
operations on multiple sets of data.
KVM Keyboard Video Mouse (KVM). A rackmounted drawer unit with display screen,
keyboard, and mouse or touch pad used to control multiple computers in a data
centers.
I²C Inter-Integrated Circuit. A multi-master, multi-slave, packet switched, single-ended,
serial computer bus. It is typically used for attaching lower-speed peripheral ICs to
processors and microcontrollers in short-distance, intra-board communication. I²C is
often spelled I2C and pronounced I-two-C.
IFB Interface board. A printed circuit board (PCB) assembly used for the transmission of
signals between different components/systems within the server.
MDC Management daughter card. A printed circuit board (PCB) assembly with IO interface
used to configure, monitor, and manage server subsystems and components.
NVMe Non-Volatile Memory Express (NVMe). A logical device interface specification for
accessing non-volatile storage media attached through a PCI Express (PCIe) bus.
NVMe is commonly flash memory that comes in the form of solid-state drives (SSDs).
PCIe 3.0 Peripheral Component Interconnect Express, 3rd generation I/O.
PCIe switch board A PCIe expansion backplane with 10 PCIe x16 Gen3 slots that expand the
motherboard PCIe lanes and computing resources.
PLX PLX Technology, Inc. is the manufacturer of the PEX8796 PCIe 3.0 multiple-host
switching integrated circuit (IC) chips used on the PCIe switch board.
RU Rack unit. Abbreviated RU or U, is a height measurement defined as 44.5 mm (1.75
in). Most frequently refers to the overall height of 19-inch and 23-inch rack frames, as
well as the height of servers/equipment that mounts in these frames.
SATA Serial AT Attachment (SATA). A computer bus interface that connects host bus
adapters to mass storage devices such as hard disk drives and solid-state drives.
SMBus System Management Bus. A single-ended, simple, two-wire bus used for lightweight
communication. It is typically used in computer motherboards for on/off communication
with the power source.
SSD Solid-state storage device (SSD). SSDs use integrated circuit assemblies as memory
to store data persistently so the data can continue to be accessed. SSDs have no
moving mechanical components as do traditional electromechanical magnetic disks
such as hard disk drives (HDDs).
About the CS-Storm 500GT 3U Server Hardware Guide
H-6150 (Rev C) 4

Acronym Definition
U.2 U.2 formerly known as SFF-8639, is a computer interface for connecting SSDs to a
computer. It uses up to four PCI Express lanes.
UPI Intel® UltraPath® Interconnect. UPI is a point-to-point processor interconnect capable
of up to 10.4 GT/s. With the Intel Xeon Scalable processor family (formerly code-
named Skylake-SP), UPI replaces the Intel QuickPath Interconnect (QPI).
Product EMC Compliance
● FCC Part 15 (USA)
● EN55022 (Europe)
● ICES-003 Emissions (Canada)
● VCCI Emissions (Japan)
● KC Certification (Korea)
Product Regulatory Compliance Markings
The CS-Storm 500GT model 7201 chassis and system components are marked with the following regulatory and
certification markings.
Regulatory
Compliance
Country Marking
FCC Marking
(Class A)
USA INFORMATION TO THE USER
This equipment has been tested and found to comply with the limits for a
Class A digital device, pursuant to part 15 of the FCC Rules. These limits
are designed to provide reasonable protection against harmful
interference when the equipment is operated in a commercial
environment. This equipment generates, uses, and can radiate radio
frequency energy and, if not installed and used in accordance with the
instruction manual, may cause harmful interference to radio
communications.
Operation of this equipment in a residential area is likely to cause harmful
interference in which case the user will be required to correct the
interference at his own expense.
WARNING
Changes or modifications not expressly approved by the manufacturer
could void the user’s authority to operate the equipment.
NRTL (National
Recognized Test
Laboratory)
USA/Canada
About the CS-Storm 500GT 3U Server Hardware Guide
H-6150 (Rev C) 5

Regulatory
Compliance
Country Marking
CE Mark Europe
WARNING
This is a class A product. In a domestic environment this product may
cause radio interference in which case the user may be required to take
adequate measures.
EMC Marking
(Class A)
Canada This Class [A] digital apparatus complies with Canadian ICES-003.
Cet appareil numerique de la classe [A] est conforme a la norme
NMB-003 du Canada.
VCCI Marking
(Class A)
Japan この裝置は, 情報處理裝置等電波障害自主規制協議會 (VCCI) の基準に基
づくクラス A情報技術裝置です. この裝置を家庭環境で使用すると電波妨
害を引き起こすことがあります.
この場合には使用者が適切な對策を講ずるよう要求されることがありま
す.
C-Tick Marking
(Class A)
Australia
Replaceable
Lithium battery
Warning
Information
UL Safety CAUTION
RISK OF EXPLOSION IF BATTERY IS REPLACED BY AN INCORRECT
TYPE.
DISPOSE OF USED BATTERIES ACCORDING TO THE
INSTRUCTIONS
Low Altitude Use China Only use at altitude not exceeding 2000m.
AC Symbol All IEC 60417-5032
Alternating current
About the CS-Storm 500GT 3U Server Hardware Guide
H-6150 (Rev C) 6

Regulatory
Compliance
Country Marking
Stand-by Symbol All IEC 60417-5009
Stand-by
Trademarks
The following are trademarks of Cray Inc. and are registered in the United States and other countries: CRAY and design, SONEXION, Urika-GX, and
YARCDATA. The following are trademarks of Cray Inc.: APPRENTICE2, CHAPEL, CLUSTER CONNECT, ClusterStor, CRAYDOC, CRAYPAT, CRAYPORT,
DATAWARP, ECOPHLEX, LIBSCI, NODEKARE. The following system family marks, and associated model number marks, are trademarks of Cray Inc.: CS, CX,
XC, XE, XK, XMT, and XT. The registered trademark LINUX is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the
mark on a worldwide basis. Other trademarks used in this document are the property of their respective owners.
About the CS-Storm 500GT 3U Server Hardware Guide
H-6150 (Rev C) 7

System Description
The CS-Storm™ 500GT system is a dense 3U or 4U 19-inch wide rackmount server that is optimized to support
today’s highest power GPU or FPGA accelerator cards.
Each 500GT server contains two Intel® Xeon® Scalable processors, up to 1536GB of memory, eight 2.5-in drive
bays, and up to 16 DIMMs. However, for optimal memory performance, 12 DIMMs are recommended to achieve
maximum performance.
Each CS-Storm 500GT server supports up to 10 PCIe GPU or FPGA accelerator cards.
Figure 1. CS-Storm 500GT Server
Switches
and indicators
GPU/accellerator card
status LEDs
Front cover
Rear cover
Model 7201
19in 3U chassis
Model 7201A
19in 4U chassis
Cover latches
Front grill
(air intake)
Air vents
(exhaust)
Cover latches
Server Configuration Options:
●Balanced PCIe Configuration
○ GPU host-to-peer optimized server.
○ Balanced PCIe CPU-to-GPU bandwidth. The balanced PCIe architecture offers balanced performance for
codes that have high data parallelism and use both the CPUs and GPUs in workload processing.
●Custom Accelerator Card Configuration
○ A 4U chassis balanced PCIe server implements the same system PCIe architecture and hardware
components but supports extended height custom-sized FPGA accelerator cards.
System Description
H-6150 (Rev C) 8

Table 2. CS-Storm 500GT Server Specifications
Feature Description
Rack options 19in rack, 42RU and 48RU options
Chassis ● 19-inch wide, 3U or 4U rackmounted chassis
● Up to 15 server chassis in a 48RU rack
● Chassis weight:
○ Up to 76 lb (34kg) without PCIe cards
○ Up to 135 lb (62kg) fully loaded
● 3U Dimensions: (HxWxD) 5.1 x 17.7 x 36.4in (130 x 449 x 925mm)
● 4U Dimensions: (HxWxD) 6.8 x 17.7 x 36.4in (173 x 449 x 925mm)
Accelerators Up to 10 PCIe accelerators (up to 400W continuous each):
● NVIDIA® Tesla® P40 or P100
● NVIDIA® Tesla® V100
● Custom extended height full-size FPGA accelerators (4U chassis only)
Custom 4U Chassis ● Up to eight 425W custom cards
● N+1 power supply redundancy
● Not ASHRAE compliant
● Balanced PCIe configuration
Motherboard Intel® S2600BP
Processors Two Intel Xeon Scalable family processors (up to 165W TDP)
Memory Capacity Up to 12 of 16 available DIMM slots
Up to 1536GB DDR4 (12 x 128GB DIMMs)
For optimal memory performance, 12 DIMMs (1 DIMM per channel, 6 DIMMs per
CPU) are highly recommended.
Storage 2.5in drive bays
NVMe U.2 drive configuration depends on PCIe topology.
Spinning disks are not supported (all SATA disks must be SSDs).
● Up to 8 SATA SSDs in external drive bays (hot swap)
● 4 NVMe SSDs (external bays 4-7)
● 1 or 2 fixed internal SATA SSDs
Some configurations require an additional add-in storage controller
Total number and type of drives vary with configuration and PCIe topology.
Expansion slots ● 2 PCIe 3.0 x16 slots
System Description
H-6150 (Rev C) 9

Feature Description
● 2 additional PCIe 3.0 x16 slots can be added with 8 GPUs
Network adapter cards ● Omni-Path (100Gb/s)
● InfiniBand™ EDR (100Gb/s) or HDR (200Gb/s)
● Ethernet (100Gb/s)
Cooling Air cooled (front to rear air flow)
● Seven fans
○ Three 120mm fans (front)
○ Four 80mm fans (middle)
○ Active/manual fan speed control through MDC or hydrad daemon
● Built-in air duct
● Passive processor heatsinks
● Passive GPU/FPGA heatsinks
● Two in-line fans in each power supply unit
Power Supplies Support for both N+1 and N+N power configurations. Up to four 2200W AC power
supplies, 200-277VAC (gold level efficiency)
● 2+2 redundancy with 10 (300W) accelerators
● 3+1 redundancy with 10 (400W) accelerators (3+1 PSUs required)
Server supports multiple PCIe topologies and configuration options
Node management ● Integrated Baseboard Management Controller (BMC) (IPMI 2.0)
● Management daughter card (MDC)
● MDC supports hydrad daemon to manage fans, GPUs, and PSUs
● Intel remote management module 4 (RMM4)
● RMM4 supports remote KVM and Intel Dedicated Server Management NIC
● On-board RJ45 management port
● Support for Intel System Management Software
IO Ports ● 2 RJ45 10GBase-T LAN ports
● 1 RJ45 dedicated management LAN port
● 2 USB 3.0 ports
● Optional: VGA or serial port
System Description
H-6150 (Rev C) 10
Table des matières
Autres manuels Cray Serveur




















