Telemetry Streaming

Introduction

Telemetry Streaming, a new feature in iDRAC9 v4.0 enabled by the new Datacenter License, allows you to discover trends, fine tune operations, and create predictive analytics to optimize your infrastructure. Using tools such as Splunk or ELK Stack, you can perform deep analysis of server telemetry including storage, networking and memory parametric data for proactive decision making and decreased downtime. Telemetry streaming can be used for system customization, optimization, risk management, and predictive analytics.

There is a huge amount of untapped machine data in your IT infrastructure: use iDRAC9 Telemetry Streaming and analytics to leverage that data to optimize your server management and operations.

With iDRAC9 Telemetry Streaming, you get time-series and detailed statistics reports delivered directly to a variety of analytics collection tools with higher efficiency by removing the need for issuing individual commands for each piece of data. The streaming configuration is flexible so users can modify the number of metrics they require, the report interval (30 seconds for example), and enable reports to be sent immediately upon detection of critical events in the server (like a PSU failure say).

The advanced agent-free architecture in iDRAC9 provides over 180 data metrics (with more coming) related to server and peripherals operations that are precisely time-stamped and internally buffered to allow highly efficient data stream collection and processing with minimal network loading. This comprehensive telemetry can be fed to popular analytics tools to predict failure events, optimize server operation, and enhance cyber-resiliency.

New iDRAC9 Telemetry Streaming Feature

  • iDRAC9 Telemetry Streaming provides high-performance streaming of server data
  • Goal is to extract high-value data that can be leveraged by customers’ analytic tools as well as enhancing Dell customer support
  • Over 180 unique server and peripheral metrics can be streamed or pulled from iDRAC9, due to our industry-leading agent-free architecture
  • Provides precise, time-series data for monitoring power, temperatures, performance (CUPS*) and statistics (NICs, GPUs, Storage SMART attributes, etc.)
  • Plug-in support is available (staged on GitHub) for integrating iDRAC Telemetry Streaming into popular analytics solutions like Splunk, ELK stack and InfluxDB as examples

Types of Telemetry Data

A summary of the types of telemetry that iDRAC9 has are:

New Telemetry Data with iDRAC9 4.0:

  • Serial Data Log messages
  • GPU Accelerator Inventory & Monitoring
  • Advanced CPU Metrics
  • Storage Drive SMART logs
  • Advanced Memory Monitoring
  • SFP+ Optical Transceiver Inventory & Monitoring

Existing Telemetry Data:

  • Configuration – comprehensive settings for all devices (BIOS, iDRAC, NICs, RAID, etc.)
  • Inventory: comprehensive server hardware and firmware reporting
  • Performance: CPU, memory bandwidth and I/O usage (Compute Usage Per Second or CUPS)
  • Performance and diagnostic statistics: PERC, NICs, Fiber Channel
  • Sensors: voltage, temperature, power, connectivity status, intrusion detection
  • Logs: SEL log, iDRAC diagnostics, Lifecycle Controller Log
Telemetry Data

Telemetry and Analytics

Telemetry has been around for decades and has been used in various business applications, from hospitals monitoring patients to oil and gas drilling systems to weather balloons transmitting meteorological data. The definition of Telemetry is an “automated communications process by which measurements are made, and other data collected at remote or inaccessible points are transmitted to receiving equipment for monitoring.”

Telemetry and Analytics

Some of the use cases for data center analytics are:

Predictive analytics: Customers can perform an in-depth analysis of server telemetry, including device parametric data to proactively replace failing devices. In one case, an IT team used analytics on telemetry from memory devices to develop an algorithm that predicted eventual failure. This allows proactive replacement of suspect devices during scheduled maintenance windows, significantly improving uptime and SLA quality.

Optimized IT operations: You can perform time-series analysis of vital server metrics to gain insights into optimizing server operation, including tracking of power, temperature, CPU, and I/O performance, etc. One industry that makes extensive use of analytics is High-Frequency Trading, where every millisecond of compute counts in accelerating automated trades. Detailed telemetry is commonly used to discover ways to squeeze out more performance from servers, which becomes a key competitive advantage in this industry.

Security: AI-based analytics can respond far faster to security events. You can enhance security AI and forensics by monitoring the say of unusual user login activity or physical intrusion events on your servers.

Integrating iDRAC9 Telemetry Streaming with Popular Analytics Solutions

The data is formatted using JSON (JavaScript Object Notation) and can be easily adapted to connect many analytics solutions on the market.

Integrating iDRAC Telemetry into Analytics Solutions

2021 MD11 ict engineering & consulting