Chaos Minions: Harassing the World’s Largest Supercomputer

10/06/2017
12:00 pm - 12:20 pm
OCCC W315A

Objective: Guidance
Audience Level: Beginner/Intermediate
Session Type: Presentation

As HPC systems increase in size and complexity, there is a growing need for resilience validation.Chaos Minions is a framework in which fault injections and recovery are combined into an automated solution.The framework runs harassers on targeted components and provides randomly generated harassment to the system. A workload can run parallel to the framework to identify bottlenecks in resiliency.

Speaker(s)

, HPC Systems Engineer, Intel