Mellanox ClusterKit- Single Application evaluating your InfiniBand Cluster Health
Mellanox ClusterKit is an MPI-based application for evaluating your InfiniBand cluster health at multiple levels.
The MPI-based application provides multiple benchmarks:
- Network-wide- Random/order ring performance
- Per Node-System Memory bandwidth
- Pair-wise Network tests- Latency, Bandwidth
GPU tests (bandwidth and latency):
- GPU to Host memory bandwidth
- GPU to remote GPU (pair-wise, GPU0 -> GPU0, GPU1->GPU1, GPUn->GPUn)
- GPU to remote GPU (neighbor test, each GPU talks to each GPU)
Per rack- Collectives (Barrier, Bcast, Allreduce)
Main Features:
- Runs multiple benchmarks as a single MPI job
- Automatic results reporting and visualization, ready for analysis
Differentiators:
- Per node/rack reports
Example: Pairwise Latency Matrix
- Latency between node pairs
- Raw Data from a subset of Summit nodes (Top 500 #1 cluster)
- Data Includes switch and cable latency
- Topology is clearly visible
Interested in seeing more?
Contact us today and ask for a live demo