Apache Hadoop is an open source suite of software tools and frameworks. It is designed to simplify the processing and analysis of big data sets across clusters of computing resources that each have local processing and storage. The frameworks use a distributed storage file system and simple programming models to make the power of distributed computing available to all. Workloads can be scaled from a single server up to thousands. The framework is designed to handle failures within the available servers at the application layer of the network stack, and so eliminate the need for hardware resilience.