Hadoop in the cloud

Leverage big data analytics easily and cost-effectively with IBM InfoSphere BigInsight

Published August 2014

One of the hottest technologies in the big data space is Apache Hadoop, an open source software framework used to reliably manage large volumes of data. Designed to scale from a single server to thousands of machines with a high degree of fault tolerance, Hadoop enables organizations to extract valuable insight from large volumes of structured, unstructured and semi-structured data.

The need for large upfront investments and concerns about flexibility, coupled with special challenges involved in evaluating the technology and developing Hadoop skills, often prevent organizations from adopting and deploying Hadoop across the enterprise. It also becomes impractical to use Hadoop on an occasional basis for high-impact projects that do not have a need for continuous processing.

There is good news, though - you can overcome these capital requirements barriers through cloud computing. The cloud model of paying only for the resources that you need - and only when you need them - supports experimentation and evaluation and is ideal for building up skills. It is also a great solution for short-term or occasional-use projects where an investment in a dedicated cluster is cost prohibitive.