Challenges

Promise of Big Data

Big Data systems such as NoSQL/Hadoop are seeing tremendous growth, and companies are adopting these systems with promise of:

  • Storing larger quantities of data
    The ability to scale data beyond what can fit in a single server. With the petabytes of data generated by sensors, web and social media interactions, storing more than a year of data without a Big Data system is impractical.
  • Cost-effective storage of data
    The ability to cheaply store data redundantly for potential analysis.
  • Availability among companies of various sizes
    Elastic and cloud computing allows organizations of all sizes access to the massive computational resources necessary for handling Big Data.
  • Advanced analytic capabilities
    A rich set of processing for complex, comprehensive analysis. Understanding complex networks, such as those in social interactions/media, requires intensive at scale data and processing.
Big Data systems have delivered:
  • Massive Petabyte scale systems
  • Cheap, inexpensive, reliable storage of massive amounts of data
  • Reliable, fault tolerant, scalable processing

However, building analytic solutions that can be used by the business to understand their data, customers, and business on Big Data falls well short of expectations. There are a number of key challenges with building analytics on top of Big Data:

  • Slow Queries
    Big Data systems process huge amounts of data; processing a petabyte of data in 5 minutes is fantastic. However, dashboard and report users expect results in seconds; a minute goes by and they’re off to get a cup of coffee.
  • Expensive Report Authoring
    Building charts and reports on Big Data systems typically involves expensive developers to run programs to extract data.
  • IT Backlog
    With developers building reports and charts, changes to formatting, filters, or slight adjustments to reports can take weeks to get prioritized alongside existing IT projects.
  • Little to no interactivity
    Big Data reporting applications aren’t meant to be run with different parameters and data ranges. Easy, quick, AdHoc access to the data where users can create their own filtering/customizations is crucial in supporting business decision making.

Big Data has solved the INFRASTRUCTURE of raw/core data storage but has provided little value, to date, to what business users need.

Current solutions to solve these Big Data challenges involve:
  • Lots of code if building BI (especially AdHoc) directly on top of NoSQL systems.
  • Expensive and complicated ETL development to extract and push into a regular data warehouse that ONLY enables historical analysis with no live access.
  • Building layers to connect directly from BI tools to the Big Data systems resulting in direct access to big data but SLOW reports.