Our Product

Open source. Built for developers.

Cask Data Application Platform

The Cask Data Application Platform (CDAP) is an open source integrated platform for developers and organizations to build, deploy, and manage data applications.



    Streams for data ingestion
  • Supports Kafka, Flume, REST, and custom implemented protocols
  • Time-stamped, ordered and horizontally scalable
    Reusable libraries for common Big Data access patterns
  • Secondary indexes, Time Series, Key-Value, Objects, Geospatial, OLAP Cube and more
  • Libraries expose each pattern as RPCs, Batch Scans, and SQL Tables
    Data available to multiple applications and different paradigms
  • Unified batch and real-time processing with the same data used concurrently by MapReduce, Hive, Spark, Flows and more
  • Expose data as REST services to quickly enable data as a service


  • Simplify data ingestion and Extract Transform Load (ETL) to accelerate time to value
  • Maximize value of data by making it easy to find and easy to explore through multiple query methods
  • Protect the data through security, audit, lineage, and reporting



    Framework level guarantees
  • Integrated transactions mean applications aren’t required to be idempotent
  • Ingestion capabilities and processing engines provide partitioning, ordering and exactly-once execution
    Full development lifecycle and production deployment
  • Portable and scalable from laptop to cluster with support for testing and continuous integration
  • Logging, metrics, security, and management with low developer overhead
    Standardization of applications across programming paradigms
  • Take advantage of Spark, Cascading, Hive, etc. and their User APIs without worrying about the details of how to integrate with each system
  • Real-time and batch applications can be packaged, deployed, and managed together.


  • Developers can build a broader range of apps focusing on business logic, not writing integration code or building core system services
  • Speeds up time from development to testing to production to deployment
  • Take advantage of new technology with less need for training and expertise

Use Cases

Extract Transform Load (ETL)

ETL is often a tedious and complex task, but it is a critical first step for organizations seeking to gain value from their data. CDAP can help, from day one to data lake.

Learn More

Unified real-time and batch processing

Many Big Data solutions demand that insights from retrospective data be applied to real-time streams of data, but these two systems are often separate. CDAP enables developers to unify batch and real-time to achieve better business results.

Learn More
Getting Started options
  • Download CDAP Standalone

    Develop and deploy Hadoop data and apps with a fully functional CDAP environment designed to run on your laptop.
    Native Requires Java and NodeJS. Mac, Linux
    VM Requires Virtual Box. Windows, Mac, Linux

  • Download CDAP Distributed

    Run the fully distributed, highly available YARN, HDFS, and HBase based version of CDAP for full scale testing and production.
    Requires Hadoop 2.0+. Linux

  • Spin up CDAP in the cloud

    Use your own cloud credentials to create a CDAP environment with a single click using Coopr Cloud (http://coo.pr).
    Requires only your browser