Log in

Invalid username or password

Get commercial support

Our Products

Open source. Built for developers.


Cask Data Application Platform

Virtualization for Hadoop data and apps.

CDAP is an open source application development platform for the Hadoop ecosystem that provides developers with data and application virtualization to accelerate application development, address a broader range of real-time and batch use cases, and deploy applications into production while satisfying enterprise requirements.

Data Virtualization

Logical representations of physical data as CDAP Datasets within the CDAP Runtime Environment


    Streams for data ingestion
  • Supports Kafka, Flume, REST, and custom implemented protocols
  • Time-stamped, ordered and horizontally scalable
    Reusable libraries for common Big Data access patterns
  • Secondary indexes, Time Series, Key-Value, Objects, Geospatial, OLAP Cube and more
  • Libraries expose each pattern as RPCs, Batch Scans, and SQL Tables
    Data available to multiple applications and different paradigms
  • Unified batch and real-time processing with the same data used concurrently by MapReduce, Hive, Spark, Flows and more
  • Expose data as REST services to quickly enable data as a service


  • Simplify data ingestion and Extract Transform Load (ETL) to accelerate time to value
  • Maximize value of data by making it easy to find and easy to explore through multiple query methods
  • Protect the data through security, audit, lineage, and reporting

App Virtualization

Applications deployed as CDAP Containers within the CDAP Runtime Environment


    Framework level guarantees
  • Integrated transactions mean applications aren’t required to be idempotent
  • Ingestion capabilities and processing engines provide partitioning, ordering and exactly-once execution
    Full development lifecycle and production deployment
  • Portable and scalable from laptop to cluster with support for testing and continuous integration
  • Logging, metrics, security, and management with low developer overhead
    Standardization of applications across programming paradigms
  • Take advantage of Spark, Cascading, Hive, etc. and their User APIs without worrying about the details of how to integrate with each system
  • Real-time and batch applications can be packaged, deployed, and managed together.


  • Developers can build a broader range of apps focusing on business logic, not writing integration code or building core system services
  • Speeds up time from development to testing to production to deployment
  • Take advantage of new technology with less need for training and expertise

CDAP Architecture


Provides a single-point of access for data, apps, service, and management APIs with integrated discovery, load balancing and horizontal scalability

Transaction engine

Enables ACID properties data operations from within any program container, real-time and batch

Runtime services

Includes services for apps and data like security, discovery, and management throughout the app and data lifecycles

Use Cases

Extract Transform Load (ETL)

ETL is often a tedious and complex task, but it is a critical first step for organizations seeking to gain value from their data. CDAP can help, from day one to data lake.

Learn More

Unified real-time and batch processing

Many Big Data solutions demand that insights from retrospective data be applied to real-time streams of data, but these two systems are often separate. CDAP enables developers to unify batch and real-time to achieve better business results.

Learn More
Product Options CDAP Free CDAP Cloud CDAP Enterprise

Based 100% on open source CDAP under Apache license

Pricing model

Free/unlimited use

Annual subscription based on node count

Annual subscription based on node count


Includes Big Flow streaming
(Tigon in future release)

Full support for data virtualization

Full support for app virtualization

Services and tools include:
Logging, metrics, security, transaction support

Included support

Community support

24×7 support portal and knowledge base

8×5 live support
1 day response

24×7 support portal and knowledge base

8×5 live support
4 hour response

Custom support options

Updates: Maintenance releases, patches


Updates installed by Cask

Updates installed by Cask


Getting Started options
  • Download CDAP Standalone

    Develop and deploy Hadoop data and apps with a fully functional CDAP environment designed to run on your laptop.
    Native Requires Java and NodeJS. Mac, Linux
    VM Requires Virtual Box. Windows, Mac, Linux

  • Download CDAP Distributed

    Run the fully distributed, highly available YARN, HDFS, and HBase based version of CDAP for full scale testing and production.
    Requires Hadoop 2.0+. Linux

  • Spin up CDAP in the cloud

    Use your own cloud credentials to create a CDAP environment with a single click using Coopr Cloud (http://coo.pr).
    Requires only your browser