The Cask Data Application Platform (CDAP) is an open source framework for rapidly delivering solutions on Hadoop. It integrates and abstracts the underlying Hadoop technologies to provide a simple and consistent platform to build, deploy, and manage complex data analytics applications in the cloud or on-premise.

CDAP Provides Containers on Hadoop

CDAP provides a container architecture for your data and applications on Hadoop. Simplified abstractions and deep integrations with diverse Hadoop technologies dramatically increase productivity and quality in order to accelerate development and reduce time-to-production to get your Hadoop projects to market faster.

Data Containers

CDAP Datasets provide a standardized, logical container and runtime framework for data in varied storage engines. They integrate with other systems for instant data access and allow the creation of complex, reusable data patterns.

Program Containers

CDAP Programs provide a standardized, logical container and runtime framework to compute in varied processing engines. They simplify testing and operations with standard lifecycle and operational and can consistently interact with any data container.

Application Containers

CDAP Applications provide a standardized packaging system and runtime framework for Datasets and Programs. They manage the lifecycle of data and apps and simplify the painful integration and operation processes in heterogeneous infrastructure.

CDAP Benefits


CDAP provides a higher-level integrated framework that frees developers and operations from learning, integrating, and managing each individual open source project. Applications built on CDAP separate business logic from infrastructure APIs, drastically reducing complexity and total cost of ownership.


CDAP enables developers to get started quickly with built-in data ingestion, exploration, and transformation capabilities available through a rich user-interface and interactive shell. Reusable abstractions expose simple APIs for developers to quickly build data-centric applications and get them into production.


CDAP makes all data in Hadoop available for access in real-time, batch, and for ad-hoc SQL analysis without the need to write code, manage metadata, or copy any data. Advanced functionality for scale-out, high-throughput real-time ingestion and transactional event processing while maintaining data consistency enables disruptive new use-cases.