Integrated, code-free CDAP extension for data discovery, metadata tracking, data lineage, and usage analytics.


Store all metadata about your Data Lake.


Quickly and easily locate the data you need.


Improve data quality through lineage analysis.


Learn how datasets are being used, and what the primary applications are accessing them.

Cask Data Application Platform CDAP 4 now in Preview! Includes Data Dictionary for detailed information about columns within your data as part of Cask Tracker.

Cask Tracker is a self-service CDAP extension that provides users with visibility into how data is flowing through, within, into, and out of a Data Lake. It allows them to perform impact and root cause analysis, and provides an audit trail for auditability and compliance. It enables IT to oversee changes, while delivering trusted, secured data in a complex Data Lake environment. Tracker provides access to structured information that describes, explains, locates, and makes it easier to retrieve, use, and manage datasets.

Features & Benefits

Harvest, Index and Track Datasets

  • Immediate, timely, and seamless capture of technical, business, and operational metadata enabling faster and better traceability of all datasets
  • Quickly, reliably, and accurately indexes technical, business, and operational metadata to easily locate datasets
  • Understand the impact of changing datasets on other datasets or processing, and queries using lineage
  • Track flow of data across enterprise systems and data lakes, no matter which process or application is moving or transforming your data
  • Trusted and complete metadata on datasets provides easy traceability to resolve any data issues and improve data quality

Support Standardization, Governance and Compliance Needs

  • Provide IT with traceability needed in governing datasets and easily applies compliance rules through seamless integration with other extensions
  • Consistent definitions of metadata containing information about data to reconcile difference in terminologies
  • Empowers business users in understanding lineage of business-critical data

Blend Metadata Analytics and Integrations

  • Gain deep insights into how your datasets are being created, accessed, and processed with built-in usage analytics capabilities
  • Valuable multi-dimensional usage analytics to understand complex interactions between users, applications, and datasets
  • Deeper and extensible integrations with enterprise grade MDM systems like Cloudera Navigator and others for centralizing metadata repository, to deliver accurate, complete, and correct data to all

CDAP accelerates time to value from Hadoop through standardized APIs, configurable templates, and visual interfaces, and it increases efficiencies through reusable and portable components. CDAP removes barriers to innovation as an extensible and future-proof platform that provides consistency across environments and easily integrates with existing MDM, BI, and security solutions.

Learn More

Cask Hydrator, powered by CDAP, is a code-free visual application for building complex data pipelines and managing them on your Data Lake. With Cask Hydrator, you can ingest data from varied sources, ingest CSV, XML, Excel, etc, cleanse, normalize and transform data, build machine learning models on-fly, perform aggregations, run custom scripts, and more.

Learn More

Want to see Tracker in action? Click the button to request a demo >>