Integrated, code-free CDAP extension for data discovery, metadata tracking, data lineage, and usage analytics.
Store all metadata about your Data Lake.
Quickly and easily locate the data you need.
Improve data quality through lineage analysis.
Learn how datasets are being used, and what the primary applications are accessing them.
Cask Tracker is a self-service CDAP extension that provides users with visibility into how data is flowing through, within, into, and out of a Data Lake. It allows them to perform impact and root cause analysis, and provides an audit trail for auditability and compliance. It enables IT to oversee changes, while delivering trusted, secured data in a complex Data Lake environment. Tracker provides access to structured information that describes, explains, locates, and makes it easier to retrieve, use, and manage datasets.
Features & Benefits
Harvest, Index and Track Datasets
- Immediate, timely, and seamless capture of technical, business, and operational metadata enabling faster and better traceability of all datasets
- Quickly, reliably, and accurately indexes technical, business, and operational metadata to easily locate datasets
- Understand the impact of changing datasets on other datasets or processing, and queries using lineage
- Track flow of data across enterprise systems and data lakes, no matter which process or application is moving or transforming your data
- Trusted and complete metadata on datasets provides easy traceability to resolve any data issues and improve data quality
Support Standardization, Governance and Compliance Needs
- Provide IT with traceability needed in governing datasets and easily applies compliance rules through seamless integration with other extensions
- Consistent definitions of metadata containing information about data to reconcile difference in terminologies
- Empowers business users in understanding lineage of business-critical data
- A data dictionary allows users to define and describe columns that apply across all datasets in a namespace to enforce a common naming convention, type, and indicate if the column contains PII data
Blend Metadata Analytics and Integrations
- Gain deep insights into how your datasets are being created, accessed, and processed with built-in usage analytics capabilities
- Valuable multi-dimensional usage analytics to understand complex interactions between users, applications, and datasets
- Deeper and extensible integrations with enterprise grade MDM systems like Cloudera Navigator and others for centralizing metadata repository, to deliver accurate, complete, and correct data to all
Other Cask Products
Cask Hydrator, powered by CDAP, is a data ingestion service that simplifies and automates the difficult and time consuming task of building, running, and managing data pipelines. The studio allows you to drag-and-drop various sources, transforms, analytics, sinks, and actions.Learn More
Cask Wrangler, powered by CDAP, provides an easy and interactive way to visualize, transform, and cleanse data. It helps data scientists and data engineers derive new schemas and operationalize the data preparation with a few clicks.Learn More
CDAP accelerates time to value from Hadoop through standardized APIs, configurable templates, and visual interfaces, and it increases efficiencies through reusable and portable components. CDAP removes barriers to innovation as an extensible and future-proof platform that provides consistency across environments and easily integrates with existing MDM, BI, and security solutions.Learn More
Cask Market is Cask’s “Big Data App Store” with push button deployment for applications, use cases, data pipelines, sample datapacks, and plugins from within CDAP. It provides step-by-step wizards to help configure and deploy new entities within the platform.Learn More
Want to see Cask Tracker in action? Click the button to request a demo >>