The cost of maintaining a traditional Enterprise Data Warehouse (EDW) is skyrocketing as legacy systems buckle under the weight of exponentially growing data and increasingly complex processing needs. While Hadoop offers massive horizontal scalability along with system-level services that allow developers to free up storage and processing power from premium EDW platforms, it is not a complete ETL (Extract-Transform-Load) solution. In most cases, these gaps must be filled through large number of lines of complex manual coding, slowing Hadoop adoption and frustrating organizations eager to deliver results.
Cask’s EDW Offload solution, powered by CDAP and available in Cask Market, fills the gaps between Hadoop and ETL needs. It comes with pre-built pipelines which can be edited in a studio environment that consists of drag-and-drop sources, transforms, analytics, sinks, and actions. By using the Cask Change Data Capture (CDC) solution, a pre-built solution using Spark Streaming for real-time data integration, the extraction of data from the source data systems can be fast and efficient.
After data and workloads have been migrated to Hadoop, CDAP ensures extensive security and sophisticated governance across your datasets. Additionally, CDAP provides access controls that meet typical security needs of the enterprise.
Rapid Time to Value
Pre-configured pipelines in Cask Hydrator save developers the complex and time-consuming process of having to write custom data pipelines from scratch, enabling quick deployment of EDW offload solutions.
Cask Hydrator provides a graphical drag-and-drop interface for building pipelines from the EDW into Hadoop.
Cask Tracker provides metadata audit, which simplifies tracking data flows and makes it easier to retrieve, use, and manage datasets.