Cask Data App Platform

Four steps to Analytics on Hadoop

Step 1

Start CDAP

Unzip the downloaded SDK ZIP or Start the downloaded VM. For other artifacts please visit our site.

Change directory
cd <install-dir>
Start CDAP Standalone
bin/ start
Go to the CDAP UI
Step 2

Ingest Data

Start by creating a Stream.

bin/ "create stream logEventStream"

Then, load some data into the Stream

bin/ "load stream logEventStream examples/resources/accesslog.txt"
Now that you have loaded data into Stream, you can run HIVE SQL queries to explore the data. You can find more information here.
Step 3

Generate Analytics

Process ingested weblogs to generate analytics using the ETL Batch Adapter and store in OLAP Cube Dataset.

bin/ -s examples/resources/weblog-adapter.txt
Find out more about Application Templates and ETL Adapters here. Take a look at the CLI script below for details.

Step 4

Serve Results

Analytics generated in the previous step can be served using HTTP Service.

bin/ -s examples/resources/weblog-service.txt

Query the cube results (Note: Workflow triggers every 1 min)

bin/ -s examples/resources/weblog-query.txt

Try configuring the same through CDAP UI

Now that you have set up and understand CDAP, here is some more information to help you experience it more deeply!