Big Data Analysis with Hadoop Workshop

Event time: 
Thursday, July 28, 2016 - 2:00pm to 3:00pm
Location: 
Yale Institute for Network Science See map
17 Hillhouse Avenue
New Haven, CT
Event description: 

This is a recorded online workshop with LinkedIn Hadoop developer Jack Dintruff, teaching how to use Hadoop utilities to set up, manage, and analyze large and distributed datasets. We will learn how to work with: 

  • HDFS
  • YARN
  • MapReduce
  • Hive
  • Pig

Topics include:

  • setting up and administrating clusters
  • ingesting data
  • selecting and aggregating large datasets
  • defining limits, unions, filters, and joins
  • writing custom user-defined functions (UDFs)
  • creating queries and lookups

Recommended for anyone working with big data or interested in the parallel computing.