Big Data Analysis with Hadoop Workshop
Event time:
Thursday, July 28, 2016 - 2:00pm to 3:00pm
Location:
Yale Institute for Network Science
17 Hillhouse Avenue
New Haven, CT
Event description:
This is a recorded online workshop with LinkedIn Hadoop developer Jack Dintruff, teaching how to use Hadoop utilities to set up, manage, and analyze large and distributed datasets. We will learn how to work with:
- HDFS
- YARN
- MapReduce
- Hive
- Pig
Topics include:
- setting up and administrating clusters
- ingesting data
- selecting and aggregating large datasets
- defining limits, unions, filters, and joins
- writing custom user-defined functions (UDFs)
- creating queries and lookups
Recommended for anyone working with big data or interested in the parallel computing.