Loading...
  1. About the Course
  2. Intended Audience
  3. Syllabus

COURSE OVERVIEW

 

 

    Course Content

    Big Data

    • The problem space and example applications
    • Why don’t traditional approaches scale?
    • Requirements

    • Hadoop Background

      • Hadoop History
      • The ecosystem and stack: HDFS, MapReduce, Hive, Pig…
      • Cluster architecture overview

      • Development Environment

        • Hadoop distribution and basic commands
        • Eclipse development

        HDFS Introduction 

        • The HDFS command line and web interfaces
        • The HDFS Java API (lab)

        MapReduce Introduction

        • Key philosophy: move computation, not data
        • Core concepts: Mappers, reducers, drivers
        • The MapReduce Java API (lab)

        Real-World MapReduce

        • Optimizing with Combiners and Partitioners (lab)
        • More common algorithms: sorting, indexing and searching (lab)
        • Testing with MRUnit

        Higher-level Tools

        • Patterns to abstract “thinking in MapReduce”
        • The Cascading library (lab)
        • The Hive database (lab)
next
prev

Customers