Course Info

DSC 333: Introduction to Big Data Processing

This course will explore different approaches and a framework for performing data analytics on a dynamic, heterogeneous cluster of computing nodes. The course will begin with studying principles behind MapReduce and implementation of custom distributed queries using Hadoop. It will then expand to cover higher-level languages and tools within Hadoop ecosystem (e.g., Pig, Hive) and cluster configuration techniques. Finally, the course will delve into a comparative evaluation of several NoSQL and NewSQL databases that make fundamentally different assumptions for data processing (e.g., OLAP vs OLTP, disk-bound vs in-memory or real-time streaming data). The primary focus of the course will be hands-on implementation and tuning performance for large-scale clusters and data sets.

CSC 355 is a prerequisite for this class.

Fall 2024-2025

Section: 701
Class number: 14949
Meeting time: Tu 5:45PM - 9:00PM
Location: CDM 00228 at Loop Campus
Instructor:

Spring 2023-2024

Section: 901
Class number: 35328
Meeting time: Th 5:45PM - 9:00PM
Location: CDM 00220 at Loop Campus
Instructor: Ahmed Abid | View syllabus

Winter 2023-2024

Section: 801
Class number: 23101
Meeting time: Tu 5:45PM - 9:00PM
Location: CDM 00224 at Loop Campus
Instructor: Tanu Malik | View syllabus
CLOSED

Fall 2023-2024

Section: 701
Class number: 13414
Meeting time: Tu 5:45PM - 9:00PM
Location: CDM 00228 at Loop Campus
Instructor: Tanu Malik | View syllabus

Winter 2022-2023

Section: 801
Class number: 29121
Meeting time: W 5:45PM - 9:00PM
Location: CDM 00218 at Loop Campus

Fall 2022-2023

Section: 701
Class number: 19144
Meeting time: Tu 5:45PM - 9:00PM
Location: CDM 00228 at Loop Campus

Spring 2021-2022

Section: 901
Class number: 37539
Meeting time: Tu 5:45PM - 9:00PM
Location: CDM 00226 at Loop Campus

Winter 2021-2022

Section: 801
Class number: 28843
Meeting time: W 5:45PM - 9:00PM
Location: CDM 00218 at Loop Campus

Fall 2021-2022

Section: 701
Class number: 18790
Meeting time: Tu 5:45PM - 9:00PM
Location: CDM 00228 at Loop Campus

Fall 2020-2021

Section: 401
Class number: 16273
Meeting time: Tu 5:45PM - 9:00PM
Location: Online: Sync
Section: 410
Class number: 17309
Meeting time: -
Location: Online: Async (Sync-Option)

Spring 2019-2020

Section: 901
Class number: 30934
Meeting time: Tu 5:45PM - 9:00PM
Location: REMOT E0000