ClassInfo

CSC 478 Programming Machine Learning Applications

Bamshad Mobasher

Office: CDM 833
Spring 2015-2016
Class number: 37388
Section number: 901
M 5:45PM - 9:00PM
LEWIS 01216 Loop Campus

Summary

The course will focus on the implementations of various data mining and machine learning techniques and their applications in several domains. The primary tools used in the class are the Python programming language and several associated libraries. Additional open source machine learning and data mining tools may also be used as part of the class material and assignments. Students will gain hands on experience developing supervised and unsupervised machine learning algorithms and will learn how to employ these techniques in the context of popular applications such as automatic classification, recommender systems, searching and ranking, text mining, group and community discovery, and social media analytics.



Texts

Please see: Course Syllabus


Grading

The structure and grading in the class will be centered around 4-5 assignments and a final project. The assignments will involve Python implementations of selected data mining techniques and their applications in various domains. The assignments will typically involve both programming components as well as problems related to the material covered in class. Some assignments may also involve the use of other open source data mining tools. These assignments must be done individually, unless otherwise specified. Late assignments will be penalized 10% per day (with weekends counting as one day).

The final project will be a more complex programming/implementation assignment that will involve integrating multiple concepts and techniques. Student will be able to choose from among several possible projects ideas or propose their own. More details on the final project are available in the class Web site. The final grade will be determined (tentatively) based on the following components:

    Assignments = 65%
    Final Project = 35%

The general grading scheme will be based on a curve. At the end of the quarter, some adjustments may be made based on overall class performance as well as signs of individual effort. Pluses and minuses will be given at the high/low ends of each grade range.



Prerequisites

CSC 401 and IS 467 (formerly IS 567)



Tentative List of Topics

The following issues and topics will be covered throughout the course. Many of these topics will be revisited several times during the course in a variety of contexts.

  • Data Science and the Knowledge Discovery Process
    • The KDD process and methodology
    • Data preparation for machine learning
    • Overview of data mining and Machine Learning techniques
    • Review of Python and overview of Python tools for Data Science
  • Supervised Techniques
    • Classification and Prediction using K-Nearest-Neighbor
    • Classifying with Probability Theory; Na?ve Bayes
    • Building Decision Trees
    • Forecasting and Regression models
    • Stochastic Gradient Decent
    • Ensemble models: Random Forest; AdaBoost
    • Support Vector Machines
    • Evaluating and optimizing predictive models
    • Feature and Model Selection Strategies
  • Unsupervised Learning
    • Clustering using K-Means
    • Association Rule discovery
    • Sequential Pattern Analysis
    • Principle Component Analysis
    • Dimensionality Reduction via Singular Value Decomposition
  • Possible Applications (covered throughout the course)
    • Collaborative Recommender Systems
    • Content Based personalization
    • Predictive User Modeling
    • Text Mining and Concept Discovery from Documents
    • Finding groups using social or behavioral data
    • Image Analysis
    • Building predictive models for target marketing
    • Customer or user segmentation
  • Advance Topics (if time permits)
    • Matrix Factorization
    • Search and Optimization Techniques
    • Markov Models
    • Topic Modeling with Latent Dirichlet Allocation


School policies:

Changes to Syllabus

This syllabus is subject to change as necessary during the quarter. If a change occurs, it will be thoroughly addressed during class, posted under Announcements in D2L and sent via email.

Online Course Evaluations

Evaluations are a way for students to provide valuable feedback regarding their instructor and the course. Detailed feedback will enable the instructor to continuously tailor teaching methods and course content to meet the learning goals of the course and the academic needs of the students. They are a requirement of the course and are key to continue to provide you with the highest quality of teaching. The evaluations are anonymous; the instructor and administration do not track who entered what responses. A program is used to check if the student completed the evaluations, but the evaluation is completely separate from the student’s identity. Since 100% participation is our goal, students are sent periodic reminders over three weeks. Students do not receive reminders once they complete the evaluation. Students complete the evaluation online in CampusConnect.

Academic Integrity and Plagiarism

This course will be subject to the university's academic integrity policy. More information can be found at http://academicintegrity.depaul.edu/ If you have any questions be sure to consult with your professor.

All students are expected to abide by the University's Academic Integrity Policy which prohibits cheating and other misconduct in student coursework. Publicly sharing or posting online any prior or current materials from this course (including exam questions or answers), is considered to be providing unauthorized assistance prohibited by the policy. Both students who share/post and students who access or use such materials are considered to be cheating under the Policy and will be subject to sanctions for violations of Academic Integrity.

Academic Policies

All students are required to manage their class schedules each term in accordance with the deadlines for enrolling and withdrawing as indicated in the University Academic Calendar. Information on enrollment, withdrawal, grading and incompletes can be found at http://www.cdm.depaul.edu/Current%20Students/Pages/PoliciesandProcedures.aspx.

Students with Disabilities

Students who feel they may need an accommodation based on the impact of a disability should contact the instructor privately to discuss their specific needs. All discussions will remain confidential.
To ensure that you receive the most appropriate accommodation based on your needs, contact the instructor as early as possible in the quarter (preferably within the first week of class), and make sure that you have contacted the Center for Students with Disabilities (CSD) at:
Lewis Center 1420, 25 East Jackson Blvd.
Phone number: (312)362-8002
Fax: (312)362-6544
TTY: (773)325.7296