ECT 584 Web Data Mining
Summary
Web mining refers to the
automatic discovery of interesting and useful patterns from the data associated
with the usage, content, and the linkage structure of Web resources. It has
quickly become one of the most popular areas in computing and information
systems because of its direct applications in e-commerce, e-CRM, Web analytics,
information retrieval/filtering, Web personalization, and recommender systems.
Employees knowledgeable about Web mining techniques and their applications are
highly sought by major Web companies such as Google, Amazon, Yahoo, MSN and
others who need to understand user behavior and utilize discovered patterns from
terabytes of user profile data to design more intelligent applications. The
primary focus of this course is on Web usage mining and its
applications to e-commerce and business intelligence. Specifically, we will
consider techniques from machine learning, data mining, text mining, and
databases to extract useful knowledge from Web data which could be used for site
management, automatic personalization, recommendation, and user profiling. The
first half of the course will be focused on a detailed overview of the data
mining process and techniques, specifically those that are most relevant to Web
mining. The second half will concentrate on the applications of these techniques
to Web and e-commerce data, and their use in Web analytics, user profiling and
personalization.
Texts
-
Data Mining Techniques for Marketing, Sales, and Customer Relationship Management,
Third Edition, by Michael Berry and Gordon Linoff, John Wiley, 2011.
-
Recommended Books:
-
Data Mining: Practical Machine Learning Tools and Techniques, by Ian Witten
and Eibe Frank, 3rd Ed., Morgan Kaufmann, 2011
-
Web
Data Mining: Exploring Hyperlinks, Content, and Usage Data,
by Bing Liu, 2nd Edition, , Springer, 2011 (Note: parts of
the 1st edition will be available electronically for the reading
assignments).
Grading
The final grade will be determined (tentatively) based on the following components:
- Assignments = 65%
- Final Project = 35%
Prerequisites
Some background in basic statistics and data structures; basic knowledge of database design and programming.
Project
For the class project, students can choose to do an implementation project, a data analysis project, or a research paper. Implementation and data analysis projects may be done individually or in groups of up to 3 people (depending the complexity and the type of the project). Research paper projects must be done individually. Each group or individual will submit a specific project proposal to be approved. More details about the possible project options, as well as due dates for the proposal and the final submission, will be available on the class Web site.
Tentative List of Topics
The following issues and topics will be covered throughout the course. Many of these topics will be revisited several times during the course in a variety of contexts.
Data Mining and Knowledge Discovery
- The KDD process and methodology
- Data preparation for knowledge discovery
- Overview of data mining techniques
- Market basket analysis
- Classification and prediction
- Clustering
- Memory-based reasoning
- Evaluation and interpretation
Web Usage Mining Process and Techniques
- Data collection and sources of data
- Data preparation for usage mining
- Mining navigational patterns
- Integrating e-commerce data
- Leveraging site content and structure
- User tracking and profiling
- E-Metrics: measuring success in e-commerce
- Privacy issues
Web Mining Applications and Other Topics
- Data integration for e-commerce
- Web personalization and recommender systems
- Web content and structure mining
- Web data warehousing
This syllabus is subject to change as necessary during the quarter. If a change occurs, it will be thoroughly addressed during class, posted under Announcements in D2L and sent via email.
Evaluations are a way for students to provide valuable feedback regarding their instructor and the course. Detailed feedback will enable the instructor to continuously tailor teaching methods and course
content to meet the learning goals of the course and the academic needs of the students. They are a requirement of the course and are key to continue to provide you with the highest quality of teaching. The
evaluations are anonymous; the instructor and administration do not track who entered what responses. A program is used to check if the student completed the evaluations, but the evaluation is completely
separate from the student’s identity. Since 100% participation is our goal, students are sent periodic reminders over three weeks. Students do not receive reminders once they complete the evaluation.
Students complete the evaluation online in CampusConnect.
This course will be subject to the university's academic integrity policy. More information can be found at http://academicintegrity.depaul.edu/ If you
have any questions be sure to consult with your professor.
All students are expected to abide by the University's Academic Integrity Policy which prohibits cheating and other misconduct in student coursework. Publicly sharing or posting online any prior or current materials from this course (including exam questions or answers), is considered to be providing unauthorized assistance prohibited by the policy. Both students who share/post and students who access or use such materials are considered to be cheating under the Policy and will be subject to sanctions for violations of Academic Integrity.
All students are required to manage their class schedules each term in accordance with the deadlines for enrolling and withdrawing as indicated in the University Academic Calendar. Information on enrollment, withdrawal, grading and incompletes can be found at http://www.cdm.depaul.edu/Current%20Students/Pages/PoliciesandProcedures.aspx.
Students who feel they may need an accommodation based on the impact of a disability should contact the instructor privately to discuss their specific needs. All discussions will remain confidential.
To ensure that you receive the most appropriate accommodation based on your needs, contact the instructor as early as possible in the quarter (preferably within the first week of class), and make sure that
you have contacted the Center for Students with Disabilities (CSD) at:
Lewis Center 1420, 25 East Jackson Blvd.
Phone number: (312)362-8002
Fax: (312)362-6544
TTY: (773)325.7296