Current Catalog Description

Overview of modern data mining techniques: data cleaning; attribute and subset selection; model construction, evaluation and application. Fundamental mathematics and algorithms for decision trees, covering algorithms, association mining, statistical modeling, linear models, neural networks, instance-based learning and clustering covered. Practical design, implementation, application and evaluation of data mining techniques in class projects. Credit will not be given for both CSE 347 and CSE 447. Prerequisites: CSE 017 and (CSE 160 or CSE 326) and (MATH 231 or ECO 045 or ISE 121)

Instructor: Lifang He (Spring 2022)

Textbook

Tan, Steinbach, Karpatne & Kumar,  "Introduction to Data Mining", 2nd Ed., Pearson (2018), ISBN 978-0133128901

 

References

None

COURSE OUTCOMES

Student will have:

  1. Understanding the principles of data mining.
  2. Be aware of the challenges that arise in data mining.
  3. Know a range of techniques for data mining and where they can be applied.
  4. Become aware of ethical issues that are present in data mining applications.

RELATIONSHIP BETWEEN COURSE OUTCOMES AND STUDENT ENABLED CHARACTERISTICS

CSE 347 substantially supports the following student enabled characteristics

A. An ability to apply knowledge of computing and mathematics appropriate to the discipline

B. An ability to analyze a problem and identify and define the computing requirements appropriate to its solution

I. An ability to use current techniques, skills, and tools necessary for computing practices

J. An ability to apply mathematical foundations, algorithmic principles, and computer science theory in the modeling and design of computer-based systems in a way that demonstrates comprehension of the tradeoffs involved in design choices

K. An ability to apply design and development principles in the construction of software systems of varying complexity

Major Topics Covered in the Course

  • Machine learning (both shallow and deep learning) for data mining
  • Statistical modeling
  • Data preprocessing and cleaning
  • Data mining knowledge representation 
  • Attribute and subset selection
  • Instance-based learning
  • Association rule mining
  • Clustering
  • Classification/regression
  • Linear/nonlinear models
  • Outlier and anomaly detection
  • Social network analysis
  • Community detection
  • Model evaluation
  • Neural networks (image, graph)

Assessment Plan for the Course

The students are given six homework assignments that relate to the assigned readings and material presented in lectures. There is one open-book midterm exam about halfway through the course on the topics covered up to that point. There is a final course project to identify and tackle a "real-life" data mining problem. Students are required to submit a write-up and make a short oral presentation describing their work on their final projects.

How Data in the Course are Used to Assess Program Outcomes:(unless adequately covered already in the assessment discussion under Criterion 4)

Each semester I include the above data from the assessment plan for the course in my self-assessment of the course. This report is reviewed, in turn, by the Curriculum Committee.