M.S., Data Science Curriculum

This program is a 30 credit master's degree program offered through the P.C. Rossin College of Engineering and Applied Science. The curriculum is designed to achieve a diversity of career outcomes for graduates, including data-specific industry positions, research & development, and doctoral programs. The program welcomes students from a wide variety of disciplines, and is designed to train students who are relatively new to the field.

Students of the program will gain academic and practical literacy in data science, develop and apply methods to extract relevant information from data, and recognize and resolve legal and ethical considerations in data science applications. Through coursework and direct exposure to real-world data and challenges, students graduate prepared for professional success or doctoral research in foundational or applied data science.

The program provides options for live in-person instruction, as well as a fully online mode, as well as a path to completion within 12 months or an expanded timeframe, depending on student needs.

Students are required to successfully complete the following core courses totaling 21 credits, as well as 9 credits of electives, 6 of which at the 400 level. 

  • DSCI 310: Introduction to Data Science (3 credits)
     
  • DSCI 311: Optimization and Mathematical Foundations for Data Science (3 credits)
     
  • DSCI 321: Algorithms and Software Foundations for Data Science (3 credits)
     
  • DSCI 411: Data Management for Big Data (3 credits)
    (OR)
    DSCI 421: Accelerated Computing for Machine Learning (3 credits)
     
  • DSCI 431: Introduction to Statistical Modeling (3 credits)
     
  • DSCI 441: Statistical and Machine Learning (3 credits)
     
  • DSCI 451: Ethics in Data Science (3 credits)

 

Course Listing

Core Courses
  • DSCI 310: Introduction to Data Science (3 credits) - The computational analysis of data to extract knowledge and insight. Exploration and manipulation of data. Introduction to data collection and cleaning, reproducibility, code and data management, statistical inference, modeling, ethics, and visualization.

  • DSCI 311: Optimization and Mathematical Foundations for Data Science (3 credits) - This course briefly reviews mathematical structures, linear modeling and matrix computation, and probabilistic thinking and modeling, and covers optimization with an eye towards the algorithms and techniques most commonly used in data analysis.

  • DSCI 321: Algorithms and Software Foundations for Data Science (3 credits) - A data scientist needs to study foundational computer science topics and be able to develop software. Topics include discrete structures, algorithm design, programming concepts and data structures, tools and environments, and scaling for big data.

  • DSCI 411: Data Management for Big Data (3 credits) - Data management here is more than traditional data management and must include (distributed) systems supporting volume and velocity attributed to big data (SQL, NoSQL, Hadoop, Spark, etc.). This also covers data collection, cleaning, provenance, structuring and transforming data. Prerequisites:  DSCI 310 and DSCI 321.

  • DSCI 421: Accelerated Computing for Machine Learning (3 credits) - This course provides an introduction to hard- ware architectures and parallel computing systems that facilitate high speed machine learning. This would cover Graphics Processing Unit (GPU) versus Computer Processing Unit (CPU), hardware architecture of parallel computers, memory allocation and data parallelism, multidimensional kernel configuration, kernel-based parallel programming, principles and patterns of parallel algorithms, application of parallel computing to machine learning. Prerequisites:  DSCI 310 and DSCI 321.

  • DSCI 431: Introduction to Statistical Modeling (3 credits) - This course introduces statistical analysis of data, linear models, building on the introductory courses. Other topics include exploratory data analysis, graphical data analysis, estimation and hypothesis testing, Bayesian methods, simulation and resampling, linear, multivariate and generalized linear models, algorithmic modeling, clustering, model selection and performance evaluation. Prerequisites:  DSCI 310 and DSCI 311.

  • DSCI 441: Statistical and Machine Learning (3 credits) - Covers common machine learning methods, algorithmic analysis of models for scalability and implementation, data transformations (including dimension reduction, smoothing, aggregation), supervised and unsupervised learning, and ensemble methods. Prerequisites:  DSCI 310 and DSCI 321 and DSCI 431.

  • DSCI 451: Ethics in Data Science (3 credits) - Legal and ethical considerations including privacy, reproducibility, bias, and fairness that are central to data science efforts, as well as ethical principles in information and technology research. This course raises the issues in real-world contexts and develops methods to ameliorate the problems. Prerequisites: DSCI 310.

Approved Electives

In addition to the core requirements, students are required to complete a minimum of 9 credits from a list of approved electives maintained by the program, at least 6 of which must be at the 400 level. Up to 9 credits from other programs can be applied towards the MS degree. Additional courses are offered within the Data Science program and may be taken by students interested in foundational or project courses. 

  • CSE 327: Artificial Intelligence: Theory and Practice (3 credits) - Detailed analysis of a broad range of artificial intelligence (AI) algorithms and systems. Problem solving, knowledge representation, reasoning, planning, uncertainty and machine learning. Applications of AI to areas such as natural language processing, vision, and robotics.
     
  • CSE 407: Structural Bioinformatics (3 credits) - Computational techniques and principles of structural biology used to examine molecular structure, function, and evolution. Topics include: protein structure alignment and prediction; molecular surface analysis; statistical modeling; QSAR; computational drug design; influences on binding specificity; protein-ligand, -protein, and –DNA interactions; molecular simulation, electrostatics. Consent of instructor required.
     
  • CSE 408: Bioinformatics: Issues and Algorithms (3 credits) - Computational problems and their associated algorithms arising from the creation, analysis, and management of bioinformatics data. Genetic sequence comparison and alignment, physical mapping, genome sequencing and assembly, clustering of DNA microarray results in gene expression studies, computation of genomic rearrangements and evolutionary trees. This course, a version of 308 for graduate students requires advanced assignments.
     
  • CSE 419: Image Processing and Graphics (3 credits) - State-of-the-art techniques for fundamental image analysis tasks; feature extraction, segmentation, registration, tracking, recognition, search (indexing and retrieval). Related computer graphics techniques: modeling (geometry, physically-based, statistical), simulation (data-driven, interactive), animation, 3D image visualization, and rendering.
     
  • CSE 425: Natural Language Processing (3 credits) - Overview of modern natural language processing techniques: text normal- ization, language model, part-of-speech tagging, hidden Markov model, syntactic and dependency parsing, semantics, word sense, reference resolution, dialog agent, machine translation. Three projects to design, implement and evaluate classic NLP algorithms.
     
  • CSE 428: Semantic Web Topics (3 credits) - Theory, architecture and applications of the Semantic Web. Issues in designing distributed knowledge representation languages, ontology development, knowledge acquisition, scalable reasoning, integrating heterogeneous data sources, and web-based agents.
     
  • CSE 445: WWW Search Engines (3 credits) - Study of algorithms, architectures, and implementations of WWW search engines. Information retrieval (IR) models; performance evaluation; properties of hypertext crawling, indexing, searching and ranking; link analysis; parallel and distributed IR; user interfaces.
     
  • CSE 447: Data Mining (3 credits) - Modern data mining techniques: data cleaning; attribute and subset selection; model construction, evaluation and application. Algorithms for decision trees, covering algorithms, association rule mining, statistical modeling, model and regression trees, neural networks, instance-based learning and clustering covered.
     
  • CSE 471: Principles of Mobile Computing (3 credits) - Course topics include fundamental concepts and technology underlying mobile computing and current research in these areas. Examples drawn from a variety of application domains such as health monitoring, energy management, commerce, and travel. Issues of system efficiency will be studied, including efficient handling of large data such as images and effective use of cloud storage. Recent research papers will be discussed.
     
  • CSE 475: Principles and Practice of Parallel Computing (3 credits) - Parallel computer architectures, parallel languages, parallelizing compilers and operating systems. Design, implementation, and analysis of parallel algorithms for scientific and data-intensive computing.
     
  • CSE/BIOE 420: Biomedical Image Computing and Modeling (3 credits) - Biomedical image modalities, image computing techniques, and imaging informatics systems. Understanding, using, and developing algorithms and software to analyze biomedical image data and extract useful quantitative information: Biomedical image modalities and formats; image processing and analysis; geometric and statistical modeling; image informatics systems in biomedicine. This course, a graduate version of BIOE 320, requires additional advanced assignments. Credit will not be given for both BIOE 320 and BIOE 420.
     
  • ECE 345: Introduction to Data Networks (3 credits) - Analytical foundations in the design and evaluation of data communication networks. Fundamental mathematical models underlying network design with their applications in practical network algorithms. Layered network architecture, queuing models with applications in network delay analysis, Markov chain theory with applications in packet radio networks and dynamic programming with applications to network routing algorithms. Background on stochastic processes and dynamic programming will be reviewed.
     
  • ECE 401: Advanced Computer Architecture (3 credits) - Design, analysis and performance of computer architectures; high-speed memory systems; cache design and analysis; modeling cache performance; principle of pipeline processing, performance of pipelined computers; scheduling and control of a pipeline; classification of parallel architectures; systolic and data flow architectures; multiprocessor performance; multiprocessor interconnections and cache coherence.
     
  • ECE 403: Accelerated Computing for Deep Learning (3 credits) - Graphics Processing Unit (GPU) versus Computer Processing Unit (CPU), hardware architecture of parallel computers, memory allocation and data parallelism, multidimensional kernel configuration, kernel-based parallel programming, principles and patterns of parallel algorithms, application of parallel computing to deep learning neural networks. Deep Learning (DL) algorithms, such as Convolutional Neural Networks (CNN), Stochastic Gradient Descent, and back propagation algorithms.
     
  • ECE 414: Machine Learning and Statistical Decision Making (3 credits) - Overview of Machine Learning. Overview of decision making based on data. Overview of discovery of unknown quantities based on data. Description of the popular algorithms for decision making and for discovery of unknown quantities based on data. Performance analysis via comparison to optimum methods and bounds on optimum performance for assumed models. The emphasis is on statistical analysis of various algorithms using well established statistical theory. Exposure to probability and random process theory is assumed.
     
  • ECE 440: Introduction to Online and Reinforcement Learning (3 credits) - Review of probability and random processes, basic reinforcement learning framework, learning from streaming data, actions in response to changing environment through Markov Decision Processes, elements of artificial intelligence. Exploration-Exploitation trade offs through bandit problems, and different methods for reinforcement learning including dynamic programming, Monte Carlo methods, temporal difference and Q-learning. Approximate solutions for very large state space systems, policy iteration and actor critic methods, introduction of deep reinforcement learning.
     
  • ECE 464: Cryptography and Network Security (3 credits) - Introduction to cryptography, classical cipher systems, cryptanalysis, perfect secrecy and the one time pad, DES and AES, public key cryptography covering systems based on discrete logarithms, the RSA and the knapsack systems, and various applications of cryptography.
     
  • ECE/BIOE 466: Neural Engineering (3 credits) - Neural system interfaces for scientific and health applications. Basic properties of neurons, signal detection and stimulation, instrumentation and microfabricated electrode arrays. Fundamentals of peripheral and central neural signals and EEG, and applications such as neural prostheses, implants and brain-computer interfaces.
     
  • ECE/CHE/ME 436: Systems Identification (3 credits) - The determination of model parameters from time-history and frequency response data by graphical, deterministic and stochastic methods. Examples and exercises taken from process industries, communications and aerospace testing. Regression, quasilinearization and invariant-imbedding techniques for nonlinear system parameter identification included.
     
  • ISE 362: Logistics and Supply Chain Management (3 credits) - Modeling and analysis of supply chain design, operations, and management. Analytical framework for logistics and supply chains, demand and supply planning, inventory control and warehouse management, transportation, logistics network design, supply chain coordination, and financial factors. Students complete case studies and a comprehensive final project.
     
  • ISE 404: Simulation (3 credits) - Applications of discrete and continuous simulation techniques in modeling industrial systems. Simulation using a high level simulation language. Design of simulation experiments. This course is a version of IE 305 for graduate students, with research projects and advanced assignments.
     
  • ISE 409: Time Series Analysis (3 credits) - Theory and applications of an approach to process modeling, analysis, prediction, and control based on an ordered sequence of observed data. Single or multiple time series are used to obtain scalar or vector difference/ differential equations describing a variety of physical and economic systems.
     
  • ISE 410: Design of Experiments (3 credits) - Experimental procedures for sorting out important causal variables, finding optimum conditions, continuously improving processes, and trouble shooting. Applications to laboratory, pilot plant and factory. Must have some statistical background and experimentation in prospect.
     
  • ISE 417: Nonlinear Optimization (3 credits) - Advanced topics in mathematical optimization with emphasis on modeling and analysis of nonlinear problems. Convex analysis, unconstrained and constrained optimization, duality theory, Lagrangian relaxation, and methods for solving nonlinear optimization problems, including descent methods, Newton methods, conjugate gradient methods, and penalty and barrier methods.
     
  • ISE 444: Optimization Methods in Machine Learning (3 credits) - Machine learning models and advanced optimization tools that are used to apply these models in practice. Machine learning paradigm, machine learning models, convex optimization models, basic and advanced methods for modern convex optimization.
     
  • ISE 455: Optimization Algorithms and Software (3 credits) - Basic concepts of large families of optimization algorithms for both continuous and discrete optimization problems. Pros and cons of the various algorithms when applied to specific types of problems; information needed; whether local or global optimality can be expected. Participants practice with corresponding software tools to gain hands-on experience.
     
  • ISE 467: Mining of Large Datasets (3 credits) - Explores how large datasets are extracted and analyzed. Discusses suitable algorithms for high dimensional data, graphs, and machine learning. Introduces the use of modern distributed programming models for large-scale data processing.

Other classes will be considered through discussion with the program chairs.

Data Science: Meet a Student or Advisor

CONNECT WITH THE DATA SCIENCE PROGRAM
Questions about courses? Research experience? Life on campus? Register for our online info session or schedule a meeting with a Data Science Graduate Student advisor today!

Video: Info Session

• Meet with an Advisor

M.S., DATA SCIENCE

 • Program Home

 • About

 • Leadership

 • Curriculum

 • Meet an Advisor

 • Data Science Profiles