CurriculumFoundations of Data Science
DATA_ENG 200 Foundations of Data Science
Offered: Winter (TTh 9:30-10:50 a.m.) and Spring (TTh 12:30-1:50 p.m.)
Foundations of Data Science will cover the fundamentals of data science and the context within which this field operates. This course will introduce the steps of the data science lifecycle and the associated data tools and techniques, through implementation in languages such as Python. This course is reserved for students pursuing the McCormick Machine Learning and Data Science Minor. We encourage students to take this early in their studies for the minor. It is the first part of a two-part sequence with DATA_ENG 300.
Prerequisite: COMP_SCI 150
Learning Objectives
(General overview)
- Students will understand the core concepts and scope of data science.
- Students will understand the stages of the data science lifecycle and the common tools and techniques used.
- Students will be able to formulate and scope innovative, relevant, or scientific questions that can be addressed with data.
- Students will be able to utilize computational thinking for problem-solving in data science.
- Students will be able to present data findings through written communications and visual aids through homework assignments and a project presentation.
(Related to specific topics)
- Students will be able to conduct exploratory data analysis to uncover insights.
- Students will know and be able to apply principles of data cleaning and manipulation.
- Students will know and be able to apply the principles of algorithmic data collection and joining of multiple data sources.
- Students will know and be able to identify and avoid common pitfalls in data analytics, such as algorithmic bias.
- Students will know and be able to construct reproducible data science pipelines to ensure replicability of analyses.
(If time permits)
- Students will understand and apply best practices for handling and protecting sensitive data.
- Students will be able to implement version control to manage and track changes in data projects.
Topics
- Introduction to data science
- Data exploration and visualization
- Data manipulation, transformation, and standardization
- Algorithmic data retrieval methods
- Statistical modeling and machine learning
- Introduction to cloud computing
- Ethics and algorithmic bias
(If time permits)
- Data security and privacy
- Version control
More information on required materials will be coming soon.