Ranked #8 in the nation for “Best Online Graduate Computer Information Technology Programs” in 2021 by U.S. News & World Report.
ONLINE Master in Data Science (MSDS)
Deliver Innovation Through Data
MSDS OVERVIEW
The M.S. in Data Science online program at the Stevens Institute of Technology prepares students for careers in fintech, business intelligence and analytics, academia, and database management, as well as government positions requiring strong skills in data analysis.
Python and R, SQL, Hadoop, Hive, and TensorFlow
Machine learning and deep learning
Predictive modeling
Advanced statistical and optimization methods
Supervised/unsupervised learning
Neural networks
Natural language processing (NLP)
Data visualization
By the Numbers
Coursework
Below are Traditional and Advanced course sequences for the M.S. in Data Science program. Students will engage in coursework on the following topics to develop skills as data scientists who can glean insights and aid in informed decision-making. The MSDS program consists of 30 credit hours, with 10 courses, and is 100% online.
Term 1
This course provides students with the essential background in calculus and linear algebra needed to pursue the study of Data Science. Topics include limits, derivatives and integrals of (multivariable) functions; vectors and matrices; vector spaces and subspaces; norms and projections; basis and dimension; eigenvalues and eigenvectors; singular values; continuous optimization; and maps between Euclidean spaces and Jacobians. Throughout, various applications to Data Science will be considered, with hands-on numerical and coding exercises supplementing the theory.
This course provides the theoretical basis for studying the properties of modern statistical and machine learning methods. Students will learn the definitions and properties of probability spaces, random variables, distributions, expectations and limit theorems. Students will work with density functions, conditional expectations, and convergence of random variables. After successful completion of this course, students will be able to determine the probability distribution function and density of random variables/vectors, use the properties of expectations and higher-order moments in computations, and examine the appropriate convergence of random variables given specific situations. Students will also be able to apply results from probability theory to study the properties of sample statistics such as estimators.
Term 2
This course offers an introduction to exploratory data analysis and the use of basic statistical tools. Topics will include data collection; descriptive statistics, and graphical and tabular treatment of quantitative, qualitative, and count data; detecting relations between variables; confidence intervals and hypothesis testing for one and two samples; simple and multiple linear regression; analysis of variance; design of experiments; and nonparametric methods. Selected topics, such as quality control and time series analysis, may also be included. Statistical software will be used throughout the course and statistical inference will be based on examples using real data. Students will participate in group projects of data analysis. They will be trained in the different phases of the professional statistician’s work, namely: data collection, description, analysis, testing, and presentation of the conclusions. Prerequisite: MA 540.
This course will introduce foundational ideas as well as advanced techniques in linear algebra that are employed in computational science of Big Data. Students will work with vector-matrix representation of various types of structured and unstructured data and how different models and processes could be understood in terms of linear algebra operations and algorithms. Efficient implementation of algorithms for high dimensional data by using Randomized Numerical Linear Algebra will be one of the focal points. Students will develop and improve their coding skills in Python and MATLAB for implementation of several algorithms. In addition, students will read past and current literature in machine learning and data science to familiarize themselves with current trends and challenges in linear algebra for solving real life problems. Prerequisites: MA 123, MA 124 or equivalent, MA 232 or equivalent, MA 222 or equivalent, and have basic knowledge of MATLAB (FE 516) or Python (FE 520).
Term 3
The objective of this course is to introduce the students to the theory and methods of optimization used in data science. The first portion of the class focuses on elements of convex analysis and subgradient calculus for non-smooth functions, optimality conditions for differentiable and for non-smooth optimization problems, and Lagrangian duality. The main part of the class discusses numerical methods for optimization with a focus and application to problems arising in data science. Approaches to large-scale/big-data optimization include decomposition methods, design of distributed and parallel methods of optimization, as well as stochastic approximation methods. Examples of optimization models in classification, clustering, statistical learning, compressed sensing will be discussed in order to illustrate the theoretical and numerical challenges and to demonstrate the scope of applications.
An introductory course for machine learning theory, algorithms, and applications. Content aims to provide students with the knowledge to understand key elements of how to design algorithms/systems that automatically learn, improve, and accumulate knowledge with experience. Topics covered in this course include decision tree learning, neural networks, Bayesian learning, reinforcement learning, ensembling multiple learning algorithms, and various application problems. Students will be provided opportunities to simulate their algorithms in a programming language and apply them to solve real-world problems. Cross-listed with: EE 695.
Term 4
Deep learning (DL) is a family of the most powerful and popular machine learning (ML) methods and has wide real-world applications including face recognition, machine translation, self-driving car, recommender system, and playing the Go game. This course is designed for students either with or without ML background. The course will cover fundamental ML, computer vision, and natural language problems and DL tools for solving the problems. The students will be able to use DL methods for solving real-world ML problems. The homework is mostly implementation and programming using the Python language and popular DL frameworks such as TensorFlow and Keras. Knowledge and skills in Python programming and linear algebra are strictly required. Knowledge of probability theory, statistics, and numerical analysis are recommended but not required. Knowledge of machine learning and artificial intelligence is helpful but unnecessary.
This course provides a broad and systematic introduction to time series models and their applications to modeling and prediction. It utilizes real-world examples to apply a variety of time series models and methods. After successful completion of this course, students will be able to work with stationarity and measures of dependency, time series regression, graphical analysis, trend and seasonality detection and removal, and moving-average filtering. Students will also be able to apply linear time series analysis, spectral analysis, and multivariate time series methods. Additional topics that will be covered include long-memory processes, unit root testing, volatility modeling, state space models and Kalman filtering.
Term 5
This course uses advanced technologies, such as IBM's Blue Mix and Google's TensorFlow, as building blocks, allowing student teams to exercise their ingenuity to develop applications that use AI and machine learning in entirely new business application areas. The products of cognitive computing are beginning to appear in the marketplace, while so-called "deep-learning" AI applications are finding their way into healthcare, energy management, security, marketing and financial services.
The field of Big Data is emerging as one of the transformative business processes of recent times. It utilizes classic techniques from Business Intelligence & Analysis, along with new tools and processes to deal with the volume, velocity, and variety associated with big data. As they enter the workforce, a significant percentage of BIA students will be directly involved with big data either as technologists, managers, or users. This course will build on their understanding of the basic concepts of BI&A to provide them with the background to succeed in the evolving data centric world, not only from the point of view of the technologies required, but in terms of management, governance, and organization. Tools will include Hadoop, Hbase, and related software.
Term 1
This course will provide an introduction to exploratory data analysis and the use of basic statistical tools. Topics will include: data collection; descriptive statistics, and graphical and tabular treatment of quantitative, qualitative, and count data; detecting relations between variables; confidence intervals and hypothesis testing for one and two samples; simple and multiple linear regression; analysis of variance; design of experiments; and nonparametric methods. Selected topics, such as quality control and time series analysis, may also be included. Statistical software will be used throughout the course and statistical inference will be based on examples using real data. Students will participate in group projects of data analysis. They will be trained in the different phases of the professional statistician’s work, namely: data collection, description, analysis, testing, and presentation of the conclusions. Prerequisite: MA 540.
This course will provide foundational ideas as well as advanced techniques in linear algebra that are employed in computational science of Big Data. Students will work with vector-matrix representation of various types of structured and unstructured data and how different models and processes could be understood in terms of linear algebra operations and algorithms. Efficient implementation of algorithms for high dimensional data by using Randomized Numerical Linear Algebra will be one of the focal points. Students will develop and improve their coding skills in Python and MATLAB for implementation of several algorithms. In addition, students will read past and current literature in machine learning and data science to familiarize themselves with current trends and challenges in linear algebra for solving real life problems. Prerequisites: MA 123, MA 124 or equivalent, MA 232 or equivalent, MA 222 or equivalent, and have basic knowledge of MATLAB (FE 516) or Python (FE 520).
Term 2
The objective of this course is to introduce the students to the theory and methods of optimization used in data science. The first portion of the class focuses on elements of convex analysis and subgradient calculus for non-smooth functions, optimality conditions for differentiable and for non-smooth optimization problems, and Lagrangian duality. The main part of the class discusses numerical methods for optimization with a focus and application to problems arising in data science. Approaches to large-scale/big-data optimization include decomposition methods, design of distributed and parallel methods of optimization, as well as stochastic approximation methods. Examples of optimization models in classification, clustering, statistical learning, compressed sensing will be discussed in order to illustrate the theoretical and numerical challenges and to demonstrate the scope of applications.
This course will provide an introduction to machine learning theory, algorithms, and applications. Content aims to provide students with the knowledge to understand key elements of how to design algorithms/systems that automatically learn, improve, and accumulate knowledge with experience. Topics covered in this course include decision tree learning, neural networks, Bayesian learning, reinforcement learning, ensembling multiple learning algorithms, and various application problems. Students will be provided opportunities to simulate their algorithms in a programming language and apply them to solve real-world problems. Cross-listed with: EE 695.
Term 3
Deep learning (DL) is a family of the most powerful and popular machine learning (ML) methods and has wide real-world applications including face recognition, machine translation, self-driving car, recommender system, and playing the Go game. This course is designed for students either with or without ML background. The course will cover fundamental ML, computer vision, and natural language problems and DL tools for solving the problems. The students will be able to use DL methods for solving real-world ML problems. The homework is mostly implementation and programming using the Python language and popular DL frameworks such as TensorFlow and Keras. Knowledge and skills in Python programming and linear algebra are strictly required. Probability theory, statistics, and numerical analysis are recommended by not required. Knowledge of machine learning and artificial intelligence is helpful but unnecessary.
This course provides a broad and systematic introduction to time series models and their applications to modeling and prediction. It utilizes real-world examples to apply a variety of time series models and methods. After successful completion of this course, students will be able to work with stationarity and measures of dependency, time series regression, graphical analysis, trend and seasonality detection and removal, and moving-average filtering. Students will also be able to apply linear time series analysis, spectral analysis, and multivariate time series methods. Additional topics that will be covered include long-memory processes, unit root testing, volatility modeling, state space models and Kalman filtering.
Term 4
This course uses advanced technologies, such as IBM's Blue Mix and Google's TensorFlow, as building blocks, allowing student teams to exercise their ingenuity to develop applications that use AI and machine learning in entirely new business application areas. The products of cognitive computing are beginning to appear in the marketplace, while so-called "deep-learning" AI applications are finding their way into healthcare, energy management, security, marketing and financial services.
The field of Big Data is emerging as one of the transformative business processes of recent times. It utilizes classic techniques from Business Intelligence & Analysis, along with new tools and processes to deal with the volume, velocity, and variety associated with big data. As they enter the workforce, a significant percentage of BIA students will be directly involved with big data either as technologists, managers, or users. This course will build on their understanding of the basic concepts of BI&A to provide them with the background to succeed in the evolving data centric world, not only from the point of view of the technologies required, but in terms of management, governance, and organization. Tools will include Hadoop, Hbase, and related software.
Term 5
This course will provide an introduction to dynamic programming as the most popular methodology for learning and control of dynamic stochastic systems. We discuss basic models, some theoretical results, and numerical methods for these problems. They will be developed starting from basic models of dynamical systems, through finite-horizon stochastic problems, to infinite-horizon stochastic models of fully or partially observable systems. Throughout the class, special attention will be paid to the application of dynamic programming to statistical learning. The class will include introduction to approximate dynamic programming techniques, which are used in statistical learning, such as tree-based methods for classification, Bayesian learning, among others. The concepts and methods will be illustrated by various applications including learning in stochastic networks, engineering, business, and finance. Prerequisites: MA 547, MA 623.
In this course, students will learn through hands-on experience how to extract data from the web and analyze web-scale data using distributed computing. Students will learn different analysis methods that are widely used across the range of internet companies, from start-ups to online giants like Amazon or Google. At the end of the course, students will apply these methods to answer a real scientific question.
Career Outlook
National employment and job postings statistics for data science careers.
JOB TITLE | EMPLOYED | AVERAGE INCOME |
---|---|---|
Data Scientist | 33,200 | $126,800 |
Software Developers | 1,500,000 | $107,000 |
Information Security Analysts | 137,000 | $99,700 |
Computer and Information Systems Managers | 461,000 | $146,000 |
Computer Systems Analyst | 627,000 | $91,000 |
Source: Emsi Labor Market Data, 2021
MSDS ALUMNI HAVE GONE ON TO BE EMPLOYED AT THE FOLLOWING ORGANIZATIONS:
- Amazon
- Cityblock Health
- Bank of America
- Disney Streaming Services
- New York University
- Two Sigma Investments
PROGRAM ADMISSION REQUIREMENTS
- Bachelor's Degree
Minimum GPA of 3.0 from an accredited institution. Degree required to begin program; completion not required at time of application.
- ACADEMIC TRANSCRIPTS
You may submit unofficial transcripts during the application process. After admission, you will be required to submit official transcripts.
- Two Letters of Recommendation
Faculty members and/or professional colleagues.
- TOEFL/IELTS/Duolingo Scores
Required for international students.
- Statement of Purpose
Optional, but strongly recommended.
- Resume
Optional, but strongly recommended.
TEST SCORE ACCOMMODATIONS DURING THE NOVEL CORONAVIRUS OUTBREAK
Due to the impacts of the coronavirus (COVID-19) on testing centers around the world, Stevens has made the following accommodations available to all students for the fall 2022 admissions cycle:
TOEFL/IELTS/DUOLINGO: Affected applicants may submit Duolingo English Test (DET) results in lieu of TOEFL/IELTS exam results.
Tuition & Cost
Key Dates & Deadlines
Term | Early Submit | Priority Submit | Final Submit | Start of Classes |
---|---|---|---|---|
$250 Deposit Waiver* and Application Fee Waiver Available. | Application Fee Waiver Available and Early Application Review. | |||
Fall 2022 |
|
| August 29, 2022 | September 12, 2022 |
Spring 2023 | October 10, 2022 | November 21, 2022 | December 21, 2022 | January 23, 2023 |
* Applicants who apply by the early submit deadline and are admitted may be eligible for a $250 deposit waiver. Applicants who receive education assistance from employers or other tuition discounts are not eligible. Other eligibility conditions may apply.
Upcoming Webinars
"What School is Right For You?"
Online MSDS Program and Application Forum
"What School is Right For You?"
Online MSDS Program and Application Forum
"What School is Right for You?"
Online MSDS Program and Application Forum
"What School is Right for You?"
Faculty
Our faculty includes five National Science Foundation (NSF) CAREER winners as well as researchers who consult with companies such as Microsoft, IBM, Google, Bell Labs and other top industry firms.

MICHAEL ZABARANKIN
ASSOCIATE PROFESSOR AND INTERIM CHAIR OF THE DEPARTMENT OF MATHEMATICAL SCIENCES

EDUARDO BONELLI
TEACHING PROFESSOR

DARINKA DENTCHEVA
PROFESSOR

HADI SAFARI KATESARI
TEACHING ASSISTANT PROFESSOR

UPENDRA PRASAD
LECTURER
