New York City skyline viewed from within Stevens campus building.

ONLINE Master in Data Science (MSDS)

Deliver Innovation Through Data

Ranked #8 in the nation for “Best Online Graduate Computer Information Technology Programs” in 2021 by U.S. News & World Report.

Apply Now

MSDS OVERVIEW

The M.S. in Data Science online program at the Stevens Institute of Technology prepares students for careers in fintech, business intelligence and analytics, academia, and database management, as well as government positions requiring strong skills in data analysis.

  • Python and R, SQL, Hadoop, Hive, and TensorFlow

  • Machine learning and deep learning

  • Predictive modeling

  • Advanced statistical and optimization methods

  • Supervised/unsupervised learning

  • Neural networks

  • Natural language processing (NLP)

  • Data visualization

By the Numbers

In U.S.
#21
Best Online Graduate Engineering Programs by U.S. News & World Report (2020)
In U.S.
Top 25
Among the Top 25 STEM colleges by Forbes (2018)
Winner
7x
21st Century Award for Best Practices in Distance Learning by The U.S. Distance Learning Association
In Nation
#9
Recognized for 'Top 20 Best Career Placement (Private Schools)' by The Princeton Review (2021)
In N.J.
#3
Best School of Engineering & Science Graduate Programs by U.S. News & World Report (2022)

Coursework

Below are Traditional and Advanced course sequences for the M.S. in Data Science program. Students will engage in coursework on the following topics to develop skills as data scientists who can glean insights and aid in informed decision-making. The MSDS program consists of 30 credit hours, with 10 courses, and is 100% online.

Term 1

This course provides students with the essential background in calculus and linear algebra needed to pursue the study of Data Science. Topics include limits, derivatives and integrals of (multivariable) functions; vectors and matrices; vector spaces and subspaces; norms and projections; basis and dimension; eigenvalues and eigenvectors; singular values; continuous optimization; and maps between Euclidean spaces and Jacobians. Throughout, various applications to Data Science will be considered, with hands-on numerical and coding exercises supplementing the theory.

This course provides the theoretical basis for studying the properties of modern statistical and machine learning methods. Students will learn the definitions and properties of probability spaces, random variables, distributions, expectations and limit theorems. Students will work with density functions, conditional expectations, and convergence of random variables. After successful completion of this course, students will be able to determine the probability distribution function and density of random variables/vectors, use the properties of expectations and higher-order moments in computations, and examine the appropriate convergence of random variables given specific situations. Students will also be able to apply results from probability theory to study the properties of sample statistics such as estimators.

Term 2

This course offers an introduction to exploratory data analysis and the use of basic statistical tools. Topics will include data collection; descriptive statistics, and graphical and tabular treatment of quantitative, qualitative, and count data; detecting relations between variables; confidence intervals and hypothesis testing for one and two samples; simple and multiple linear regression; analysis of variance; design of experiments; and nonparametric methods. Selected topics, such as quality control and time series analysis, may also be included. Statistical software will be used throughout the course and statistical inference will be based on examples using real data. Students will participate in group projects of data analysis. They will be trained in the different phases of the professional statistician’s work, namely: data collection, description, analysis, testing, and presentation of the conclusions. Prerequisite: MA 540.

This course will introduce foundational ideas as well as advanced techniques in linear algebra that are employed in computational science of Big Data. Students will work with vector-matrix representation of various types of structured and unstructured data and how different models and processes could be understood in terms of linear algebra operations and algorithms. Efficient implementation of algorithms for high dimensional data by using Randomized Numerical Linear Algebra will be one of the focal points. Students will develop and improve their coding skills in Python and MATLAB for implementation of several algorithms. In addition, students will read past and current literature in machine learning and data science to familiarize themselves with current trends and challenges in linear algebra for solving real life problems. Prerequisites: MA 123, MA 124 or equivalent, MA 232 or equivalent, MA 222 or equivalent, and have basic knowledge of MATLAB (FE 516) or Python (FE 520).

Term 3

The objective of this course is to introduce the students to the theory and methods of optimization used in data science. The first portion of the class focuses on elements of convex analysis and subgradient calculus for non-smooth functions, optimality conditions for differentiable and for non-smooth optimization problems, and Lagrangian duality. The main part of the class discusses numerical methods for optimization with a focus and application to problems arising in data science. Approaches to large-scale/big-data optimization include decomposition methods, design of distributed and parallel methods of optimization, as well as stochastic approximation methods. Examples of optimization models in classification, clustering, statistical learning, compressed sensing will be discussed in order to illustrate the theoretical and numerical challenges and to demonstrate the scope of applications.

An introductory course for machine learning theory, algorithms, and applications. Content aims to provide students with the knowledge to understand key elements of how to design algorithms/systems that automatically learn, improve, and accumulate knowledge with experience. Topics covered in this course include decision tree learning, neural networks, Bayesian learning, reinforcement learning, ensembling multiple learning algorithms, and various application problems. Students will be provided opportunities to simulate their algorithms in a programming language and apply them to solve real-world problems. Cross-listed with: EE 695.

Term 4

Deep learning (DL) is a family of the most powerful and popular machine learning (ML) methods and has wide real-world applications including face recognition, machine translation, self-driving car, recommender system, and playing the Go game. This course is designed for students either with or without ML background. The course will cover fundamental ML, computer vision, and natural language problems and DL tools for solving the problems. The students will be able to use DL methods for solving real-world ML problems. The homework is mostly implementation and programming using the Python language and popular DL frameworks such as TensorFlow and Keras. Knowledge and skills in Python programming and linear algebra are strictly required. Knowledge of probability theory, statistics, and numerical analysis are recommended but not required. Knowledge of machine learning and artificial intelligence is helpful but unnecessary.

This course provides a broad and systematic introduction to time series models and their applications to modeling and prediction. It utilizes real-world examples to apply a variety of time series models and methods. After successful completion of this course, students will be able to work with stationarity and measures of dependency, time series regression, graphical analysis, trend and seasonality detection and removal, and moving-average filtering. Students will also be able to apply linear time series analysis, spectral analysis, and multivariate time series methods. Additional topics that will be covered include long-memory processes, unit root testing, volatility modeling, state space models and Kalman filtering.

Term 5

This course uses advanced technologies, such as IBM's Blue Mix and Google's TensorFlow, as building blocks, allowing student teams to exercise their ingenuity to develop applications that use AI and machine learning in entirely new business application areas. The products of cognitive computing are beginning to appear in the marketplace, while so-called "deep-learning" AI applications are finding their way into healthcare, energy management, security, marketing and financial services.

The field of Big Data is emerging as one of the transformative business processes of recent times. It utilizes classic techniques from Business Intelligence & Analysis, along with new tools and processes to deal with the volume, velocity, and variety associated with big data. As they enter the workforce, a significant percentage of BIA students will be directly involved with big data either as technologists, managers, or users. This course will build on their understanding of the basic concepts of BI&A to provide them with the background to succeed in the evolving data centric world, not only from the point of view of the technologies required, but in terms of management, governance, and organization. Tools will include Hadoop, Hbase, and related software.

Term 1

This course will provide an introduction to exploratory data analysis and the use of basic statistical tools. Topics will include: data collection; descriptive statistics, and graphical and tabular treatment of quantitative, qualitative, and count data; detecting relations between variables; confidence intervals and hypothesis testing for one and two samples; simple and multiple linear regression; analysis of variance; design of experiments; and nonparametric methods. Selected topics, such as quality control and time series analysis, may also be included. Statistical software will be used throughout the course and statistical inference will be based on examples using real data. Students will participate in group projects of data analysis. They will be trained in the different phases of the professional statistician’s work, namely: data collection, description, analysis, testing, and presentation of the conclusions. Prerequisite: MA 540.

This course will provide foundational ideas as well as advanced techniques in linear algebra that are employed in computational science of Big Data. Students will work with vector-matrix representation of various types of structured and unstructured data and how different models and processes could be understood in terms of linear algebra operations and algorithms. Efficient implementation of algorithms for high dimensional data by using Randomized Numerical Linear Algebra will be one of the focal points. Students will develop and improve their coding skills in Python and MATLAB for implementation of several algorithms. In addition, students will read past and current literature in machine learning and data science to familiarize themselves with current trends and challenges in linear algebra for solving real life problems. Prerequisites: MA 123, MA 124 or equivalent, MA 232 or equivalent, MA 222 or equivalent, and have basic knowledge of MATLAB (FE 516) or Python (FE 520).

Term 2

The objective of this course is to introduce the students to the theory and methods of optimization used in data science. The first portion of the class focuses on elements of convex analysis and subgradient calculus for non-smooth functions, optimality conditions for differentiable and for non-smooth optimization problems, and Lagrangian duality. The main part of the class discusses numerical methods for optimization with a focus and application to problems arising in data science. Approaches to large-scale/big-data optimization include decomposition methods, design of distributed and parallel methods of optimization, as well as stochastic approximation methods. Examples of optimization models in classification, clustering, statistical learning, compressed sensing will be discussed in order to illustrate the theoretical and numerical challenges and to demonstrate the scope of applications.

This course will provide an introduction to machine learning theory, algorithms, and applications. Content aims to provide students with the knowledge to understand key elements of how to design algorithms/systems that automatically learn, improve, and accumulate knowledge with experience. Topics covered in this course include decision tree learning, neural networks, Bayesian learning, reinforcement learning, ensembling multiple learning algorithms, and various application problems. Students will be provided opportunities to simulate their algorithms in a programming language and apply them to solve real-world problems. Cross-listed with: EE 695.

Term 3

Deep learning (DL) is a family of the most powerful and popular machine learning (ML) methods and has wide real-world applications including face recognition, machine translation, self-driving car, recommender system, and playing the Go game. This course is designed for students either with or without ML background. The course will cover fundamental ML, computer vision, and natural language problems and DL tools for solving the problems. The students will be able to use DL methods for solving real-world ML problems. The homework is mostly implementation and programming using the Python language and popular DL frameworks such as TensorFlow and Keras. Knowledge and skills in Python programming and linear algebra are strictly required. Probability theory, statistics, and numerical analysis are recommended by not required. Knowledge of machine learning and artificial intelligence is helpful but unnecessary.

This course provides a broad and systematic introduction to time series models and their applications to modeling and prediction. It utilizes real-world examples to apply a variety of time series models and methods. After successful completion of this course, students will be able to work with stationarity and measures of dependency, time series regression, graphical analysis, trend and seasonality detection and removal, and moving-average filtering. Students will also be able to apply linear time series analysis, spectral analysis, and multivariate time series methods. Additional topics that will be covered include long-memory processes, unit root testing, volatility modeling, state space models and Kalman filtering.

Term 4

This course uses advanced technologies, such as IBM's Blue Mix and Google's TensorFlow, as building blocks, allowing student teams to exercise their ingenuity to develop applications that use AI and machine learning in entirely new business application areas. The products of cognitive computing are beginning to appear in the marketplace, while so-called "deep-learning" AI applications are finding their way into healthcare, energy management, security, marketing and financial services.

The field of Big Data is emerging as one of the transformative business processes of recent times. It utilizes classic techniques from Business Intelligence & Analysis, along with new tools and processes to deal with the volume, velocity, and variety associated with big data. As they enter the workforce, a significant percentage of BIA students will be directly involved with big data either as technologists, managers, or users. This course will build on their understanding of the basic concepts of BI&A to provide them with the background to succeed in the evolving data centric world, not only from the point of view of the technologies required, but in terms of management, governance, and organization. Tools will include Hadoop, Hbase, and related software.

Term 5

This course will provide an introduction to dynamic programming as the most popular methodology for learning and control of dynamic stochastic systems. We discuss basic models, some theoretical results, and numerical methods for these problems. They will be developed starting from basic models of dynamical systems, through finite-horizon stochastic problems, to infinite-horizon stochastic models of fully or partially observable systems. Throughout the class, special attention will be paid to the application of dynamic programming to statistical learning. The class will include introduction to approximate dynamic programming techniques, which are used in statistical learning, such as tree-based methods for classification, Bayesian learning, among others. The concepts and methods will be illustrated by various applications including learning in stochastic networks, engineering, business, and finance. Prerequisites: MA 547, MA 623.

In this course, students will learn through hands-on experience how to extract data from the web and analyze web-scale data using distributed computing. Students will learn different analysis methods that are widely used across the range of internet companies, from start-ups to online giants like Amazon or Google. At the end of the course, students will apply these methods to answer a real scientific question.

Career Outlook

National employment and job postings statistics for data science careers.

JOB TITLE

EMPLOYED

AVERAGE INCOME

Data Scientist

33,200

$126,800

Software Developers

1,500,000

$107,000

Information Security Analysts

137,000

$99,700

Computer and Information Systems Managers

461,000

$146,000

Computer Systems Analyst

627,000

$91,000

Source: Emsi Labor Market Data, 2021

MSDS ALUMNI HAVE GONE ON TO BE EMPLOYED AT THE FOLLOWING ORGANIZATIONS:

  • Amazon
  • Cityblock Health
  • Bank of America
  • Disney Streaming Services
  • New York University
  • Two Sigma Investments

PROGRAM ADMISSION REQUIREMENTS

  • Bachelor's Degree

    Minimum GPA of 3.0 from an accredited institution. Degree required to begin program; completion not required at time of application.

  • ACADEMIC TRANSCRIPTS

    You may submit unofficial transcripts during the application process. After admission, you will be required to submit official transcripts.

  • Two Letters of Recommendation

    Faculty members and/or professional colleagues.

  • TOEFL/IELTS/Duolingo Scores

    Required for international students.

  • Statement of Purpose

    Optional, but strongly recommended.

  • Resume

    Optional, but strongly recommended.

TEST SCORE ACCOMMODATIONS DURING THE NOVEL CORONAVIRUS OUTBREAK

Due to the impacts of the coronavirus (COVID-19) on testing centers around the world, Stevens has made the following accommodations available to all students for the fall 2022 admissions cycle:

  • TOEFL/IELTS/DUOLINGO: Affected applicants may submit Duolingo English Test (DET) results in lieu of TOEFL/IELTS exam results.

Tuition & Cost

Per Credit (30 credits)
$1,776*
Application Fee
$60
Fee waivers available
Enrollment Deposit
$250
*Tuition rates based on Fall 2022 tuition rate effective September 2022. Tuition and fees are subject to change annually.

Key Dates & Deadlines

Term

Early Submit

Priority Submit

Final Submit

Start of Classes

$250 Deposit Waiver* and Application Fee Waiver Available.

Application Fee Waiver Available and Early Application Review.

Fall 2022

May 23, 2022

June 27, 2022

August 29, 2022

September 12, 2022

Spring 2023

October 10, 2022

November 21, 2022

December 21, 2022

January 23, 2023

* Applicants who apply by the early submit deadline and are admitted may be eligible for a $250 deposit waiver. Applicants who receive education assistance from employers or other tuition discounts are not eligible. Other eligibility conditions may apply.

Upcoming Webinars

Thursday, August 18th
7:00pm ET

"What School is Right For You?"

Thursday, September 15th
7:00PM ET

Online MSDS Program and Application Forum

Thursday, September 22nd
7:00PM ET

"What School is Right For You?"

Wednesday, October 19th
7:00PM ET

Online MSDS Program and Application Forum

Thursday, October 27th
7:00PM ET

"What School is Right for You?"

Wednesday, November 9th
7:00PM ET

Online MSDS Program and Application Forum

Thursday, November 17th
7:00PM ET

"What School is Right for You?"

Faculty

Our faculty includes five National Science Foundation (NSF) CAREER winners as well as researchers who consult with companies such as Microsoft, IBM, Google, Bell Labs and other top industry firms.

Michael Zabarankin

MICHAEL ZABARANKIN

ASSOCIATE PROFESSOR AND INTERIM CHAIR OF THE DEPARTMENT OF MATHEMATICAL SCIENCES

Eduardo Bonelli

EDUARDO BONELLI

TEACHING PROFESSOR

Darinka Dentcheva

DARINKA DENTCHEVA

PROFESSOR

Hadi Safari Katesari

HADI SAFARI KATESARI

TEACHING ASSISTANT PROFESSOR

Upendra Prasad

UPENDRA PRASAD

LECTURER

Pedro Andres Vilanova

PEDRO ANDRES VILANOVA

TEACHING ASSISTANT PROFESSOR

Request Information

GRE/GMAT Not Required