The objective of this course is to introduce the students to the theory and methods of optimization used in data science. The first portion of the class focuses on elements of convex analysis and subgradient calculus for non-smooth functions, optimality conditions for differentiable and for non-smooth optimization problems, and Lagrangian duality. The main part of the class discusses numerical methods for optimization with a focus and application to problems arising in data science. Approaches to large-scale/big-data optimization include decomposition methods, design of distributed and parallel methods of optimization, as well as stochastic approximation methods. Examples of optimization models in classification, clustering, statistical learning, compressed sensing will be discussed in order to illustrate the theoretical and numerical challenges and to demonstrate the scope of applications.