Quick Links >> Register | Content Details | Teaching Faculty | Prerequisites | Schedule & Location | Financial Aid & Discounts | In-person vs Remote | Testimonials
The 3-week boot camp of in-depth, hands-on training
Call us at 1.855.LEARN.AI for more information.
START DATE | Monday, March 20th, 2023 |
DURATION | Three weeks |
LAB SESSION DAYS | Every Monday, Tuesday, Wednesday, and Thursday |
LAB SESSION TIMING | 10 AM – 5 PM (includes one hour of lunch break) PST |
CLINIC/HELP SESSIONS | (Optional, if you need help) Sunday morning PST |
Workshop Overview
The Spring Session Starts Monday Morning, March 20th, 2023
This workshop is a gentle but in-depth and comprehensive introduction to the field of Machine Learning. It gives equal emphasis on understanding the theoretical foundations and hands-on experience with real-world data analyses on the Google cloud platform. The workshop spans over 100 hours of in-person training, spanning 12 all-day sessions spanning across three weeks. It comprises in-depth lecture/theory sessions, guided labs for building AI models, quizzes, projects on real-world datasets, guided readings of influential research papers, and discussion groups.
The teaching faculty for this workshop comprises the instructor, a supportive staff of teaching assistants, and a workshop coordinator. Together they facilitate learning through close guidance and 1-1 sessions when needed.
You can attend the workshop in person or remotely. State-of-the-art facilities and instructional equipment ensure that the learning experience is invariant of either choice. Of course, you can also mix the two modes: attend the workshop in person when you can, and attend it remotely when you cannot. All sessions are live-streamed, as well as recorded and available on the workshop portal.
NOTE: This comprehensive workshop comprises of, and merges, the previously given two separate data science workshops: ML100 (Introduction to ML), ML200 (Intermediate ML).
Lectures
Labs
Quizzes
Projects
Papers
Overview of weekly activities
Each week will focus on one outcome: mastering a specific topic in data science. To achieve this outcome, we will cover the relevant theory in simplified but extensive depth. We will follow it with hands-on labs. There will also be a guided reading of an important research paper on the topic.
Finally, we will assess our progress with a quiz that covers the topic, as well as a hands-on data science project, applying what we have learned to various real-world datasets.
To summarize, every day will include
- 10 AM to 1 PM Theory/lecture session (3 hours)
- 1 PM to 2 PM Lunch break
- 2 PM to 5 PM Hands-on Guided Lab (3 hours)
- SATURDAY, 10 AM Weekly Project released. A hands-on AI project based on the topic of the week, where the teaching assistants will provide technical help and support
- THURSDAY’s MORNING Quiz will be released online to review the theory and the practical aspects learned.
- SUNDAY NOON (Optional) Guided reading of an important research paper in the field
- SUNDAY, 4 PM (Optional) Review of the quiz; help with the data science projects
Capstone Project
Each participant starts on a capstone project that spans the ten weeks duration of the workshop. The teaching staff will work closely in providing you guidance in pursuing the milestones of this project.
A capstone project can be done in groups of no more than 4 participants. The capstone project must fulfill the following criteria:
- Original Work It must present either a machine learning modeling of a new dataset or present a new model or approach to an existing dataset. It may contain the group’s own code implementation of an algorithm or approach suggested in a recent research paper.
- Blog Optionally, a blog describing the work and the experience of the project.
- Presentation End of workshop technical presentation to the batch of participants.
Target Audience
Theory/Lecture Topics
Covariance, Correlation & Causation
We study the covariance between two variables, and its geometrical intuition. Next, we learn about feature standardization, Pearson correlation, and its relationship to linear regression for a single predictor. We also study the phenomenon of regression towards the mean. Correlation does not imply causation: though it is a common fallacy to fall into. We will delve more deeply into this.
Regression
We will study linear regression, the concept of least-squares, and gradient-descent to minimize the sum-squared errors. Then we will study the ordinary least squares linear regression, polynomial regression and the Runge phenomenon, nonlinear least-squares, and box-cox transformations. We will also learn about residuals analysis and other model diagnostic techniques, and get introduced to alternate loss functions.
Regularization
Regularization reduces overfitting and high variance. We look at an additional penalty term to the regression loss function as a regularization hyperparameter. Additionally, we see a geometric interpretation of this term along with Minkowski distance. Lasso (L1), Ridge (L2), and Elastic-Net regularizations are covered in the data-science lab exercises.
Classification
We will study classification and the learning of decision-boundaries in the predictor space. In particular, we will study the Logistic Regression Classifier and the Linear and Quadratic Discriminant Analyses classifiers. We will study the goodness of fit diagnostic measures such as the confusion matrix, precision, recall, accuracy, the ROC curve, and the area under the ROC curve.
Clustering
We study three different approaches to clustering of data in the feature space: K-Means clustering and its close variants; selecting the optimal number of clusters through the scree plots Agglomerative (Hierarchical) clustering, dendrogram and various linkage functions Density-based clustering techniques such as dbscan, optics and denclue
Dimensionality Reduction
We will study a few techniques for dimensionality reduction. Primarily we will focus on Principal Component Analysis and understand the geometrical interpretation of it. Along with this we will relate it to the Covariance Matrix, and discuss the class of datasets that PCA works best with. We will also cover such simpler approaches as backward and forward selection and Lasso.
Art of Feature Engineering
The predictive efficacy of the machine-learning algorithms is greatly amplified by feature engineering from the raw input feature-space. A meticulous process of exploratory data analysis (EDA) in search of clues towards meaningful feature extraction often can transform a simple algorithm which was performing very poorly on the original feature-space into an extraordinarily effective one in the new feature space where these extracted features are present.
Approximate & Kernel k-NN
Approximate & Kernel k-NN is a lazy-learning, non-parametric method that proves remarkably effective in certain situations. We will learn about the interesting notions of distances or similarities that underly the search for nearest neighbors. We will learn about the Curse of Dimensionality, its origin, and the methods to deal with it. We will study how the choice of the "k" provides the bias-variance tradeoffs. Finally, we will learn about a rich collection of distance kernels that mitigate the need for hyperparameter "k" tuning.
Decision Trees
Decision Tree provides a versatile means to adapt to non-linearities in the feature space. It is employed for classification as well as regression. The strength of decision trees resides in their very intuitive representation as a tree structure in the way it iteratively partitions the feature space. Thus it is valued for its interpretability, which at the same time being remarkably powerful.
Ensemble Methods: Bagging, Boosting & Stacking
Among the most powerful concepts to have emerged in machine learning is that of ensemble learning. Instead of using only one learner, and making it as powerful as possible so that it comes up with a good hypothesis, the approach here is different. Instead, a crowd of weak learners is taken, each forming its own hypothesis. These hypotheses are methodically synthesized to generate predictions in a manner that there is much greater overall predictive power.
RandomForest & Extremely Randomized Trees
RandomForest is a powerful approach to both classification and regression that employs an ensemble of decision trees and bagging to get a consensus vote or score. Extremely Randomized Trees are a variation of the same approach but exhibit fewer variance errors.
Gradient Boosting Methods
Gradient Boosting is a powerful ensemble method that has proven remarkably effective in recent years. As a result, there is a lot of research and activity in creating continually better performing and more predictive implementations.
Support Vector Machines
Support Vector Machines provide a systematic way to linear the decision boundary by transforming the original feature space into another where the various classes are separated by a linear maximal margin classifier.
No Free Lunch Theorems
The No Free Lunch theorems -- despite their frivolous sounding name -- are a fundamental guideline that no one algorithm is more "powerful" than any other, on the average. Different algorithms make different underlying assumptions of the ground truth and therefore are well suited to different datasets. On average, they all perform equivalently.
Capstone ML-200 Project
A Capstone project where each participant presents an extensive data-science journey over a non-trivial dataset of choice.
- Google Cloud Platform basics for data-science exploratory notebooks
- Linear, Polynomial, and Non-Linear Regression, along with power-transforms and basis changes to linearize datasets, Principal Components Regression
- Classifiers: Logistic Regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis
- Sampling, Bootstrapping, and Cross-Validation
- Regularization: Ridge and Lasso for Regression
6. Clustering: Agglomerative, K-Means family of clusterers, Density-based clusterers (DBScan, Optics, Denclue), and Expectation Maximization
7. Dimensionality Reduction with Principal Components Analysis, and Matrix Factorization, t-SNE and UMAP
8. Geometrical background and intuition behind the major algorithms
9. Bias-Variance Trade-off, model vs data complexity, and hyperparameter tuning through grid-search
10. Getting started with Deep Neural Networks
11. Kernel Methods and Support Vector Machines
12. Ensemble Methods: RandomForest, Decision Tree and XGBoost
12. Interpretability of machine-learning models
13. Hyperparameter optimizations and an introduction to automated machine learning.
Guided Labs and Projects
The Basics
- Setting up the AI development environment in the Google cloud (Jupyter notebooks, Colab, Kubernetes)
- Introduction to Pandas and Scikit-learn for data manipulation and model diagnostics, respectively
- Creating interactive data-science applications with Streamlit
- Data visualization techniques
- Kubeflow: Model development life cycle
- Models as a service
Google Cloud AI Platform
- GKE (Google Kubernetes Engine)
- Selecting the right compute-instances and containers for deep learning
- Colab and Notebooks in GCP
- Going to production in GCP
- Recommendations AI (if time permits)
Core Topics
- Exploring Numpy and SciPy
- Linear regression with Scikit-Learn
- Model diagnostics with Scikit-Learn and Yellowbrick
- Residual analysis
- Power transforms (Box-Cox, etc.)
- Polynomial regression
- Regularization methods
- Classification with Logistic Regression
- LDA and DQA
- Dimensionality reduction
- Clustering algorithms (k-Means, EM, hierarchical, and density-based methods)
Explainable AI
- Interpretability of AI models
- LIME (Locally Interpretable Model Explanations)
- Shapley additive models
- Partial dependency plots
Ensemble Methods
- Decision trees and pruning
- Bagging and Boosting
- RandomForest and its variants
- Gradient Boosting, XGBoost, CatBoost, etc.
Hyperparameter Optimization
- Grid search
- Randomized search
- Basic introduction to Bayesian optimization
AI Recommendation Systems
(Using Surprise, etc.)
- Memory based recommenders
- Model based recommenders
Teaching Faculty
Asif Qamar
[Univ. of Illinois at Urbana Champaign (UIUC)]
About the instructor
Background
Over more than two decades, Asif’s career has spanned two parallel tracks, as a deeply technical architect, and as a passionate educator. While he primarily spends his time technically leading research and development efforts, he finds expression for his love of teaching in the workshops he offers, over the weekends. Through this, he aims to mentor and cultivates the next generation of great technical craftsmen.
Educator
He has also been an educator, teaching various subjects in Programming, Machine-Learning, and Physics for the last 29 years. He has taught at the University of California, Berkeley extension, at the Univerity of Illinois, Urbana- Champaign (UIUC), and Syracuse University. Besides this, he has given a large number of workshops, seminars, and talks at technical workplaces.
He has been honored with various excellence in teaching awards, in the universities and technical workplaces.
Teaching assistants
There will be a staff of teaching assistants helping and guiding you with the labs, as well as helping understand the concepts. They will be monitoring the discussion groups; many will be available on campus to answer your questions. You can also reach out to them individually.
Kate Amon

Kunal Lall

Harini Datla

Dennis Shen

Shefali Qamar
Schedule
The workshop starts on Saturday, September 18th, 2021 at 10 AM Pacific Standard Time.
Saturday attendance is essential; Sunday activities are optional. The schedule for each week:
- Theory: Saturday Morning, 10 AM to 1 PM
- Guided Lab: Saturday Afternoon, 2 PM to 6 PM
- (Optional attendance) Paper reading, quiz review, and project presentations: Sunday noon onwards.
For in personal participation
Venue
Prerequisites
It would help if you have basic fluency with Python. If you do not have the necessary Python background, you should attend the (free and optional) Python programming sessions at SupportVectors before this workshop starts. We will use Python as the primary programming language, and use R language as optional labs for those who would like to master data science in both languages.
It is not required to have any other programming or mathematical background, though the latter can give you a better appreciation for some of the things you will learn in the workshop.
Financial Aid and Tuition Discounts
Financial aid is available to three students as a work-study program. Reach out to the teaching staff if you are interested, and to see if you qualify.
Participants with Disability: Between 25% to 100% discount, based on the disability.
Living in a developing nation: $500 discount for participants living in any developing nation such as India, Sri Lanka, Bangladesh, China, African nations, etc.
Veterans or currently serving members of the US Military: $500 discount.
In-Person vs Remote Participation
What workshop participants have to say…
The instructor is exceptionally well versed in topics and has the best didactic approach of any teacher/instructor I have had in 30+ years of post-graduate studies . . . reminds me of Richard Feynman and his great books on physics. Asif’s geometric approach is profoundly … illuminating and prod to learn more.
…explained from a geometric perspective that is not easily found in books…It is a big facility that can seat about 50 people. Breakfast, lunch, and snacks are provided. I think the greatest part of this (or any other class that Asif offers for that matter) is that Asif makes all complex math behind algorithms look extremely intuitive by going into the geometry… I would recommend this course or any course by Asif to anyone.
…offers a very thorough, intensive, and yet remarkably beginner-friendly way for students of varying expertise to study Machine Learning.Asif Qamar, with his decades of experience in both Machine Learning and mathematics, masterfully teaches difficult and intricate topics in a way that students can easily understand complex real-world applications. Asif reaches this due to his inextinguishable passion for both the field and academia. He has created resources that are excellent for reference far beyond the scope of the class. There is very little doubt in that I would recommend Asif to a friend and will definitely be continuing on to the higher course.
Asif is very good in teaching and explanation with respect to geometry linking to maths was very impressive. I wish I could have his guidance for all Eng. math teachers in India.
Chapterwise, the content covered a lot of spectrum which was good. It gave me a strong background in Machine Learning, making my fundamentals strong.
The facility is amazing hands down.
Note to Instructor: You have so much knowledge about every field that gives me a lot of motivation and inspiration. Machine Learning was covered in good depth. I loved the way of teaching.
I am definitely coming back. Yes, I will recommend it to friends.
Excellent facility, unlimited access, very neat and clean and quiet training room, and labs env. Course material: very comprehensive and excellent mix and coverage of theory and labs.Instructor: very knowledgeable and passionate about teaching and makes sure that one really understands the concept … very good analogies and examples to explain the subject. Overall the best instructor I have ever experienced.