Data
Science

Spring 2020
course
site

Data Science

info

title       Data Science
term        Spring 2020
credits     4
time        Tues/Fri 1:30 - 2:50pm
level       Intermediate
place       Brown Science / Sci 217
faculty     Jim Mahoney
repeat      no, cannot be repeated for credit
prereq      previous programming experience and some facility with math

textbook

Data Science from Scratch: First Principles with Python 2nd Edition by Joel Grus, ISBN 1492041130 | amazon

blurb

Data science combines data analysis, computing, and numerical methods to analyze and understand large collections of numbers from all sorts of sources. It's been gaining popularity lately as a paradigm for interpreting everything from movie recommendations to image recognition. Using the Python programming language, this course will explore the basics of data science through statistics, numerical visualization, and machine learning.

schedule

  week  0  Jan 23 : chap 1              jupyter, terminal, getting started
  week  1  Jan 28 : chap 2, 9           coding review & practice
  week  2  Feb  4 : chap 3, 4           visualization ; matrices
  week  3      11 : chap 5, 6           statistics & probability
  week  4      18 :      6, 7           ... 
  week  5      25 : chap 10, 11         machine learning intro ; look at kaggle
  week  6  Mar  3 : chap 12                * k neighbors 
  week  7      10 : chap 13                * naive bayes : spam filter
   -- spring break --
  week  8      31 : chap 14, start 8       * regression
  week  9  Apr  7 : chap 15, finish 8        ...       
  week 10      14 : chap 18                * neural nets                      .
  week 11      21 : projects 1               
  week 12      28 : projects 2
  week 13  May  5 : presentations

textbook chapters - topics summary

summary of chapters
   1  intro
   2  python                                  | coding background
   3  visualize            | math background
   4  linear algebra       |
   5  statistics           |
   6  probability          |
   7  hypothesis tests     |
   8  gradient descent           | math aside
   9  data input                             | more coding background
  10  data exploring         | getting off the ground
  11  machine learning       | overview of methods
  12  k-nearest neighbors      | method 1
  13  naive bayes              | method 2
  14  linear regression        | method 3, part 1  (needs gradient descent)
  15  multiple regression      |   method 3, part 2
  16  logistic map             |   method 3, variation
  17  decision trees           | method 4
  18  neural networks          | method 5, part 1
  19  deep learning            |   method 5, part 2
  20  clustering               | method 6
  21  natural language       | problem type 1
  22  network analysis       | problem type 2
  23  recommender systems    | problem type 3
  24  databases and SQL                | related topic 1
  25  MapReduce                        | related topic 2
  26  ethics                           | related topic 3
  27  epilog
https://cs.marlboro.college /cours /spring2020 /data /syllabus
last modified Sun September 27 2020 3:44 am