Data
Science

Spring 2020
course
site

Jan 24

Questions ?

I've posted a tentative schedule on the syllabus page.

jupyter

Getting started with jupyter.marlboro.college :

How do you find documentation for all this stuff ?

google search
-------------
jupyter tutorial
jupyter docs
github data science from scratch
linux terminal tutorial

setting up your workspace

In a terminal :

 $ mkdir data_science
 $ cd data_science
 $ git clone https://github.com/joelgrus/data-science-from-scratch.git
 $ ln -s data-science-from-scratch/scratch scratch

In the jupyter hub , look at what you have click on "scratch" click on "introduction.py"

In a new notebook name it "chap1" from scratch.introduction import users, friends_of_friends

exploring

Let's try to do some of what he does in chapter 1 on our own.

... depending on time ...

I've put what we did in class into the ../code/jan24/ folder.

python

First: review some python 3 basics ... check out the resources page for docs & cheatsheet.

Second: chap 2 summary. Much of this you've seen before ... but some you haven't.

The author has his own coding style. Your mileage may vary.

 * zen of python  (beautiful; explicit; simple)
 * getting python / virtual environments (we'll use jupyter.marlboro.college)
 * whitespace / modules / functions (all seen in 'intro programming')
 * strings (new: f-strings)
 * exceptions / lists / tuples / (all seen in 'intro programming')
 * dictionaries (new: defaultdict)
 * Counter object (convient way to create {thing:count} dictionaries)
 * sets : {1,2,3} or set()
 * control flow: for, while, if
 * truthiness: bool(), and, or, all(), any()
 * sorting: list.sort() in place; sorted()
 * comprehensions:
   see https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Comprehensions.html
 * tests: assert (I used doctests in 'intro programming'.)
 * objects: (seen in 'intro programming')
 * iterables and generators:
   efficient (i.e. 'lazy') but may be awkward to invoke or appear unexpectedly
 * randomness (seen in 'intro programming')
 * regular expressions: powerful but complicated
 * functional programming: Often this is a style choice; "map", "filter",
   and other "data flow" operations. What the author calls 'pythonic"
   is typically not 'functional' per se in the way that
   some languages (e.g. Haskell) are.
 * types: a new and awkwardly handled recent python feature,
   but an important concept to understand, much more central
   (for good reasons) in many other languages. The python language
   sacrifices type safety for coding ease and convenience. As the
   size of coding projects, this trade off becomes more problematic.

We walk through some code examples.

https://cs.marlboro.college /cours /spring2020 /data /notes /jan24
last modified Fri January 24 2025 7:52 pm