Sep 6
discuss homework
Notes:
- Public facing web pages: please include contact info, date, and context. (And discuss why that's a good habit to get into.)
- Code: likewise, please always have author, date, and context in the header. Good habits to get into.
- Think in terms of functions (or methods) when you write code; those are the testable, reusable pieces.
- Repetitive coding structures are usually bad : put it in a loop, and pull the data out of the logic.
- Show explicitly in the comments, a README.txt, or a screenshot what your code does.
- Comments are for things you can't infer from the code. Give examples of good & bad.
The assignment for next Tues is up. (There may be too much verbiage in it. The main point is to understand POST, GET, forms, cookies, and to see explicitly the information being passed, in the headers and/or body, either from the command line and/or which tools like the Chrome browser's Developer mode.)
HTTP (Hypertext Transfer Protocol) and all that
(Continue the conversation we started Tuesday.)
Big picture so far:
- Essential technologies in "client-side" browser:
- html - page content
- css - appearence
- js - actions in javascript programming language, using DOM (tags in page are objects)
We're assuming you've had some exposure to the first two. We'll be
looking at the third more closely soon.
But first, here is the key question we want to understand next :
What - exactly - happens when a browser requests a web page?
* What is the data?
* In what form
* Sent from what
* Caught by what?
Your mission over the weekend is to explore the answer to that question
using some interactive tools (browser with developer features, command prompt "telnet host 80",
wget|curl, etc) and by online reading|googling.
Here's the starting point for the answer :
Visit the wikipedia page. Point to the related articles. (This is in the week's reading assignment.)
Telnet session example from command prompt :
# connect to port 80 (web server)
$ telnet cs.marlboro.edu 80
# Do you speak HTTP ?
> GET / HTTP/1.1
> Host: cs.marlboro.edu
# The server answers:
> HTTP/1.1 200 OK
> Date: ...
> Server: ...
# Other (request ... response) pairs
# (all with 2nd line 'Host: cs.marlboro.edu, followed by blank line)
GET foo HTTP/1.1 ... HTTP/1.1 400 Bad Request
GET /foo HTTP/1.1 ... HTTP/1.1 404 Not Found
HEAD / HTTP/1.1 ... HTTP/1.1 200 OK
# To www.marlboro.edu
HEAD / HTTP/1.1 request
Host: marlboro.edu
HTTP/1.1 301 Moved Permanently response
Location: http://www.marlboro.edu/
Concepts:
- caching
- one web page = many files (.html, .jpg, .css, ...)
- "dynamic" pages
- simple page : one static (unchanging) page
- complex page : maps.google.com AJAX fancy stuff
Use Chrome developer features and/or Firefox + extensions to watch this happen.
Buzz words:
- HTML form tag : send data along with web page request
- HTTPD GET : "idempotent" - server isn't supposed to change (much) ; data in URL
- HTTPD POST : upload new data to server ; data in body of request
Peeking under the hood :
- look at HTTP PUT & GET requests with Tamper Data in Firefox
- ditto with forms (both put & get) & URL passed parameters
- ditto with cookies
Web form demo :
Note also that 'wget' and 'curl' can fetch a web page from the command line, and (see the man pages) do some header/cookie stuff.
And maybe way under the hood :
- Matt Dailey's packet capture example
- use Wireshark to capture our own packets
Related concepts not quite in our curriculum this term:
- What is a URL? A URI? Is there a difference?
- DNS and host name lookup
- MAC addresses vs IP addresses
- intra- vs inter- nets
- (All of these may be part of the "get a web page" cycle, but at lower or parallel protocol layers.)