Sep 9
Jim's notes from our first meeting :
Talked with Dylan on Sep 9
R files :
RDS.r (respondent driven sampling),
NSM.r (network sampling with memory)
These two are both looking at this problem:
population of 2 types of people (male/female, say) want to find % of
each, with sampling sampling procedure is to start with small number
of "seeds", and then using some sort of network find more people in
successive "waves" of adding to sample.
RDS does this with a given network, adding 1 new person per (or a small number)
NSM does this with each person listing others, and some sort of
filter on combined lists
----
SES_simulation.r, T-square_and_quadrat.r
These two are both trying to count how many things there
are in a 2D plane, either with a grid (quadrat) or
by moving in a T direction from previous find & line (?)
SES scaled estimation sampling
---
$ r
$ install.package("network") # required
$ import("network")
The RDS.r one (only one we looked at together)
has some missing constants (pop.size, avg.degree)
and looks like it needs some debugging.
Indentation needs to be cleaned up,
and I'd like to see it put into functions
with clear inputs/outputs/test_cases
and some sort of really short description of each
with definitions of terms (seed, wave, ...) would be good.
---
"Variance" - of what in RDS ?
(I don't know what that means in this context.
What is being repeated?)
---
For for quadrat and NSM, Dylan wants to find optimum
parameters to get best result. I suggested abstracting
this into a function
result = f(input1, input2, ...)
that we can then use formal search approaches on inputs
(i.e. hill climbing, exhaustive search on a grid, ...)