definitions and variables
Definitions and variables for code:
Network/incidence matrix: an N x N matrix where N is size of the population. The i,j th
entry is a 1 if individual i knows (could potentially recruit) individual j and 0 otherwise.
This is used to create sampled for NSM and RDS.
Network Sampling with Memory (NSM): a sampling method where a pool of individuals from
a population nominate others in the population. The uniquely nominated individuals
are considered and one is chosen at random, replacing a member from the initial pool.
The process continues until a sample of desired size is obtained.
Panel: In NSM, the group that nominates potential respondents
Nominee: In NSM, the group of potential respondents as listed by the panel.
Simple Random Sample (SRS): a sample in which all members of the population have an
equal probability of being selected.
Bootstrapping: a method of resampling for variance calculation where the sample is
treated as a population and a number of random samples are taken. The variance in these
samples is the bootstrapped variance.
Respondent Driven Sampling (RDS): A sampling method where an initial group from a population
recruits new sample members (between 0 and 3 in my code), who in turn recruit more
respondents. The process continues through a set number of waves.
Seed: In RDS, the initial recruiters. Because they aren't selected through the recruitment
chains, they are not considered in the final calculations.
Wave: In RDS, a term referring to each new phase in the sample when respondents recruit
new individuals into the sample.
Degree: the number of people an individual knows in a given population. This is used
to calculate selection probabilities in RDS.
Scaled Estimation Sampling: A sampling method where the user makes estimates on
nonoverlapping groups in the population, then samples some of them to get a scale
factor of how "off" the estimates are, which is applied to all estimates.
actual.pops: the actual population sizes of all groups in the population
SES.estimates: the estimated population sizes of these groups
SES.sample.df: A data frame with the estimated population size and index of all sampled
units
Probability proportional to size (PPS): a method for creating a sample where the units
with highest populations are more likely to be chosen.
T-square: a method of area sampling where a series of random points are chosen in the
sample area. The distance between these random points and the nearest house is found,
as well as the distance between this house and the nearest house outside of a 'T' made by
the line between the point and house and the line perpendicular to this going through the
house. These distances are used to find the size of the houses in the area and then
calculate population size based on this and the total size of the area.
sample.win: the window where all houses and points are generated
houses: the houses that the function is estimating
starting.points: the randomly chosen points
S: the houses nearest each random point
X.dist.vec: the distances between each S house and starting point
Y.dist.vec: the distances between each S house and each nearest house outside the T.
There may be NAs if there are no houses outside of the T
T.square.df: a data frame with X.dist.vec and Y.dist.vec and no NAs.
quadrat method: an area sampling method where the population is divided into square blocks
called quadrats. A number of these quadrats are sampled and the average area used to
estimate population size
actual.pops: the population size in all quadrats
quad.sample: the population size in sampled quadrats