Empirical
Science
Workshop

course

schedule

Jan 24 - Feb 7 : data analysis with Jim and Matt

your groups

what to expect

stuff you should know by the end

your assignment

... has two parts. For each, your group should make lab write-up either as a web page or a powerpoint-like presentation, complete with the appropriate table of data and graphics with error bars. We'll discuss more of what's expected in class,
  1. Measure the distance from the science building to the campus center.
  2. Test the following hypothesis: "There's a better than even chance that at least one five will come up when I roll these four dice."

Feb 14 - Feb 21 : Measure the acceleration of gravity, g

your groups

what to expect

stuff you should know by the end

your assignment

...is simple: measure the acceleration of gravity, g, using some procedure to be decided on by your group. Since the point here is to get you to think about and correct systematic errors, make sure you do some of both and make that clear in your write-up. You'll want to read the background essay I wrote up to get you thinking in the right direction. Then you'll want to spend some time as a group thinking seriously about the best way to do this. Figure out what materials you need; we should have most of them already, and we'll arrange to get other things you think you need (within reason).

Some general-interest comments on the lab write-ups...

First off, the write-ups this time were significantly better than the ones from the first lab. They all showed a level of effort that was at least reasonable. That effort is noted and appreciated. I'm going to list a bunch of pervasive or interesting errors/confusions that showed up in the write ups. Hopefully these will be clarifying and help you understand what we, the faculty, are looking for in these reports. (...something, by the way, we're just figuring out for the first time too!) But let me just state that you shouldn't be too bummed out by the long list of negative comments, nor the similar ones that are scribbled on your individual reports. It's only when you put some effort into both the lab and the write-up that it becomes possible for us to say "aha, you misunderstood this point", etc. So in a way the long list of criticisms here is a testament to the much improved level of effort.
The write-ups were typically (perhaps universally) too short. Most groups included at least some prose to lay out clearly the goal of the lab, and then the specific methods/procedures used. But there wasn't enough detail on these. Remember, the goal of any scientific paper is to convince rational readers of your conclusions. If someone reads your paper but (a) can't tell what you did or (b) can't tell how you did it or (c) can't tell how you analyzed your data (etc.), then (if they are rational) they will not be convinced. So in the future, try to lay everything out in a clear enough way that they will be convinced!
All the groups included their "raw data" in some form. Typically this meant that their report was about one page of actual report, followed by 3 or 5 or 8 pages of just plain numbers. On the one hand, it's good to include the raw data. But nobody is ever going to look at those pages and pages of numbers, and even if someone did, they'd be hard pressed to learn anything useful from them. Hmmm... if only there was a way to present data in a way that (a) didn't take up so much space on the page and (b) was easier to look at and actually learn something from, i.e., interpret. Hey! There is such a thing! It's called a graph! So in the future, it would be good to include your data in the form of a graph. For example, if you have 500 different values for some quantity that you measured 500 times, show a histogram (a plot showing the number of times various values (grouped into some number of small intervals or "bins") showed up in your 500 trials). Or you could even just display a regular graph of result vs. "run number". This is often a very revealing thing to look at -- you can almost always see just by looking at the graph if there is some kind of funny systematic effect going on. One of the groups, for example, measured "g" a certain way by doing 5 runs of 20 trials each. Well, their raw data made it look as if the values were all pretty similar within each group of 20, but there appeared to be rather large and systematic differences between the values in the 5 different groups. I don't know if they noticed this; I don't even know if there really was such an effect, because it's hard to tell for sure just by staring at 5 pages of numbers. But a graph would reveal this (and save space to boot).
There was frequent inconsistency with regard to the level of precision or "care". Here's what I mean: lots of people took literally hundreds of measurements and carefully calculated means and standard deviations and kurtoses and whatnot with them; this care was typically evident in the write ups. But often the same write-ups just blazed at light-speed past statements like "the length of our pendulum was 1.9 meters" or "the ball was dropped from 3.01 meters". But how did you measure this? What is the uncertainty? If your fancy statistical analysis suggests that you've measured "g" with a 1% uncertainty, but you don't tell me how accurately you know any of the measured parameters that went into the calculations for g, why should I believe your 1% error bar? As an extreme example, if you just eyeballed the length of the pendulum (as in, "Well, it looks like a bit less than two meters... call it 1.9 meters... too bad we don't have a meter stick to actually measure it") then maybe your value for the length is good to, say, 10%. But if that's the case, your calculation of g -- which is based on the length -- is also immediately afflicted by this same 10% uncertainty (or thereabouts). This goes back to the general goal of convincing the reader to believe your conclusion. All the care in the world on some points is simply wasted if you're not equally careful (and equally explicit in the write-up) about other points.
A few groups seemed overly concerned with (or even overly aware of) the "nominal" value of g -- 9.8 m/s^2 or whatever. There's of course nothing wrong with knowing that other people have measured this and that that's what they got. But that knowledge should play essentially no role whatsoever in your experiment. Here are some examples: one group claimed that their experiment was successful because their result was consistent with the nominal value. It's fine to note that, but that's definitely not what it means for an experiment to be successful -- indeed, defining success that way means you can never learn anything new from a successful experiment, and, conversely, a surprising result is always by definition a failure! So... try to force yourself to redefine "success" -- for this class at least, a successful experiment is one which would convince a rational (appropriately skeptical!) reader that your result (including its uncertainty, whatever that turns out to be) should be believed.
Every write-up included at least some discussion of possible systematic errors. That's good. But most of them were all talk and no game. In fact, only one group actually identified, analyzed, and corrected for a systematic error. That's not so good (I mean for the rest of you). Talk is cheap. If you think your experiment is afflicted with some particular systematic error (and you all did), you can't just mention that and go on. You also can't say: "well, there's probably some friction, which would be a systematic error, but it's negligible." How do you know it's negligible? The answer is: you don't know, not unless you measure it or calculate it or estimate it or something. So... any time you recognize the presense of a systematic error, you should make some real, concerted effort to isolate it, to understand it, to estimate its "size", and, if possible, to correct for it.
One group's report included the statement: "Our results revealed a large amount of systematic error." I think what they meant is that the nominal value (9.8 m/s^2) was outside their 95% confidence interval -- hence, there must have been some systematic errors. Well, that's probably true. But I think the statement might also have been based on a confusion about the nature of systematic errors -- namely, systematic errors are something you (at most) identify, but you can't really do anything about them; but that's OK because you get to find out in the end how big they were when you compare your result to "the truth." Well, that works fine if you already know the truth. But if you already know the truth, why are you wasting time doing the experiment? (That's a rhetorical question... probably best you don't answer it!) Bottom line: the time to analyze and correct systematic errors is before your experiment is done. If you do a good job of this, you will actually be able to believe your results. Always carry out your experiment as if you have no idea what the outcome will be, and try your hardest to be sufficiently careful that you believe your own results (including the error bars) more than you believe what you read in some book or magazine.
Be wary of formulas. A couple groups got into trouble by having just a little too much faith in formulas. For example, the period of a pendulum is only "T = 2 pi sqrt(L/g)" if the amplitude is very, very tiny, and if the "bob" is infinitely heavier than the string which supports it. Or: the equation for the distance a dropped object has fallen in time t is only "d = 1/2 g t^2" if there is no friction or air resistance. I think all the groups that used formulas like these recognized the possibility that there might be systematic errors associated with measuring, say, "d" and "t" (e.g., reaction-time delays). But several groups didn't seem to recognize that, in addtion, the equation itself might simply be wrong, because the equation is derived under some assumptions which weren't applicable to the actual situation at hand.
Only one group looked at the AmJPhys paper I referenced. =( But that one group got, I think, a lot out of it. Let that be a lesson to the rest of you! If you're stuck or confused or looking for good advice, it's not at all crazy to see how people have done the thing before and learn something from their attempts.

Feb. 28 & Mar. 4: Juglone extraction and assay with Todd

Your groups

Things you should know

Assignment

Notes on analysis of experimental data

For this experiment, I hope that all of the seeds grow, and that some of the treatments (e.g., high and low concentrations of juglone) result in seedlings that don’t grow as tall as the control seedlings. If we do get such results then one way to analyze the data is with an Analysis of Variance (ANOVA - see below).
Another possibility is that any plants that grow will be of similar size, but that the different treatments affect germination differently, so that some treatments contain lots of plants, and other treatments contain only a few plants that grew. How would you analyze this type of data?

Analysis of variance – ANOVA

The point of the ANOVA is to test for significant differences between means. This analysis partitions the total variance in the data set into:
If more of the total variance is due to differences between groups, rather than to variation within a group, then there’s a high probability that the means are from different populations – that the treatments used had an effect on the dependent variable – and the null hypothesis (H0) is rejected.
In our experiment, what is the independent variable? It’s the different chemicals applied to tomato seeds. And the dependent variable? Well, we expect to measure plant height as the dependent variable. Is this a continuous or discrete dependent variable?
Once the seeds germinate I’ll pick a day and measure their height. Then you can calculate mean plant height for each treatment. You’ll probably find that the means are not the same. But is the difference in mean plant height due to random variation (experimental error), or some systematic effect of the independent variable (treatment effects)? We can use an ANOVA to try and answer this type of question.

Using ANOVA

To analyze the result from our experiment we will use a ‘1-way ANOVA’, or a ‘one factor ANOVA’. Our ‘one factor’ is the different chemicals applied to the seeds. A two-factor, or two-way ANOVA would add another variable. For example, if we wanted to test for an interaction between juglone and soil composition, we would apply different concentrations of juglone to seeds in different types of soil. But for our one-way ANOVA the null hypothesis is that all of our calculated population means, the treatment means, are equal. To test this hypothesis ANOVA partitions the total variance (as described above) and calculates an F statistic.
For ANOVA's performed using computer software, the program compares this F statistic with a table of critical values for the F distribution, which gives us the probability of arriving at a particular value for the F statistic. This probability is the p-value.
When the p-value is less than 0.05, and the null hypothesis (H0) is rejected, at least one mean is different from at least one other mean. But the ANOVA can’t tell you where this difference lies – it can’t tell you which means are different from which other means. To find this difference we use multiple comparison tests, or post-hoc tests. One method is to use a series of pair-wise comparisons – say, t-tests. But what risk do you run when you start performing lots of t-tests? You increase your chances of committing a type I error: a true Ho is incorrectly rejected. To control for this risk, multiple comparison tests require a lower p-value for rejecting H0 – i.e., lower than our normal critical value of 0.05 (5%) – where the exact value depends on the number of comparisons you perform.
To summarize, you will use an ANOVA to analyze your data. The software you use to perform the ANOVA will calculate an F statistic and a p-value. If p £ 0.05 you reject Ho. Next you examine the results of the multiple comparison test to determine which means are statistically different from which other means.

Performing ANOVA using JMP

Computers in the computer lab have a program called JMP. I recommend using this software for your analysis. Excel can also perform ANOVA’s, but has no provision for performing multiple comparison tests.
After launching JMP you get a dialogue box called JMP Starter
From the menu at the very top of the screen select Analyze

March 28 - April 4 : the wonders of biology with Bob

your groups

what to expect

stuff you should know by the end

your assignment

April 11 - 18 : Correlation & Causation with Jenny

your groups

SEE:
http://akbar.marlboro.edu/~allisont/correlation.xls
for an example spreadsheet with a scatter-diagram and with the calculation of the correlation coefficient. You may have to cut & paste that url; the link isn't covering the whole address, for some reason. You will need microsoft excel to open the file.
4/18/05, what Jenny wrote on the board (your assignment):
80% group
Statement of possible correlation
Basis for developing correlation
Methods for collecing data for correlation
Data presentation for correlation
Interpretation of correlation analysis
Design an experiment/set of observations to test causality
20% individual
Present one possible correlation to analyze (natural science-oriented)
Due: Monday May 2nd

April 25 & May 2 : Student Presentations

your groups

This portion of the course is not done in groups; you will be presenting your data from one of the previous labs (your choice) in more detail, as well as providing a write-up.

schedule

Presenting on Monday, April 25th are:
Presenting on Monday, May 2nd are:
last modified Tuesday April 19 2005 7:24am EDT