Put differently, just because you see a bump, or just because you see a trend, does not mean it is significant and real. It might just be a fluctuation.
When we make claims in public policy or social science, about society, that are empirically grounded, we'll rarely get 5-sigma quality (too few observations, too little theory, too little precision). But, in general, you want to be assured that the claims make sense. Hence you must always attach error bars to your points or claims, where the bars might be 1-sigma plus or minus. Moreover, if you are claiming a trend or a shape, you need to fit the data to see if constancy and a straight line are reasonable zeroth-order assumptions. And if you are making a claim about when something began or the like, there are subtle tests of such in the statistical literature.
Moreover, Bayesian ideas should be on your mind. Even if you have rough measures and not so ideal statistics, can your measurements be seen in the light of what we take as priors and used to revise them. Often, in the policy arena, poor data may still allow you to improve practice, albeit not with the assurance you would like, but at least now you are doing better than without any data and only your presumptions and priors.
Also, never draw a line connecting points unless it is a "fit" to the data. Surely in the case of railroads you can link stations with lines since you know that trains go from A to B to C to... And even here they may not follow straight lines between stations. However, in studying time dependent data, your straight lines presume trends when what you may have is random fluctuation.
Finally, if you want to claim changes from one time to another,
be sure to normalize those changes by the standard deviations of the data, so
that, again, fluctuations are more apparent. And if you plot the data and you
have data that begins at say zero, you do not just show say from .5 to .6, but
present it as 0 to .7, or if not put a zig-zag on the y-axis to indicate that
you are skipping lots of y-axis—that is, the-y-axis begins at zero, you put in
a zig-zag at say 0.1 and resume at 0.4 in the above case.
-----------------------------------
-----------------------------------
What motivated the above: A propos of yesterday’s seminar on economic conditions and
social capital, I wrote this post. I enjoyed the talk, and unusually for me, I was not so much
concerned with what was the punchline. It seemed clear—to provide some evidence
about a common belief. Jenny Schuetz asked incisive questions about causation.
The speaker responded that he was trying to find out the facts of the situation,
and the connection seemed to be
causal given the time frames and some of trends in the disaggregated data. I woke up this morning thinking some more. None of what I
say here diminishes my interest in the seminar, but all of these things are needed
to calm various objections, none of which are necessarily fatal but all of
which need to be dealt with. A seminar may not be the place to make sure all is
perfect, but there is no reason to leave out obvious practices even if it is “just”
a talk. You want people to concentrate on your substance, not go crazy over
your statistics.
No comments:
Post a Comment