Saturday, March 8, 2014

Precision in Social Science Resarch

For many good reasons, social science research and its statistics and numbers function differently than what many natural scientists are able to do. This is not to impugn such social science research, for often it is of very great import, appropriate import.

1. When numbers are quoted, it is likely that they have between 1 and 2 significant figures at best. That is, even if the quoted errors or dispersion is small, one would not bet on their actual value (whatever that might mean) being so small--so  0.21 and 0.24 are likely indistinguishable in actuality, even if their errors are quite small.

2. Hence, when one compares two numbers, it is unlikely that a difference in say the second figure is credible. So it is quite unlikely that 0.21 differs from 0.24, no matter what the errors, in actuality.

3. Rarely is research repeated in such a way that the numbers reported in one project are checked to better than one significant figure. That is 0.2 and 0.3 are the same, no matter what the errors.

4. Dispersions, as say standard errors or standard deviations are measures of either variations in the data (you have no way of accounting for it), or they are measures of reliability of say regression coefficients. But, they have no intrinsic significance. But in finance, the dispersions are measures of volatility and so of risk in the context of some equilibrium.

5. Explanations, even if they have good reason to be taken as causal, will explain some fraction of the variation (the R-squared). Enormous effort goes into showing that the unexplained variation is not much influencing the explained fraction, and is merely noise.

6. What you are aiming for is a sense of what's influential, what is much less influential, what's worth attending to in making a decision. Rarely is the actual number, even with its claimed precision, so crucial.

5. When a particle physicist or an atomic physicist quotes a number, it often matters at the level of 3-10 significant figures, and if done properly, the error is in the last figure or so.  Often there is a good reason to have the number not only be precise, but accurate in the sense that you have a good reason to believe what the number should be in the context of theory and other numbers. Differences of two means, for example, may represent mass differences, or masses that are or are not zero, and these differences or tests for nonzero have great importance, and may well be a matter of differences in the last digit of three significant figures. Often the differences can be measured with much greater precision than can the two numbers that are being compared. Research is repeated very often, if not often enough, in the sense that a particular claimed number is checked by a different method, a method that claims to investigate the same fact. And dispersions are not always measures of variation, but often have deep meaning much as in finance. In general, for physicists, they would bet the farm on the claimed number with its precision as being accurate.

No comments: