Thursday, March 13, 2014

Credible Claims in Statistical Analyses

You have a data set. Let us for the moment claim that there is no measurement error in each measurement or data point. 

1. You might find the mean and the standard deviation of some measurement. It's likely that the standard error of the mean is small if the N is large. But it is hard for me to believe that the mean +- SE should be taken too seriously if the standard deviation is substantial. That is, the location of the mean may be statistically sharp, but given the substantial dispersion of the measurements, I would be surprised if I should bet on more than two significant figures, often just one. Put differently, you measure 6+-2, and the SE is .02. I would find it hard to distinguish 6 from perhaps 7, given the spread of the measured values. That is another measurement might have gotten 7+-2, with similarly small SE. I would not believe 6 is different from 7 in this context.

2. When I say that I would not believe, what I am saying is there is enough noise, non gaussian intermixture, junk in the data so that the spread given by the SD prevents me from making vary sharp claims about the difference of two different means.

3. You do a variety of statistical studies. Ahead of time, before you do the studies, you might estimate what you think the statistics will be. The mean, the SD, the regression coefficient. Roughly estimate them. Maybe only relative sizes. Maybe there is previous research that gives you a decent idea. 

Then do your studies. Are you surprised by any of the statistics that come out. Keep in mind that it is hard to believe more than two significant figures, often one, whatever the statistical error.

4. You are trying to measure the effect of an intervention and the like. I suspect that any effect smaller than 1% is not credible, again whatever its statistical error. Maybe 10%, maybe 30%. Your problem is that typically you might account for a fraction of the R-squared, and you have to assure yourself that the rest is truly random noise or randomized by your research design. A small amount of impurity or contamination in the data will be problematic.

5. Whatever you measure, can you think of a mechanism that would lead to the number, roughly, that you measure. This is after you have done your analysis. Before, see #3 above. Could you eliminate a range of mechanisms by your statistical work?

6. If you make claims about a discount rate, say, why should I believe your claim? Have you done sensitivity analyses with different rates, to see if your conclusions are robust? And how many years ahead would you want to use such a notion as discount rate and believe it is credible in reflecting our attitude about the more distant future?

7. In the natural sciences, measurements almost always are about actual objects and their properties, say their mass, their energy, etc. Usually those properties are connected to measurements of other properties and perhaps to theories that predict values or connect the value of one property to that of another. In the social sciences, that is rarely the case, as far as I can tell. And you believe you could have, in some particular measurements, many significant figures (high accuracy). I don't see such a belief in social science.

Saturday, March 8, 2014

Precision in Social Science Resarch

For many good reasons, social science research and its statistics and numbers function differently than what many natural scientists are able to do. This is not to impugn such social science research, for often it is of very great import, appropriate import.

1. When numbers are quoted, it is likely that they have between 1 and 2 significant figures at best. That is, even if the quoted errors or dispersion is small, one would not bet on their actual value (whatever that might mean) being so small--so  0.21 and 0.24 are likely indistinguishable in actuality, even if their errors are quite small.

2. Hence, when one compares two numbers, it is unlikely that a difference in say the second figure is credible. So it is quite unlikely that 0.21 differs from 0.24, no matter what the errors, in actuality.

3. Rarely is research repeated in such a way that the numbers reported in one project are checked to better than one significant figure. That is 0.2 and 0.3 are the same, no matter what the errors.

4. Dispersions, as say standard errors or standard deviations are measures of either variations in the data (you have no way of accounting for it), or they are measures of reliability of say regression coefficients. But, they have no intrinsic significance. But in finance, the dispersions are measures of volatility and so of risk in the context of some equilibrium.

5. Explanations, even if they have good reason to be taken as causal, will explain some fraction of the variation (the R-squared). Enormous effort goes into showing that the unexplained variation is not much influencing the explained fraction, and is merely noise.

6. What you are aiming for is a sense of what's influential, what is much less influential, what's worth attending to in making a decision. Rarely is the actual number, even with its claimed precision, so crucial.

5. When a particle physicist or an atomic physicist quotes a number, it often matters at the level of 3-10 significant figures, and if done properly, the error is in the last figure or so.  Often there is a good reason to have the number not only be precise, but accurate in the sense that you have a good reason to believe what the number should be in the context of theory and other numbers. Differences of two means, for example, may represent mass differences, or masses that are or are not zero, and these differences or tests for nonzero have great importance, and may well be a matter of differences in the last digit of three significant figures. Often the differences can be measured with much greater precision than can the two numbers that are being compared. Research is repeated very often, if not often enough, in the sense that a particular claimed number is checked by a different method, a method that claims to investigate the same fact. And dispersions are not always measures of variation, but often have deep meaning much as in finance. In general, for physicists, they would bet the farm on the claimed number with its precision as being accurate.

Thursday, March 6, 2014

What you are measuring...

I was trained as a physicist. When we made a single measurement, of an event or a whatever, we have a sense of the measurement error, what we called systematic error. It usually depended on the quality of our apparatus and its limits. In the case of rare events, we might even have a sense of a Poisson process and so say that we saw an event at time T. In any case, whatever it was that we were measuring was as real as anything, not an artifact.

When we combined measurements, a combination of the data, that is we were being statistical, we might expect to get something like a gaussian with a mean and standard deviation (or some other expected or suprising distribution). The more events we had, the better known were those statistics. And we believed that those statistics were referring to something as real as anything--say the mass of the particle and its lifetime (the inverse of the standard deviation) with an unavoidable systematic error that affected our statement of the standard deviation and perhaps that of the mean.

If you wondered whether the mean were different from zero, you wondered whether some real thing were different from zero--say the mass of the tau neutrino. If you wondered whether the mean of two different measured quantities were different, you were wondering whether the mass of the K particles were different. You would take the width or standard deviation seriously, because in fact there was surely some systematic error and there was an intrinsic width of the measured quantity, its lifetime, but still the mean, the mass was something real, and you would quote the mean with an error and the standard deviation with an error (systematic and statistical)

In social science studies, as far as I can tell (and I have become acutely aware of this only recently), again we make a single measurement and have sense of measurement error (if that makes sense in this context: sometimes it may be a matter whether people are reliable reporters in a survey, whether the data is dirty,...).  Again, we might say that whatever we are measuring or surveying is as real as anything.

Again, "When we combined measurements, a combination of the data, that is we were being statistical, we might expect to get something like a gaussian with a mean and standard deviation (or some other expected or suprising distribution). The more events we had, the better known were those statistics."  And we act as if those statistics were referring to something as real as anything--say the average height of a population. But almost always there is no reason to take those statistics as real, they were artifacts of our combination and we had no theory that gave that number a deep reality--or so is my observation of actual practice. And of course there might well be "an unavoidable systematic error [ore measurement] that affected our statement of the standard deviation and perhaps that of the mean."

If you wondered whether the mean were different from zero, you would check the power of your statistic, you would see how well measured it was (standard error) and so you might get a good sense of whether the mean were different from zero. But, there was nothing real about it in the sense that a particle mass were real. It was a statistical measure of that artifact, the height of a population. Presumably the width is substantial but not overwhelming, but it shows the dispersion of heights.

However, say you wanted to check whether the difference in heights of two populations were significant. Surely you can do much as the physicists would do, and see if the difference of the means were statistically significant (and say that the systematic or measurement errors were not important). But say as well that the distributions overlapped substantially. You could surely say something about whether the means were different. But, I would find it hard to take such a statement very seriously, since the distributions overlap so much and so any problems in the distributions would make me skeptical that the measured difference were credible in actuality.

Wednesday, February 26, 2014

Skating to where the puck will be... +Other ideas.

Referring to a strong neuroscientist someone says, "He is really good in that Wayne Gretzky way of skating to where the puck will be."

It seems that in Brecht's Mother Courage, she says something to the effect that you need courage when someone has failed to plan correctly.  So you need courageous soldiers when the generals screw up.


Saturday, February 22, 2014

Jury-rigged and Jerry-built

Jury-rigged is a matter of an improvisation in repairing something. [comes from nautical use, jury-rigging a sail]

Jerry-built is shoddy workmanship.

Monday, February 17, 2014

What you can do in a classroom-- And why there's room for a person in the classrooms

I have been thinking about what made the difference in my higher education, college and graduate school.  What mattered was the interaction with my teachers, and watching them talk or lecture or comment. Actually, a good fraction of the time, nothing much mattered, but when it mattered it was that person thinking about a subject.

I studied physics and did my advanced work in elementary particle physics. My teachers were of varying degrees didactic, but what mattered was watching them think and work out problems and explain stuff. You could learn most of physics from textbooks and doing problems, but what you needed to learn was thinking like a physicist. Now the Feynman textbooks convey that rather well, but I know of no other such textbooks in physics, at any level, that really teach you this. So I was stuck with my teachers (about 7 or 8 had or eventually received Nobel Prizes in physics). Watching them think is what I learned: how to think like a physicist about the world. I imagine that watching videos or movies of those lectures might have done the work, but often it was a matter of the less polished, more impromptu moves that made the difference.

I also had a heavy dose of the great books, great thinkers, in both literature and social science and politics. Again, what I needed to master was how to read and think about the world of imagination and ideas, of culture and society. Actually, my teachers were not in general so good at teaching me this, and it is only in my later years, ten to thirty years post PhD that I began to learn to think in the ways humanities scholars do. As for the social sciences, those that emulate something like physics, I can see what they are doing. If there is something more subtle, say fieldwork in sociology, my humanities training was what was crucial. Trained as a physicist I was never much diverted by people pulling out equations or models--for I knew that what mattered was the basic ideas about such--and I discovered that many of my colleagues in the social sciences were so involved with the formalism that the ideas escaped their consideration.

In other words, what mattered to me was to learn to think. I imagine that if I studied a field where substantive detailed knowledge was crucial, I would not have been very adept, and perhaps those fields benefit from distance learning technologies and other didactic methods. But if you want to learn to think, you have to watch people do it, and model yourself after them. And it matters if you are in the room with them, engaged with them, and having a sense of what is at stake.

I do not do much didactic teaching. I don't know what to teach. Rather I teach people how to think about matters of public policy and city planning, about methodology and reliable knowledge, about critical analysis of scholarly work. I can write down some rules. But what matters, I believe, is watching me in action, and having me take on a student's work and try to make it better.  I will discuss reading and try to give people a sense of what matters in the text we are analyzing. I will take on questions and try to see how the question relates to what we are studying. I am a performer, an intellectual performer, and I live and die by my ability to think.

Monday, February 10, 2014

Technical Yiddish and Formal Mathematics: Bupkis and Unbelievable Chutzpah, Nonstandard Analysis

My colleague Ed Kleinbard of USC Gould School of Law, referred to the comparatively small amount of money you can raise in taxes from the ultra-rich as Bupkis ("This is what is known in public finance circles as bupkis.") that is, goat droppings, and the Yiddish term for nothing or something quite small.

In an earlier quote, he referred to some of Apple's tax strategy as Unbelievable Chutzpah (“There is a technical term economists like to use for behavior like this, Unbelievable Chutzpah.”), where by locating economic activity in low tax nations, taxes are reduced--actually a very large number of dollars being involved.

Such technical Yiddish has a theoretical structure in mathematics.There is in mathematics a whole theory of very small and very large quantities (as 1/small)--nonstandard analysis. It arose out of mathematical logic, set theory, and what is called model theory. Turns out to be very useful, for talking about infinitesimals, etc, when 4 x epsilon is still epsilon, and for many other mathematical notions.

Friday, February 7, 2014

I give up... You can't tell them if they don't want to hear.

Briefly: I will advise a colleague or a student about how to be more effective, with suitable disclaimers that I might well be wrong. They then defend themselves, in effect ignoring what I said about their behavior, justifying what they did without paying attention to matters of decorum or style. I have in the past tried to get back to them, to make them realize what I was trying to say.

I give up...

Thursday, February 6, 2014

Potshots vs. Deep Questions

Often one goes to a seminar, and there is a continuing series of questions that are ok, but in fact are potshots at the speaker's work. What you want to do is to ask deep questions, questions that allow the speaker's work to be rescued from idiocy. But only some scholars are capable of rising to the occasion. If you are one of them, you want to avoid potshots. Another way of putting it is that a great scholar takes a dumb question asked by a student or colleague and converts it into an interesting and deep question and answer.

To a really talented scholar, I would say: You are too good to spend your time or effort shooting things down. You ought launch your own rockets. That is, when someone is giving an unsatisfactory talk, you want to ask a question that allows them to grow, and in effect that allows your deep intelligence to inform the conversation. I realize that what I am saying is not so sweet, but since I have respect for you, I will keep saying it. Of course, all of what I am saying could just be wrong, but at least I am putting myself on the line.

Your job, given your reasonable request for theory and mechanism, is to present your observations in terms of a theory rather than a series of scattershot observations. You might have said, "I just want to be sure we understand that your work tells us nothing much about the motivations or scheming of these actors. In fact I can imagine a whole variety of innocent explanations for your observations. Here are some...  " I realize this is hard to do, but you don't want to be  too smart by-a-half. For the ability to discern such innocent explanations is nice, but that gets you nowhere unless you also start thinking about how you would go about checking them out. That is why I mentioned fieldwork as well as statistical studies.  The reason I am so sharp here is that you are more than too smart by-a-half, you are really smart and deep.  You don't want to diminish your power by being that half.

What makes Chicago/Moscow style objections in a seminar is that the objections are never cheap. They reveal the depth of the questioner. You have all that depth.

Now of course, I understand that you could be frustrated by a talk that does not do what you think it should, and that may have driven you off course. What I usually do is about 15 minutes in try to ask a question to find out what is going on. That question almost always puts me at risk, since I am making a positive claim: "Is what you are doing X, Y, and Z?"  Potshots are a waste of your depth.

Monday, February 3, 2014

When a Project Fizzles

Right now I am thinking about my next project. I wrote an outline for a presumed book, and in the last day or so I expanded it in an essay of about 2500 words. As I was writing, the book seemed rather more pedestrian than I had imagined, the ideas seemed rather more obvious. Of course, the book would have had detailed examples and cases. But for the moment, these 2500 words do not look so promising. I have said what I wanted to say, and perhaps no more should be said.

I won't know until a few days from now when I reread what I have written, fix it up a bit, and share with a few people who are more expert in the field than I am.

If it fizzles, I will be grateful that I have discovered this well before I have put in much more time and effort. Of course, one is disappointed, but at least the air is cleared.

We'll see.