The Design Of Experiments vs the design of experiments

1. Jan 23, 2012

Stephen Tashi

There are always a large number of posters in the statistics section of math forums ( - in physicsforums, but also on other math websites) who present their situation as: "I 've collected all this data, how how do I analyze it?".

These posts are not a random sampling of the population of all people who analyze data. I'd think that they over represent those who are confused about the procedure. However, when I think back on my own education, statistics courses didn't emphasize that one should plan the statistical analysis of data before the decisions about how to collect data are finalized. (The only nod to that problem was the topic of how large sample sizes should be to give certain levels of "confidence".)

There is a discipline called "The Design Of Experiments", but it appears to focus on a very specific kind of model for the data, one where the dependent variables are multinomial functions of the independent variables. (Is my characterization unfair?)

I'm curious whether the modern university curricula teaches courses about the design of experiments in the general sense of the word "design". Or are students still encouraged to take the shock treatment - rush to get the data collected and then... hmmm...How am I going to analyze this?

2. Jan 24, 2012

chiro

Hey Stephen Tashi.

I took a subject titled "Sample Surveys and Experimental Design" which was an upper undergraduate/graduate course on these two topics which was a semester long 13-week course. I will refer to this course in my discussion.

In the section of the "Experimental Design" part of the course we focused mainly on the standard linear model treatment of analysis, but we did do some GLM analysis with computer packages. We also had to design our own experiment and analyze correctly.

A lot of the course was involved in designing the experiments and understanding the implications of this both in a physical context as well as mathematical (model) context and an analytic (analysis/results/interpretation) context.

So with regard to your question, we had to make sure we could understand the data and how to properly categorize it in the context of an appropriate design. We had to think about how to analyze both the context of the data (the experiment) as well as the actual data in the right manner.

This was a harder course for myself because it required a lot more thinking than some of my other subjects and I am far from having a decent understanding of experimental design in statistics at this very moment.

We also had to identify fixed and random effects and put them into context.

So in conclusion, the course I think was pretty thorough given the amount of time that was spent on the material (about 6 weeks). It is however noted that for an honors or graduate year, there is another course that goes into material that isn't covered and I think its important to note that statistics (as I am finding out) takes a little while to learn.

I remember my professor who taught the subject was telling us how much of a shock it was when he went out into the 'real world' where he had to really learn this kind of thing because the training in university was of a different type (more mathematical), and since I intend to get into this kind of area, I will probably have a lot of trouble if I get into this area (given how I went in a more theoretical course).

Also before I forget, you have to remember that I am in a statistics major. I have high doubts that a non-statistics major would allocate enough time on even a fraction of the stuff we had to cover and it would be interesting to note how many of the threads in question come from standard science majors or engineers who take your very basic probability/hypothesis testing class that just pumps out formulas and statistical tests without really explaining what they are doing.

3. Jan 24, 2012

Stephen Tashi

That's something that interests me, too.

I suppose many people in the "hard sciences" doubt the importance of statistical studies done by social scientists, psychologists, economists etc. Perhaps I share such prejudices! So tt always surpises me when posters working in physics, astronomy, electrical engineering etc. reveal that they have collected data and only now are thinking about what statistical methods should be used to analyze it.

There are situations where can only get data from experiments done by others and it's understandable that a person can't pre-plan the statistical analysis of such data. However, when the person sets up the experiment or exerts himself to pick out certain information from a vast collection of information, I'm suprised when they don't think about the statistical analysis before they do the data collection.

4. Jan 25, 2012

chiro

I took a course in Bayesian Inference last semester (like the design course I could have went better) just to set the context.

One thing I have noticed when I took the course is that a lot of the algorithms for generating very complex distributions using the conditional statistics of the Bayesian approach are very very recent. I noticed the dates of papers were something on the order of approximately ten years ago.

I have noticed that these methods seem to be used everywhere from health all the way up to particle physics to analyze accelerator results data.

The thing is though, that I would imagine that really knowing these algorithms inside out in the context of a proper statistical analysis (relevant assumptions, context of inference and so on) would take quite a long time to really grasp solidly.

For me, I've taken a full course on this and I don't yet have the proper insight to do the above although I understand what is being done and how I can use the technique to answer things in a more limited statistical context.

I can't imagine a scientist really having this kind of insight as well unless they either a) spent a lot of time with an experienced statistician or b) had to do very specific research that required them to do a lot of solid statistics.

Another thing is that I think a lot of these people see statistics as a tool that is "secondary" to their work. To me, its a minor distraction to what they really are concerned about which ultimately depends on their field. In this way I can understand where they are coming from if this is the case.