Science has progressed much by using the experimental method, because it allows us to control many of the important variables including they key independent variable, and it allows us to randomly assign subjects to the treatment or control group, so we can control for confounding variables. I will give a short intro to the experimental method hereunder. Another key element of this process is the sample-size of the study. If the sample-size is too small, the effects we find are highly untrustworthy, which of course is extremely disruptive to science. However, I think the ‘publish-or-perish’ mentality of modern-science forces scientists to publish multiple papers per year. This means that scientists cannot possibly be expected to conduct research with sufficiently large sample-sizes, because such research is too time consuming.
Experiments work by randomly assigning subjects to a control and a treatment group. Treatment can mean anything here, it could mean treating a cancer patient with a new drug, it can mean teaching pupils with a new method, it can mean giving people a piece of information that the control group does not receive to see whether this changes the behaviour of the treatment group, but it can also mean bombarding a sub-atomic partical with another sub-atomic particle. In physics and chemistry random assignment is often not very important, because the subjects are all the same, or differ on sufficiently few variables that we know what we need to control for. However, when studying more complex things, such as plants, animals and people, random assignment is vital. There are so many things that people can differ on, that we can never control for all of their differences. For instance, we know someone is depressed, and we gave that person a certain drug, and he became less depressed. The drug may have worked, but he may also have changed his diet, or maybe he met a wonderful, smart, kind and inspirational woman. But by random assignment to the control and treatment group we can ‘randomize away’ these differences: on average the treatment group and the control group will be the same. In the control group there will also be subjects which may have changed for other reasons, and probably roughly as many as in the treatment group. If the treatment group than shows a different result than the control group, we can be fairly certain that it was because of the treatment, and not because of all those other things changing.
However, randomized assignment is not the only important factor to take into account when conducting an experiment. For randomized assignment to properly function, the treatment and control group must also be fairly large. For instance, the human brain is immensely complex – for instance, recently scientists discovered that a fully functioning man had a very small brain – to properly randomize away the different variables that matter, you need enough cases to have individuals at different values of a significant number of those variables. Studies with a small sample-size are more likely to miss effects of a treatment which actually is there (a Type-II error), and are more likely to find an effect of a treatment which actually is not there (a Type-I error). This is because they do not include enough subjects to ensure that the control group and the treatment group would have behaved the same on average. Unfortunately, the sample-size of many investigations is too small. Researchers have noted that in neuroscience the small sample-sizes fundamentally undercut the reliability of the research. This problem is wide-spread in science, and it is exacerbated because scientists and journals tend to publish confirmations and tend to not publish falsifications. This means that research based on an insufficiently large sample-size that finds an effect gets published, while the research that does not find anything is discarded. Whole branches of science face replication crises, including biomedicine, psychology, and experimental economics. Much of what we though we knew in those areas, turns out to have been a statistical artifact.
While this in itself is problematic, and extremely disruptive to the scientific process, I think it is made even worse by the ‘publish-or-perish’ mentality in modern science. Nowadays, many scientists are forced to publish a certain number of articles per year to keep their jobs, or to get promotions. However, conducting research with a large sample-size is time consuming. How can a scientist be expected to research 1000 individuals for a certain experiment, and be expected to publish multiple articles per year? This is an obviously impossible to achieve simultaneously. So, even while recommendations to increase sample-sizes are quite common in psychology, sample-sizes have not actually increased in psychological research since 1955. It is not very hard to find a culprit: by forcing them to publish multiple papers per year universities and politicians do not give scientists the time to perform at the best of their ability.
Bottom Line: Experiments in complex environments require a large sample-size to yield trustworthy results. I hypothesize that because scientists have to publish multiple articles per year they cannot possibly conduct research with sufficiently large sample-sizes.
Image: Schulz KF, Altman DG, Moher D; for the CONSORT Group. CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. BMJ. 2010 Mar 23;340:c332. doi: 10.1136/bmj.c332. (http://www.bmj.com/cgi/content/full/340/mar23_1/c332)