Science

Tweaking your research

Flimsy science can be dangerous science

Leaving out anomalies. Adding additional participants. Not replicating the study, or repeating it until a certain result shows up. Many researchers use these techniques to optimise their data: they tweak it until they find something statistically significant.
By Emily Howard / Illustration by Kalle Wolters

Stephan Schleim, Associate Professor in theoretical psychology, was really taken aback when he saw the data from a joint project of him and his colleague. To his dismay, the data were clearly manipulated. ‘He was very quick with removing data of subjects who were behaving in contradiction with our hypothesis’, says Schleim. ‘Of course, then we had a significant behavioral effect.’

Schleim wasn’t surprised: he knows that these practices are commonplace. Still, he wasn’t comfortable working with the researcher. ‘I stopped the collaboration with this researcher and the paper was, to my knowledge, never published,’ he says.

However, many papers like this are published. Only 37 out of every 100 psychology publications from 2008 were reproducible – and the effects were much smaller. It is believed that many of the researchers ‘optimised’ their data.

Saintly behaviour

‘You have at one extreme saintly behaviour, which is non-existent. On the other extreme you have the wilful manipulation of information for the purpose of deceiving. But in between is where most researchers find themselves’, says psychology lecturer Anastasios Sarampalis.

There’s no space for failure in science

Researchers may exclude or include variables that skew results in a certain direction. Or, they use computer programs to test results along the way. As soon as the effect is significant, the experiment is over.

There are thousands of ways to get the results you want, explains statistician Rink Hoekstra. ‘And many of them seem perfectly normal to most people, like adding a few participants. But it’s really like throwing dice multiple times and then saying: “Hey, I have a six!”’

No space for failure

One explanation for engaging in problematic research practices is the way publishing works. In order to publish, researchers must produce results that pass the statistical significance threshold. ‘A p-value less than 0.05 means that you publish, and a p-value greater than 0.05 means that you don’t publish’, explains Nick Brown, a PhD candidate in the Faculty of Medical Sciences.

‘There is so much pressure on publishing statistically significant data so you can get into the journals that are expected if you want to make a career in science’, says Schleim. He explains that grants and tenure are only given to researchers who have published enough studies. ‘The competition is so fierce; if you said that the study didn’t work, maybe you’d be out of the game. There’s very little space for failure in science.’

Brown agrees. ‘You don’t have to be a bad person at all, people will do a lot to guarantee the future of their monthly paycheck,’ he says.

Paid per patient

Questionable research practices are not limited to the social sciences. ‘There have been some cases of medical doctors making up data, because they are paid per patient and so for each new recorded patient, they get money,’ explains Hans Burgerhof from the Faculty of Medicine. But he is quick to add that this is not common practice.

We’ve got a scientific publishing system that doesn’t really care if your results are true or not

Tamalika Banerjee teaches scientific integrity in the Faculty of Science and Engineering, and says that the problem is less common than it used to be. In 2002, German Physicist Jan Hendrik Schön was discovered as a fraud. ‘That was a significant reason why people started speaking up, and why we are at quite an advanced stage of recognising misconduct,’ explains Banerjee.

Now, physics students attend mandatory scientific integrity courses. They learn about their ethical responsibilities and how to recognise scientific misconduct.

Statistical struggles

But problematic behaviours aren’t always intentional. Statistician Casper Albers suggests that researchers don’t have enough statistical knowledge. ‘It does sound very sensible to say, “let me just collect ten more participants to see if I get to the significance level”,’ Albers explains. ‘But you have to be a statistician to know that actually, if you do that, the computation of the p-value becomes different.’

‘Doing statistics properly requires skills that are more at the level of flying a plane than driving a car,’ adds Brown.

What’s more, the over-reliance on statistical significance doesn’t necessarily result in interesting research. ‘This statistical significance threshold has become a fetish, but something that’s statistically significant might not be interesting at all’, says Schleim.

Fatal consequences

Regardless of intentions, manipulating data has detrimental impacts on the academic community. ‘As someone who is a recent newcomer to science, I found it absolutely terrifying that we’ve got a scientific publishing system that doesn’t really care if your results are true or not’, says Brown. Even more grave are the impacts that these studies might have on the wider world.

‘Very flimsy scientific findings end up being picked up by university press agencies, media and the public, and can become a feedback-loop for extremely dangerous science’, explains Sarampalis. He refers to the Wakefield case, where a British medical doctor claimed to find a link between vaccines and autism. As a result, thousands of parents decided not to have their children vaccinated – and diseases which had almost been eradicated reappeared.

‘The world is not just cherries’

Is there a way to prevent researchers from optimising their data? Hoekstra thinks more careful publishing policies might work. An example is pre-registration, when journals agree to publish a study regardless of the results. ‘In a situation where journals can pick studies after looking at the results, they pick out the cherries and forget about the rest. But the world is not just cherries’, says Hoekstra. Pre-registration is common in medicine publications, which are often expensive and have to be financed in advance.

Open access papers are cited roughly eight times more

Another solution is publishing the data itself. ‘If science is about publication – to make things public – it’s very odd that we have to discuss whether or not people are sharing their data,’ says Schleim.

Data sharing is normal in physics, where other researchers often base their studies on techniques used in previous data sets. Now, across all fields, the RUG’s policy states that researchers must store their data on an internal hard drive accessible to other researchers.

Sharing data has benefits for researchers too, explains Albers. ‘Studies show that if you also share the data behind the paper, you are cited more often. Open access papers are cited roughly eight times more,’ he says.

Remove the pressure

But Schleim doesn’t think that these measures are a cure. ‘More rules won’t solve the problem,’ he says. He thinks there will always be loopholes for data optimisation. ‘One thing that we cannot ignore anymore is that we must remove the pressure from the system. It’s not just harming scientists, it’s harming science.’

The future generation of researchers is becoming aware of the issues of tweaking data. In behavioural and social sciences, courses such as the theory of science teach students about the validity of studies and the dangers of misrepresenting data. Instead of just learning statistical methods, students are also taught how to use them in different contexts – and the risks of misusing them.

‘When I’m optimistic, I say that there is a resurgence in this introspective understanding of what it means to do good-quality research’, says Sarampalis. ‘As with a lot of things, I put my money on the new generation to take this new way of thinking and embrace it.’

Nederlands