Excuse me, could you repeat that again?
No? Then how is this even being published in a scientific journal?
Many are unaware, but science – particularly social sciences and psychology – is facing a replication crisis, a widespread issue where attempts to replicate past studies fail. Currently, approximately seventy percent of scientific studies fail when researchers try to reproduce their results. This is alarming when you consider how often science, along with the flawed consensus argument, is referenced by politicians making significant, history-altering decisions.
In-depth reviews of several highly influential studies are uncovering cracks in science’s framework and highlighting flaws in its methodology, revealing that we can’t always trust what we read, even in reputable scientific journals.
10. It's a Deeply Widespread Issue

In 2018, a group of social scientists, including psychologists and economists, set out to replicate 21 experiments published in the prestigious journals 'Nature' and 'Science'. This review included influential studies that were widely publicized, such as a 2011 dissertation examining whether search engine usage impacts human memory, and a study about how reading affects children's ability to understand different viewpoints, also known as 'theory of mind.'
With hindsight, the scientists running the replications made their retests far more stringent than the original experiments. To ensure complete transparency, they preregistered their study and analysis, which served as a safeguard against any researcher revising conclusions to salvage their original findings. In some cases, they even increased the number of participants fivefold, applying the idea that more data leads to more reliable results, thus reducing the risk of isolated evidence being accepted as scientific fact.
Disturbingly, only 13 of the 21 studies replicated successfully under the additional scrutiny. This leaves one study shy of two-thirds, a result that is far from acceptable for studies published in such renowned and rigorous scientific journals. The takeaway is clear: questionable science is increasingly being accepted as truth.
9. Even Many Well-Regarded Studies Are Now Under Scrutiny

Picture two students receiving their midterm results. One gets an A, the other a D. Both have passed, but the gap between those two grades is enormous.
Nearly as troubling as the failure to replicate some studies is that even those that pass the test of scrutiny often show dramatic drops in effect size—the difference between the experimental group and the control group—when examined under more rigorous conditions. This was evident in several of the 13 studies successfully replicated out of the original 21. In some cases, the effect size was halved, a significant reduction that suggests the original findings may have exaggerated the impact of the experimental variables.
Here’s a thought experiment: What if science conclusively proved that an apple a day does, in fact, keep the doctor away? The next question would naturally be, 'Well, to what extent?' Many studies present themselves as definitive answers, when in reality, they barely meet the minimum standards. In science, the magnitude of cause and effect is crucial.
The researchers behind the replication studies acknowledged this, noting that the bias in published findings stemmed 'partly from false positives and partly from inflated effect sizes of true positives.' In other words, take the apple with a pinch of skepticism.
8. Replication Itself Has Its Own Limitations

Replications are rarely identical. Often, there are complex factors that make a comprehensive analysis of each study challenging, if not outright unfeasible.
In some cases, the replicators themselves exhibit the same carelessness as the original researchers. For example, the previously mentioned study that identified a link between search engine usage and human memory was one of eight that couldn’t be replicated. However, the replication only involved a word-priming task – determining if the mere thought of the internet's availability could impair memory – instead of a real-world experiment with actual trivia-based quizzes. The replication also overlooks obvious examples like the fact that smartphones have led to fewer people remembering phone numbers, an indication that technological dependence reduces the need for fact recall, making it less frequent and more difficult.
Some studies fail to replicate because of changes in the participants themselves. In 2014, MIT psychologist David Rand published a study on human cooperation. In this study, participants played an online economics game designed to observe collaboration and selfish behavior.
When the experiment didn’t replicate, Rand suggested that the typical pool of online participants had grown too familiar with his game, reducing its effectiveness in simulating real-world behaviors. While this may seem like a stretch, his broader point stands: when an experiment becomes too well-known, it can lose its value.
However, more frequently, experiments fail to replicate simply because they were flawed from the start, as we will explore in the following examples.
7. Some Studies Are Just Downright Absurd

It can be downright unsettling what passes as legitimate science. One experiment that failed to replicate attempted to test whether challenging people's rationality would make them less religious. Though it may seem a bit insulting at first glance – implying that religious beliefs are inherently irrational – the goal was to observe whether people would be more inclined to seek cause and effect in the physical world as opposed to the spiritual realm.
The experiment itself was, frankly, ridiculous. In one phase, participants were asked to gaze at Rodin’s well-known sculpture, The Thinker, for several minutes. In essence, the study sought to determine if looking at a nude statue of a man with his fist thoughtfully resting under his chin would cause people to abandon their belief in a deity.
Sound science? Hardly. In an ironic twist, the architect of the study believes the reason for its failure – that is, the reason it couldn’t be replicated – was due to the small sample size, rather than the absurdity of the experiment itself. "When we asked them a single question on whether they believe in God, it was a really tiny sample size, and barely significant," said University of Kentucky psychologist Will Gervais. "I’d like to think it wouldn’t get published today," he added, in what must be the understatement of the scientific century.
6. Soft Logic: The Marshmallow Test & The Oversimplification of Social Science

Many studies fail to replicate their results because the factors they initially considered were either incomplete or insufficient. A notable example of this is the “marshmallow test”, which originally linked the ability to resist temptation in childhood with future success in adolescence or adulthood.
The experiment itself is both compelling and cruel. A child has a marshmallow placed in front of them, and the researcher offers a choice: If the child can wait for them to leave the room and return, they will receive an additional marshmallow. If the child can’t resist, no second marshmallow is given. Instant gratification on one side, double the reward on the other.
Years later, the children who were able to wait for the second marshmallow scored higher on the SATs, had lower rates of substance abuse, were less likely to become obese, and exhibited better social skills. The conclusion seemed clear: Children who display self-control early in life are more likely to become disciplined, successful adults.
However, a child’s life is far more complex than a simple snack test. When researchers revisited the study, considering additional factors like family background, the correlation vanished. The most probable explanation is that the children who waited had the advantages of proper upbringing, nutrition, and support. They weren’t inherently better—they were simply raised in a better environment.
5. Think Again: The Replication Crisis in Psychology

The replication crisis in psychology stands out even more, possibly because tracing the origins of thoughts could be even more complex than identifying causes and effects in behaviors or achievements.
One specific event sparked the psychological community’s much-needed crisis. In 2010, a paper employing well-established experimental methods claimed evidence of something widely considered scientifically impossible: it found that people were capable of perceiving the future.
The issue wasn’t so much with the absurdity of the study’s conclusion, but with how rigorously it was conducted: the experiment spanned 10 years, during which Cornell University Psychology Professor Daryl Bem carried out nine replications, eight of which were successful. When the study was published, Daniel Engber of Slate summed up the aftermath: “The paper posed a very difficult dilemma,” he wrote. “It was both methodologically sound and logically insane.”
Unlike the future, the outcome was predictable: the study led to a major reassessment of practices such as using smaller sample sizes, which, compared to larger studies, are more likely to yield results based on pure chance. The best ways to ensure effective randomization—ensuring biases or external influences don’t skew the results—are also undergoing scrutiny in a scientific revolution that, although less than a decade old, is still unfolding.
4. Psychology’s Replication Reckoning

Unless “Dr.” Peter Venkman from Ghostbusters was onto something, extrasensory perception remains, at best, unproven and, at worst, entirely disproven. So when Daryl Bem employed accepted experimental methods to 'prove' the impossible, psychology was forced to reckon with how studies were being conducted.
This discipline-wide reevaluation led to a 2015 paper published in *Science* magazine. The report highlighted a significant issue: When 270 psychologists attempted to replicate 100 experiments published in leading journals, only about 40% were successfully repeated. The rest either failed or produced inconclusive results. Additionally, many studies that did replicate showed weaker effects than the original studies.
The report’s conclusion was as strong as any statement in a scientific journal. 'After an intensive effort... how many of the effects have we established as true? Zero. And how many of the effects have we established as false? Zero.'
This wasn’t about placing blame but about acknowledging that science is not as straightforward as it was once thought to be. 'It is the reality of doing science,' the conclusion continues, 'even if it is not appreciated in daily practice.'
At the heart of the matter was the conclusion: 'Humans desire certainty, and science infrequently provides it.' The report suggests that there is room for improvement in psychological studies—additional steps and considerations that may lead to more reproducible results. However, the larger point is clear: there is no simple, magical solution.
3. Ego Deflation (Ego Depletion, Part 2)

Then, much like its signature cookies, ego depletion fell apart. A 2016 study published in *Perspectives on Psychological Science* detailed a massive 2,000-subject attempt to replicate the effects of ego depletion. The research was rigorous, with two dozen labs across multiple continents taking part and leaving no assumption unchecked (as science should).
The study’s conclusion was blunt, and the results were essentially nothing: “Results from the current multilab registered replication of the ego-depletion effect provide evidence that, if there is any effect, it is close to zero.” Researchers found no clear connection between a blow to one’s ego and the ability to perform subsequent tasks. For example, if you’re good at crossword puzzles, getting kicked in the groin won’t stop you from solving 22 Across.
Just like that, a study that had been cited 3,000 times—and replicated in many forms—was now considered, at best, questionable, and at worst, fraudulent. It’s hard to emphasize enough how deeply ingrained ego depletion was in psychological theory, to the point of being near-canon.
The real issue—and a central challenge of the replication crisis—is why ego depletion became so widely accepted before it collapsed. On the surface, ego depletion seems logical, and we still see it referenced today, such as when a soccer or baseball player in a slump is said to have that struggle carry over into their defensive performance.
Seemingly logical theories often receive the benefit of the doubt—a bias that researchers carry into subsequent experiments. The replications end up being based, in part, on these assumptions, and instead of the house of cards collapsing, it gains another layer, which in turn further supports the original, albeit incorrect, theory.
2. Don’t Bring Me Down (Ego Depletion, Part 1)

Sometimes, it’s the very reasonableness of a study that enables its spread, despite a flawed foundation. To illustrate the complexity and cascading effects of the replication crisis, two consecutive list entries are needed (part 2 follows).
One notable failed replication involves a widely-cited experiment with a key finding that has been referenced over 3,000 times: ego depletion. The concept appears logical, as researchers seemed to show that a hit to one's ego could affect a range of subsequent tasks, such as self-control, decision-making, and problem-solving abilities.
In the experiment, psychologists Roy Baumeister and Dianne Tice placed freshly-baked cookies (yum) next to a bowl of radishes (yuck). Some participants were instructed to eat only radishes, while others could have cookies. Afterward, each volunteer attempted to solve an intentionally impossible puzzle. The cookie group spent an average of 19 minutes on the puzzle, similar to those in a control group who had no snacks. The radish group, however, gave up after an average of only eight minutes.
Baumeister and Tice suggested that this demonstrated a crucial insight: humans possess a limited reservoir of willpower, which diminishes with excessive use. Resisting the urge to indulge in cookies while eating a radish instead is a taxing act of self-control, one that negatively affects our ability to tackle future challenges.
This discovery quickly gained traction, being widely cited to explain a broad range of cause-and-effect relationships, essentially arguing that our willpower reserves significantly influence our ability to successfully complete tasks. In 2011, Baumeister furthered the idea with his best-selling book, 'Willpower: Rediscovering the Greatest Human Strength.'
Accepted wisdom, right? Not quite…
1. A Simple Correction to Common Sense?

Fortunately, the practice of conducting replications offers researchers across various fields valuable insights into which experiments are likely to succeed in future replications – and, by excluding the rest, helps identify those that are conceptually flawed.
With the replication crisis now widely recognized, more thoughtful replication studies can help refine scientists' instincts about which hypotheses are truly worth testing and which are not. In this way, the careful process of replication can lead to a more practical and grounded approach to new theories and experiments.
Take, for example, a replication study led by psychologist Brian Nosek, Director of the Center for Open Science. This study featured a prediction element, where a group of scientists made bets on which studies they believed would replicate and which would not.
Interestingly, the predictions mostly matched the final outcomes, demonstrating that the scientists had a sort of professional 'bullshit detector.' Among the experiments they predicted would fail to replicate was a study claiming that simply washing one's hands could alleviate 'post-decisional dissonance' – a fancy way of describing confirmation bias that helps us justify tough decisions after the fact.
The encouraging takeaway is that resolving a complex crisis can be greatly aided by a simple quality: common sense. If a study sounds too good to be true, it probably is.
