Lachlan J. Gunn et alia argue in a paper, “Too good to be true: when overwhelming evidence fails to convince,” that in some situations, as you apparently gather evidence in favor of a theory, it becomes less and less likely to be true:
In this paper, for the first time, we perform a Bayesian mathematical analysis to explore the question of multiple confirmatory measurements or observations for showing when they can—surprisingly—disimprove confidence in the final outcome. We choose the striking example that increasing confirmatory identifications in a police line-up or identity parade can, under certain conditions, reduce our confidence that a perpetrator has been correctly identified.
Imagine that as a court case drags on, witness after witness is called. Let us suppose thirteen witnesses have testified to having seen the defendant commit the crime. Witnesses may be notoriously unreliable, but the sheer magnitude of the testimony is apparently overwhelming. Anyone can make a misidentification but intuition tells us that, with each additional witness in agreement, the chance of them all being incorrect will approach zero. Thus one might nabelieve that the weight of as many as thirteen unanimous confirmations leaves us beyond reasonable doubt.
However, this is not necessarily the case and more confirmations can surprisingly disimprove our confidence that the defendant has been correctly identified as the perpetrator. This type of possibility was recognised intuitively in ancient times. Under ancient Jewish law, one could not be unanimously convicted of a capital crime—it was held that the absence of even one dissenting opinion among the judges indicated that there must remain some form of undiscovered exculpatory evidence.
This is an interesting situation, and someone might suppose that it is one where the evidence changes sides.
But this does not follow. The reality is a bit different. Suppose you flip a coin thirty times, and get heads every time. Each time you get heads, you receive evidence in favor of the hypothesis, “I am having a really lucky streak.” But each time you also receive evidence, and stronger evidence, in favor of the hypothesis, “This coin is biased.” After flipping it thirty times, you thus are likely to become very convinced of the latter hypothesis, and thus convinced that the former is mistaken. But this did not happen because at some point the evidence went from one side to the other, but because the evidence supported two different theories, and one more than the other.
However, regardless of exactly how we describe the situation here, this does have important consequences for the evaluation of multiple instances of testing the same thing. In the case of the police line-up discussed in the article, we know from experience that people are far from infallible in their identification. So if you have a large number of people who identify the same person, without any exception, that is strong evidence that the process is biased; perhaps the police encouraged the people to identify a particular person, for example. Likewise, eyewitness testimony tends not to be perfectly accurate. Consequently, if we take the testimony of many eyewitnesses to the same complex events, and there is not the slightest discrepancy, this is good evidence that their testimony is biased. Perhaps someone created a story and instructed them not to deviate from it, for example. And on the other hand, minor discrepancies in such accounts do not weaken their testimony, but strengthen it (although not to the discrepant point itself), by making it less likely that they are biased in such a way.