A couple of years ago AVclub.com published an article about the film 12 Angry Men. It argued that juror no. 8, who convinces the other, initially pro-guilt, jurors to vote not-guilty in a murder case, persuaded them by fallacious logic and ensured they came to the wrong verdict:
Rose [Reginald, the screenwriter], an expert at dramatic construction, has his hero, Juror No. 8 (Fonda in the movie), undermine each of these pieces of evidence individually, assisted along the way by those who’ve defected to the Not Guilty camp...We know what the logic is for combining separate items of probabilistic evidence into an overall estimate of the probability of guilt: Bayes' Theorem. It's explained here and here. I've used it a couple of times, to analyse the Lockerbie and Pistorius cases. In the words of this article defending the use of Bayes' Theorem in court:
None of this ultimately matters, however, because determining whether a defendant should be convicted or acquitted isn’t—or at least shouldn’t be—a matter of examining each piece of evidence in a vacuum. “Well, there’s some bit of doubt attached to all of them, so I guess that adds up to reasonable doubt.” No. What ensures The Kid’s guilt for practical purposes, though neither the prosecutor nor any of the jurors ever mentions it (and Rose apparently never considered it), is the sheer improbability that all the evidence is erroneous. You’d have to be the jurisprudential inverse of a national lottery winner to face so many apparently damning coincidences and misidentifications. Or you’d have to be framed... But there’s no reason offered in 12 Angry Men for why, say, the police would be planting switchblades.
Bayes theorem is a basic rule, akin to any other proven maths theorem, for updating the probability of a hypothesis given evidence. Probabilities are either combined by this rule, or they are combined wrongly.We have a theory to test: the defendant is guilty. Call this theory h for 'hypothesis'.
What we do, for each item of evidence, is to estimate the probability of the evidence being the outcome of events if h is true.
Then we estimate the probability of the same item of evidence being the outcome of events if h is not true (i.e. if ~h is true).
We put the probability of each piece of evidence on h and ~h into a ratio.
We multiply the ratios together into a total conditional probability ratio for each theory, guilty and not-guilty.
Voilà, in that ratio we have the estimated probability of the defendant being guilty.
(For present purposes, I will ignore 'prior probability'. This is the probability we estimate of a theory being true, even before considering the specific, detailed evidence, based just on what sort of thing it proposes. Prior probability is the reason why 'extraordinary claims require extraordinary evidence': the less inherently probable a claim or theory, the better specific evidence is required to overcome this inherent improbability. The existence of magic, for example, would require extremely strong evidence to overcome the initial improbability of phenomena existing that violate known laws of physics. Since we do not know anything about the world of the film, e.g. how many murder defendants brought to trial are in fact guilty, I will assume 'indifferent' priors, i.e. a 50% chance of guilt.)
The evidence (taken from the screenplay)
- The old man in the apartment below the crime-scene heard loud noises through his open window at 12:10 a.m. that he said sounded like a fight. He heard the defendant shout, 'I'm gonna kill you', then heard a body fall. He ran to look outside and saw the defendant running down the stairs and away. He called the police who found the defendant's father knifed to death with the knife in his chest. The old man picked out the defendant's voice by hearing alone from among four others in court. He knows the defendant well. However, juror 8 proposes that the El train was roaring by as the murder took place (as per point 5), and so the old man could not have heard, or heard clearly, what was going on upstairs. He came into court in dilapidated clothes, and appeared to juror 9 to be hiding his limp out of shame; juror 9 suggested he exaggerated his testimony for the sake of having a moment in the limelight.
- Juror 8 simulates the old man limping from his bed to the window, taking 42 seconds to do so, suggesting that if he heard the body fall while in bed then heard the murderer running down the stairs 15 seconds later, then he could not in fact have seen the murderer out the window.
- The coroner determined the time of death as around 12 a.m..
- The defendant claimed to have been at the cinema at 12 a.m., yet failed to remember what films he saw.
- There is no witness to the defendant entering or exiting the cinema.
- The woman across the street looked through the window onto the crime scene; she said she saw the defendant stab his father to death sixty feet away just as she looked out. However, she only saw the vital moments through the windows of a passing, darkened El train. Famously, juror 8 recalls that she had indentations on her nose due to habitually wearing glasses; he argues that if she saw the murder just as she looked out the window while tossing and turning in bed, then she could not have been wearing her glasses.
- There were witnesses by hearing to the defendant and his father arguing at 8 p.m.. They heard the father hit the boy twice, and saw the boy walk out of the building in an angry mood.
- The defendant has been regularly beaten by his father growing up.
- The defendant has several violent crimes on his record (is the jury allowed to know this?)
- The murder weapon was a distinctive kind of knife known to be owned by the defendant. He bought it shortly after leaving the house, witnessed by the shopkeeper. Witnesses saw it in his possession at 9:45 p.m. The defendant arrived home at 10 p.m. He claims his knife slipped through a hole in his pocket between then and returning from the cinema at 3:15 a.m.. The 8th juror shows the others that he has procured exactly the same sort of knife for himself from a shop near the crime-scene, showing that it is not unique or unavailable (surely grounds for a mistrial, as the jury is considering evidence not presented in court?)
- Juror 5 says that people handy with switch-blades, like the defendant, would stab with an underhand grip, but the victim was stabbed overhand, to judge from the coroner's assessment of the wound.
- The father was a tough man and compulsive gambler, known for a propensity to get into bar-fights, particularly over women.
- The defendant returned home at 3:15 a.m., where he was arrested by police.
My estimates of probabilities are just that: my subjective ideas about what is likely or unlikely. This is hardly scientific, since we do not have the objective data for that. It is fair to criticise Bayesian reasoning for the uncertainty and subjectivity of the estimates used, as long as the critic understands that we cannot escape these problems simply by forswearing the use of numbers and going back to vague words. If subjectivity causes a problem for Bayesian reasoning, then it will cause the same problem for reasoning from evidence in general, including the sort of reasoning that jurors have to perform. On the positive side, using Bayes' Theorem will at least ensure that, whatever estimates are made regarding the separate pieces of evidence, they are logically combined into a view of the evidence as a whole. It should be noted, then, that conclusions derived by Bayesian reasoning are not better than verbal reasoning by virtue of a more scientific appreciation of the premises, but may be more logical in drawing conclusions from those premises, valid or invalid as they may be. Perhaps a juror's decision is not so tricky, since they can use the inherent uncertainty to justify the default not-guilty vote whenever guilt is not rigorously proven. I discuss why we should use Bayes' Theorem to do History here.
For each piece of evidence, 1-13, I will assign a probability that it would be the outcome of events on the theory of the defendant being guilty (h) or not-guilty (~h), then express the two probabilities as a ratio. Then when we multiply the ratios together, we will have a conditional probability of guilt. (N.B. I'm unsure about points 7 and 8 below; I'd like someone experienced with Bayesian statistics to tell me if I've got them right or wrong.)
- I will allow that the old man could not have clearly heard events on the floor above due to the noise of the train. It would therefore seem that he did indeed invent or exaggerate his evidence, a finding corroborated by his inability to really get to the window quick enough to see the murderer flee. I will therefore assign equal probabilities to the old man's evidence on either theory; in other words, his evidence is worthless.
- See (1).
- Time of death needs to correspond with the defendant being at home.
- The defendant cannot remember when interrogated the films he says he saw. Allowing for a possible defect of memory or attention, I will allow a 50% probability of this happening even if he really saw them. The failure is 100% expected if he were the murderer, and thus not at the cinema at all. h: 100%, ~h: 50%, 2:1 for h.
- No alibi witness for the cinema. Certain if he were the murderer, but possible if he were there but simply forgotten or not noticed. h: 100%, ~h: 50%, 2:1 for h.
- The woman who saw the murder may need glasses due to being long-sighted rather than short-sighted. Or she may habitually wear sunglasses. I'll allow a generous 80% chance that she could not see what happened clearly but testified to it anyway. h: 100%, ~h: 80%, 5:4 for h.
- There was an earlier argument that angered the boy, in which his father hit him twice, if we can trust the witnesses' hearing. This supplies a possible motive, which makes him more likely to be the murderer than someone who had not argued.
But, given how many arguments take place, even when somebody is hit, that do not lead to revenge murders, this argument having happened does not greatly increase the chance of the defendant being the murderer in absolute terms. h: 5%, ~h: 4%, 5:4 for h.The right statistical thought-process here, is not to ask how likely it is that an argument is followed by a murder, but rather how likely it is that a murder is preceded by an argument. So, on the assumption of h, how predictable was it that the defendant would turn out to have argued with and been hit by the victim, or in some other way been given cause for violence? Highly likely: let's say 90%. Whereas, on the assumption of ~h, how likely was the defendant to have argued with and been hit by his father, while not being the murderer? This depends on how regularly such an event happens. Given that (8) tells us such paternal violence was a regular occurence, let's guess at it happening once a week, giving a probability of 14%. So: h: 90%, ~h: 14%, 6.4:1 for h.
- Similar issue to (7).
I'll allow more significance to regular beating as a probable background factor than a one-off argument and a couple of hits. Let's say: h: 10%, ~h: 8%, 5:4 for h.Again, the question should be: how likely was it that the murder would turn out to be preceded by a history of regular beatings of the defendant, if the defendant was the murderer? Probably quite high, since violence begets violence, and murderers are more likely than the average, I suppose, to have been subjected to violence. But probably not as high as the probability of a recent bout of violence having occasioned a murder. Let's say 50%. And how likely was it that the defendant would turn out to have been regularly beaten, if he were not the murderer? That would be the general rate at which non-murdering young men are subject to childhoods full of beatings. Let's say, for 1957, 1 in 10, so 10%. Thus: h: 50%, ~h: 10%, 5:1 for h.
- The jury should not take his past record into consideration.
- What is the chance of the defendant buying a replica of the future murder weapon shortly before the murder, then losing it, while somebody else commits the murder with such a weapon? This would be very unlucky! This is just the sort of excuse the defendant would have to make up if he were the murderer. Let's say: h: 95%, ~h: 5%, 19:1 for h.
- I'll allow this assessment of the evidence: it was improbable the defendant would stab with this technique. Let's say: h: 10%, ~h: 50%, 1:5 against h.
- Other people may have had a motive to kill the father. What are the chances of this being the case if the defendant were guilty? Obviously higher than for most people, given the father's behaviour. What are the chances if the defendant were innocent? A little bit higher still, since, on the hypothesis that the defendant was not guilty, somebody else did in fact commit the murder. On the other hand, there are motiveless murders. The fact that there were alternative potential murderers does not help the defendant much unless it was more likely that one of them would be the murderer, i.e. unless a plausible alternative culprit and series of events could be suggested by the defence. But it helps a little to have unspecified alternatives. h: 40%, ~h: 50%, 4:5 against h.
- The defendant returning home looks good for his innocence, as he might expect to be arrested if he were not in fact out at the cinema and thus ignorant of what had occurred. Or he might have returned to retrieve the murder weapon. I would say the former argument is stronger: h: 10%, ~h: 100%, 1:10 against h.
Now we multiply the ratios for each piece of significant evidence together:
(2x2x5x6.4x5x19x1x4x1) / (1x1x4x1x1x1x5x5x10) = 48,640 / 1,000 = 48.6 / 1.
Thus I estimate the defendant was over 48 times more likely to be guilty than innocent.
Expressed as a percentage, I rate him as 98% likely to be guilty.
So even once you throw out the old man's evidence as false witness, there really is a case beyond reasonable doubt.
So the 12 Angry Men were probably wrong to let the defendant go free!