> wbjbj
*nooaLA#))))8ń)Wz"Ӆ:
Ѵsj])t))
ttt)
)
ttt:,))j]G
,'0WRݷtݷjt)j))DX]d))]i:
ABSTRACT
The inability of judges to understand mathematics has led them to abdicate their responsibility to make important policy choices. This article explores the surprisingly wide-ranging effects of the judicial failure to understand the meaning of numbers. For example, largely unaware of both the meaning and effect of statistical analysis, courts have permitted prosecutors to highlight the race of criminal defendants when race is irrelevant to the case. Judges have also allowed prejudice to reduce tort awards for African Americans and woman. In cases involving securities fraud, courts have unwittingly created a system that exaggerates the harm of the risk of error to a proven wrongdoer and denigrates the harm to an innocent investor. In these cases, as well as others, identifying the policies inherent in mathematical decisions is the first step. Once that step is taken, however, implementing corrective solutions can be relatively simple.
Significant Statistics:
The Unwitting Policy Making of Mathematically Ignorant Judges
I. Introduction
What could cause a judge to permit prosecutors to highlight the race of criminal defendants when there is no indication that race is relevant to the case? Why would a court require a jury to consider a suspect as likely to be guilty as innocent, even when there is no evidence other than an accusation? And what could possibly induce a judge to permit African Americans and women to receive lower damages than identically situated white men on the unspoken expectation that racism and sexism will continue for the foreseeable future? These, as well as other, perversions of justice are the direct, though unthinking, result of judicial mathematical illiteracy.
The fact that many judges suffer from an estrangement from, resistance to, and incapacity in mathematics, should not be surprising. This condition, after all, afflicts most lawyers, as it does most Americans. One need not conduct a study to know that law students are typically smart people who do not like math. Law professors are of little help to their students, since legal academicstend not to have a background in, or use, statistical analysis, or are unfamiliar with empirical data collection. Indeed, it is clear that there is a prevalent (and disgraceful) math-block that afflicts the legal profession.
Mathematical illiteracy is especially worrisome, as the analysis of numbers has become such an important component of the legal system. In particular, the use of statistical testimony at trial has increased dramatically. Statistical evidence is now an essential element of cases spanning the legal universe. Statistics are regularly used to prove or disprove issues as diverse as causation of injuries in toxic torts cases, breach of contracts, discrimination in employment and voting, DNA identification in criminal and family law, trademark and patent violations, environmental harm, securities fraud, and loss of future earnings.
The Supreme Court has declared that in all such cases, federal courts have a general gatekeeping obligation. Beginning with its 1993 decision in Daubert v. Merrell Dow Pharmaceuticals, Inc., the Court has declared that the Federal Rules of Evidence assign to the trial judge the task of ensuring that an experts testimony both rests on a reliable foundation and is relevant to the task at hand. Bluntly put, the Court has told judges that they must ascertain the difference between good science and junk science.
This evaluation appears to be a substantially more subtle task than that formerly performed by federal judges. Under the previous regimen, pursuant to the so-called Frye test, courts would admit statistical and other technical evidence if it could be shown that such evidence was derived from a well-recognized scientific principle or discovery which had gained general acceptance in the particular field in which it belongs.
The Supreme Court has explained that the Daubert test is less rigid than Frye. The Court stated that, without the exclusive emphasis on general acceptance, federal judges would be able to admit a somewhat broader range of scientific testimony than would have been admissible under Frye Nonetheless, the Court has stressed that trial judges must still screen evidence to determine its scientific validity. According to the Court, this inquiry is a flexible one. The Court also proposed several factors that might bear on this examination. These factors include:
Whether a theory or technique . . . can be (and has been) tested;
Whether it has been subjected to peer review and publication;
Whether, in respect to a particular technique, there is a high known or potential rate of error and there are standards controlling the technique's operation; and Whether the theory or technique enjoys general acceptance within a relevant scientific community.
Dauberts guidance has not provided much comfort for some judges who have complained that they have been placed in the uncomfortable position of evaluating the quality of scientific and statistical evidence far beyond their own fields of expertise:
[T]hough we are largely untrained in science and certainly no match for any of the witnesses whose testimony we are reviewing, it is our responsibility to determine whether those experts proposed testimony amounts to scientific knowledge, constitutes good science, and was derived by the scientific method.
Our responsibility, then, unless we badly misread the Supreme Court's opinion, is to resolve disputes among respected, well-credentialed scientists about matters squarely within their expertise, in areas where there is no scientific consensus as to what is and what is not good science, and occasionally to reject such expert testimony because it was not derived by the scientific method. Mindful of our position in the hierarchy of the federal judiciary, we take a deep breath and proceed with this heady task.
Such admirable humility might not be necessary were judges truly to limit themselves to keeping out blatantly junk science, such as the testimony of a phrenologist who would purport to prove a defendant's future dangerousness based on the contours of the defendant's skull. Similarly easy evidence to exclude would be, as the Court explained, theories grounded in any so-called generally accepted principles of astrology or necromancy.
Even obviously bad mathematics should be readily observable to the average judge. Consider the case of Estate of Boncher v. Brown County. At a trial alleging that a jail had been deliberately indifferent to the risk of prisoner suicide, a criminologist testified that the risk was particularly acute since there had been five suicides at the jail during the previous five years. But how big a risk does the number five convey? As the court noted, the simple number five does not disclose whether the risk was acute unless we also know the size of the prison population, the rate of suicides in other prisons, and the rate of suicides in the general geographic area from which the jail draws its inmates. After all, five suicides in a small prison population obviously demonstrate a far greater level of risk than were that same number to occur in a very large population. The criminologists numeric evidence was indeed useless and properly excluded under Daubert.
On more subtle points of mathematics or science, it would be appropriate for judges to turn to experts to help identify whether proposed evidence is good science. As the editors of the New England Journal of Medicine suggested, Judges should be strongly encouraged to make greater use of their inherent authority . . . to appoint experts .
But non-legal experts have been permitted to expand their role to the point where they are now making normative policy decisions. Sometimes this has happened when courts have confused scientific validity with logical relevance; the science and mathematics may be undisputed, but the question really involves the logical and legal connection of the numbers to the legal point being proven. At other times courts have been unable to distinguish the methods of the scientist and statistician from the values of their disciplines. While those who use these methods must respect the methodology of the disciplines from which they arise, we must also recognize that the values of those disciplines often differ markedly from those of the legal system. As the Carnegie Commission report, Science and Technology in Judicial Decision Making, observed: In the courts, scientific knowledge must inform the choice, but abdication to the scientist is incompatible with the judges responsibility to decide the law. In other words, while striving to avoid accepting junk science into evidence, too many judges have permitted statisticians and others to allow junk law into the courts.
Many have assumed that the problem is that judges lacked the scientific literacy to evaluate evidence properly. Thus, there are complaints that those of a scientific bent certainly can take issue with whether the judges and lawyers have the education or training to engage in scientific testing. The critical judicial deficit, however, is not with science, but with mathematics. Too many judges do not speak math and do not understand what numbers communicate. They also fail to see that the meaning to be given to mathematical results is frequently not a matter of scientific necessity, but a reflection of specific value judgments. By ignoring those judgments that are inherent in the mathematical choices, judges have acquiesced to values that are at odds with our system of justice.
This article will explore several areas in which judges, hampered by their mathematical ignorance, have permitted numerical analysis to subvert the goals of our legal system. In Part II, I will examine the perversion of the presumption of innocence in paternity cases, where courts make the counter-factual assumption that regardless of the evidence, prior to DNA testing, a suspect has a 50/50 chance of being the father. In Part III, I will explore the unnecessary injection of race into trials involving the statistics of DNA matching, even when race is entirely irrelevant to the particular case. Next, in Part IV, I will discuss how courts use race- and gender-based statistics to reduce damages in tort cases for women and racial minorities, and silently assert that past racism and sexism will continue. In the final section, I will examine how judges have improperly allocated the risk of error in cases such as securities fraud, so as to reward those who have attempted to manipulate stock prices illegally.
.
II. Snake Eyes and the Power of Numbers
To understand both the uses and abuses of statistical evidence, I will present a simple story, a murder mystery, called Snake Eyes. I will then demonstrate the limits of pure statistical analysis and the way that courts have permitted statisticians and scientists to warp the legal process and pervert traditional legal values.
SNAKE EYES
Victor is an elderly millionaire who has decided to bring his family together to give away his possessions, which range in value from the one extremely rare antique to several mundane items. He invites his 11 closest relatives to his house. He seats them at a long table, and tells them that they will be rolling dice to determine the order they will select which of his possessions each will receive, with the highest roll choosing first. Victor opens a fresh set of dice from the Trustworthy Dice Company and hands them to his guests. Each guest simultaneously places two dice in a dice cup, tosses the dice, looks at the result and covers their dice with the cup.
Sitting at one end of the table are Al and Dennis. Al, immediately after looking at his dice, runs out of the house. All the other guests race to the window to watch Al get into his car and drive away. Suddenly, they hear a loud sound and turn around to see Victor on the floor, bludgeoned to death with a candlestick. By Victors body is a note saying: I killed him because I rolled snake eyes.
The guests rush back to their seats and find that Victors fall had knocked all of the dice cups onto the floor, scattering all of the dice except for Als and Denniss. Both dice cups are lifted, revealing that each had rolled two ones. Dennis concedes that indeed he had rolled snake eyes, but he denies having killed Victor.
Assuming the note was accurate, the dice were fair, and that there is no other evidence, what can statistics tell us about the identity of the murderer? Can it tell us how likely it is that Dennis is the murderer?
We can calculate easily the probability of Dennis rolling the murderers two ones if he were innocent. The probability that Dennis matched the murders rol1 by pure random chance is 1/36, which is about 2.8%. What that tells us is that Denniss roll matched that of the murderer and the probability of a random match is 2.8%. Another way to think about this is that if Dennis were not the murderer, the probability of seeing this match is 2.8%.
But that does not tell us what we want to know, which is the probability that Dennis is the murderer. It is incorrect to say that
Because the probability of Dennis matching the murderers snake eyes as a matter of random chance was only 2.8%, then
The probability of Dennis not being the murderer, given that match, is the same small 2.8%.
This error in sentence (2) is called the prosecutors fallacy, because it incorrectly reverses events in a conditional probability to create a direct statement about the defendants probability of guilt which is not implied by the evidence. In logical reasoning, such an error is called transposing the conditional. It is the same mistake as saying Because lawyers tend to be literate people, literate people tend to be lawyers.
To understand what other information is needed to calculate the probability of guilt, we have to keep in mind that the likelihood of Denniss guilt depends in large measure on whether other people could have committed the crime. If no one else in the room rolled snake eyes, then there is no other suspect, and we can be 100% certain that Dennis is guilty. If one other person in the room also rolled snake eyes, however, the probability that Dennis is the murderer is 50%, since it would be equally likely for either to be guilty. The key question, then, is not how unlikely was it for Dennis to have rolled the tell-tale snake eyes, but how many other people in the room also did.
Mathematically, that means we need to account for both the probability that other people at the gathering rolled snake eyes and the probability that Dennis was indeed the murderer despite the existence of others who matched the evidence. In this story, there were 10 people at the party who could have committed the crime, since Al was not in the house when the murder happened. In such a case, the probability of Denniss guilt is 88.4%. But if there were fewer suspects, say only 5, the likelihood of finding other suspects rolling snake eyes goes down, and the probability of Denniss guilt would increase to 94.6%. On the other hand, if there were many more members of the family, if say100 other relatives were in the room, there would then be many more possible suspects, and the probability of Denniss guilt would drop to 33.8%. The probability of Dennis having rolled snake eyes has not changed with each scenario, but the probability that he is the murderer, the only issue we care about, has varied greatly depending on how many other possible suspects there are.
It was not difficult to calculate the extent of this variation for these examples, because the number of other possible suspects was fixed and known. One problem with applying this approach in the real world is that we usually do not know how many other possible suspects there are. If a murder occurs on a street in West Baltimore, the number of possible suspects would turn, in part, on whether the universe of suspects includes only those who live in the neighborhood, residents of the city, or all who might conceivably have visited Baltimore that day. Each choice will lead to a different probability that a particular suspect is guilty. Accordingly, how we answer that unanswerable question will determine the result of any statistical analysis
This dilemma has been called the problem of reference classes. Resolving the ultimate question of the probability of a particular suspects guilt requires us to choose the particular population group with which to compare our evidence. Unfortunately, there are an infinite number of reference classes, the boundary conditions of which can be gerrymandered in countless ways [and] nothing in the natural world privileges or picks out one of the classes as the right one. To do the math, we must choose the appropriate reference class, but we are then making a subjective judgment not mandated by objective analysis.
We face, therefore, one of the sad realities of statistics. The easiest number to calculate, the probability that the defendant matches particular evidence, does not give us the information we really want. Even if it were very unlikely that the defendant matched particularly damning evidence, the numbers would not tell us how likely it was that the evidence came from the defendant.
We encounter this same problem when trying to determine paternity using DNA. The facts of N.J. v. Spann, will illustrate the issue. The defendant, a corrections officer at the county jail, was charged with sexually assaulting an inmate. To prove that the defendant had had sex the victim, the prosecution wanted to prove that he was the father of the victims child.
Blood tests were entered into evidence. They showed that the child had phenotypes A2 A28 B45 B53. This corresponds to two genes, located next to one another on the same chromosome. Of these four phenotypes, a child receives two sets of A and B genes, one from the mother and one from the father. The mother had HLA types A28 A30 B53 B61, which meant that the childs A2, B45 set had to come from the father. Thus, if a person did not have the A2, B45 genes, that person could not be the father. [Analogously, in the story Snake Eyes, if one guest, Iris, had rolled two fives, we would know that she was definitely not the murderer.] Accordingly, a blood test can eliminate a suspect as a possible father. As the Supreme Court has noted, It is a negative rather than an affirmative test with the potential to scientifically exclude the paternity of a falsely accused putative father.
In Spann, however, the defendants phenotype was A2 A28 B35 B45. Because he had the A2, B45 genes, he could not be excluded as the possible source of the childs genes.
Just as we could calculate in Snake Eyes, the probability of Dennis matching the murderers snake eyes as a matter of random chance, it is possible to calculate the probability of the defendant in Spann matching the childs genes as a matter of random chance. Genetic tables showed that only 1% of the relevant male population had the requisite blood and tissue type. But, as in Snake Eyes, the fact that there is a small probability that the defendant would match the childs genes as a matter of random chance, does not establish a similarly small probability that the defendant was the actual source of the childs genes.
What that 1% figure does tell us is what is known as the probability of exclusion. 99% of the relevant male population can be excluded from suspicion. That still leaves 1%. And in a male population of 100,000, that would mean 1,000 people had not been excluded. The probability of exclusion also cannot not tell us who among those 1,000 is most likely to be the father.
A statistic that is related to the probability of exclusion, which can also be derived from blood tests, is called the paternity index [the PI]. The PI compares the probability that the genetic makeup of the child could result from the mating of the mother and a particular suspect with the
probability that it could result from the mating of the mother with some person randomly selected from the general population. The less frequently a particular genetic marker appears in the general population, the lower the probability that a person chosen at random could be the father, and hence the greater the PI.
Note that the PI, like the probability of exclusion, still does not tell us how likely it is that a particular suspect is the father. For example, consider the following example drawn from a classic article by Prof. David Kaye.
Assume that the probability a defendant in a paternity suit would transmit the particular genes in question is .12 and that the probability for a randomly selected man was 0.0062. The PI for the defendant would be obtained by dividing his probability by that of the randomly selected man, which would equal 19.4. A PI of 19.4 means that someone with the defendants exact genetic makeup, would produce (with the mother), a child with the requisite phenotype more than 19 times as frequently as would a randomly selected man.
But we still do not know the probability that the defendant is the father. If there were a relevant population of 100,000 men, we might expect that approximately 620 men other than the defendant would also be capable of transmitting the genes that created the child in question. Based on the paternity index, that would mean that the probability that the defendant was the father was only a miniscule 0.019%.
In the real world, however, we cannot assume that just because 620 people have the same matching genetic profile as the defendant, than everyone is equally likely to be guilty. This has been termed the defendants fallacy; it uses the numbers to make guilt appear unlikely while ignoring all of the other evidence -- such as how many people actually knew the mother, how many of those ever had the opportunity to have sexual relations with her at a time when she was able to conceive, and how many men were infertileall of which would exclude some people and render others more plausible.
There is a mathematical solution to this problem, but it is one in which the legal systems mathematical ignorence has led to a very disturbing trend. It begins with a mathematical formula, known as Bayes theorem. On its most basic level, the formula is nothing more than a mathematical way of representing how we incorporate new information into our reasoning: When an observer receives new evidence relevant to the truth of the proposition at issue, she adjusts her probability assessment to take that evidence into account.
For example, a new restaurant opens and I read an excellent review. I would think that it is likely that this is a good restaurant. Then assume that a friend whose tastes I trust tells me that she ate at the restaurant and the food was terrible. Obviously my assessment of the probability that the restaurant was good will change with this new information. The degree to which my assessment changes will depend on how much I value the opinion of both the restaurant reviewer and my friend.
That reality is captured by Bayes theorem. To utilize this theorem, we need to know that the Bayesian analysis of evidence relies on six concepts: hypothesis, information, prior probability, likelihood, likelihood ratio and posterior probability.
An hypothesis could be thought of as a theory of the case; it is the question what happened? There will always be a main hypothesis, the theory being considered. In paternity litigation, the main hypothesis is that the suspect was the father.
Information refers to everything we know. Our aim is to determine the probability that our main hypothesis is true after taking into account new information. In the paternity case, the new information is the DNA match.
Before the new information was obtained, an hypothesis had a prior probability of being true. Prior probability refers to the probability that a particular hypothesis was true based on everything we knew before the arrival of the new information.
Likelihood refers to the probability of having obtained the new information under the assumption that a particular hypothesis was true. For example, assuming the suspect were the father [our main hypothesis], what is the probability he would match the DNA [our new evidence]? Since we know that the father definitely matched the DNA, that probability is 1. Thus we would say the likelihood is 1.
To calculate the likelihood ratio, one must divide the likelihood by the probability that the event would occur by random chance. The likelihood ratio shows the effect the new evidence has on our hypothesis. As Richard Friedman noted: A likelihood ratio greater than 1 means that the proposition appears more probable in light of the new evidence; a likelihood ratio less than 1 means that the new evidence makes the proposition appear less probable; and a likelihood ratio of precisely 1 means that the new evidence leaves the probability unchanged.
Finally we have our goal, the posterior probability, which is the probability of our main hypothesis being true after we have obtained the new information. (Note that this is the transposition of the likelihood.) The posterior probability in a paternity suit is the probability that the suspect is the father after we know of the DNA match.
There are many different ways to express Bayes theorem. For our purposes, we can use the following:
Posterior probability = Prior probability x Likelihood ratio
of hypothesis of hypothesis
Since the paternity index, PI, tells us the same information as the likelihood ratio, we can write Bayes formula for paternity litigation as:
Probability suspect = Prior probability x Paternity Index
is the father suspect is the father
In other words, the probability that the defendant is the father, generally known as the probability of paternity, equals the paternity index multiplied by the prior probability of the defendant being the father. Since we have already seen that the paternity index can be calculated readily, the only other issue is how to calculate prior probability.
In simplistic cases like the one described in Snake Eyes, this prior probability is not difficult to calculate. If there are ten guests, prior to knowing the results of rolling the dice, each had an equal chance of being the murderer, so that the prior probability that Dennis, or any of the others, was the murderer was 1/10 = 10%. If we already knew that Iris had rolled two 5s, she would be eliminated as a possibility, and the prior probability that Dennis was the murderer would increase to 1/9 H" 11.1%.
The real world, of course, is far messier. In the paternity context, for example, the prior probability of the suspect being the father depends both on whether he had sexual relations with the mother and on how many other men she had sexual relations with, as well as the timing of each sexual encounter. Those facts will rarely be known by the fact finder with any degree of certainty, but will be conveyed by a wide range of incomplete, uncertain, and often disputed pieces of evidence. Accordingly, any determination of prior probability will necessarily be based on the subjective judgment of the fact finder. The prior probability that the suspect is the father will be, in other words, an imprecise approximation, rather than a nice, neat number. The problem that then arises is that Bayes
theorem is no longer usable; one cannot do the calculation with only a subjective sense of what might have happened.
Without Bayes formula, all that is left is the Paternity Index, which can tell us how much of the population to exclude, but does not answer the critical question of how likely it is that the suspect is the father. There is nothing inherently wrong with such a situation. In many trials, jurors are given information that excludes large portions of the population, and must then figure out if the defendant, who was not excluded, was the guilty party. For example, a jury may be shown photographic evidence that the robber was a 6 foot 6 inch Caucasian male. That would exclude all who do not meet that description, but even if the defendant meets that description, the prosecutor would still need to show more evidence (such as that the stolen goods were found in the defendants possession) to obtain a conviction.
The power of numbers and the promise of an objective determination, however, have blinded many judges and legislators when it comes to paternity testing. In courts throughout the country, the probability of paternity is calculated using Bayes theorem by taking the Paternity Index and inserting, a standard prior probability of .5 regardless of any other factors, which indicates a fifty percent chance that the alleged father actually had sexual intercourse with the mother.
What the use of a prior probability of .5 means is that genetic experts present to a jury the probability of paternity as a fixed number on the unproven (and often unspoken) assumption that, prior to the genetic testing, there was a 50/50 chance that the suspect was the father. Courts have justified this use of a fictitious number on the dubious grounds that, The 50/50 assumption was completely neutral. This claim that this is a neutral probability is based on the argument that a prior probability of .5 assumes that the defendant is just as not likely the father of the child as it assumes he is the father.
While 50/50 may appear fair on first glance, the 50/50 assumption of prior probability is demonstrably not a neutral assumption. To see why, let us return to Snake Eyes. If Dennis had been accused of being the murderer prior to the disclosure of his dice roll, it would have been a tremendous injustice to assume that he was equally as likely the murderer as not. There were, after all, 9 other guests with an equal chance of being guilty. Thus, a 50/50 assumption would be tantamount to saying the Dennis was as likely as everyone else combined to be the murderer. This is hardly a fair or accurate statement, as the odds of guilt were not 50/50 but actually 10 to 1.
One can also recognize this concept by looking at sports betting in competitions with numerous contestants. For example, when there are 65 teams in the NCAA Mens Basketball tournament, it would be ludicrous to say that it is as likely for one team to win as all the others, that the odds are 50/50. Indeed, for the 2008 tournament, even the favorite, UCLA, was given odds by bookmakers of 7-2, while long shot Coppin State, was given odds of 2500-1. Similarly, at the 2008 Kentucky Derby, heavy favorite, and eventual winner, Big Brown went to the post with 5-2 odds in his race against 19 other horses.
In these sporting venues, the reason that the prior probability of a particular team or horse winning is not 50/50 is obvious. The choice is not Either A will win or A will not win. Rather, the choice is Either A will win, or B will win, or C will win, etc. Similarly, in the paternity context, if all we know is an accusation, the choice is not Either A is the father or A is not the father, but Either A is the father, or B is the father, or C is the father, etc.. In order for 50/50 to represent the actual prior odds of paternity, there would have to be exactly one other person than the suspect who had sexual relations with the mother during the appropriate time span. To reject the use of the 50/50 prior probability is not to say that there are endless groups of possible fathers in every case. Rather, it is to assert the simple proposition that the automatic use of the 50/50 prior probability is inappropriate because, without knowing other evidence, it is impossible to know how many possible fathers there are.
Another problem with the seemingly neutral assumption of 50/50 prior probability is that it can easily lead to ridiculous, counter-factual results. In Snake Eyes, for example, if we used the 50/50 prior probability, we would conclude that Al, who was not in the house when the shooting occurred, would be deemed to have the same likelihood of committing the crime as Dennis. This leads to the perverse results that a) two different people are each given a probability of more than 80% of being the only shooter; and b) someone who is definitely innocent is perceived as far more likely to be guilty than innocent.
Lest one think this is a fanciful case, consider the plight of Donald Cole. A North Carolina district court judge found him to be the biological father of Jonathan Cole, based on evidence showing that the probability of paternity was 95.98%. The judge found that the numerical value for the probability of paternity was more probative than the fact that the purported father had had a vasectomy before Jonathan was born, and that tests showed a sperm count of zero both before and after the birth. The finding of paternity was reversed on appeal, but the lure of an easy number has convinced all too many others to opt for counter-factual certainty.
Even while conceding that the assumption of a 50/50 prior probability will not correspond to the facts in most cases of disputed paternity, the Joint AMA-ABA Guidelines recommended use of the 50/50 assumption as a useful working hypothesis. The Uniform Parentage Act similarly creates a rebuttal presumption of paternity when there is a probability of paternity of 99%, using a prior probability of 0.50. Many states have specifically adopted this language of the Uniform Parentage Act, specifying use of a prior probability of 0.50. Among those states whose laws do not stipulate the 50/50 prior probability, virtually every state still allows its use in creating a probability of paternity.
Some have proposed that, rather than rely on a standard 50/50 prior probability, a more accurate assessment can be made by having jurors determine the prior probability for themselves. To assist with the calculation, the jurors would be given a range of different prior probabilities and the probability of paternity associated with each. One court stated that, The expert should present calculations based on assumed prior probabilities of 0, 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100 percent.
This proposal, though well-meaning, is hopelessly misguided. On the most basic level, jurors are being asked to quantify the strengths of their subjective opinions. The problem is that people generally cannot condense their thoughts, feelings, and intuitions into a solid number. As Professor J.H. Wigmore wrote, No one has yet invented or discovered a mode of measurement for the intensity of human belief.
Even if opinions could be turned into numerical probabilities, justice would not be served. A jury will have great difficulty balancing hard numbers, against such fuzzy imponderables as the risk of frame-up or of misobservation, if indeed it is not induced to ignore those imponderables altogether. The danger is that the attempt to concretize what is inherently a subjective analysis will tend to shift the focus away from such elements as volition, knowledge, and intent, and toward such elements as identity and occurrence -- for the same reason that the hard variables tend to swamp the soft.
One court has rejected having jurors determine their own estimate of prior probability because the resulting trial would be unduly complicated. Unfortunately, this same court barred the admission of the factually-based probability of exclusion because jurors were apt to confuse it with the simplified, but inaccurate likelihood of paternity. Thus, in the name of simplicity, the court permitted the use of the inaccurate 50/50 prior probability.
This may explain a large part of the reason the legal community has so embraced the unfortunate use of 50/50 prior probability. In addition to its seeming facial neutrality, it use makes the intimidating math of Bayes theorem easier to grasp and more understandable.
To insist upon using the ersatz 50/50 probability because it is more understandable, though, presents, the absurdity of looking for the lost coin under the lamppost solely because the light is better. In the words of Oliver Wendell Holmes, the appeal of the precision of numbers is that they, flatter that longing for certainty and for repose which is in every human mind. But certainty generally is illusion, and repose is not the destiny of man. It has long been understood by real mathematicians, that Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question.
There is no disagreement about the mathematical accuracy of Bayes theorem, the statistical meaning of the Paternity Index, or the fact that the generalized use of the 50/50 prior probability is not necessitated by either science or mathematics. But determining whether using a fixed 50/50 prior probability to determine the likelihood of paternity provides an exact answer to the wrong question is a task for the legal system. It does not matter that within the relevant community of blood testers, the paternity probability calculations were based upon scientific methods, accepted worldwide, which incorporate both Bayes theorem and the .5 prior probability. The real issue is whether the use of an unsubstantiated, counter-factual prior probability is relevant for deciding whether a particular suspect is the father. Simply put, What is and is not relevant is not appropriately decided by scientists and statisticians. It is for the trial court, not the scientific community, to determine the relevance of the technique If the numbers are not accurate, they are irrelevant and should not be used.
But sometimes the numbers are accurate and still should not be used. Such a situation arises with another DNA issue, this time the use of race in describing genetic statistics.
III. Racialized Numbers
Imagine that at Denniss trial for the snake-eyes murder, the prosecution attempted to present evidence that, according to the Bureau of Justice Statistics, homicides are committed by 1 in 28,574 whites and 1 in 3,773 blacks. Among the many objections to the admissibility of this evidence would be that there is no reason to believe that race is relevant to the question of Denniss guilt and that bringing race unnecessarily before the jury would imply that Denniss race was somehow relevant.
It is astonishing, therefore, that such race talk is a commonplace occurrence at criminal trials all over America. Even in cases when there is no evidence of the perpetrators race, jurors are often presented information describing a DNA match between genetic material found at the crime scene with that of the defendant in explicitly racial terms. For example, in a typical case, the prosecutions expert testified that, Defendant's genetic profile would be expected to occur in one of 96 billion Caucasians, one of 180 billion Hispanics, and one of 340 billion African-Americans. Indeed, when the race and ethnicity of the perpetrator is unknown, providing statistics from several racial groups is the standard way of assessing the significance of a match.
Those not well-versed in thinking about statistics can easily be overwhelmed when told that scientists use racial categories to create extraordinarily intimidating numbers as one of 340 billion African-Americans. The impressiveness of those numbers, however, cannot be permitted to prevent the legal system from making its own value judgment about the significance of the fact that there are no actual definitions to delineate the racial categories. The numbers should also not prevent judges from recognizing the harm that results from permitting court room discussions of race when race would otherwise be irrelevant.
Scientifically, the DNA analysis for a criminal match is similar, but not identical, to that for paternity matching. In the latter, we are trying to see if a suspects DNA is consistent with that of whoever contributed half of the childs DNA. In criminal matching, we are trying to determine whether two DNA samples are identical.
When most people think about DNA, they focus on the many thousands of genes, which are linked segments of DNA. Genes have specific functions, determining all the physical traits that we inherit from our biological parents. Genes, however, make up only a tiny percentage of our DNA. The vast majority of human DNA, estimated at 97 percent, is known as non-coding material or junk DNA, because it serves no known function.
About 99.9% of DNA is identical between any two individuals. When there are differences in either genes or junk DNA, they are identified as alleles. A position on a specific chromosome, called a locus, where almost all humans have the same DNA sequence is termed monomorphic. A locus with multiple possible alleles is termed polymorphic. The more variations there are among alleles, the easier it is to make distinctions between DNA samples. Since junk DNA tends to be highly polymorphic, that is, it contains far greater variation among individuals, it is used for forensic identification.
What that means is that when an expert testifies about the likelihood of DNA appearing in different racial groups, she is only referring to non-coding DNA. Thus, she is not reporting on the DNA that determines skin color or any other physical or biological characteristics associated with specific races; the common assumption to the contrary is completely, if not dangerously, misplaced.
While there is no single gene or collection of genetic material that is specifically associated with any one race, geneticists have determined that some non-coded material is found in greater frequency in some population groups than others. It is critical to recognize that not everyone in a particular ethnic or racial group will have that same genetic material. Moreover, those in other population groups may very well share that particular DNA. Thus, when testimony is given about race and DNA, all that is really happening is that the expert is making highly probabilistic statements about suspects and the likely ethnic, racial, or cultural populations from which they can be identified--statistically.
Because of this statistical variation, the FBI's databases are divided along racial lines. The FBI has divided its national DNA database into five separate population groups: African-American, United States Caucasian, Hispanics, Far East Asians, and Native Americans. It is from these FBIs databases that courtroom experts derive the racial genetic probabilities that they proclaim. But these probabilistic statements, especially when presented with the mathematical certainty of one of 340 billion African-Americans, mask a series of problems that can escape those easily blinded by numbers.
First, there is the question of defining the racial categories. If we are to divide Americans into five population groups in a scientific fashion, we need to have a working definition of each group. Obviously, if we are to distinguish apples from oranges, we need to know the difference between apples and oranges.
The demarcation between racial categories is especially important for the national DNA database. The federal Combined DNA Identification System (CODIS) is a three-tiered system. Local law enforcement agencies collect the DNA data of those they arrest and create a Local DNA Index System (LDIS). Each state then combines the local profiles into a State DNA Index System (SDIS). Each of these state compilations is then combined with the FBIs database into a National DNA Index (NDIS) which contains more than six million offender profiles.
Were only one entity to be charged with compiling and categorizing individuals by race, a clear definition of the categories would be necessary. But when thousands of individual local law enforcement agencies are deciding who is African-American, who is Caucasian, and who is Hispanic, such clarity is essential.
Incredibly, there are no definitions of the particular racial categories that are commonly used in courts throughout the nation. An FBI-funded report by the National Research Council not only recommended the use of racial categories but admitted that there could be no uniform way of delineating the categories:
There is no generally agreed-on vocabulary for treating human diversity. Major groups are sometimes designated as races, and at other times as ethnic groups. Ethnic group is also used to designate subgroups of major groups. [G]roups are mixed, all the classifications are fuzzy at the borders, and the criteria for membership are variable. For such reasons, some assert that the word race is meaningless. But the word is commonly used and generally understood, and we need a vocabulary.
In other words, despite the authoritative sound of the race-based genetic statistical evidence, the actual classification system is no more precise, consistent, and objective than Justice Stewarts notorious description of obscenity: I know it when I see it. The way that race is generally understood, is entirely subjective and non-scientific, based on outward appearance and the societal association of that appearance with a particular racial label: Even though we may feel confident of our visual perceptions and racial or ethnic conclusions, we know that this kind of classification is dismally inaccurate.
One difficulty with acknowledging this inaccuracy is that race is in our nations DNA. From the initial racialization of slavery, racial definitions have been part of our national discourse. But the actual placement of individuals in racial categories, even if generally understood, has always been imprecise. Ironically, at the same time the Supreme Court in Plessy v Ferguson was upholding segregation as the inevitable result of racial instincts, the Court was also acknowledging that there was no agreed-upon distinction between races: [T]he proportion of colored blood necessary to constitute a colored person, as distinguished from a white person, is one upon which there is a difference of opinion in the different States.
Today there is still no consensus on the meaning of race. When the Oxford English Dictionary attempted to define race, it recognized that, the term is often used imprecisely; even among anthropologists there is no generally accepted classification or terminology.
Moreover, despite our common usage, racial categories are not discrete. There is no bright line separating the categories. Unlike with fish or fowl, racial categories are not mutually exclusive.
Still, if, as the National Research Council noted, race is commonly used, how are we able to categorize individuals by race? Many have argued that race is socially constructed, meaning that race is not innate and unchanging. People look at the external physical traits, especially skin color, and associate them with a particular race. These categories, however, are socially fluid: For example, in the US, people with ancestry from India are sometimes labeled Asian and sometimes labeled white or Caucasian; they are not classified in the same way in the UK as in the US.
Not only will racial labels vary by geography, they can also vary by time. It is not at all clear that Homer Plessy, who was generally considered to look White would be considered African-American today. Indeed, people may even change how they self-identify, altering the racial group to which they say they belong.
The incoherence of race as a category is not contradicted by the fact that there is some statistical correlation between the frequency of certain alleles and our ill-defined racial categories. What the DNA variations actually are correlated to is, at best, a partial ancestral geographic origin. In other words, the DNA statistics might signal where some of ones ancestors originated.
But even to the extent that geography is in our genes, the five racial categories of CODIS cannot capture the reality of America. Unlike ancient insular societies, Americans do not stay isolated in neatly-definable groupings: After hundreds of years of sexual mixings, there continues to be no socially sanctioned in-between classification of race in America. Accordingly, the correlation of DNA and geography confirms the incoherence of the CODIS statistical analysis, the reality being that the diversity of human biology has little in common with socially constructed racial categories.
Thus, the introduction in court of racially-based DNA numbers is fundamentally misleading. The geographic origins of a particular long-dead ancestor which might be conveyed by those numbers is simply not the same as the social classification a juror may associate with a particular outward physical appearance. When jurors hear that a particular combinations of alleles occurs in in one of 96 billion Caucasians, one of 180 billion Hispanics, and one of 340 billion African-Americans, they will assume the number applies to the racial category in which they have placed the defendant based on a subjective interpretation of outward physical appearances, regardless of the defendants actual [and unknown] ancestral origins. The influence of the numbers masks the unspoken assumptions.
But there is a greater problem with the legal professions awe of numbers. By relinquishing authority to those who control the numbers, courts have abandoned their responsibility to consider the harm caused by unnecessary race speech in court. This cavalier attitude was expressed by the California Supreme Court when it endorsed the admission of expert testimony presenting a range of racially-characterized genetic profile frequencies: Presenting the objective data in the manner in which such information is collected and analyzed within the scientific community does not inject inappropriate racial assumptions or issues into the litigation.
Courts must not be so intimidated by objective data that they fail to consider the harm created whenever race is introduced into a courtroom discussion. In the story of Snake Eyes, the race of Dennis was irrelevant, indeed, I suspect, it was outside of most readers thoughts, until racial testimony brought the issue to the readers attention. Courts should not casually permit the insertion of race into a jurors analysis.
When the prosecutor puts forth racially-categorized statistics, it immediately raises the question of the relevance of race. Imagine if, at a trial of an African American defendant, a juror is told that the DNA found at the scene of the crime matched the defendant and occurred in 1 out of 10,000 United States Caucasian and 1 out of 10 billion African Americans. The relevance of the DNA would depend on how likely the jury thought it was that the crime had been committed by an African American. After all, there would be thousands of whites who would match the DNA, but very few, if any, other African Americans. Thus, if it were the considered the kind of crime that only an African American would commit, the overwhelming likelihood would be that the defendant was the culprit.
This sort of racial thinking is inherent in the use of racial categories, because, by highlighting, without compelling justification, the racial distinctions that have historically divided us, the government is expressing an improperly divisive conception of the public. Historically, and unfortunately today, [t]he word race suggests that human beings can be divided into subspecies, some of which are morally, intellectually, and biologically inferior to others.
The Supreme Court has recognized this danger, even when no particular group was being treated differently. In Anderson v. Martin, the Court struck down a Louisiana statute requiring that ballots designate the race of candidates for elective office. The Court stated that, although Louisiana was not restricting any voters individual choice, by directing the citizen's attention to the single consideration of race or color, the State indicates that a candidate's race or color is an important -- perhaps paramount -- consideration in the citizen's choice. The constitutional evil arose because, by placing a racial label on a candidatethe State furnishes a vehicle by which racial prejudice may be so aroused as to operate against one group because of race and for another.
When not under the hypnotic influence of numbers, courts readily recognize the danger of governmental use of race-based categorizations: [R]acial classifications are simply too pernicious to permit any but the most exact connection between justification and classification. Accordingly, the Supreme Court has frequently declared that, all racial classifications, imposed by whatever federal, state, or local governmental actor, must be analyzed by a reviewing court under strict scrutiny. A racial classification will only pass strict scrutiny if it is necessary for furthering some compelling interest and is narrowly tailored to further that interest.
Conceding that the accurate determination of a criminal defendants guilt or innocence is compelling, still leaves the critical question of whether the racial categorization of DNA is necessary for and narrowly tailored to the making of that determination. Obviously, if there were any racially neutral way of presenting the DNA information in a meaningful fashion, the Constitution would require the FBI and courts to forego the race-based approach.
Fortunately, there is in fact a mathematical solution to this problem, a relatively simple way to fulfill the desire for a race-blind figure in a general-population case. All that is required is the use of what mathematicians call a corrective factor. Mathematicians often add so-called corrective factors to their equations so that their generalized theoretical predictions can more accurately reflect particular factual situations. For example, when calculating back pay in one case, the EEOC determined that using only the hourly wage would not represent the full amount of money lost, and proposed a corrective factor to be placed in the formula which would accurately reflect the effect of overtime hours. In a similar fashion, the National Committee on the Future of DNA Evidence has shown that, by placing the appropriate corrective factor in the equations for calculating genetic probabilities, the necessity for group classification could be avoided by using an overall U.S. database. Significantly, the Committee reported that it was able to convert to a race-neutral formula so easily because genetic differences, are mainly between individuals rather than between group averages.
Because a scientifically sound basis exists for using general database probabilities, the continued use in criminal trials of race-based statistics cannot be legally justified. Judges must tell genetic statisticians that the emphasis on race makes racial divisions too salient. The legal system must reassert itself and prohibit the placing of the power of the State behind a racial classification that induces racial prejudice.
IV. Bigoted Numbers
Sometimes, the problem with numbers is not that they induce prejudice in others, but that the very use of numbers is unknowingly bigoted. Consider the insurance company which, in 1962, charged African Americans more than 28% more than Caucasians for the identical life insurance policy. The insurance company defended this discriminatory treatment as justified by statistics showing that, at the time of the policy, African Americans did not, on average, live as long as Caucasians.
Or consider the 2004 case of a three-month-old Native American girl who was killed when her father, in a rage, threw her to the ground, causing her to hit her head against the bathroom sink. After the father was convicted of voluntary manslaughter, the Court called for an expert to calculate the lost future income of the infant, so that restitution could be awarded for the income she would have earned but for her death at the hands of the defendant. The expert concluded that the restitution should be reduced to account for the victims race and sex; he thus recommended an award of $171,366, which was more than 44% lower than his estimate of lost wages for all Americans of that age, 308,633.
One can make a numeric defense of such disparate treatment. After all, statistics do show that, on average, African Americans die five years earlier than white Americans. Statistics also show that African Americans earn, on average, less than whites, and women, on average, earn less than men. Moreover, numbers show that women have, on average, spent fewer years in the workforce than men, largely because many women have taken time off from work in order to raise children.
Backed by such hard numbers, most courts have been quite willing to make decisions which result in different financial outcomes, depending on the race and gender of the parties. As Professor Martha Chamallas has noted: [I]t is commonplace for expert witnesses to rely on gender and race-based tables to determine both the number of years that a plaintiff would likely have worked (work/life expectancy) and the likely annual income the plaintiff would have earned. Thus, in reading opinions awarding damages, it is not unusual to read statements such as in 2003, an African American female, aged 65, born in and living in the United States, has an additional life expectancy of 18.5 years; Plaintiff presented evidence from an economics expert... as to the demonstrated earning capacity of someone of plaintiff's race, sex, age, and educational level; and future earnings [were calculated] based on the average earnings of a college-educated female of her age.
Even the federal government relies on sex-based tables. In calculating the minimum funding requirements for certain pension plans, the I.R.S. provides mortality tables that can be used for determining the current liability for individuals who are entitled to benefits on account of disability. The IRS explains that these mortality tables are gender- distinct because of significant differences between expected male mortality and expected female mortality. Even more emphatically, the IRS requires that those who wish to use alternate mortality tables for their pension plans must use tables that treat men and women differently: Separate mortality tables must be established for each gender under the plan.
One might well expect that even statistically-justified race and gender distinctions would be met with heightened scrutiny by the courts. Race and gender are, after all, called suspect classes because we suspect that racial and gender classifications are based on stereotyped views of groups and we suspect that, as in the past, these distinctions have the purpose or effect of harming those in the disfavored category. Yet surprisingly the reported cases have almost completely neglected the question. It is as if, when faced with race- or gender-based statistics, we tend not to notice the discrimination and to accept it as natural and unproblematic.
The power of these numbers can be so great that even well-meaning judges shrink from confronting them. In holding that lost wages for a female plaintiff needed to be reduced to reflect the average womans lower salary, one judge bemoaned: I am constrained to agree with the defense that the present value of prospective earnings, female wages before taxes must be used. However sympathetic this Court may be to equality in employment, it must look to the reality of the situation and not be controlled by its own convictions.
The true reality of the situation, however, is that a reluctance to fully understand what numbers can and cannot tell us has caused the justice system to accept and enforce needless discrimination. The reliance on race and sex based statistics should be rejected as both bad mathematics and bad policy.
The first mistake made by those who rely on race and sex based statistics is that they ignore one of the cardinal principles of statistics: correlation does not prove causation. The fact that the month with the fewest days has the most snow days does not imply either that short months cause snow or that snow causes short months.
Nonetheless, statistical correlation may still be relevant for predicting the future. We can often use the past experience to guess what is likely to happen in the future. Thus, over a span of several years, we can expect that the shortest month generally will continue to be the one that tends to experience the most snow days.
There is a critical assumption, though, which enables us to use statistics of what has already happened to predict what is still to come. For the past to predict the future, the future must resemble the past. To continue our short month-snow day analogy, assume that a new leader comes in and changes the calendar. Declaring that summer vacations are too long and wasteful, this despot decrees that July and August shall henceforth only have 21 days, and that the remaining ten months would each get an additional two days. Suddenly, our prediction that the shortest month will have the most snow days is obsolete, even though the statistical analysis on which it was based remains unchanged.
Similarly, a law student in 1963, wondering if his future granddaughter would attend law school, would have been badly mislead by statistical tables. He would have been told that only 4.2% of law students were women and that, looking backwards, the numbers had barely budged over the preceding decades. Fast forward to the present, and we see that almost half of all law students are women. This monumental change, due largely to the womens movement and anti-discrimination laws, would not have been incorporated into statistical tables.
Great social change continues into the 21st century (Does anyone really need to say President Obama?). Even the mortality rates of today are different from a few years ago. According to the Centers for Disease Control, [d]ifferences in mortality between men and women continued to narrow. Similarly, since 1989, the age-adjusted death rates for the black and white populations have tended toward convergence.
For race- and gender-based statistical tables to accurately foretell the future, therefore, the circumstances which caused the statistical differences would have to continue. The only way to justify mathematically the use of race and gender based statistical tables for predicting the future is to assume either that existing discrimination (and its effects) will continue or that the race and gender based distinctions are innate and inevitable. Not only are these propositions offensive, pessimistic, and wrong, evaluating their likelihood is not within a statisticians skill set. Determining whether discrimination and the effects of past discrimination will be negated by both the legal system and social changes is, most emphatically, not the province of statisticians. The justice system cannot allow itself to be so intimidated by a statistical statement that it overlooks the need to make its own evaluation; indeed, any decision to use a group-based projection into the future involves normative judgments about the relevant frame of reference and the rate of future change.
For example, as previously discussed, Americas fascination with racial analysis often masks, rather than reveals, the truth. While African-Americans, on the average, have a shorter life expectancy than their white counterparts, a large proportion of that difference is due to socio-economic, not racial, differences. One demographic study found that, White residents of urban poor areas have mortality profiles comparable to those of black residents of poor rural areas and blacks nationwide. The life expectancy of these whites was found to be, in fact, lower than that for residents of relatively advantaged black urban areas. The socio-economic factor is disregarded in the life expectancy tables. As Judge Jack Weinstein noted, Gross statistical tables do not answer the question: how does the life expectancy of well-off or middle-class African-Americans compare to that of poor African-Americans? Thus, he concluded, courts should reject the use of racially based tables that tend to enforce the negative impacts of lower socio-economic status while ignoring the diversity within populations
Courts should also recognize that to the extent the differences reflected in the race and gender based tables are caused by ongoing discrimination, using those tables reinforces the harm caused by wrongful discrimination. As one court ruled, it would be inappropriate to incorporate current discrimination resulting in wage differences between the sexes or races or the potential for any future such discrimination into a calculation for damages resulting from lost wages. This is especially true in tort cases, where victims have been deprived of their chance to excel in life beyond predicted statistical averages.
The use of these statistical averages causes harm in a great many ways. The most obvious way is that some injured tort victims receive far less of a remedy than other equally injured tort victims, based solely on their race or gender. Because of a long history of discriminatory treatment, the explicit use of race-based and sex-based economic data dramatically reduces some damage awards for women and for African-American and Hispanic men. In an infamous 1905 case, the court was faced with determining damages from wrongful death claims for eight claimants whom it described as white or colored. They all could not be treated equally, the court decreed, due to the difference in the vitality of the two races. Accordingly, the judge, lowered the awards for the deaths of blacks ten percent more than the awards for the deaths of whites and slashed three of the awards for blacks by forty percent or more.
Racial and gender differences in income continue today. According to a 2007 report by the U.S. Department of Commerce, non-Hispanic white men had annual median earnings of $47,814,while African American mens annual median earnings were more than 25% lower, at $34,480. Similarly, the median earnings of women, $32,649, is 77.3% of mens $42,210. For some demographics, the differences are even starker; the average salary for a male Native American is just
58% that for white males. Thus, the use of race and gender-based statistics will have the undesirable effect of reinforcing the underlying social inequalities of our society rather than describing a significant biological difference.
There is an additional social cost, beyond the perpetuation of inappropriate stereotypes. Assuming that the deterrent effect of a legal action depends on its ability to raise the cost of the undesirable behavior to the defendant, leads to the conclusion that when damages for injuring members of minority groups are lowered, the legal regimen will have the perverse result of encouraging torts against them. Thus, because it is cheaper to injure poor minority children, there is less incentive for defendants to take measures to clean up toxic hazards in the neighborhoods most affected by lead paint.
A further harm caused by the use of race- and sex-based statistics is analogous to the harm discussed with raced-based DNA testimony: such use places unnecessary emphasis on factors that are both largely irrelevant and have a historical record of justifying irrational discrimination. As one commentator noted, organizing the statistics around race propels race to the forefront of predictions about individual achievement. The Supreme Court has made a similar observation about the use of gender-based statistics. Since, each individual's life expectancy is based on a number of factors, of which sex is only one. . . .[One] cannot say that an actuarial distinction based entirely on sex is based on any other factor than sex. Sex is exactly what it is based on.
There have been significant instances where the fundamental interest in equality has overwhelmed the power of the statistical average. Most notably, perhaps, was the distribution of money from the September 11th Victim Compensation Fund. This fund was established by federal law to provide compensation for those injured or killed as a result of the 9/11 attacks. A Special Master, Kenneth Feinberg, was appointed to distribute the funds. One of the thornier issues he had to resolve was how to calculate the lost earnings of the victims. In calculating the expected work life for the claimants, the Special Master chose not to discriminate against women and elected to utilize the same worklife table for both men and women. The appropriateness of choosing a gender-neutral approach was brought into sharp focus by the overarching purpose of the compensation fund, to serve as a national expression of unity in the face of a tragedy unique in American history, as well as to help survivors. Thus, the transcendent values of equality and respect for individuals were found to outweigh the persuasive power of statistics in the extraordinary context of compensating for the horrors of 9/11. Those values, though, should also be sufficient in ordinary cases to rebut the need for race and gender based statistics.
The Supreme Court took a tentative step toward this goal when it ruled that Title VII of the Civil Rights Act of 1964 prohibits employers from utilizing gender-based statistics in their retirement plans. According to the Court, employers can neither require women to make larger contributions in order to obtain the same monthly pension benefits as men nor offer their employees the option of receiving retirement benefits only with companies that pay lower monthly benefits to a woman than a man who has made the same contributions. It is irrelevant, the Court explained, whether or not the sex-segregated actuarial tables actually, reflect an accurate prediction of the longevity of women as a class. Indeed, [even] a true generalization about [a] class cannot justify class-based treatment.
Even though these rulings were based on a particular federal statute, the reasoning can be applied equally to a constitutional analysis of the use of race- and gender-based statistics. The Court recognized that the use of race and sex to predict longevity, was flatly inconsistent with the principles of Title VII that require employers to treat their employees as individuals, not as simply components of a racial, religious, sexual, or national class.
Such use of race- and sex-based statistics is equally inconsistent with constitutional norms, since the same principle applies: At the heart of the Constitution's guarantee of equal protection lies the simple command that the Government must treat citizens as individuals, not as simply components of a racial, religious, sexual or national class.
It is for the courts, and not the statisticians, to ensure that this command is obeyed. Courts should require the use of blended tables which do not distinguish based on race or gender when calculating tort damages. Just as insurance companies elected to stop using race-based statistics due to the social unacceptability of such discrimination, so should they cease using gender-based statistics, by force of law if not voluntarily. If society is ready to transcend the history of race and gender discrimination, we must not permit bigoted numbers to slow our progress.
V. Reclaiming Judicial Responsibility for Allocating the Risk of Error
Mistakes happen. There is no truth machine that will tell us with unwavering accuracy the proper result of a medical test, economic prediction, or trial. Ideally, we want to reduce the frequency and degree of inaccurate results, but imperfection is an inescapable result of the human condition. Statisticians, who deal in the art of the probable, have devised a useful way to think about and deal with this inevitability of error.
Suppose, for example, that there was a medical test for determining whether patients had a particular disease, and, in general, a higher test score correlated to an increased likelihood of having the disease. Assume that patients have a range of scores on this test, and a cut-off point is needed so that we can say that above this particular score, the patient will be diagnosed with the condition.
There are two situations where the test could be wrong. First, with a false positive, healthy patients are diagnosed with the disease. Alternatively, with a false negative, diseased patients would be mistakenly termed healthy. In statistics, these would be termed Type I and Type II errors respectively. No matter which cut-off score you choose, you will make some errors; there is no perfect point for us to choose.
Accordingly, the cut-off is chosen based on a determination as to which kind of error is worse than the other. Raising the cut-off point will result in more false negatives (more afflicted patients declared healthy) but fewer false positives (fewer healthy patients deemed afflicted). Lowering the cut-off point has the opposite effect, causing fewer false negatives (fewer afflicted patients declared healthy), but more false positives (a greater number of healthy patients will be termed afflicted).
Since both kinds of errors always will occur, the cut-off point chosen for determining the presence of the disease will reflect a value judgment as to which error has more serious consequences. We might prefer to have a fewer false positives, a smaller Type I error, for an employment drug test, so that we reduce the number of employees wrongfully accused. For diseases with grave consequences that could be averted only by immediate action (as when a change in diet could avoid retardation during fetal development), we might desire fewer false negatives, a smaller Type II rate, to minimize the possibility that someone with the disease goes undiagnosed.
The choice of the legal standard of proof reflects a similar calculus. As with the inevitably imperfect diagnostic medical test, there is always the possibility that the verdict in a trial does not square with the true facts.
In the criminal context, if we convict someone who is innocent, we have made a Type I error. If we acquit a guilty person, we have made a Type II error. Similarly, in the civil context, finding for the plaintiff when, were truth fully known, the defendant should prevail, is a Type I error; permitting the culpable defendant to win the case would be a Type II error. And we know that errors will be made.
Adjusting the standard of proof affects the frequency of each type of error. Just as raising the cut-off point results in fewer healthy people being diagnosed as diseased, the higher we make the standard of proof, the fewer the innocent people who will be found guilty. The cost, of course, is that then more guilty people will be acquitted.
For criminal cases, the Constitution requires the highest standard, beyond a reasonable doubt, because of a fundamental value determination of our society that it is far worse to convict an innocent man than to let a guilty man go free. This principle predates the Constitution, as reflected in Blackstones admonition that English law recognized that it was preferable for ten guilty persons to escape than for one innocent person to be convicted wrongfully. The Supreme Court has explained that this balance reflects the fact that the accused has a far greater stake in a criminal trial than even the Government: Where one party has at stake an interest of transcending value --as a criminal defendant his liberty -- [the] margin of error is reduced as to him by the process of placing on the other party the burden of persuading the fact finder ... of his guilt beyond a reasonable doubt.
By contrast, in a civil suit between two parties, where the plaintiff alleges that the defendant is responsible for some monetary loss, the preponderance of the evidence standard is used, signifying that the cost of a mistaken verdict for plaintiff is neither greater nor less than the cost of a mistaken verdict for defendant. As former Chief Justice Rehnquist noted, because the preponderance of the evidence standard allocates the risk of error more or less evenly, it is used whenever an incorrect finding of fault would produce consequences as undesirable as the consequences that would be produced by an incorrect finding of no fault.
In some cases, the Courts evaluation of the undesirability resulting from incorrect findings, has led to the utilization of a middle level of burden of proof -- clear and convincing evidence. The Court has required the Government to prove its case by clear and convincing evidence, in civil cases in which governmental action threatened a significant deprivation of liberty, such as civil commitment, deportation, and denaturalization.
This increased burden was selected to ensure that a more of the risk of an erroneous decision would be imposed on the Government: The individual should not be asked to share equally with society the risk of error when the possible injury to the individual is significantly greater than any possible harm to the state.
The question of when an injury is significantly greater is a matter for judicial determination. For example, in Santosky v. Kramer, the Court struggled with the wrenching issuing of adjudicating the loss of parental rights. In that case, three children were removed from their parents custody after the local Department of Social Services found evidence of abuse, malnutrition and neglect. The issue for the Supreme Court was to determine the standard of proof the government needed to establish before permanently terminating parental rights.
A majority of the Justices, citing the commanding importance of a parents interest in raising his or her child, required the use of the clear and convincing standard. The preponderance of the evidence standard would
reflect[] the judgment that society is nearly neutral between erroneous termination of parental rights and erroneous failure to terminate those rights. The Court stated that the preponderance of the evidence standard, that allocates the risk of error nearly equally between those two outcomes does not reflect properly their relative severity.
In dissent, then-Associate Justice Rehnquist argued that the majority had seriously undervalued the harm that would result from erroneously maintaining parental rights: If the Family Court makes an incorrect factual determination resulting in a failure to terminate a parent-child relationship which rightfully should be ended, the child involved must return either to an abusive home or to the often unstable world of foster care. Therefore, he stated, the two types of errors should be viewed as having equal seriousness, and a preponderance of the evidence standard should have been utilized to determine what was best for the children.
A second case, Cruzan v. Director, Missouri Department of Health, presented a similar need to allocate the risk of error. Cruzan involved parents who wanted to terminate the life support system of their comatose daughter. The Supreme Court permitted the State of Missouri to overrule the desires of the parents unless they could prove by clear and convincing evidence that their daughter would have wanted to avoid further medical treatment. The Court emphasized the States great interest in the protection and preservation of human life, and concluded that: An erroneous decision to withdraw life-sustaining treatment ... is not susceptible of correction.... An erroneous decision not to terminate [could be corrected by] the possibility of subsequent developments such as advancements in medical science, [or] the discovery of new evidence regarding the patients intent....
This time, Justice Brennan dissented, contending that the Court has undervalued the serious harm from incorrectly rejecting the parents claim: An erroneous decision not to terminate life support ... robs a patient of the very qualities protected by the right to avoid unwanted medical treatment. His own degraded existence is perpetuated; his familys suffering is protracted; the memory he leaves behind becomes more and more distorted.
In all of these cases, the Supreme Court has struggled to make a careful, nuanced determination as to whether greater harm resulted from one type of error or the other, and then has selected the legal standard that incorporates that determination. What matters for this discussion is not whether one agrees with their ultimate determination in any of these cases. Rather, the point is that all would agree it is the task of the court to make the normative determination of the harm that would be caused by both types of erroneous decisions and adjust the legal standard accordingly.
Unfortunately, as soon as numbers appear in a case, the courts appear to abdicate their policy-making responsibilities. For example, in Castaneda v. Partida, the Supreme Court was asked to determine whether the Texas system for selecting members of a grand jury discriminated against Mexican-Americans. The most important evidence for the criminal defendant challenging the system was a comparison of the percentage of Mexican-Americans in Hidalgo County with the percentage on grand juries. After a brief statistical analysis, the Court concluded that any claim that the jury drawing was random would be suspect to a social scientist. That is, however, the wrong question. Whether the statistics would convince a social scientist, is largely irrelevant; instead the Court should be asking whether the statistics are legally relevant to resolving the issue at hand.
That decision must incorporate the consideration of how the justice system should balance the risk of erroneously finding discrimination in a fair system with the risk of erroneously finding neutrality in a discriminatory grand jury selection process. While mathematical analysis is necessary to explain to the Court the meaning of the numbers, the value-laden evaluation of the comparative seriousness of the harms involved should never be a mathematical decision.
To understand how the courts have forfeited their proper role and how they can reclaim it, we must understand the statistical process of hypothesis
testing. For that exploration, I will present another story, called Heads You Win, to demonstrate the value judgments implicit in hypothesis testing, and show how and why judges must reassert their right [and obligation] to make those judgments.
Heads You Win
Sally is an elderly art collector who wants to give away the most valuable piece in her collection to one of her two children. She invites them to her house and tells them that she wants to play a game to decide who gets the painting. She opens a fresh roll of quarters and gives one to her son Charles and one to her daughter Lisa. The game is simple, she says. Each of you will flip your quarter 10 times. Whoever gets the most heads gets the picture. But, if I catch you cheating, not only will you not get the painting, I will leave you out of my will. Charles and Lisa take turns flipping their coins while Sally marks the number of heads each flips. When the final scores are tallied, Charles has won with 10 heads, while Lisa only has 5. Lisa turns to her mother and shouts, Its not fair. He cant be that lucky. He must have cheated.
Based on the evidence, how is Sally to proceed? More particularly, what do the numbers tell her about the likelihood that Charles cheated?
The case of Heads You Win will turn on how we think about unlikely events. While it may seem counterintuitive, every possible outcome of the ten coin tosses is, on one level, unlikely. Since in a fair game there is an equal 50/50 likelihood of flipping heads or tails on a single toss, getting heads on half the tosses, 5 heads out of 10, would seem to be a likely event. Such thinking, though, ignores the fact that out of 10 tosses there are actually 11 possibilities [ranging from 0 heads to 10 heads]. While 5 is indeed the most likely of the 11 possibilities, one should actually expect to get 5 heads out of 10 tosses fewer than 25% of the time. While that is the least unlikely event, variation from that ideal would hardly be unexpected.
Similarly, when examining a sample, whether a statistical survey or the scores some people obtained on a job test, one should expect that the sample will not be a perfect representation of the entire population. Similarly, the mere fact that the price of a stock rises the day after information is released does not prove that the stock increase was noteworthy, let alone linked to the information. There is an inevitable volatility in stock prices that might account for the increase.
Statisticians have a way of calculating the expected volatility in a sample. The mathematical phrase is standard deviation, which can be thought of as a measure of spread, dispersion or variability of a group of numbers. The standard deviation is determined based on the amount each element of the sample differs from the average. While there are precise mathematical formulas for calculating standard deviation, there is great benefit to be derived from considering the plain English meaning of the phrase. The phrase standard deviation means that, for every statistical sample, some divergence from the center, from the normal, is to be expected. It is, indeed, standard, for any specific result to be somewhat different from another.
But the phrase standard deviation implies something else not all deviations are standard. A degree of variation is to be expected, but some variations are so extreme as to be non-standard and surprising. When those surprising variations occur, it makes statisticians consider the possibility that their original expectations might have been mistaken.
To return to the mathematical definition of standard deviation, the more standard deviations a result is from the expected result [oft times the mean], the less likely one is to see it. The most common mathematical calculation for standard deviation involves what is known as a normal distribution. The normal distribution can be thought of as the classic bell curve, a symmetric distribution with the highest total occurring in the middle and smaller and smaller totals as one goes to either extreme. With a normal distribution, a little more than 2/3 of all results will be within one standard deviation of the mean. Slightly less than 1/3 of all results are more than one standard deviation away. Thus, the mere fact that results occur one standard deviation from the mean is not particularly shocking. An occurrence with a 1/3 probability, say rolling a one or two with a single die, would not raise any eyebrows.
Once we get more than two standard deviations from the mean, however, suspicions often rise. Since the probability is approximately 95.5% that a randomly selected result will fall within two standard deviations of the mean, there is a less than 5% chance of seeing a result that is more than two standard deviations from the mean.
In Heads You Win, the task for Sally is to determine whether the specific result Charles obtained, 10 heads out of 10 tosses, is so surprising as to raise the specter that the tosses were not fair. To make that determination, statisticians might suggest that she use hypothesis testing to quantify this intuition.
Perhaps the most remarkable aspect of hypothesis testing is that usually it does not actually test the proposition in which you are most interested. Rather, hypothesis testing generally examines the likelihood that the opposite of what you care about is true. For example, the question Sally needs to decide is whether Charles cheated. No mathematical tool exists for determining this directly. Instead, an examination of the data through hypothesis testing is a journey in indirection. The focus of the hypothesis test would be an evaluation of the probability of seeing Charles result [10 heads] if there had been no cheating. Specifically, all that a hypothesis test can show is whether an observed result is so unlikely to have occurred by chance alone that it is reasonable to attribute the result to something else. We are left with a double negative. The test cannot tell us if Charles probably cheated; at best, it can tell us something like Charles probably did not not cheat.
The first step in hypothesis testing is to identify what we want to ascertain, namely the proposition that Charles result is a product of cheating. This is sometimes described, ironically, as the alternate hypothesis. If there were multiple possible causes for the result, i.e. maybe the coin was not fair to begin with or maybe Sally herself doctored the coin, there would be several ways to phrase the alternate hypothesis. For this example, assume that if the process was not fair, Charlie cheated.
Sally would then label the opposite proposition, that Charless result of 10 heads is a product of random chance, as the null hypothesis. Generally, the null hypothesis is a statement that differences in the sample are due to the luck of the draw. In logical terms, we can say that if the null hypothesis is false, the alternate hypothesis is accurate. Thus, the goal of hypothesis testing is to attempt to disprove the null hypothesis.
To do that, Sally would calculate what is known as a P-value. For different types of data, there are different statistical tests that would be used to determining the P-value, but all are generally designed to answer one question: What is the probability of seeing a result as extreme as the result actually seen if the null hypothesis were true? Sally would discover that the P-value for Charless result is less than 1%, only about 0.00098.
What does that small number tell us? It means that it would extremely unlikely to see as many as 10 heads out of 10 tosses as a random result of a fair process. A statistician would find this result statistically significant, and therefore reject the null hypothesis that Charless 10 heads were the result of pure chance. Because the probability of seeing ten heads for ten tosses is so small under the null hypothesis, Charless result would be strong evidence that the coin toss was not fair.
To understand what it means to say that the probability was so small, we can change the plot of Heads You Win. Suppose that instead of 10 heads, Charles had tossed 8 heads in his 10 tosses, and his sister again accused him of cheating. The analysis is slightly different this time. We would not use a null hypothesis based on exactly 8 heads appearing as a product of random chance. If, the allegation is that He cant be that lucky, we need to consider the probability of a random person being precisely that lucky [8 head of 10 flips] or even luckier [tossing 9 heads of 10 flips and tossing10 heads of 10 flips]. In fact, to see how extreme 8 heads is, statisticians would want to calculate how likely it would be to get 8 or more of either heads or tails. Thus, the new null hypothesis is that tossing 8 or more heads or 8 or more tails out of 10 tosses is a product of random chance. Calculating the new P-value, Sally would learn that the probability of seeing a result as extreme as 8 heads due to random chance is about 10.94%.
The key question is whether this result, the probability of seeing eight or more heads for ten tosses if the null hypothesis were true, is so small that we should suspect that the coin toss was not fair. According to statisticians, the magic number for a P-value is .05 (which is approximately two standard deviations); if the P-value is greater than .05, the results are deemed to be not statistically significant, and hence not sufficient to disprove the null hypothesis. A test where the P-value must be less than .05 to be deemed statistically significant is said by statisticians to have a .05 significance level. Another concept, termed either the confidence level or confidence coefficient, is defined as being equal to one minus the significance level; thus, statisticians say that the confidence level for such a test is 95%.
With such a confidence level, because the P-value is greater than .05, Sally would not reject the hypothesis that the Charless result is from random chance. This would not mean we had proven that Charless 8 heads were actually the result of random chance; it would merely announce that the data was not inconsistent with the supposition that the result was due to the luck of the draw. Thus, a statistician would conclude that, because we obtained a P-value of greater than .05, we cannot say that that the null hypothesis is probably false, and thus cannot say that the alternate hypothesis of cheating is probably true.
The overwhelming majority of courts have accepted as dogma a rule that any P-value greater than either .05 or less than two standard deviations is not sufficient to disprove a null hypothesis of random chance. The Supreme Court, itself, has indicated that it leaned towards such an approach. In Castaneda v. Partida, the Court was attempting to determine whether Mexican Americans had been underrepresented in Texas grand juries. In comparing the percentage of Mexican American eligible to serve with those who did serve, the Court stated, As a general rule for such large samples, if the difference between the expected value and the observed number is greater than two or three standard deviations, then the hypothesis that the jury drawing was random would be suspect to a social scientist. As some have noted, the fact that the actual number of standard deviations in that case was approximately 29 standard deviations, combined with the casualness of the Court's language in the footnote, indicates that the Court did not intend to fix a mandatory level for statistical significance.
Nonetheless, the vast majority of courts considering this question have opted for the security of replicating the classical statistical model. Without evaluating whether the concept of statistical significance is equivalent to the concept of legal significance, courts have generally appropriated the traditional statistical world view: Social scientists, and in turn the courts, have adopted two standard deviations as a threshold measure of statistical significance.
In one case, African American employees attempted to prove discrimination by showing that whites had been promoted at a much higher rate. Out of 22 total promotions, only 5 went to blacks. Because 42% of the eligible workforce was black, it would have been expected, had there been no discrimination, that 42% of 22, or 9.24, of those promoted would have been black. To determine if the difference between the expected 9.24 and the actual 5 was statistically significant, the court used a null hypothesis of no discrimination, and calculated how likely it would be to see that many more whites than blacks hired if the difference were due entirely to random chance. The court announced that when the difference is less than 2 standard deviations, it is not statistically significant. Because the difference in this case amounted to 1.84 standard deviations, it did not reach the 2 standard deviation threshold. Consequently, the court concluded, the plaintiffs failed to prove a prima facie case of discrimination.
Obviously, the courts decision to use two standard deviations was the critical element in its ruling on the existence of differential treatment. Yet, an accurate understanding of this benchmark reveals that, once again, mathematically ignorant judges have ceded their responsibility to make normative policy judgments.
As a starting point, it must be noted that the origin of the .05 significance level was intuition, rather than rigorous mathematics. Statisticians, working on researching various industrial and agricultural problems, were attempting to show how their mathematical tools could help point to solutions that, when repeated, would prove to be successful. The founder of modern statistics, R.A Fisher, wrote, almost cavalierly, it is convenient to draw the line at about the level at which we can say: Either there is something in the treatment, or a coincidence has occurred such as does not occur more than once in twenty trials. Fisher actually acknowledged that choice was a subjective choice, not mandated by either science or math. Nonetheless, Fishers choice quickly became the gold standard for statisticians.
Far more troubling than the birth of the .05 standard, is that its use leads to a skewed balancing of the risks of different types of errors. Returning to our modified version of Heads You Win, if Sally wrongfully concluded that Charles had cheated when his result was merely the result of random chance, she would be making the Type I error of incorrectly condemning the innocent. If, on the other hand, she were to decide mistakenly that Charless 8 heads were the result of random chance, though in reality he had cheated, Sally would be making the Type II error of incorrectly exonerating the culpable.
As Fisher wanted, the use of the .05 significance level results in decision-making in which the probability is extremely small that one will erroneously reject the null hypothesis, for example, believe that the result was caused by an unfair system when it was merely the product of random chance. When people in Sallys position conclude that someone was not culpable, they will be wrong only 5% of the time.
Put slightly differently, if we imagined a large series of ten coin tosses, the hypothesis of fairness would sometimes be rejected and sometimes not rejected. With a P-value of .05, if we rejected the hypothesis of fairness 100 times, we would be making a mistake of assuming a fair coin was not fair only 5 times.
Perhaps surprisingly, the .05 significance level does not tells us about the other sort of error, the Type II error of failing to reject the hypothesis of fairness when in fact the coin toss was not fair. In our large series of coin tosses, we do not know how often people in Sallys position fail to identify a truly unfair coin.
The exclusive focus on reducing Type I errors does more than mask the existence of Type II errors. Even more problematically, the more we strive to reduce Type I errors, the greater will be the risk of Type II errors. With hypothesis testing it is impossible to reduce the risk of both Type I and Type II errors; thus, a decrease in one results in an increase of the other. We can see this intuitively with the stringent beyond a reasonable doubt standard; fewer innocent people are convicted, [Type I error] but more guilty parties are acquitted [Type II error]. In general, Type I and Type II risks are inversely related, since by reducing one we tend to increase the other.
The precise mathematical relationship between Type I and Type II errors, however, is not simple to calculate. While there is a direct relationship between the two, it is not a linear relationship; while an increase in one type of error lead to a decrease in the other, it will not necessarily be of an equal amount.
There is a statistical measure of the ability of a test to prevent Type II errors. Statisticians use the word power to describe the probability of properly rejecting the null hypothesis when the alternative hypothesis is correct. A high power means fewer Type II errors. It turns out that tests using the .05 significance level, while very effective at preventing Type I errors, generally have low power; they are not particularly good at preventing type II errors.
Professors David Kaye and David Freedman give a useful example to demonstrate this tradeoff. An employer plans to use an examination to select trainees. To see whether there is a disparate impact, the employer administers the exam to a sample of 50 men and 50 women drawn at random from the population of job applicants. If the null hypothesis is that men and women pass the test at equal rates, and the P-value is set at .05, courts will mistakenly find a disparate impact, that is, incorrectly reject the null hypothesis, 5 times out of 100. But what if, in reality, the test does have a disparate impact such that 55% of the men would pass, but 45% of the women would. In such a case, courts will mistakenly find no disparate impact when men and women pass the test at different rates almost 17 times out of 100. In other words, the probability of an incorrect exoneration is more than three times the probability of an incorrect condemnation.
The disparity may actually be even worse than that. Assume that tests and other hiring practices in this particular industry have always favored men, such that we may consider the probability that the test favored women to be negligible. Thus, any disparate impact would find men doing better than women. The p-value of .05 would then result in courts mistakenly finding a disparate impact, that is, incorrectly rejecting the null hypothesis, only 2.5 times out of 100. In such a case, the probability of an incorrect exoneration is more than six times the probability of an incorrect condemnation.
This level of difference between Type I and Type II errors is typical of hypothesis testing in general. One statistical model showed that, for a hypothetical employment discrimination case, when the risk of incorrectly condemning the innocent employer [Type I error] was set at .05, the risk of
incorrectly exonerating the discriminatory employer [Type II error] was approximately 50%, a ten-times greater risk.
Such a disparity has severe practical consequences for our justice system. Innocent employers will lose only one time out of 20, while injured employees lose half of the time. There may well be legitimate policy reasons for so allocating the risk of errors in certain circumstances, but courts rarely engage in this analysis. Far too many judges have been unable to see the policy choices inherent in the numbers. Instead, they meekly accept the .05 significance level as beyond their capacity to alter.
The consequence of this mathematical illiteracy is what has been termed an arbitrary balancing of the disutilities, or regrets, of Type I and Type II errors. But such a balancing should reflect a comparison of the social harms associated with each type of error. While judges have not appreciated the need to do this comparison for the justice system, statisticians realized early on the need to make their own value judgments that reflected the cost of different types of errors in their very dissimilar field.
As one of his students would later remark, R.A. Fisher vehemently denied the importance of Type II errors for the work of statisticians. Fisher himself wrote:
The notion of an error of the so-called second kind, due to accepting the null hypothesis when it is false may then be given a meaning in reference to the quantity to be estimated. It has no meaning with respect to simple tests of significance, in which the only available expectations are those which flow from the null hypothesis being true.
One of the reasons for the statistics communitys general indifference to Type II errors is that, in the world of scientific evaluations for which these tests were designed, such errors are not final. Failing to reject the null hypothesis of no effect is not the same as accepting the proposition that a given substance or technique actually had no effect. The statistical failure to reject the null hypothesis is nothing more than cause to reserve judgment on the proposition. As R.A. Fisher explained: It should be noted that the null hypothesis is never proved or established, but is possibly disproved, in the course of experimentation. Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis.
In this light, the differing harms for a statistician or scientist of a Type I versus Type II error become obvious. When a Type I error is made, and a null hypothesis of no effect is incorrectly rejected, a scientist has mistakenly declared that predictive rule of nature rests on these test results. By contrast, when a Type II error is made and an erroneous null hypothesis of no effect is not rejected, no definitive statement has been made. Moreover, scientists are free to conduct countless further studies which might reveal the truth. Thus, the .05 level reflects the social scientists conservatism with respect to Type I error.
Judges should have a markedly different view of the comparative costs of Type I and Type II errors. A judge, unlike a scientist, is not just deferring decision until more research becomes available. Rather, a judge is selecting a specific course of action that definitively resolves important social and legal rights. Most significantly, a judicial Type II error, failing to reject an erroneous null hypothesis of no effect or no discrimination, is actually a legal acceptance of that false premise. When such a Type II error occurs, a truly harmed plaintiff is denied relief.
Nonetheless, courts should be somewhat cautious about finding liability every time numbers vary from expected values. When one contemplates all of the events that individually are unlikely and considers them all together, it would be very unlikely for unlikely events not to occur. Accordingly, unless one would find a result truly surprising, we should not reject the possibility that it was the product of random chance.
Thus, in our modified version of Heads You Win, Sally might not be willing to accuse Charles of cheating just because he obtained 8 heads out of 10 tosses. With a greater than 10% chance that his result occurred due to random chance, she might well prefer avoiding the risk of making the Type I error of condemning the innocent and risk the Type II error of exonerating the culpable. One reason for such a calibration is that there was no evidence of Charless malfeasance except for the numbers themselves. Some have termed the situation where the only evidence is statistical as naked statistical evidence.
The most well-know hypothetical involving naked statistical evidence is the Blue Bus case, in which a driver is struck by a bus, and the only evidence is that 80% of the buses that run on the road where the accident occurred are operated by the Blue Bus Company. It is generally conceded that the Blue Bus Company will avoid liability for the accident, even though, based on the statistics, the Company is more likely liable than not.
The Blue Bus case presents the issue of how to consider the different risks of error. The Type I error would be to accept the statistical argument and find the Blue Bus Company liable when it was really innocent. The Type II error would be to overlook the statistical evidence and find the Blue Bus Company not liable when it was really culpable. If we assume that bus accidents are proportional to the number of busses each bus company owns, we can calculate how often we would make each type of error, depending on whether we accept the naked statistical evidence or not. If we would always find the Bus Company liable based on the statistical evidence alone, then we would find it liable 100% of the time, even though it only accounted for 80% of the accidents. We would wrongly find liability, the Type I error, 20% of the time. If we never found the Bus Company liable based on the statistical evidence, then we would never find it liable, even though it accounted for 80% of the accidents. We would wrongly find no liability, the Type II error, 80% of the time. By ignoring the statistical evidence, we are saying that we are willing to make four times as many Type II errors as Type I errors.
One of the more common justifications for not finding liability in the Blue Bus scenario is that, the plaintiffs failure to adduce some further evidence appears unjustified, because such evidence should be available to them at little cost. Thus, the lack of non-statistical evidence of guilt is itself evidence of innocence.
But that rationale does not cover what the hypothetical implies, the situation where there is no other evidence to be had. Assume that the accident was not severe enough to cause damage to the bus because the victim was a pedestrian, it was a rainy evening so that any blood would have washed off the bus, and that all that a review of other evidence revealed was a confirmation that 80% of the buses which could have caused the accident were owned by the Blue Bus Company. Or, we could consider a case where all of the alternate evidence is destroyed through no fault of either party.
There would still be great reluctance to let the numbers, by themselves, prove liability in such cases. Why? One way to think about this is to consider cases where the numbers do prove liability.
An employer can be found to have violated Title VII of the Civil Rights Act of 1964 based on statistics alone that show that a particular employer practice had a disparate impact based on race or gender. The reason naked statistical evidence suffices for a finding of liability is that Title VII prohibits not only overt discrimination but also practices that are fair in form, but discriminatory in operation. Numbers are quite capable of communicating that a discriminatory result occurred. The numbers are not being offered into evidence to show a discriminatory intent on the part of the employer.
In extreme cases, however, numbers can indeed be a window into human motivations. For example, in Yick Wo v. Hopkins, the Court found a constitutional violation from the fact that more than 200 Chinese laundry owners had been denied permits to operate their business, while 80 non-Chinese owners were granted licenses. According to the Court, this numerical disparity was sufficient to require the conclusion that the licensing decisions had been made, with a mind so unequal and oppressive as to amount to a practical denial by the State of that equal protection of the laws.
Similarly, the Supreme Court in Gomillion v. Lightfoot, concluded that an extreme numerical discrepancy could reveal a bigoted mind. In reviewing a redistricting plan for Tuskegee, Alabama, which removed from the city all but four or five of its 400 African-American voters without removing a single white voter, the Court declared that, the conclusion would be irresistible, tantamount for all practical purposes to a mathematical demonstration, that the legislation is solely concerned with segregating white and colored voters by fencing Negro citizens out of town so as to deprive them of their pre-existing municipal vote.
Illegal motivation can also be inferred numerically by, what the Court has termed, the inexorable zero. Under this doctrine, the fact that an employer had hired no women or minorities when some were arguably available would lead to an inference of discriminatory motive. In a sex discrimination suit against a drug store company, where evidence showed that the company had hired hundreds of male manager trainees and not chosen a single woman during a seven and one-half year period, the Fifth Circuit Court of Appeals rejected the employers contention that zero is just a number. The court explained that zero carries special significance in discerning firm policies and attitudes, because even the hiring of two or three women would indicate at least some willingness to consider women as equals in firm management. The total absence of such hiring, in contrast, indicates an unwillingness to view women as equals, and accordingly has led courts to be, particularly dubious of attempts by employers to explain away the inexorable zero when the hiring columns are totaled.
But absent extreme cases, courts are properly reluctant to find either bad motive or wrongful actions based on naked statistics as the simple luck of the draw might well supply the reason for a numerical disparity. In such a case, with no other evidence available, we can treat random chance as a plausible alibi witness. Thus, courts should be particularly concerned with avoiding the Type I error of incorrectly finding an improper motive or action in the absence of other evidence, even if that concern leads to an increase in Type II errors of exonerating the culpable.
Evidence of wrongdoing, however, should lead a court to recalibrate its view of the proper balance between Type I and Type II errors. Imagine that we knew that the Blue Bus Companys drivers had been drinking at a bar before they began their routes that evening. With some evidence of wrongdoing, the no-longer naked statistical evidence begins to look more convincing.
Returning to Heads You Win, suppose that before Charles had tossed his 8 heads, he was seen filing the edge of his quarter. Assume that his filing was not particularly extreme and resulted in a coin that landed heads 55% of the time. If we used a significance level of 95% to determine whether Charless 8 heads resulted from random chance, we would be more than six times as likely to mistakenly clear him when his cheating caused his advantage than we are to wrongly condemn him when his advantage was due to random chance.
But why should we accept that disbalance of risks after we have determined that wrongdoing has occurred? Whether judges admit it or not, such a disbalance of risks represents a value judgment as to the relative evil that would result from the different sorts of error. In non-mathematical cases, such as termination of parental rights or permitting the withdrawal of life-sustaining treatment, courts have had little difficulty recognizing that the allocation of the risk of error between two outcomes in a particular case must reflect properly their relative severity. Courts accordingly have understood that it is their responsibility to make the value judgment as to whether an incorrect finding of fault would produce consequences as undesirable as the consequences that would be produced by an incorrect finding of no fault.
By passively accepting the 95% standard for significance, though, judges are acquiescing in a value judgment that, the social disutility of wrongful inculpation is many times greater than the social disutility of wrongful acquittal. The calculation of social disutility, however, is properly for courts to make and should vary with the situation. The selection of the significance level should be made to reflect that judicial evaluation.
Judges must realize that they are free to choose significant levels other than 95%. As noted in the Reference Guide on Statistics, Although 95% confidence intervals are used commonly, there is nothing special about 95%. Indeed, as one court noted, Different levels of significance may be appropriate for different types of studies depending on how much risk one is willing to accept that the conclusion reached is wrong.
How much risk we are willing to accept must be based on the values of the legal system. Unlike scientific inquiry, the law is oriented toward the just resolution of cases rather than truth-finding While ascertaining the truth is a deep value of our legal system, it is often an unobtainable goal since verdicts must be rendered even when information is incomplete. In the face of incomplete information, errors are inevitable. The legal system has long recognized that justice requires a different weighting of risks of error after a finding of wrongdoing by one of the parties.
For example, in numerous cases, courts have shifted the burden of proof of causation away from an innocent plaintiff and onto the negligent defendant, when requiring the plaintiff to prove causation, would be both unfair and destructive of the deterrent purposes embodied in the concept of duty of care. One such case is Kingston v. Chicago & N. W. R. Co., in which the plaintiff's property had been damaged when two fires united. One fire was attributable to the negligence of the defendant railroad, the other was of unknown origin. The court held that the defendant should have the burden of proving that the fire set by him was not the proximate cause of the damage. The reason for this burden-shifting, according to the court, was that forcing the plaintiff to prove the causation in such a situation, would certainly make a wrongdoer a favorite of the law at the expense of an innocent sufferer.
Similar reasoning was employed in Sindell v. Abbott Laboratories, a case in which children injured by their mothers ingestion of DES, sued several DES manufacturers, and were unable to identify who manufactured the drugs taken by their mothers. The California Supreme Court held that, even though the plaintiffs could not show which defendant caused their injuries, each defendant would be presumptively liable for a portion of the judgment based on its share of the market. The court gave two reasons for removing the burden of proving causation from the plaintiffs. First, the court said that justice required balancing the risks of uncertainty between the parties: [A]s between an innocent plaintiff and negligent defendants, the latter should bear the cost of the injury.
Second, the court noted that tort law is designed to deter wrongful conduct. In Sindell, the DES manufactures had known that there was a grave danger that DES could cause cancer in the daughters of pregnant women who took the drug, yet continued to market the drug as a miscarriage preventative. The manufacturers also failed to test DES for safety and ignored tests performed by others that indicated that the drug was not safe.
In light of such wrongdoing by the defendants, the court ruled that they should be forced to carry the burden of proof of causation based on a broader policy standpoint. The DES manufactures were in, the best position to discover and guard against defects in its products and to warn of harmful effects; thus, holding it liable for defects and failure to warn of harmful effects will provide an incentive to product safety.
A similar shift in the burden of proving causation has occurred in employment discrimination cases. If an employer is shown to have engaged in a discriminatory hiring pattern and practice, individual employees need not prove that employers discrimination was the cause of adverse treatment they personally received. Instead there is a rebuttable presumption that an employee was the victim of the employers discriminatory practices, and the burden shifts to the employer to overcome that presumption for each employee.
The Supreme Court has explained this burden-shifting in part on the normative rationale that the employer who has committed a pattern of discrimination can no longer be viewed as a presumptively benign actor. It is appropriate to shift the burden of proving individual causation to the employer because, the finding of a pattern or practice changed the position of the employer to that of a proved wrongdoer. On a practical and logical level, that change to the status of proved wrongdoer denied the employer the ability to claim that there is no reason to believe that its individual employment decisions were discriminatorily based; it has already been shown to have maintained a policy of discriminatory decisionmaking.In the case of a proved discriminatory pattern, an employees detrimental treatment should be presumed to have been caused by discrimination, since there was a greater likelihood that any single decision was a component of the overall pattern.
An analogous burden-shifting should occur in cases involving statistical evidence. The allocation of the risk of error should not be the same when Charles is a proved wrongdoer, guilty of filing the coin, as when there is no evidence that he committed any wrongdoing. After his attempt to alter the coin, there is undoubtedly a greater likelihood that the coin toss was not a fair one.
We cannot directly calculate the effect this increased likelihood of an unfair toss has on the statistical probability previously obtained. In theory, Bayes theorem should provide an equation that permits the combination of the probability that the Charles changed the coin with the statistical probability in a neat mathematical formula. Unfortunately, Bayes formula cannot provide us that information since determining the increased probability that the coin tosses were unfair is necessarily subjective. Because the assessment of that increased probability is nothing more than ones imprecise, non-scientific belief as to the likelihood that Charles succeeded in his attempt to alter the coin, there is no number we can plug into the equation to get the correct probability. This is, then, one of those instances where Bayes theorem is most useful as a heuristic device, reminding us that the persuasive power of statistical evidence depends, in part, on whether it is consistent with or at variance with the indications we can draw from the other relevant evidence we may have. Taking heed of that lesson, the task becomes determining the proper way to combine the hard numbers from the statistical test with the information that the defendant is a proven wrongdoer.
The question can be better approached by considering the issue of securities fraud. Under Rule 10b-5, it is illegal to manipulate stock prices by making false or misleading material statements. In order to prevail on a Rule 10b-5 claim, plaintiffs must show that they relied on those false or misleading statements to their detriment. In Basic Inc. v. Levinson, the Supreme Court ruled that a rebuttable presumption of reliance could be created through the economic theory known as fraud-on-the-market.
According to the Court: the fraud on the market theory is based on the hypothesis that, in an open and developed securities market, the price of a company's stock is determined by the available material information regarding the company and its business. Misleading statements will therefore defraud purchasers of stock even if the purchasers do not directly rely on the misstatements. In other words, once a plaintiff proves the false or misleading statements, the next step is to show that those statements affected the market price of the stock to the plaintiffs detriment.
Generally, economists conduct what is know as an event study to compare the actual return on a stock directly after the misleading statement is given with the predicted return, a mathematical estimate of what the return would have been absent such statement. This estimate is based on a statistical test called regression analysis, that factors in both the firms average return during some control period as well as any contemporaneous market-wide influences, such as news affecting the entire relevant market which would likely have affected the firm in question as well. The difference between the actual return and the predicted return is called the abnormal return.
Some variation in returns can be expected, of course, due to the random chance of a volatile market. A large abnormal return, however, indicates that it is unlikely that the market was unaffected by the misleading statement. To determine if the variation in return was caused by random chance, economists use traditional hypothesis testing.
First, they establish a null hypothesis that the misleading statements had no effect on the market. Based on the number of standard deviations that the abnormal return is from the predicted return, they can then calculate the probability of seeing an abnormal return of such magnitude based purely on chance.
Assume that a corporation gives out deliberately misleading information implying high quarter profits. Stock prices rise, but the company subsequently announces a negative earnings report for that quarter, and the price of the stock plummets. Shareholders who had purchased after the dissemination of the misleading information but before the earnings report was released sue the corporation, and a regression analysis shows that there is only a10 or 15% probability of seeing an abnormal return as large as that experienced by the company based purely on chance.
The question of whether those statistics indicate that the market was indeed affected by the misleading statements will turn on our choice of a significance level. According to four of the leading scholars in this area, Jonathan Macey, Geoffrey Miller, Mark Mitchell, and Jeffry Netter, the classic P-value of .05 should be utilized: We suggest choosing a significance level such that the probability of a Type 1 error is less than 5%; this is a standard level used by researchers in finance and economics.
The Type I error with which they are concerned is that of finding an effect on the market when the abnormal return was in fact the result of random chance. Their proposal, however, completely disregards the probability of a Type II error, finding no effect on the market when the misleading statement actually caused harm. As we have seen, the .05 level leads to a decision-making regime in which the probability of an incorrect exoneration far exceeds the probability of an incorrect condemnation.
If we utilized the traditional confidence level of 95%, the statistician would conclude that the statistics were not significant and thus not proof that the misleading information affected the market. But what if a different confidence level had been selected, say 80%? Now, the statistics would be found to be significant and proof that the market was affected.
The usual response to a suggestion of an 80% significance level is that it would permit too high a rate of error. That concern, of course, only refers to the Type I error of finding an effect on the market when the abnormal return was really the result of random chance. But if our concern is on avoiding too high a rate of Type II error, the 95% significance level is also suspect. In fact, there is an arbitrariness to choosing any magic number for a significance lever, especially when we are concerned with both types of errors.
The ideal solution would be to use a significance test that had the effect of equalizing Type I and Type II errors. Such an approach would reflect the assumption implicit in the preponderance of the evidence standard for civil trials in general, that Type I and Type II errors impose essentially equal costs on society. It would also equalize the cost of wrong judgments so that the system as a whole would favor neither plaintiffs nor defendants.
One mathematical problem with implementing such a solution is that Type II errors are often not able to be calculated. To understand why, consider this story of a challenge to a grand jury pool. In a community with 38% African American were eligible to serve, eighteen persons were selected by local jury commissioners to serve on the grand jury; 3 of those selected were African Americans [17%] and the other 15 were white. The question is whether the disparity between the expected percentage and the actual percentage was the result of random chance or a discriminatory selection process.
With the usual null hypothesis of random chance, the P-value for this situation is .051 which would not be statistically significant at the traditional .05 level. That .05 level, remember, represents a 5% risk of the Type I error of incorrectly condemning the innocent.
Calculating the risk of the Type II error of incorrectly exonerating discriminatory jury commissioners is impossible, since we do not know how discriminatory they were. As one commentator noted, there are many possible degrees of inequality or disadvantage. The commissioners could be enthusiastic bigots, trying to keep every black off, or they could be subtler bigots, just tilting the playing field slightly. Intuitively, it is easier to mistakenly overlook the discriminator who only marginally affects the selection process than one whose results are far more blatant.
Statisticians capture that variability with what is know as the power function. The power function represents the differing probabilities of rejecting the null hypothesis of no effect for the full range of possible actual effects. In the grand jury case, for example, the power function would reveal that use of the 95% confidence level will very rarely identify a small discriminatory effect. The test has a somewhat improved chance of identifying larger discriminatory effects but still does not have a better than even chance of correctly detecting [discrimination] unless the list is so grossly biased that a black's chance of appearing on a grand jury is diluted by some sixty percent, from what it would be absent any discrimination.
It is obvious in the grand jury example that use of the 95% significance level will create a high risk of overlooking discriminatory conduct. We cannot simply say equalize Type I and Type II errors, however, since there are multiple probabilities for different Type II errors.
One way to fulfill the spirit of equalizing Type I and Type II errors is to borrow from the concept known as baseball or final offer arbitration. In baseball arbitration, two parties each submit an amount meant to represent a fair resolution of a dispute and the arbitrator must choose between those two. Thus, in a salary dispute, the team owner and player each select a proposed salary and the arbitrator must select one or the other as the fairer figure.
For hypothesis testing, each side could propose their own hypothesis. Thus, the jury commissioners charged with discrimination would provide the null hypothesis; they would contend that their conduct had no effect on the actual numbers that appeared. Those challenging the commissioners would need to select some level of discrimination they contend occurred as the alternate hypothesis. From those two possibilities, it is easy to create a significance level that equalizes Type I and Type II errors. If the regression analysis revealed a higher probability of seeing an abnormal return than that significance level, the statistics would not disprove the defendants hypothesis that its statements did not affect the market. If however, the analysis revealed a smaller probability of seeing that abnormal return, the plaintiffs alternate hypothesis with the specified level of discrimination would be accepted.
The baseball arbitration approach would be particularly valuable in cases such as the securities fraud example discussed earlier. A corporation that had deliberately released misleading information, that is, a proved wrongdoer, would contend for its null hypothesis that its statements had no effect on stock prices. The plaintiff shareholders would then select some level of price increase that they claimed was caused by the misleading statements for their alternate hypothesis. Again, from those two hypotheses, a significance level could be selected to balance the risk of mistakenly finding an effect from a harmless statement and erroneously excusing the company whose fraud affected the market. Note that the smaller the alleged effect, the lower the equalizing significance level will need to be, making it easier for the plaintiff to prove an effect. Of course, the smaller the alleged effect, the lower the plaintiffs damages will be. To prevent the manipulation of statistics with de minimus alleged effects, courts could require that any effect be of practical or substantive significance.
Equalizing Type I and Type II errors would do more than simply balance the risks of error. It would also reflect a policy judgment that the harm from the risk of error to an innocent investor is deemed as important as the harm of the risk of error to a proven wrongdoer.
Sometimes, however, the nature of the statistical analysis being used makes it difficulty to create a power function, and thus impossible to equalize the risk of error. Additionally, there may be judges who understand the necessity of improving the balance between Type I and Type II errors but still feel wedded to the 95% confidence level. For both situations, there is another approach: When dealing with a proved wrongdoer, courts should utilize the one-tailed rather than the more traditional two-tailed analysis.
To understand the distinction, recall that hypothesis testing determines the probability of obtaining a result as extreme as the one actually seen if the null hypothesis of no effect were correct. In Heads You Win, the two-tailed test revealed how likely it was for Charles to toss 8 or more heads or 8 or more tails. A one-tailed test would focus on the narrower question of how likely it was for him to toss 8 or more heads.
On a strictly numerical basis, with a one-tailed test it is usually easier to find statistical significance. A one-tailed test with a significant value of 95% is generally equivalent to a two-tailed test with a significant value of 90%. That would mean that Type I errors would be twice as likely, and Type II errors would be less likely [though by an uncertain amount]. Thus, Type I and Type II errors would be more nearly in balance; there is a greater chance of condemning the innocent and a lower risk of exonerating the culpable.
In the case of a proved wrongdoer, this improved balance is important. Some courts, however, have rejected one-tailed tests precisely because they make it easier for plaintiffs to show a statistical significance. One court explained the preference for two-tailed test in an employment discrimination case on the grounds that a one-tailed test only indicates whether blacks are treated worse than whites while a two-tailed test, demonstrates whether blacks and whites were treated equally, taking into account both whether whites are treated as well as or better than blacks and vice versa.
That argument only makes sense, though, if there is reason to believe that the discrimination could have gone in either direction. If there is other evidence of anti-black bias by an employer, though, it would be nave at best to pretend that whites are as likely to be harmed as African Americans. For example, the Supreme Court has stated that where a school board acted with a discriminatory motive in part of the school system, there is a high probability that similar impermissible considerations have motivated their actions in other areas of the system. With evidence of discriminatory intent, a one-tailed test is a far more accurate tool for assessing how likely the extreme result was to occur: One-tailed tests are most appropriate when one population is consistently overselected over another.
Courts should be able to act on the reasonable presumption that the proven wrongdoer who was attempting to create an effect in one direction, did not create an effect in the opposite direction. Thus, if Charles is filing the coin in order to increase the number of heads he tosses, we should not assume he increased the numbers of tails he tossed instead. Similarly, in the case of the securities fraud, if we determine that the deliberately misleading statements were made in an attempt to raise stock prices, it would be perfectly reasonable to assume they did not lower prices.
Certainly it is conceivable, though unlikely, that a truly incompetent wrongdoer was completely counterproductive. When courts use a two-tail test, they are essentially giving an unjust benefit of the doubt to the party who was actively committing wrongful acts. To use the two-tail test in such a situation, would certainly make a wrongdoer a favorite of the law at the expense of an innocent sufferer. In the name of justice, as well as probability, courts should be willing to analyze cases of a proven wrongdoer, by use of a one-tailed test.
VI. Conclusion
More than 40 years ago, the California Supreme Court admonished: Mathematics, a veritable sorcerer in our computerized society, while assisting the trier of fact in the search for truth, must not cast a spell over him. It turns out that mathematics is not merely a sorcerer but a bully as well, seizing the power to make policy judgments that belong to the courts.
In the misguided name of mathematical rigor, courts have allowed race talk to enter our criminal trials, and prejudice to reduce tort awards. With judicial acquiescence, information universally acknowledged to be incorrect and baseless has been used to calculate the probabilities of paternity. Moreover, known wrongdoers have been permitted to escape liability because judges have abdicated their responsibility of balancing the risks of error that are an inevitable part of any trial. Outside the realm of mathematics, judges would never accept such policies.
It is not necessary for judges to become amateur mathematicians in order to reclaim their rightful role. However, they must be aware that the apparent objectivity of mathematics often masks subjective judgments, and not be fooled when hard numbers are really based on little more than intuition and guesswork. Numbers can communicate important information. Judges just need to be make sure that they are able to comprehend what those numbers are trying to say.
See text accompanying notes ---, infra.
See text accompanying notes ---, infra.
See text accompanying notes ---, infra.
Peter A. Coclanis, History by the Numbers: Why Counting Matters, 7 Magazine of History, 5, 8 (1992).
See generally, Marilyn Burns. Math: Facing an American Phobia (1998).
Michael J. Saks, Legal Policy Analysis and Evaluation, 44 Am. Psychol. 1110, 1115-16 (1989).
Jeremy A. Blumenthal, Law and Social Science in the Twenty-First Century, 12 S. Cal. Interdis. L.J. 1, 1 (2002). See also Richard A. Posner, An Economic Approach to the Law of Evidence, 51 Stan. L. Rev. 1477 (1999), quoting Darrin R. Lehman, Richard O. Lempert & Richard E. Nisbett, The Effects of Graduate Training on Reasoning: Formal Discipline and Thinking About Everyday-Life Events, 43 Am. Psychologist 431, 440 (1988)(stating legal education itself (alas) produces no improvement in the ability to apply the statistical and methodological rules of the probabilistic sciences to either scientific studies or everyday-life events.).
Jonathan J. Koehler, The Probity/Policy Distinction in the Statistical Evidence Debate, 66 Tul. L. Rev. 141, 148-9 (1991), quoting Richard A. Posner, The Decline of Law as an Autonomous Discipline: 1962-1987, 100 Harv. L. Rev. 761, 778 (1987). As Justice Breyer has noted, judges are not scientists and do not have the scientific training that can facilitate the making of such decisions. GE v. Joiner, 522 U.S. 136, 148 (1997) (Breyer, J., concurring).
Jonathan J. Koehler, The Probity/Policy Distinction in the Statistical Evidence Debate, 66 Tul. L. Rev. 141, 141 (1991). See also GE v. Joiner, 522 U.S. 136, 149 (1997) (Breyer, J., concurring)(stating cases presenting significant science-related issues have increased in number. See also Judicial Conference of the United States, Report of the Federal Courts Study Committee 97 (Apr. 2, 1990) (stating Economic, statistical, technological, and natural and social scientific data are becoming increasingly important in both routine and complex litigation.).
Statistical assessments are prominent in many kinds of cases, ranging from antitrust to voting rights. Statistical reasoning can be crucial to the interpretation of psychological tests, toxicological and epidemiological studies, disparate treatment of employees, and DNA fingerprinting; this list could easily be extended. See David H. Kaye and David A. Freedman, Reference Guide on Statistics, in Federal Judicial Center, Reference Manual on Scientific Evidence, 83, 85 (2000)(hereinafter Kaye and Freedman, Reference Guide on Statistics). See e.g. Anderson v. Westinghouse Savannah River Co., 406 F.3d 248 (4th Cir. 2005), cert. den., 546 U.S. 1214 (2006)(racial discrimination in employment); Estate of Boncher v. Brown County, 272 F.3d 484 (7th Cir. 2001) (Due Process violated by lax prison security); Chavez v. Illinois State Police, 251 F.3d 612 (7th Cir. 2001)(ethnic profiling in traffic stops); Citizens Fin. Group, Inc. v. Citizens Nat'l Bank, 383 F.3d 110 (3d Cir. 2004), cert. den., 544 U.S. 1018 (2005) (trademark reverse confusion case); Currier v. United Techs. Corp., 393 F.3d 246 (1st Cir. 2004)(age discrimination claim); In Re Hanford Nuclear Reservation Lit., 292 F.3d 1124, 1133 (9th Cir. 2002)( Causation in toxic tort personal injury case); Everett v. Everett, 150 Cal.App.3d 1053, 201 Cal.Rptr. 351 (1984)(paternity suit); Lehocky v. Tidel Technologies Inc., 220 F.R.D. 491 (S.D. TX 2004)(securities fraud); Marvin Lumber & Cedar Co. v. PPG Indus., Inc., 401 F.3d 901 (8th Cir. 2005)(breach of warranty claim); U.S. v. Blaine County, 363 F.3d 897 (9th Cir. 2004), cert. den. 544 U.S. 992 (2005) (Voting Rights Act claiming that at-large election system diluted vote of Native Americans); United States v. Wright, 215 F.3d 1020 (9th Cir.), cert. den., 531 U.S. 969 (2000)(DNA match of bank robber).
Kumho Tire Co., Ltd. v. Carmichael, 526 U.S. 137, 141 (1999). See also Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579, 586 n.7 (1993)(referring to the judge having gatekeeping responsibility. Judges are to perform this gatekeeping function for testimony based on scientific technical and other specialized knowledge. Kumho Tire Co., Ltd., 526 U.S. at 141.
509 U.S. 579, 586 n.7 (1993).
Id., 509 U.S. at 597. Rule 702 of the Federal Rules of Evidence states: If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise." Fed. R. Evid. 702.
Id., 509 U.S. at 593.
GE v. Joiner, 522 U.S. 136, 153 (1997)(Stevens, J., dissenting).
Frye v. United States, 54 App. D.C. 46, 47, 293 F. 1013, 1014 (1923). While 28 states (Alaska, Arkansas, Colorado, Connecticut, Delaware, Idaho, Indiana, Iowa, Kentucky, Louisiana, Maine, Michigan, Mississippi, Montana, Nebraska, New Mexico, North Carolina, Ohio, Oklahoma, Oregon, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Vermont, West Virginia, and Wyoming) have expressly adopted the Daubert standard, at least 15 states (Arizona, California, District of Columbia, Florida, Illinois, Kansas, Maryland, Michigan, Minnesota, Mississippi, Missouri, Nebraska, New York, North Dakota, Pennsylvania, and Washington) continue to utilize the Frye standard. See Alice B. Lustre, Post-Daubert Standards for Admissibility of Scientific and Other Expert Evidence in State Courts, 90 A.L.R.5th 453 (2008).
Daubert, 509 U.S. at 586.
GE v. Joiner, 522 U.S. at 142.
Daubert, 509 U.S. at 594.
Kumho Tire Co., Ltd., 526 U.S. at 149.
Id.
Id., 526 U.S. at 149-50, summarizing Daubert, 509 U.S. at 592-594.
Daubert v. Merrell Dow Pharm., Inc., 43 F.3d 1311, 1316 (9th Cir. 1995) [hereinafter Daubert II].
GE v. Joiner, 522 U.S. at 153 (Stevens, J., dissenting).
Kumho Tire Co., Ltd., 526 U.S. at 151. Necromancy is the practice of supposedly communicating with the spirits of the dead in order to predict the future. The American Heritage Dictionary of the English Language, (4th Ed. 2004).
272 F.3d 484, 487 (7th Cir. 2001).
Estate of Boncher, 272 F.3d at 486.
Id., 272 F.3d at 486-87.
Id., 272 F.3d at 486.
GE v. Joiner, 522 U.S. at 149-50 (Breyer, J., concurring).
David Kaye, Rounding Up the Usual Suspects: A Legal and Logical Analysis of DNA Trawling Cases, 87 N.C.L.Rev. 425, 431 (2009)(hereinafter Kaye, Rounding Up the Usual Suspects). See text accompanying notes ---, infra.
See text accompanying notes --- infra
Carnegie Commission on Science, Technology and Government, Science and Technology in Judicial Decision Making: Creating Opportunities and Meeting Challenges, 24 (1993).
Sophia I. Gatowski et al., Asking the Gatekeepers: A National Survey of Judges on Judging Expert Evidence in a Post-Daubert World, 25 LAW. & HUM. BEHAV. 433, 433 (2001). See also Paul S. Miller et al., Daubert and the Need for Judicial Scientific Literacy, 77 Judicature 254 (1994).
United States v. Cline, 188 F. Supp. 2d 1287, 1294 (D. Kan. 2002).
The probability of getting a six on one die is 1/6. Because each dies outcome is independent of the other, the probability of getting obtaining a six on two dice is calculated by multiplying the probability for obtaining a six on each: 1/6 x 1/6 = 1/36. See Henry E. Klugh, Statistics: The Essentials for Research 152 (1986).
ROGER C. PARK & MICHAEL J. SAKS, EVIDENCE SCHOLARSHIP RECONSIDERED: RESULTS OF THE INTERDISCIPLINARY TURN, 46 B.C. L. Rev 949, 989 (2006)
See Wojtek J. Krzanowski, Statistical Principles and Techniques in Scientific and Social Investigations, 18 (2007). See also William C. Thompson, DNA Evidence in the O.J. Simpson Trial, 67 U. Colo. L. Rev. 827, 857 (1996).
Roger C. Park & Michael J. Saks, Evidence Scholarship Reconsidered: Results of the Interdisciplinary Turn, 46 B.C. L. Rev. 949, 988 (2006).
Mathematically, if n people rolled snake eyes, the probability of a given snake eyes roller being the murderer is 1/n.
The probability that at least one other person rolled snakes can be expressed mathematically as
N
" [P(J roll snake eyes)x (1/J)]
J=1
with N equaling the total number of other potential suspects and J equally the differing number who could have rolled snake-eyes.
See generally The Wire (HBO television series, 2002-2008).
Ronald J. Allen & Michael S. Pardo, The Problematic Value of Mathematical Models of Evidence, 36 J. Legal Stud. 107, 109 (2007).
Allen & Pardo, supra note ---, 36 J. Legal Stud. at 112.
See e.g. Id., 36 J. Legal Stud. at 112 (stating, our interests in the various inferences they generate pick out certain classes as more or less relevant).
DNA, Deoxyribonucleic acid, is a molecule that contains the genetic information for all living things. David H. Kaye & George F. Sensabaugh Jr., Reference Guide on DNA Evidence, in Reference Manual on Scientific Evidence 487 (2d ed. 2000)(hereinafter Reference Guide on DNA Evidence.) Most human DNA can be found in our chromosomes. Id. at 491 A fertilized human egg has 23 pairs of chromosomes, with each parent contributing half . National Research Council, The Evaluation of Forensic DNA Evidence 60 (1996) (hereinafter NRC II). Each chromosome contains many thousands of genes, which are segments of DNA which have specific functions, determining all the physical traits that we inherit from our biological parents. Kaye & Sensabaugh, Reference Guide on DNA Evidence, at 492. Usually, important traits, are the product of the relationship between many different genes. Id.
130 N.J. 484; 617 A.2d 247 (1993).
A phenotype has been defined as [a] trait, such as eye color or blood group, resulting from a genotype. Kaye & Sensabaugh, Reference Guide on DNA Evidence, supra note ---, at 572.
Little v. Streater, 452 U.S. 1,7 (1981) (holding that to deny blood grouping tests in a paternity suit because of defendants lack of financial resources violated due process).
Robert W. Peterson, A Few Things You Should Know About Paternity Tests (But Were Afraid To Ask), 22 Santa Clara L.Rev. 667, 680 (1982).
Analogously, in Snake Eyes, the probability of exclusion, the probability of rolling anything other than two 1s, was 35/36, or approximately 97.2%.
Andrei Semikhodskii, Dealing with DNA Evidence: A Legal Guide 75-77 (2007). Sometimes, PI refers to a single DNA marker. By multiplying the PI for several different markers, we can calculate a Combined Paternity Index [CPI], which is the value usually entered into evidence. Since courts typically simply refer to the paternity index, see e.g. Griffith v. Texas 976 S.W.2D 241, 243 (Tex. App. 1998), I will use PI instead of CPI in the text.
The paternity index tends to be based on more genetic information than the probability of exclusion, since it also incorporates not just the existence of particular genetic markers, but the fact that men with some sets of markers are more likely to transmit the particular genes than men with other sets. D.H. Kaye, The Probability of an Ultimate Issue: The Strange Case of Paternity Testing, 75 Iowa L. Rev. 75, 91 (1989).
D.H. Kaye, The Probability of an Ultimate Issue: The Strange Case of Paternity Testing, 75 Iowa L. Rev. 75, 89-94 (1989).
If all 100,000 men were seen as having the same opportunity to be the father as the defendant, using Bayesean analysis, see infra , the prior odds would be 1 to 99,999, and the posterior odds for the defendants paternity would be 19.4 to 99,999, which equals 0.019%. D.H. Kaye, The Probability of an Ultimate Issue, 75 Iowa L. Rev. at 94.
Semikhodskii, Dealing with DNA Evidence, supra note ---, at 116.
Id.
Richard D. Friedman, A Presumption of Innocence, Not of Even Odds, 52 Stan. L. Rev. 873, 875 (2000).
See e.g. Kaye, Rounding Up the Usual Suspects, supra note ---, 87 N.C.L.Rev. at 463.
Richard D. Friedman, A Presumption of Innocence, Not of Even Odds, 52 Stan. L. Rev. 873, 875 (2000)
Another way in which Bayes theorem is presented is:
P(A | B) = P(A) x P(B | A)
P(B)
P(A) means the probability that event A would occur; P(B) means the probability that event B would occur; P(A | B) means the probability of A given that B has occurred; and P(B | A) means the probability of B given that A has occurred. P(A | B) is therefore the transposition of P(B | A). See e.g., Jeff Gill, Bayesian Methods: A Social and Behavioral Sciences Approach 6 (2002)
For a discussion of the impossibility of turning subjective belief into a concrete number, see text accompanying notes ---, infra.
Butcher v. Ky., 96 S.W.3d 3, 7 (Ky. 2002), cert. den. 540 U.S. 864 (2003).
Davis v. State, 476 N.E.2d 127, 138 (Ind. Ct. App. 1985).
Griffith v. Texas, 976 S.W.2D 241, 245 (Tex. App. 1998). See also Brown v. Smith, 137 N.C. App. 160, 164 (N.C. Ct. App. 2000) (stating that, A neutral assessment of the non-genetic evidence would result in a prior probability of 0.5. This would give equal weight to paternity and non-paternity from a non-genetic aspect.); Butcher v. Ky., 96 S.W.3d at 9 (stating that the .5 prior probability is neutral, neither assuming nor denying that intercourse has taken place between the mother of the child and the alleged father.).
See March Madness NCAA Tournament Betting Odds Available, at
http://www.capperspicks.com/forums/online-sportsbook-casino-horse-racing-poker-industry-news/1515-march-madness-ncaa-tournament-betting-odds-available.html (last visited January 10, 2009).
Triumph, tragedy at Derby, The Boston Globe, May 4, 2008, at C1.
Thus, the problem with the 50/50 probability is not that it operates upon the assumption "that the mother and putative father have engaged in sexual intercourse at least once during the period of possible conception. State v. Hartman, 426 N.W.2d 320, 326 (Wis. 1988), quoting In re Paternity of M.J.B., 144 Wis. 2d 638, 650 (1988). Rather, as the New Jersey Supreme Court noted, 50/50 odds, are wholly consistent with a fact pattern that one and only one man had access to and intercourse with the victim and that one of two, and only two, men, including defendant, could possibly have been that one man, neither one more likely than the other to be the father. New Jersey v. Spann, 130 N.J. 484, 617 A.2d 247 (N.J. 1993). See also Griffith v. Texas, 976 S.W.2d at 248 (stating, Logically, the prior probability assumes intercourse could have occurred and thus the putative father could be the actual father, but the statistic does not necessarily assume intercourse did occur.).
See text accompanying note ---, supra.
Cole v. Cole, 74 N.C.App. 247; 328 S.E.2d 446, aff'd without opinion, 314 N.C. 660, 335 S.E.2d 897 (1985).
Id., 74 N.C.App. at 249; 328 S.E.2d at 448.
Id., 74 N.C.App. at 251-52; 328 S.E.2d at 449.
See Abbott, Sell, & Krause, Joint AMA-ABA Guidelines: Present Status of Serologic Testing in Problems of Disputed Parentage, 10 Fam. L.Q. 247, 262 (1976).
Unif. Parentage Act (2002) 505 (a)(1). The presumption also requires a combined paternity index of at least 100 to 1. Id., 505 (a)(2).
For example, Alabama, Code of Ala. 26-17-505 (2008); California, Cal Fam. Code 7555, (2008); Delaware, 13 Del. C. 8-505 (2008); Minnesota, Minn. Stat. 257.62 (2008); Missouri, 210.822 R.S.Mo. (2008); New York, NY CLS Family Ct Act 532 (2008); North Dakota, N.D. Cent. Code, 14-20-29 (2008); Oklahoma, 10 Okl. St. 7700-505 (2008); Texas, Tex. Fam. Code 160.505 (2008); Utah, Utah Code Ann. 78B-15-505 (2008); Washington, Rev. Code Wash. 26.26.420 (2008); and Wyoming, Wyo. Stat. 14-2-705 (2008).
See George Maha, Analysis of Genetic Test Results for Courtroom Use in DISPUTED PATERNITY PROCEEDINGS, 15.08 (Carl W. Gilmore, ed. 2008).
See e.g., M. Finkelstein & W. Fairley, A Bayesian Approach to Identification Evidence, 83 Harv. L. Rev. 489 (1970). This was the approach used by the court in State v. Spann, 130 N.J. 484, 520 (N.J. 1993).
Plemel v. Walter, 303 Ore. 262, 279 (Or. 1987).
See Ellman & D. Kaye, Probabilities and Proof: Can HLA and Blood Group Testing Prove Paternity? 54 N.Y.U. L. Rev. 1131, 1153 (1979)(stating that instructing the jury to follow the chart may be asking it to do something it cannot: to translate a subjective opinion about the non-test evidence into a single probability figure.).
9 J.H. Wigmore, Evidence 2497 at 325 (3rd Ed. 1940).
Laurence Tribe, Trial by Mathematics, 84 Harv. L.Rev. 1329, 1365 (1971).
Id., at 1366. See also C. Nesson, Reasonable Doubt and Permissive Inferences: The Value of Complexity, 92 Harv. L. Rev. 1187, 1225 (1979)(stating that Any conceptualization of reasonable doubt in probabilistic form is inconsistent with the functional role the concept is designed to play.).
Commonwealth v. Beausoleil, 397 Mass. 206, 221 (Mass. 1986). Among the proposals rejected were presenting a chart to jurors showing the effect the blood test results would have on a juror's own estimates of the prior odds of paternity, see Mark Ira Ellman & David Kaye, Probabilities and Proof: Can HLA and Blood Group Testing Prove Paternity?, 54 N.Y.U. L. Rev. 1131, 1152-1158 (1979), and providing jurors with a formula so that they could see how their own estimate of prior probability would change the probability of paternity calculation based on an assumed 50% prior probability, see Robert W. Peterson, A Few Things You Should Know About Paternity Tests (But Were Afraid to Ask), 22 Santa Clara L. Rev. 667, 686-689 (1982).
Commonwealth v. Beausoleil, 397 Mass. 206, 217 (1986). See also Kofford v. Flora, 744 P.2d 1343, 1351 (Utah 1987)(stating that because of the possibility of confusion, the probability of exclusion should only be admitted when it is in an extreme range).
Butcher v. Ky., 96 S.W.3d at 7.
William M. Sage, Judicial Opinions Involving Health Insurance Coverage: Trompe L'Oeil or Window on the World?, 31 Ind. L. Rev. 49, 50 (1998). For a related lamppost metaphor, see Ronald J. Allen and Michael S. Pardo, The Problematic Value of Mathematical Models of Evidence, 36 J. Legal Stud. 107, 119 (2007)(stating that a similar analysis was, reminiscent of relying on the lamppost more for support than illumination).
Oliver Wendell Holmes, Jr., The Path of the Law, 10 Harv. L. Rev. 457, 466 (1897)
John Tukey, Annals of Mathematical Statistics, 33 (1962).
Kammer v. Young, 73 Md. App. 565, 576-577 (Md. Ct. Spec. App. 1988), cert. denied, 488 U.S. 919 (1988). See also Brown v. Smith, 137 N.C. App. 160, 164 (N.C. Ct. App. 2000)(stating Most, if not all, laboratories in the United States use a prior probability of 0.5 in calculating the genetic probability of paternity.); M. v. Marvin S., 173 Misc. 2d 925, 929 (N.Y. Fam. Ct. 1997)(stating that the utilization by a laboratory of the 0.5 figure is a nationally accepted convention and that all the major laboratories use this figure for paternity test reporting purposes); Griffith v. Texas, 976 S.W.2d at 245 n2 (stating that nearly a million paternity tests in the U.S. were conducted using DNA or HLA methods, each using the .5 prior probability calculation).
United States v. Jenkins, 887 A.2d 1013, 1025 (D.C. 2005), accord People v. Nelson, 43 Cal. 4th 1242, 1265 (Cal. 2008).
People v. Nelson, 142 Cal. App. 4th 696, 721 (Cal. App. 3d Dist. 2006), affd, 43 Cal. 4th 1242, 1265 (Cal. 2008). See also U.S. v. Jenkins, 887 A.2d at 1024 (stating, This debate does not address the underlying principles, math, or science behind the various formulas. It is a disagreement over relevance.). See generally Kaye, Rounding Up the Usual Suspects, supra note ---, 87 N.C.L.Rev. at 448)(referring to a question of logical relevance as opposed to one of general acceptance or scientific validity).
These numbers are derived from U.S. Department of Justice, Bureau of Justice Statistics, "Homicide Trends in the United States", http://www.ojp.usdoj.gov/bjs/homicide/tables/oracetab.htm (last visited January 11, 2009).
People v. Wilson, 38 Cal. 4th 1237; 136 P.3d 864, 867 (2006). N.J. v. Spann,130 N.J. 484; 617 A.2d 247 (1993). See also, e.g. Spann, 617 A.2d at 250 (noting that, The States expert stated that the blood and tissue samples, combined with statistical data reflecting the number of men with the relevant genes, excluded 99% of the North American black male population as possible fathers.).
David H. Kaye, Logical Relevance: Problems with the Reference Population and DNA Mixtures in People v. Pizarro 3 Law, Probability & Risk 211, 214 (2004).
Usually, important traits are the product of the relationship between many different genes. Kaye & Sensabaugh, Reference Guide on DNA Evidence, supra note ---, at 491.
Nat'l Comm. on the Future of DNA Evidence, Nat'l Inst. of Justice, United States Dep't of Justice, The Future of Forensic DNA Testing, 12 (Nov. 2000)(hereinafter The Future of Forensic DNA Testing)
Kaye & Sensabaugh, Reference Guide on DNA Evidence, supra note ---, at 491.
Id. at 492.
See Id. at 485. See also DNA Analysis Backlog Elimination Act of 2000, H.R. Rep. No. 106-900 (I), H.R. Rep. No. 106-900(I) at *27. (stating that for privacy reasons, the non-coded regions were purposely selected because they are not associated with any known physical or medical characteristics.).
Troy Duster, Selective Arrests, an Ever-Expanding DNA Forensic Database, and the Specter of an Early -Twenty-First-Century Equivalent of Phrenology, in The Technology of Justice: DNA and the Criminal Justice System 325 (D. Lazer, ed. 2004). See also N.J. v. Spann, 130 N.J. 484; 617 A.2d 247, 251 (1993)(stating, Since the incidence of different blood groups, as well as HLA types, varies with race and to a lesser extent geography, gene-frequency tables are derived from population studies of different racial groups.).
People v. Dalcollo, 282 Ill. App. 3d 944, 949 (Ill. App. Ct. 2d Dist. 1996).
John M. Butler, Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers 241(2005). Cf. Bruce Budowle, et al, CODIS STR Loci Data from 41 Sample Populations, 46 J. FORENSIC SCI. 453, 453 (2001)(stating that in the United States, for purposes of DNA analysis, African American, U.S. Caucasian, Hispanics, Far East Asians, and Native Americans make up the five major population groups). See generally, Jonathan Kahn. 2008. Race, Genes, and Justice: A Call to Reform the Presentation of Forensic DNA Evidence in Criminal Trials (Available at: HYPERLINK "http://works.bepress.com/jonathan_kahn/1" http://works.bepress.com/jonathan_kahn/1).
Aaron P. Stevens, Note, Arresting Crime: Expanding the Scope of DNA Databases in America, 79 Tex. L. Rev. 921, 928 (2001).
See CODIS Statistics Clickable Map - NDIS Statistics, FBI, available at http://www.fbi.gov/hq/lab/codis/clickmap.htm (last visited January 14, 2009). The NDIS consists primarily of those who are charged with, or have been convicted of, serious crimes, including felonies and other crimes of violence. 42 USCS 14132 & 42 USCS 14135a (d). See generally Murphy, supra note---, 95 Calif. L. Rev. at 740.
Paul C. Giannelli, Forensic Science: Under the Microscope, 34 Ohio N.U.L. Rev. 315, 327 (2008).
Kaye & Sensabaugh, Reference Guide on DNA Evidence, supra note ---, at 57-58(1996)(emphasis added). This report was a follow-up to an earlier FBI-funded report, National Research Council, National Academies, DNA Technology in Forensic Science (1992) (hereinafter NRC I).
Jacobellis v. Ohio, 378 U.S. 184, 197 (1964) (Stewart, J., concurring). The full quote referred to the difficulty in defining what he termed hard-core pornography: I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description; and perhaps I could never succeed in intelligibly doing so. But I know it when I see it, and the motion picture involved in this case is not that.).
Mildred K. Cho, Racial and Ethnic Categories in Biomedical Research: There is no Baby in the Bathwater, 34 J Law Med Ethics. 497 (2006).
See e.g., Audrey Smedley & Brian D. Smedley, Race as Biology Is Fiction, Racism as a Social Problem is Real, 60:1 Am. Psychologist 16, 20 (2005)( stating that the ideology of race, arose as a rationalization and justification for human slavery at a time when Western European societies were embracing philosophies promoting individual and human rights, liberty, democracy, justice, brotherhood, and equality.).
Plessy v. Ferguson, 163 U.S. 537, 552 (U.S. 1896). Homer Plessy was described as seven-eighths Caucasian and one-eighth African blood. Id., 163 U.S. at 541. See generally Saint Francis College v. Al-Khazrajii, 481 U.S. 604, 610-11 (1987)(stating that In the middle years of the 19th century, dictionaries commonly referred to race as a "continued series of descendants from a parent who is called the stock," "the lineage of a family," or "descendants of a common ancestor".It was not until the 20th century that dictionaries began referring to the Caucasian, Mongolian and Negro races, or to race as involving divisions of mankind based upon different physical characteristics.
Oxford English Dictionary 69 (2nd ed. 1989).
Chamallas, Questioning the Use, supra note ---, 63 Fordham L. Rev. at 113 (saying that people do not fall naturally into discrete racial groupings).
Troy Duster, supra note ---, at 325.
Erik Lillquist and Charles A. Sullivan, The Law and Genetics of Racial Profiling in Medicine, 39 Harv. C.R.-C.L. L. Rev. 391 (2004).
Mildred K. Cho & Pamela Sankar, Forensics Genetics and Ethical, Legal, and Social Implications Beyond the Clinic, 36 NATURE GENETICS SUPP. S-8, S-9 ( 2004)
Id. See also Murray, supra, at 1521 (stating that, The most important limitation of the data used for our analysis is that reported race in the census, used for population estimates, may be different from race in mortality statistics, where race may be reported by the family, the certifying physician, or the funeral director.).
Sharona Hoffman, Is There a Place for "Race" as a Legal Concept?, 36 Ariz. St. L.J. 1093, 1096 (2004).
See generally Duana Fullwiley, The Biologistical Construction of Race: Admixture Technology and the New Genetic Medicine, 38 Social Studies of Science, 695-735 (2008).
McMillan v. City of New York, 253 F.R.D. 247, 251 (E.D.N.Y. 2008), quoting Audrey Smedley & Brian D. Smedley, Race as Biology Is Fiction, Racism as a Social Problem is Real, 60:1 Am. Psychologist 16, 20 (2005).
Id. at 21-22.
People v Wilson, 136 P.3d 864, 871 (Cal. 2006.). See also Edward J. Imwinkelried & D.H. Kaye, DNA Typing: Emerging or Neglected Issues, 76 Wash. L. Rev. 413, 449 (2001)(stating, No group is singled out for special treatment, and no one is penalized because of hostility toward race.).
Elizabeth S. Anderson and Richard H. Pildes, Expressive Theories of Law: A General Restatement, 148 U. Pa. L. Rev. 1503, 1538 (2000).
Sharona Hoffman, Is There a Place for Race as a Legal Concept?, 36 Ariz. St. L.J. 1093, 1099 (2004).
375 U.S. 399 (1964). The law stated: Every application for or notification or declaration of candidacy, and every certificate of nomination and every nomination paper filed in any state or local primary, general or special election for any elective office in this state shall show for each candidate named therein, whether such candidate is of the Caucasian race, the Negro race or other specified race. La. Rev. Stat. (1960 Supp.) 18:1174.1(A). The law also requited that The racial designation on the ballots shall be in print of the same size as the print in the names of the candidates on the ballots. Id. 18:1174.1(C).
Anderson, 375 U.S. at 402.
Id.
Gratz v. Bollinger, 539 U.S. 244, 270 (2003)
Adarand Constructors v. Pena, 515 U.S. 200, 227 (1995). Accord Johnson v. California, 543 U.S. 499, 505 (2005).
See Grutter v. Bollinger, 539 U.S. 306, 327 (2003)(stating that When race-based action is necessary to further a compelling governmental interest, such action does not violate the constitutional guarantee of equal protection so long as the narrow-tailoring requirement is also satisfied).
One instance in which such racial categorization might be necessary is when dealing with ethnic subpopulations. For example, assume that there is a small group whose members share a genetic anomaly that is not seen with any other group. Using a general population database might lead to a finding that the defendant was probably guilty (since very few Americans match the DNA found at the crime scene), when in fact anyone in his subgroup would have matched as well. In such a case, use of ethnic data bases would be appropriate. See Kaye & Sensabaugh, Reference Guide on DNA Evidence, supra note ---, at 526. See also R. C. Lewontin, Which Population?, 52 Am. J. Hum. Genet. 205 (1993).
Kaye, David H., DNA Probabilities in People v. Prince: When Are Racial and Ethnic Statistics Relevant?, in T. Speed & D. Nolan, eds., Probability and Statistics: Essays in Honore of David A. Freedman, 289, 300 (2008).
See e.g. Ellery W. Davis, et al, The Calculus.170 (1914)(stating that, for a particular formula used to find the measure of the bending of a curve. the corrective factor gives a better measure of the bending).
Andre A. Moenssens, Handwriting Identification Evidence in the Post-Daubert World, 66 UMKC L. Rev. 251, 343 n.155 (1997).
Bowe v. Colgate-Palmolive Co., 443 F. Supp. 696, 711-12 (S.D. Ind. 1977)
Natl Comm. on the Future of DNA Evidence, Nat'l Inst. of Justice, United States Dep't of Justice, The Future of Forensic DNA Testing: Predictions of the Research and Development Working Group 17-20, Nov. 2000, at 5. The specific corrective factor they gave is = .03 for the formula used to determine match probabilities for various alleles (designated I ) and the frequency (designated pI ) in which they occur: [2 + (1-) pi][3 + (1-) pi]/ [(1 + )(1 + 2)]. Id. at 24.
Id. at 5.
Elizabeth S. Anderson and Richard H. Pildes, Expressive Theories of Law: A General Restatement, 148 U. Pa. L. Rev. 1503, 1538 (2000).
Anderson, 375 U.S. at 402.
See In re Monumental Life Ins. Co., 365 F.3d 408, 416-20 (5th Cir.), cert den. 543 U.S. 870 (2004). The court noted that for a 20-Pay Life policy , in which the insured pays premiums for 20 years, and is then fully insured for the rest of his or her life, a twenty-year-old black, was charged a weekly premium of $ 0.41 for a $500 policy, while a twenty-year-old white was only charged $ 0.32. Id., 365 F.3d at 412 n.4.
Id., 365 F.3d at 412 n.2.
United States v. Bedonie, 317 F. Supp. 2d 1285, 1291 (D. Utah 2004).
Bedonie, 317 F. Supp. 2d at 1292. The restitution award was authorized by the Mandatory Victims Restitution Act, which provides that families of victims of certain crimes will be awarded, as restitution, a sum representing the total income lost due to the crime. 18 USCS 3663A (b)(2)(C).
Bedonie, 317 F. Supp. 2d at 1316. The Court ultimately rejected the experts recommendation and awarded the amount calculated without the discount for race and sex. Id., 317 F.Supp. 2d at 1322.
Based on the age-specific death rates prevailing for the actual population in 2004, the National Center for Health Statistics reported that the average white American lives for 78.3 years, while the average African-American lives for 73.1 years. See Elizabeth Arias, United States Life Tables, 2004, Nat'l Vital Statistics Rep., Dec. 28, 2007, at 1,3 available at http://www.cdc.gov/nchs/data/nvsr/nvsr56/nvsr56_09.pdf.
U.S. Bureau of Labor Statistics, U.S. Dep't of Labor, Report No. 960, Highlights of Women's Earnings in 2007 8 tbl.1 (2008), at HYPERLINK "http://www.bls.gov/cps/cpswom2007.pdf" http://www.bls.gov/cps/cpswom2007.pdf. See also Caron v. United States, 410 F. Supp. 378, 397-398 (D.R.I. 1975), aff'd, 548 F.2d 366 (1st Cir. 1976)(stating that One does not need expert testimony to conclude that there is inequality in the average earnings of the sexes.).
Childers v Secy HHS, 1999 U.S. Claims LEXIS 76 (U.S. Ct of Claims 1999). Worklife tables provide an average for the group, reflecting the historical pattern of actual years worked, incorporating rates of unemployment, both voluntary and involuntary, as well as incorporating an expected retirement age. Sherri R. Lamb, Comment, Toward Gender-Neutral Data for Adjudicating Lost Future Earning Damages: An Evidentiary Perspective, 72 Chi.-Kent L. Rev. 299 (1996). There are also disparities in the average worklife for African-Americans as compared to Whites: If minority men have historically been incarcerated at a much higher rate than white men, race-based worklife estimates predict that they will continue to work fewer years than whites. Martha Chamallas, Civil Rights in Ordinary Tort Cases: Race, Gender, and the Calculation of Economic Loss, 38 Loy. L.A. L. Rev. 1435, 1439 (2005)(hereinafter Chamallas, Civil Rights in Ordinary Tort Cases).
Chamallas, Civil Rights in Ordinary Tort Cases, supra note ---, 38 Loy. L.A. L. Rev.. at 1438.
Black v. Columbus Pub. Sch., No. 2:96-CV-326, 2007 U.S. Dist. LEXIS 68672, 2007 WL 2713873 (S.D. Ohio Sept. 17, 2007).
Athridge v. Iglesias, 950 F. Supp. 1187, 1192 (D.D.C. 1996).
Forman v. Korean Airlines Co., Ltd., 84 F.3d 446, 449-50 (D.C. Cir. 1996). See also Gonzalez v. City of Franklin, 128 Wis. 2d 485, 383 N.W.2d 907 (Ct. App. 1986)(stating, Here, for purposes of determining how many replacement prostheses Gonzalez might need in the future, Gonzalez's counsel used at trial a mortality table which breaks down the populace by race (white and black) and sex (male and female). The figure for white males went to the jury.); Drayton v. Jiffee Chemical Corp., 591 F.2D 352 (6th Cir. 1978)(We have considered as well her sex, her race, the necessarily limited evidence concerning her mental capabilities, and her psychological makeup.); Feldman v. Allegheny Airlines, 382 F. Supp. 1271, 1286 (D. Conn. 1974) (stating that eight years is the "middle of the range of a professional woman's likely hiatus from her principal occupation in order to raise a family"); Frankel v. Heym, 466 F.2d 1226, 1229 (3d Cir. 1972) (stating that female plaintiff would probably marry and have children with consequent substantial interruptions of gainful employment).
Mortality Tables for Determining Present Value, 73 Fed Reg 44632 (I.R.S. July 31, 2008).
26 CFR 1.430(h)(3)-2 (c)(1)(i).
See e.g. Korematsu v. United States, 323 U.S. 214, 216 (1944)(stating that all legal restrictions which curtail the civil rights of a single racial group are immediately suspect.).
United States v. Bedonie, 317 F. Supp. 2d 1285, 1315 (D. Utah 2004), aff'd sub nom, United States v. Serawop, 505 F.3d 1112 (10th Cir. 2007). There are a handful of cases in which judges did recognize the dangers posed by race and gender based statistics. See e.g. McMillan, 253 F.R.D. at 256; Bedonie , 317 F. Supp. 2d at 1315; Wheeler Tarpeh-Doe v. United States, 771 F. Supp. 427, 455 (D.D.C. 1991); Reilly v. United States, 665 F. Supp. 976, 997 (D.R.I. 1987); Hartford Accident & Indemnity Co. v. Insurance Comm'r of Commonwealth, 482 A.2d 542, 582 (Pa. 1984).
Chamallas, Civil Rights in Ordinary Tort Cases, supra note ---, 38 Loy. L.A. L. Rev. at 1442.
Caron v. United States, 410 F. Supp. 378, 397-398 (D.R.I. 1975), aff'd, 548 F.2d 366 (1st Cir. 1976).
See e.g., Jackson, Sherri L., Research Methods and Statistics: A Critical Thinking Approach 15 (2003)(stating, Correlation does not imply causation.).
See Sherri R. Lamb, Comment, Toward Gender-Neutral Data for Adjudicating Lost Future Earning Damages: An Evidentiary Perspective, 72 Chi.-Kent L. Rev. 299, 329-30 (1996)(stating Statistical tables predict the future only to the extent that the future resembles the past; a predictor is efficient only if past correlations persist throughout the period in which the predicted event will occur.).
According to the ABA, for the academic year 1963 1964, there were 20,776 first year students, 19,899 men and 877 women. American Bar Association, Legal Education and Bar Admission Statistics, 1963-2005, available at
HYPERLINK "http://www.abanet.org/legaled/statistics/charts/stats%20-%201.pdf" http://www.abanet.org/legaled/statistics/charts/stats%20-%201.pdf (last visited Mar. 1, 2009).
For the academic year 2007-08, women were 47.3% of the first-year law school class; out of 49,082 first year students, 25,864 were men and 23,218 were female. Id.
Hsiang-Ching Kung et al., National Center for Health Statistics, Deaths: Final Data for 2005 (National Center for Health Statistics, Division of Vital Statistics, Hyattsville, MD) (2007) at 2. The age-adjusted death rate for men in 2005 was 40.4 percent greater than that for women, down from 40.7 percent greater in 2004. Id.
Id. at 4. According to the CDC, Death rates declined by 10.6 percent for the black population and by 7.0 percent for the white population between 1989 and 1997, and they have declined by 10.8 percent for the black population and by 8.2 percent for the white population since 1997. Id.
See e.g., Chamallas, Civil Rights in Ordinary Tort Cases, supra note ---, 38 Loy. L.A. L. Rev. at 1455 (stating that Relying on race and sex-based statistics reinforces the view that race and sex differences are inevitable and enduring, rather than a product of political and social arrangements that are subject to change.)
See e.g. Martha Chamallas, Questioning the Use of Race-Specific and Gender-Specific Data in Tort Litigation: A Constitutional Argument, 63 Fordham L. Rev. 73, 75 (1994)(hereinafter Chamallas, Questioning the Use) (stating that, The use of race-based and gender-based tables assumes that the current gender and racial pay gap will continue in the future, despite ongoing legal and institutional efforts to make the workplace more diverse and less discriminatory.).
Jennifer B. Wriggins, Damages in Tort Litigation: Thoughts on Race and Remedies, 1865-2007, 27 Rev. Litig. 37, 56 (2007).
See text accompanying notes --- supra.
Geronimus, Arline T., et al., Inequality in Life Expectancy, Functional Status, and Active Life Expectancy Across Selected Black and White Populations in the United States, 38 Demography 227, 234-35 (2001). See also Joseph J. Sudano & David W. Baker, Explaining US racial/ethnic disparities in health declines and mortality in late middle age: The roles of socioeconomic status, health behaviors, and health insurance, 62 Soc. Sci. & Med. 909, 918 (2006) (Our results are also consistent with previous studies that have found large direct (or residual) effects of [socio-economic status] on health that were not explained by differences in health behaviors.").
Geronimus, et al., supra note ---, 38 Demography at 234-35.
McMillan, 253 F.R.D. at 252.
Id., 253 F.R.D. at 253.
Tarpeh-Doe, 771 F. Supp. at 455. See also Bedonie, 317 F. Supp. 2d at 1319 (stating that As a matter of fairness, the court should exercise its discretion in favor of victims of violent crime and against the possible perpetuation of inappropriate stereotypes.).
Bedonie, 317 F. Supp. 2d at 1319.
Sherri R. Lamb, Comment, Toward Gender-Neutral Data for Adjudicating Lost Future Earning Damages: An Evidentiary Perspective, 72 Chi.-Kent L. Rev. 299 (1996).
Chamallas, Questioning the Use, supra note ---, 63 Fordham L. Rev. at 75.
The Saginaw & The Hamilton, 139 F. 906, 910 (D.N.Y. 1905). The claims resulted from the collision of two steamships, which caused the deaths of both passengers and crewmembers.
The Saginaw & The Hamilton, 139 F. at 914.
Jennifer B. Wriggins, Damages in Tort Litigation: Thoughts on Race and Remedies, 1865-2007, 27 Rev. Litig. 37, 56 (2007).
Bruce H. Webster, Jr. & Alemayehu Bishaw, Income, Earnings, and Poverty Data From the 2006 American Community Survey 15 (2007).
Id., at 13.
Bedonie, 317 F. Supp. 2d at 1313.
McMillan, 253 F.R.D. at 250). See also Chamallas, Civil Rights in Ordinary Tort Cases, supra note ---, 38 Loy. L.A. L. Rev. at 1439 (stating that reliance on race and gender statistics, saddles nonconforming women and racial minorities with generalizations about their group, the very kind of stereotyping that anti-discrimination laws were meant to prohibit.); Sherri R. Lamb, Comment, Toward Gender-Neutral Data for Adjudicating Lost Future Earning Damages: An Evidentiary Perspective, 72 Chi.-Kent L. Rev. 299, 304 (1996)(stating that practice of issuing gender-based awards magnifies the impact of employment discrimination and devalues the earning capacity of injured women, resulting in widely varying damage awards of equally situated men and women for the same injury.).
U.S. v. Serawop, 505 F.3d 1112, 1114 (10th Cir. 2007), quoting lower court decision, sub nom United States v. Bedonie, 317 F. Supp. 2d 1285, 1319 (D. Utah 2004).
Olga N. Sirodoeva-Paxson, Judicial Removal of Directors: Denial of Directors' License to Steal or Shareholders' Freedom to Vote?, 50 Hastings L.J. 97, 137 (1998).
Chamallas, Civil Rights in Ordinary Tort Cases, supra note ---, 38 Loy. L.A. L. Rev. at 1441.
See text accompanying notes ---, infra.
Laura Greenberg, Comment: Compensating the Lead Poisoned Child: Proposals for Mitigating Discriminatory Damage Awards, 28 B.C. Envtl. Aff. L. Rev. 429, 447 (2001).
Ariz. Governing Comm. for Tax Deferred Annuity & Deferred Comp. Plans v. Norris, 463 U.S. 1073, 1081 (U.S. 1983), quoting Los Angeles Dept. of Water & Power v. Manhart, 435 U.S. 702, 712-13 (1978)
See September 11th Victim Compensation Fund, Final Rule, CIV 104F, AG Order No. 2584-2002, 67 Fed. Reg. 11233 (March 13, 2002). The statutory authorization for the fund was contained in Title IV of Public Law 107-42.
September 11th Victim Compensation Fund, Final Rule, 67 Fed. Reg. at 11238. Feinberg chose to use the worklife table for males for all claimants. Id.
Michael I. Meyerson, Losses of Equal Value, N.Y. Times, Mar. 24, 2002, 4 (week in review). See also September 11th Victim Compensation Fund of 2001, Interim Final Rule, 66 Fed. Reg. 66,274 (Dec. 21, 2001)(stating that The September 11th Victim Compensation Fund of 2001 is an unprecedented expression of compassion on the part of the American people to the victims and their families devastated by the horror and tragedy of September 11.).
Title VII makes it an unlawful employment practice to discriminate against any individual with respect to his compensation, terms, conditions, or privileges of employment, because of such individual's race, color, religion, sex or national origin. 42 U. S. C. 2000e-2(a)(1). The Court also said, in dictum, that the use of race-based actuarial tables would be similarly illegal. Ariz. Governing Comm. for Tax Deferred Annuity & Deferred Comp. Plans v. Norris, 463 U.S. 1073, 1086 (U.S. 1983).
Los Angeles Dept. of Water & Power v. Manhart, 435 U.S. 702 (1978); Ariz. Governing Comm. for Tax Deferred Annuity & Deferred Comp. Plans v. Norris, 463 U.S. 1073, 1086 (1983).
Norris, 463 U.S. at 1084.
Norris, 463 U.S. at 1085, quoting Manhart, 435 U.S. at 708. Thus, the Court said, it does not matter that the greater costs of providing retirement benefits for female employees does not justify the use of a sex-based retirement plan. Norris, 463 U.S at 1086.
Norris, 463 U.S. at 1083, quoting Manhart, 435 U.S., at 708.
Miller v. Johnson, 515 U.S. 900, 911 (1995) (quoting Metro Broadcasting, Inc. v. FCC, 497 U.S. 547, 602 (1990) (O'Connor, J., dissenting)(internal quotation marks omitted). See also Gratz v. Bollinger, 539 U.S. 244, 270 (2003)(stating that racial classifications are simply too perni cious to permit any but the most exact connection between justification and classification.); United States v. Virginia, 518 U.S. 515, 531 (1996)(stating that Parties who seek to defend gender-based government action must demonstrate an exceedingly persuasive justification for that action.).
Chamallas, Civil Rights in Ordinary Tort Cases, supra note ---, 38 Loy. L.A. L. Rev. at 1468.
Bertram Harnett & Irving I. Lesnick, The Law of Life and Health Insurance 13.03 (2008). See generally Fair Insurance Practices Act: Hearing Before the Committee on Commerce, Science, and Transportation, United States Senate, Ninety-seventh Congress, Second Session, (July 15, 1982) at 24.
See e.g. Hartford Accident & Indemnity Co. v. Insurance Comm'r of Commonwealth, 482 A.2d 542 (PA. 1984)( terming differential treatment between men and women by insurance company unfair discrimination.).
See e.g. Seth F. Kreimer, Truth Machines and Consequences: The Dark Side of "Accuracy" in Criminal Justice, 60 N.Y.U. Ann. Surv. Am. L. 657 (2005).
GET CITE
The only time you would not have both types of error would be if 100% of those tested fail or 100% pass.
In re Winship, 397 U.S. 358, 372(1970) (Harlan, J., concurring). See also Apprendi v. New Jersey, 530 U.S. 466, 477 (2000), quoting United States v. Gaudin, 515 U.S. 506, 510 (1995)(stating that a criminal defendant is entitled to a jury determination that [he] is guilty of every element of the crime with which he is charged, beyond a reasonable doubt.).
Sir William Blackstone, IV Commentaries on the Laws of England 358 (Ch. 27) (1844 ed.) (1765) (The law holds that it is better that ten guilty persons escape, than that one innocent suffer.).
D.H. Kaye, Is Proof of Statistical Significance Relevant? 61 Wash. L. Rev. 1333, 1361 (1986)(hereinafter Kaye, Is Proof of Statistical Significance Relevant?). See also Richard A. Posner, An Economic Approach to the Law of Evidence, 51 Stan. L. Rev. 1477, 1504 (1999)(stating In the typical civil trial, there is no basis for supposing that Type I errors (false positives, such as convicting an innocent person) on average impose higher costs than Type II errors (false negatives, such as an erroneous acquittal.); Neil B. Cohen, The Gatekeeping Role in Civil Litigation and the Abdication of Legal Values in Favor of Scientific Values, 33 Seton Hall L. Rev. 943, 950 (2003)(hereinafter Cohen, The Gatekeeping Role)(stating that the preponderance of the evidence standard suggests that the civil litigation system ascribes essentially equal costs to inaccurately proclaiming a proposition to be demonstrated and to inaccurately declining to proclaim that the proposition has been demonstrated.).
Santosky v. Kramer, 455 U.S. 745, 790 (1982)( Rehnquist, C.J., dissenting). Rehnquist referred to such a situation as occurring when the social disutility of error in either direction is roughly equal.Id.
Addington v. Texas, 441 U.S. 418, 432 (1979). See also Santosky, 455 U.S. at 756 (terming clear and convincing an intermediate standard of proof).
See e.g. Santosky, 455 U.S. at 769; Addington, 441 U.S.at 424-26. Woodby v. INS, 385 U.S. 276, 285 (1966), Chaunt v. U.S., 364 U.S. 350, 353 (1960), Schneiderman v. U.S., 320 U.S. 118, 125 (1943).
Addington, 441 U.S. at 427, Santosky, 455 U.S. 745 (1982).
455 U.S. 745 (1982).
Id. The Court stated that it was plain beyond the need for multiple citation that a natural parents desire for and right to the companionship, care, custody, and management of his or her children is an interest far more precious than any property right. Id., 455 U.S. at 758-759 (internal citations omitted).
Id., 455 U.S. at 765.
Id., 455 U.S. at 766.
Id., 455 U.S. at 790 (Rehnquist, J., dissenting).
497 U.S. 261 (1990).
Cruzan, 497 U.S. at 283.
Id., 497 U.S. at 320 (Brennan, J., dissenting).
430 U.S. 482 (1977).
Castaneda, 430 U.S. at 496 n.17.
Three months after Castaneda was decided, the Court in Hazelwood School Dist. v. United States, 433 U.S. 299, 309 n14 (1977), cited Castaneda for the proposition that a finding of two or three standard deviations meant that then the hypothesis that teachers in this case were hired without regard to race would be suspect, but the Hazelwood Court did not say to whom they would be suspect.
See note ---
Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 83 (stating that, a sample is unlikely to be a perfect microcosm of the population.).
See Jonathan R. Macey et al., Lessons from Financial Economics: Materiality, Reliance, and Extending the Reach of Basic v. Levinson, 77 Va. L. Rev. 1017 (1991)(stating, To test for such statistically significant returns, it is necessary to account for the usual volatility of returns, which varies across firms and over time.).
D. Baldus & J. Cole, Statistical Proof of Discrimination 359 (1980).
See Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 173. Depending on the statistical test being utilized, there are numerous formulas for standard deviations. Arthur M. Glenberg, Learning from Data: An Introduction to Statistical Reasoning 66 (1996).
An estimate based on a sample is likely to be off the mark, at least by a little, due to random error. Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 131.
The normal distribution shows the probability of a continuous range of possible occurrences. It is a perfectly symmetric curve around the mean, with probabilities above the mean identical to corresponding probabilities below the mean. See Michael O. Finkelstein & Bruce Levin, Statistics for Lawyers 113 (2nd ed. 2001). Even though life is usually not neat enough to fall into a normal distribution, statisticians are often able to work with data in such a way that it approximates the normal distribution. Wayne C. Curtis, Statistical Concepts for Attorneys 71 (1983). In fact, Although stock returns are actually not distributed normally, researchers have shown that the normal distribution is a good approximation for event study estimations. Jonathan R. Macey et al., supra note ---, 77 Va. L. Rev. at 1039. Despite the seeming prevalence of statistical use of the normal distribution, other distributions, such as binomial distributions, are also utilized by statisticians. See e.g., David G. Rees, Essential Statistics 73-74 (4th ed. 2000).
More precisely, the probability that a randomly selected value will be within one standard deviation of the mean is about 68.3%.
The probability that a randomly selected value will be greater than one standard deviation of the mean is about 31.7%.
Results more than three standard deviations from the mean are even rarer. The probability of being more than more than three standard deviations from the mean in a normal distribution is less than 1%, approximately .3%.
Note that this is not the same as an evaluation of the probability of there being no cheating. See text accompanying notes --- infra.
Kaye, Is Proof of Statistical Significance Relevant?, supra note ---, at 1333.
Regrettably, multiple negatives are involved here. A statistical test is essentially an argument by contradiction. Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 117.
The alternate hypothesis is also termed H1.
See e.g. Kaye, Is Proof of Statistical Significance Relevant?, supra note ---, at 1355 (stating that there are always other alternatives besides the one the statistician identifies as H[1] in formulating the test.). For a discussion of the importance of correctly describing the alternate hypothesis, see Daniel L. Rubinfeld, Econometrics in the Courtroom, 85 Colum. L. Rev. 1048 , 1055 (1985)(stating that, the form of the alternative hypothesis can affect the conclusion that one reaches from the statistical analysis.).
The null hypothesis is also termed H0.
Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 122. Another way of describing the null hypothesis is to state that The difference observed in the data is then just due to sampling error. Id. at 173.
Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 117.
If you toss a coin, there are only two possible outcomes for each toss. If you toss it 10 times, the total number of possible outcomes is 2^10 = 1024. Of these, there are 252 ways to get exactly five heads [only the first five tosses could be heads, every other toss could be heads, etc]. Thus, the probability of tossing exactly ten heads is 1/1024 = 0.00098.
The following chart shows the probability of tossing each quantity of heads:
0 heads: 1/1024 = 0.00098
1 head: 10/1024 = 0.00977
2 heads: 45/1024 = 0.04395
3 heads: 120/1024 = 0.11719
4 heads: 210/1024 = 0.20508
5 heads: 252/1024 = 0.24609
6 heads: 210/1024 = 0.20508
7 heads: 120/1024 = 0.11719
8 heads: 45/1024 = 0.04395
9 heads: 10/1024 = 0.00977
10 heads: 1/1024 = 0.00098
Kaye, Is Proof of Statistical Significance Relevant?, supra note ---, at 1350.
To calculate this, we would add the probability of obtaining 8 heads to the probability of obtaining 9 and 10 and to the probability of obtaining 8, 9, or 10 tails. Thus, 0.04395 + 0.00977 + 0.00098 + 0.04395 + 0.00977 + 0.00098 = 0.1094. See note ---, supra. If we wanted to know how probable it would be to be that extreme in only one direction, we would simply add the probability of obtaining 8 heads to the probability of obtaining 9 and 10. Thus, 0.04395 + 0.00977 + 0.00098 = 0.0547. The difference between extreme meaning 8 or more heads, and extreme meaning 8 or more of the same result, is the difference between the one-tail test and the two-tail test. See text accompanying notes ---, infra.
According to the current paradigm, an observation is deemed "statistically significant" (test hypothesis rejected, null hypothesis given consideration) if the p-value is less than 0.05; an observation is deemed "not significant" (test hypothesis "not rejected" or accepted) if the p-value is greater than 0.05. David Egilman et al., Proving Causation: The Use and Abuse of Medical and Scientific Evidence Inside the Courtroom - An Epidemiologist's Critique of the Judicial Interpretation of the Daubert Ruling, 58 Food & Drug L.J. 223 (2003).
See e.g. Neil B. Cohen, Confidence in Probability: Burdens of Persuasion in a World of Imperfect Knowledge, 60 N.Y.U. L. Rev. 385 (1985)(hereinafter Cohen, Confidence in Probability).
Some courts have failed to understand the limited meaning of a failure to find statistically significant result. One court stated, If a significant difference is found, the null hypothesis is rejected. If a significant difference is not found, the null hypothesis is accepted. Merrell Dow Pharms., Inc. v. Havner, 953 S.W.2d 706, 722 (Tex. 1997). That is incorrect; if a significant difference is not found, the null hypothesis is not rejected. We cannot say if it the null hypothesis true.
See e.g. Marcel C. Garaud, Comment, Legal Standards and Statistical Proof in Title VII Litigation: In Search of a Coherent Disparate Impact Model, 139 U. Pa. L. Rev. 455, 466 (1990)(stating that In fact, courts have applied a 95% confidence coefficient corresponding to a 5% significance level cut-off in disparate impact cases.); Barnett, An Underestimated Threat to Multiple Regression Analyses Used in Job Discrimination Cases, 5 INDUS. REL. L.J. 156, 168 (1982) (The most common rule is that evidence is compelling if and only if the probability the pattern obtained would have arisen by chance alone does not exceed five percent.).
430 U.S. 482 (1977).
Castaneda, 430 U.S. at 497, n. 17 (1977). Later that year, the Supreme Court reaffirmed this statement. In Hazelwood School Dist. v. United States, 433 U.S. 299, 309 n.14 (1977), the Court, in deciding whether a school district had discriminated against African-Americans in its hiring of teachers, discussed the statistical comparison between the racial compositions of the defendant school districts teaching staff with the public school teacher population in the relevant labor market. The Court reiterated its earlier comments: The Court in Castaneda noted that [a]s a general rule for such large samples, if the difference between the expected value and the observed number is greater than two or three standard deviations, then the hypothesis that teachers were hired without regard to race would be suspect. Id., quoting Castaneda, 430 U.S., at 497 n. 17.
Castaneda, 430 U.S., at 430 U.S. at 497, n. 17 (1977)(stating that, The 11-year data here reflect a difference between the expected and observed number of Mexican-Americans of approximately 29 standard deviations.).
Segar v. Smith, 738 F.2d 1249, 1283 (D.C. Cir. 1984).
See generally Kaye, Is Proof of Statistical Significance Relevant?, supra note ---.
Dobbs-Weinstein v. Vanderbilt Univ., 1 F. Supp. 2d 783, 803 (M.D. Tenn. 1998). See also Davis v. New York City Hous. Auth., 60 F. Supp. 2d 220, 239 (S.D.N.Y. 1999), affd in relevant part, 278 F.3d 64 (2d Cir. 2002); cert. den., 536 U.S. 904 (2002)(Courts have frequently adopted a standard of two to three standard deviations as constituting statistical significance.). See e.g. Jones v. GPU, Inc., 234 F.R.D. 82, 95 (E.D. Pa. 2005)(Standard deviation units measure statistical significance. 1.96 standard deviation units refers to the level of statistical disparity required to demonstrate legal statistical significance using a two-tailed test.); Smith v. Xerox Corp., 196 F.3d 358, 366 (2d Cir. N.Y. 1999)(If an obtained result varies from the expected result by two standard deviations, there is only about a 5% probability that the variance is due to chance. Courts generally consider this level of significance sufficient to warrant an inference of discrimination.); Government v. Penn, 838 F. Supp. 1054, 1070 (D.V.I. 1993) (stating that statistically significant, refers to at least a 95 percent probability.); United States v. Lansdowne Swim Club, 713 F. Supp. 785, 809 (E.D. Pa. 1989) (standard deviation of greater than two or three excludes chance); Frazier v. Consolidated Rail Corp., 851 F.2d 1447, 1452 (D.C. Cir. 1988) (The question -- the legal question -- is what degree of certainty the courts require for a prima facie case to be established. The 5% level is commonly accepted among statisticians as an acceptable degree of uncertainty); Palmer v. Schultz, 815 F.2d 84, 96 (D.C. Cir. 1987) (Statistical evidence must meet the 5% level . . . for it alone to establish a prima facie case under Title VII..); Whelan v. Merrell-Dow Pharmaceuticals, Inc., 117 F.R.D. 299, 304 (D.D.C. 1987) (Statistical evidence is admissible only if that evidence is statistically significant at the 95% confidence level.).
Anderson v Douglas & Lomason, 26 F.3D 1277 (5th Cir. 1994).
Id., 26 F.3d at 1292.
The Court used the following formula to calculate the number of standard deviations:
Number of S/D = (O-NP)/NP(1-NP),
S/D = Standard Deviations
O = Actual number of blacks who received a promotion
N = Number of workers who received a promotion
P = Probability of a black being promoted from the relevant population
Thus, 5-(22 x 42%)/2.30 = -1.84. Anderson, 26 F.3d at 1292 n.26.
Anderson, 26 F.3d at 1292.
See Kenneth J. Rothman, et al, Modern Epidemiology 211 (3rd ed. 2008)( stating that The preoccupation with significance testing derives from the research interests of statisticians who pioneered the development of statistical theory in the 20th century. These research problems were primarily industrial and agricultural, and typically involved experiments that formed the basis for a choice between two or more alternative courses of action. Such experiments were designed to produce results enabling a decision to be made, and the statistical methods employed were intended to facilitate decision making.). See generally David Egilman et al., Proving Causation: The Use and Abuse of Medical and Scientific Evidence Inside the Courtroom - An Epidemiologist's Critique of the Judicial Interpretation of the Daubert Ruling, 58 Food & Drug L.J. 223 (2003).
See C. Radhakrishna Rao, R. A. Fisher: The Founder of Modern Statistics, Statist. Sci. Volume 7, Number 1 (1992). Indeed, Fisher is credited with coining the very word statistic, Leonard J. Savage, On Rereading R. A. Fisher, 4 The Annals of Statistics, 441, 452 (1976).
Ronald A Fisher, The Arrangement of Field Experiments in Breakthroughs in Statistics: Foundations and Basic Theory 83 (eds Samuel Kotz, et al 1993).
After discussing other possible standards of significance, Fisher declared, Personally, the writer prefers to set a low standard of significance at the 5 per cent point, and ignore entirely all results which fail to reach this level. Id.
A Type I error has also been characterized as a false inculpation while a Type II error is termed a false exculpation. Cohen, Confidence in Probability, supra note ---, 60 N.Y.U. L. Rev. at 410.
See Macey, supra note ---, 77 Va. L. Rev. at 1041 (stating that The tradeoff, however, is that while the higher significance level reduces Type 1 errors, it also increases the probability of Type 2 errors.).
In some situations, a researcher can reduce both Type I and Type II errors simultaneously by increasing sample size. Ron N. Forthofer et al, Biostatistics: A Guide to Design, Analysis, and Discovery 240 (2007). Such an increase, however, may greatly increase the cost of an experiment. Geoffrey Keppel et al., Introduction to Design and Analysis,195 (1992). Moreover, it will frequently be impossible to alter the number of events available for evaluation. Id.
See Cohen, Confidence in Probability, supra note ---, 60 N.Y.U. L. Rev. at 411 n.113 (stating that It is easy to see, for any given quantity of data, that a rule of decision that decreases the likelihood of Type I (false inculpation) errors will increase the likelihood of Type II (false exculpation) errors, and vice versa.). See generally Richard A. Posner, An Economic Approach to Legal Procedure and Judicial Administration, 2 J. Legal Stud. 399, 408-15 (1973).
Cohen, Confidence in Probability, supra note ---, 60 N.Y.U. L. Rev. at 411.
See e.g. DeLuca v. Merrell Dow Pharm., Inc., 911 F.2d 941, 958 (3d Cir. 1990)(stating that, Unfortunately, the relationship between type one error and type two error is not simple.).
See e.g. Cohen, Confidence in Probability, supra note ---, 60 N.Y.U. L. Rev. at 411(stating that Although the two risks are inversely related in that increasing one decreases the other, they are not simple complements -- that is, they do not add up to one.).
Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 125 n. 144. See also Richard Goldstein, Two Types of Statistical Error in Employment Discrimination Cases, 26 JURIMETRICS J. 32 (1985)(stating that Power is the probability of not making a Type 2 error. In other words, power is the probability of correctly rejecting the null hypothesis.).
Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 121-26 & 156-59.
This is not an atypical situation. Statisticians, when they focus on Type II errors at all, are generally quite willing to risk many more Type II errors than Type I. As one text on biostatistics noted, the value of 0.20 for [risk of Type II error] is used frequently in the literature. Ron N. Forthofer et al, Biostatistics: A Guide to Design, Analysis, and Discovery 218 (2007).
The concept that results are only likely to be extreme in one direction, ie hiring too few women hired but not too few men, is captured by the concept of a one-tailed test. See text accompanying notes ---, infra.
See Dawson, Probabilities and Prejudice in Establishing Statistical Inferences, 13 Jurimetrics J. 191, 201-09 (1973). See also Richard Goldstein, Two Types of Statistical Error in Employment Discrimination Cases, 26 JURIMETRICS J. 32 (1985)(finding in one example that the risk of a Type II error was almost 50% (0.4919) for a significance level of 95%). Traditionally, a twenty percent risk of a Type II error is deemed acceptable by statisticians. Michelle M. Mello, Using Statistical Evidence to Prove the Malpractice Standard of Care: Bridging Legal, Clinical, and Statistical Thinking, 37 Wake Forest L. Rev. 821, 841 n.54 (2002). See generally Cohen, supra note ---, 60 N.Y.U. L. Rev. at 410-12.
As the Texas Supreme Court declared: We think it unwise to depart from the methodology that is at present generally accepted among epidemiologists.Accordingly, we should not widen the boundaries at which courts will acknowledge a statistically significant association beyond the 95% level . Merrell Dow Pharms., Inc. v. Havner, 953 S.W.2d 706, 722 (Tex. 1997).
Cohen, Confidence in Probability, supra note ---, 60 N.Y.U. L. Rev. at 412 (internal citations omitted).
Id.
Leonard J. Savage, On Rereading R. A. Fisher, The Annals of Statistics, Vol. 4, No. 3 (May, 1976), pp. 441-500
Ronald A. Fisher The Design of Experiments (8th ed. 1966).
Michelle M. Mello, Using Statistical Evidence to Prove the Malpractice Standard of Care: Bridging Legal, Clinical, and Statistical Thinking, 37 Wake Forest L. Rev. 821, 842 (2002).
R.A. Fisher, The Design of Experiments 18 (1935)
Margaret G. Farrell, Daubert v. Merrell Dow Pharmaceuticals, Inc.: Epistemiology and Legal Process, 15 Cardozo L. Rev. 2183, 2210 (1994). As R.A. Fisher described this world view: A scientific fact should be regarded as experimentally established only if a properly designed experiment rarely fails to give this level of significance. Ronald A Fisher, The Arrangement of Field Experiments in Breakthroughs in Statistics: Foundations and Basic Theory 83 (eds Samuel Kotz, et al 1993)(emphasis in original).
As one commentator noted, the time frame of science is relatively open-ended.
Peter H. Schuck, Multi-Culturalism Redux: Science, Law, and Politics, 11 Yale L. & Pol'y Rev. 1, 17 (1993).
Richard Lempert, Statistics in the Courtroom: Building on Rubinfeld, 85 Colum. L. Rev. 1098, 1099 (1985). See also Lucinda Finley, Guarding the Gate to the Courthouse: How Trial Judges Are Using Their Evidentiary Screening Rules to Remake Tort Causation Rules, 49 DePaul L. Rev. 335, 364 (1999)(stating: Indeed, epidemiology is so inherently conservative in its reluctance to abandon the null hypothesis that it is far more willing to tolerate false negatives - the rejection of a causal association when one may actually exist - than false positives - the attribution of an association when one does not exist.); Kaye, Is Proof of Statistical Significance Relevant?, supra note ---, at 1343 (stating that social scientists adopted the methods and conventions of others who were concerned primarily with problems in biology.)
Lucinda Finley, Guarding the Gate to the Courthouse: How Trial Judges Are Using Their Evidentiary Screening Rules to Remake Tort Causation Rules, 49 DePaul L. Rev. 335, 366 (1999).
See e.g. Michelle M. Mello, Using Statistical Evidence to Prove the Malpractice Standard of Care: Bridging Legal, Clinical, and Statistical Thinking, 37 Wake Forest L. Rev. 821 (2002)(stating A legal adjudicator, in contrast, will treat that finding as effectively establishing the null hypothesis.).
John Allen Paulos, Innumeracy: Mathematical Illiteracy and Its Consequences 37 (1988). See also Leonard J. Savage, On Rereading R. A. Fisher, 4 The Annals of Statistics, 441 (1976)(stating that The logic of something unusual is very puzzling, because of course in almost any experiment, whatever happens will have astronomically small probability under any hypothesis. If, for example, we flipped a coin 100 times to investigate whether the coin is fair, all sequences have the extremely small probability of 2100 if the coin is fair, so something unusual is bound to happen.).
See text accompanying notes ---, supra.
See David H. Kaye, Naked Statistical Evidence, 89 Yale L.J. 601 (1980). See also Richard Lempert, Analyzing the Process of Proof, 66 B.U.L. Rev. 439, 460 (1986).
Charles Nesson, The Evidence or the Event? On Judicial Proof and the Acceptability of Verdicts, 98 Harv. L. Rev. 1357, 1378 (1985). A related hypothetical is the Gatecrasher case, in which the owner of a rodeo discovers that 501 spectators did not pay to enter, while 499 did, and wants to sue all 1000 on the grounds that it is, statistically speaking, more likely than not any one of them did not pay. L. J. Cohen, The Provable and the Probable 74-76 (1977).
In the case from which the Blue Bus hypothetical was drawn, Smith v. Rapid Transit, Inc., 317 Mass. 469 (1945), a bus company who had the sole franchise for operating buses on the street where an accident occurred escaped liability when there was no evidence of wrongdoing since the court found that private or chartered buses could also have used the street: The most that can be said of the evidence in the instant case is that perhaps the mathematical chances somewhat favor the proposition that a bus of the defendant caused the accident. This was not enough Id., 317 Mass. at 470.
D.H. Kaye, Apples and Oranges: Confidence Coefficients and the Burden of Persuasion, 73 Cornell L. Rev. 54, 56 (1987). Judge Richard Posner agrees that should be a negative inference from the lack of supporting evidence and also contends that the benefit of trying those few cases for which other evidence could not be readily obtained is marginal. Richard A. Posner, An Economic Approach to the Law of Evidence, 51 Stan. L. Rev. 1477, 1508-10 (1999).
The only sensible way to understand the hypothetical is that it presents the question of what should be done when this is all the evidence there is. Ronald J. Allen, A Reconceptualization of Civil Trials, 66 B.U. L. Rev. 401, 412 (1986).
See Craig R. Callen, Adjudication and the Appearance of Statistical Evidence, 65 Tul. L. Rev. 457, 470 (1991)(stating that Assuming that all the evidence except one piece has been destroyed without anyone's fault, a factfinder can still wonder whether that one piece of evidence, if weak, is sufficient to support a verdict.).
See e.g. Callen, supra note ---, 65 Tul. L. Rev. at 471 (stating, Courts often hold statistical evidence sufficient to support a verdict. For example, judges rely heavily on statistics in title VII cases, 84 and no one seriously questions their sufficiency. The question is whether, and under what circumstances, statistics are a sufficient basis for a verdict or for a refusal to enter summary judgment.).
See e.g. Wards Cove Packing Co. v. Atonio, 490 U.S. 642 (1989); Griggs v. Duke Power Co., 401 U.S. 424 (1971). Title VII of the Civil Rights Act of 1964, 78 Stat. 253, as amended, 42 U. S. C. 2000e-2, states that it is an unfair employment practice for an employer to discriminate against any individual with respect to his compensation, terms, conditions, or privileges of employment, because of such individual's race, color, religion, sex, or national origin or to limit, segregate, or classify his employees or applicants for employment in any way which would deprive or tend to deprive any individual of employment opportunities or otherwise adversely affect his status as an employee, because of such individual's race, color, religion, sex, or national origin.
Griggs v. Duke Power Co., 401 U.S. 424, 431 (1971).
118 U.S. 356 (1886).
Id., 118 U.S. at 373.
364 U.S. 339 (1960).
Id., 364 U.S. at 341 (1960)(emphasis added).
Intl Bhd. of Teamsters v. United States, 431 U.S. 324, 342 n.23. (1977). Ironically, the facts of Intl Bhd. of Teamsters did not actually involve a zero, just a large disparity. Of the 2,919 whites who held driving jobs 1,802 (62%) were higher paid line drivers and 1,117 (38%) were city drivers; of 180 African-American and Spanish-surnamed Americans only 13 (7%) were line drivers and 167 (93%) were city drivers. Id.
Note, The Inexorable Zero, 117 Harv. L. Rev. 1215, 1216 (2004).
Capaci v. Katz & Besthoff, Inc., 711 F.2d 647, 662 (5th Cir. 1983) cert den 466 U.S. 927 (1984).
Id. In a slightly bizarre analysis, the court distinguished judges from the noble theoretician predicting the collisions of weightless elephants on frictionless roller skates [for whom] zero may be just another integer Id.
Id.
The gatecrasher story, see note --- supra, can be similarly amended: Suppose in the gatecrasher hypothetical that the operator of the rodeo testified that a particular defendant did not buy a ticket. He knows this, he asserts, because the defendant looks unusual to the operator, the operator sold all the tickets himself, and he would have remembered such an unusual character. Ronald J. Allen, A Reconceptualization of Civil Trials, 66 B.U. L. Rev. 401, 415 (1986).
See text accompanying note --- supra.
Id., 455 U.S. at 766.
Santosky v. Kramer, 455 U.S. 745, 790 (1982)( Rehnquist, C.J., dissenting). Rehnquist referred to such a situation as occurring when the social disutility of error in either direction is roughly equal.Id.
Cohen, Confidence in Probability, supra note ---, 60 N.Y.U. L. Rev. at 413-14.
Judges must recognize the difference between the seeming neutrality of numerical analysis and the value judgments that may lie beneath the surface: The mechanical quality of the hypothesis test itself may seem to ensure objectivity, but unless the selection of the significance level is also objective and sensible, this seeming objectivity is illusory. Kaye, Is Proof of Statistical Significance Relevant?, supra note ---, at 1354.
Kaye and Freedman, Reference Guide on Statistics, supra note ---, at 118.
Merrell Dow Pharms., Inc. v. Havner, 953 S.W.2d 706, 722 (Tex. 1997).
Alexander M. Capron, Daubert and the Quest for Value-Free Scientific Knowledge in the Courtroom, 30 U. Rich. L. Rev. 85, 86 (1996). See also Lucinda Finley, Guarding the Gate to the Courthouse: How Trial Judges Are Using Their Evidentiary Screening Rules to Remake Tort Causation Rules, 49 DePaul L. Rev. 335, 366 (1999)(stating that The tort system should not slavishly follow the values of epidemiology because its purposes and social functions have always included a justice role that is broader in scope than whether scientists have arrived at a conclusion, or whatever happens to be the scientific truth consensus of the moment.).
Id.
Price Waterhouse v. Hopkins, 490 U.S. 228, 264 (1989)(OConnor, J. concurring)
211 N. W. 913 (Wis.1927).
Kingston, 211 N. W. at 915. One commentator has described the holding in Kingston as an attempt by the court to devise, an evidence rule that would balance the competing claims. The court protected individualistic values by retaining the defendant's right to be free from liability unless there was proof of causation. Status concerns were protected by switching to the negligent defendant the burden of establishing the cause of the other fire. Lawrence W. Kessler, Alternative Liability in Litigation Malpractice Actions: Eradicating the Last Resort of Scoundrels, 37 San Diego L. Rev. 401, 461 (2000).
607 P.2d 924 (Cal. 1980).
Sindell, 607 P.2d at 936,
Id., 607 P.2d at 925.
Id., 607 P.2d at 926.
Id., 607 P.2d at 936.
Id.
A shift in burdens of proof has also occurred in the criminal context. The Supreme Court has held that the prosecution need only prove the factors that go into sentencing decisions by a preponderance of the evidence. McMillan v. Pennsylvania, 477 U.S. 79, 91-92 (1986). Similarly, in United States v. Watts, 519 U.S. 148, 157 (1997), the Court held that an acquittal on the charge of using a firearm during a drug offense, did not preclude the judge, during the sentencing phase, from determining, by a preponderance of the evidence that the defendant did, in fact, use such a weapon during a drug offense. As has been discussed, this means that the risk of the defendant suffering an erroneously increased sentence is treated as no more serious than the risk of the defendant enjoying an erroneously decreased sentence. Normally, the law considers the harm of a wrongful conviction as far greater than the harm of an erroneous acquittal. That calculus shifts in the sentencing phase because, criminal sentencing takes place only after a defendant has been adjudged guilty beyond a reasonable doubt. McMillan, 477 U.S at --- n.8
Intl Bhd. of Teamsters, 431 U.S. at 359.
Id. See also McKenzie v. Sawyer, 684 F.2d 62, 77 (D.C. Cir. 1982)(stating that, all doubts are to be resolved against the proven discriminator rather than the innocent employee.).
Intl Bhd. of Teamsters v. United States, 431 U.S. at 362.
Id. See also Davis v. Coca-Cola Bottling Co. Consol., 516 F.3d 955, 964-966 (11th Cir. 2008)(stating that Because the courts finding of a pattern or practice changed the position of the employer to that of a proved wrongdoer, each class member seeking redress may rely on that finding as circumstantial evidence that the employer made the challenged employment decision with intent to discriminate.).
See text accompanying notes --- supra.
David H. Kaye & David A. Freeman, Reference Guide on Statistics, at 127.
See Ronald J. Allen, A Reconceptualization of Civil Trials, 66 B.U.L. Rev. 401, 402 (1986)(stating that It is becoming increasingly obvious, for example, that Bayesian approaches can best be used heuristically as guides to rational thought and not as specific blueprints for forensic decisionmaking.); Richard Lempert, Analyzing the Process of Proof, 66 B.U. L. Rev. 439, 446 (1986) (Bayes' Theorem may be useful as a heuristic device.).
17 C.F.R. 240.10b-5 (2009) states:
It shall be unlawful for any person, directly or indirectly, by the use of any means or instrumentality of interstate commerce, or of the mails or of any facility of any national securities exchange,
(b) To make any untrue statement of a material fact or to omit to state a material fact necessary in order to make the statements made, in the light of the circumstances under which they were made, not misleading
According to the Supreme Court, the burden is on the plaintiff to show the violation or the fact that the statement was false or misleading, and that he relied thereon to his damage. Ernst & Ernst v. Hochfelder, 425 U.S. 185, 206 (1976), quoting S. Rep. No. 792, 73d Cong., 2d Sess., 12-13 (1934).
485 U.S. 224, 242 (1988).
Id. 485 U.S. at 214-42, quoting Peil v. Speiser, 806 F. 2d 1154, 1160-1161 (3rd Cir. 1986).
This assumes that it can be shown that the market for the securities in question was efficient. Macey supra note ---, 77 Va. L. Rev. at 1022-28.
Id., 77 Va. L. Rev. at 1029.
For an excellent discussion of regression analysis, see Daniel L. Rubinfeld, . Rubinfeld, Econometrics in the Courtroom, 85 Colum. L. Rev. at 1065-68 and John E. Lopatka & William H. Page, Economic Authority and the Limits of Expertise in Antitrust Cases, 90 Cornell L. Rev. 617, 688-94 (2005).
Macey et al., supra note ---, 77 Va. L. Rev. at 1029.
Id., 77 Va. L. Rev. at 1037.
This fact pattern is derived from Shaw v. Digital Equip. Corp., 82 F.3d 1194 (1st Cir. 1996).
Macey et al., supra note ---, 77 Va. L. Rev. at 1041. The authors do add that, there is no correct significance level, and calibrating the tradeoff is ultimately a value judgment based on the costs of incorrectly rejecting the null hypothesis. Id.
See text accompanying notes --- supra.
See Farrell, supra note ---, 15 Cardozo L. Rev. at 2211 (proposing that For normative reasons, where the purpose of a legal decision is to award compensation for personal loss, the law should adopt a lower degree of certainty, perhaps an 80% standard of statistical significance, and thus display greater tolerance for false positives.).
Cohen, The Gatekeeping Role, supra note ---, 33 Seton Hall L. Rev. at 950. See also Richard A. Posner, An Economic Approach to the Law of Evidence, 51 Stan. L. Rev. 1477, 1504 (1999)(stating In the typical civil trial, there is no basis for supposing that Type I errors (false positives, such as convicting an innocent person) on average impose higher costs than Type II errors (false negatives, such as an erroneous acquittal.).
Cohen, Confidence in Probability, supra note ---, 60 N.Y.U. L. Rev. at 417. See also John M. Dawson, Probabilities and Prejudice in Establishing Statistical Inferences, 13 Jurimetrics J. 191, 209 (1973).
See Ron N. Forthofer et al, Biostatistics: A Guide to Design, Analysis, and Discovery 218 (2007).
This story is derived from David H. Kaye, Is Proof of Statistical Significance Relevant? 61 Wash. L. Rev. at 1338-60. Prof. Kayes example was modeled after Moultrie v. Martin, 690 F.2d 1078, 1082 (4th Cir. 1982).
Richard Goldstein, Two Types of Statistical Error in Employment Discrimination Cases, 26 JURIMETRICS J. 32 (1985). As Goldstein notes, the true rate at which one group is being hired may be six percent higher than another (a simple alternative hypothesis), or it may be four percent higher (another simple alternative), or five, or seven, or eight percent higher, and so on. The vaguer alternative hypothesis of some degree of difference or disadvantage is a composite of such simple, well-specified, hypotheses.Id.
David H. Kaye, Is Proof of Statistical Significance Relevant?, 61 Wash. L. Rev. at 1357.
Id., 61 Wash. L. Rev. at 1360.
Keith Sharfman, Valuation Averaging: A New Procedure for Resolving Valuation Disputes, 88 Minn. L. Rev. 357, 366 (2003).
Even if a judge were reluctant to alter the usual .05 significance level, such an equalization of risks would be appropriate if there was some other evidence of discriminatory actions by the jury commissioners For example, as the court in the case on which this example was drawn stated, Additional evidence that could have supported the petitioner's case would include statistics showing that the jury commissioners exempted a disproportionate number of blacks during their proceedings. Moultrie, 690 F.2d at 1085.
There might be strategic reasons for the company to select some small level of price increase, if the use of that small increase created a better chance of the null hypothesis not being disproved.
Richard Goldstein, Two Types of Statistical Error in Employment Discrimination Cases, 26 JURIMETRICS J. 32 (1985).
See e.g., Id., 26 JURIMETRICS J. at --- (stating that For some forms of statistical analysis, the calculation of power is very complicated and/or involves a noncentrality parameter (measure of the extent of difference in the populations) that has no obvious or intuitive interpretation.).
See text accompanying notes ---, supra.
The practical difference between one-tailed tests and two-tailed tests is that the P-value produced by a two-tailed test is usually twice as great as that produced by a one-tailed test. Stender v. Lucky Stores, 803 F. Supp. 259, 323 (N.D. Cal. 1992).
As Richard Goldstein wrote, A one-tailed test is more powerful. Richard Goldstein, Two Types of Statistical Error in Employment Discrimination Cases, 26 JURIMETRICS J. 32 (1985).
While we know that the probability of a Type II error will decrease with a one-tailed analysis, it is impossible to say by how much. See text accompanying notes, supra. Also, even after the decrease there still would likely be a larger probability of a Type II error than a Type I error.
See Palmer v. Shultz, 815 F.2d 84, 96 (D.C. Cir. 1987)(stating that a two-tailed test and a 5% probability of randomness require statistical evidence measuring 1.96 standard deviations. Consequently, if plaintiffs come into court relying only on evidence that the underselection of women for a particular job measured 1.75 standard deviations, it seems improper for a court to establish an inference of disparate treatment on the basis of this evidence alone.).
Moore v. Summers, 113 F. Supp. 2d 5, 20 (D.D.C. 2000).
Keyes v. Sch. Dist., 413 U.S. 189, 209 (1973).
Stender v. Lucky Stores, 803 F. Supp. 259, 323 (N.D. Cal. 1992). See also United States v. Delaware, 2004 U.S. Dist. LEXIS 4560, n.27 (D. Del. Mar. 22, 2004)(stating that for a one-tailed test to be appropriate, one must assume.. that there will only be one type of relationship between the variables.).
Kingston, 211 N. W. at 915.
People v. Collins, 438 P.2d 33, 33(1968).
In Daubert, then-Chief Justice Rehnquist complained that a requirement that judges evaluate scientific validity, imposes on them the obligation, to become amateur scientists. Daubert, 509 U.S. at 600-01 (Rehnquist, C.J., concurring in part and dissenting in part).
David H. Kaye, Is Proof of Statistical Significance Relevant?, 61 Wash. L. Rev. at 1347.
PAGE
PAGE 106
Z f p
'
)
&iW
X
?@EF23&'Myz{JKĽĽhf
hKNHhf
hK6jhf
hK0JUhf
hKhf
hK5 hK5hNthK^Jh1hK
hK^Jhf
hK^J
hK^JhK h5> &(4yBPKP!,%]^gdKdgdKd`gdK
$da$gdKosuvUVNOQX N!O!!!!!"" $!$"%)%*%+%&&(
(t)u))))))w*x*&+++Y,Z,0H2I2ɺhf
hKaJjhf
hK0JU]hf
hK]hf
hK6]jhf
hK0JUhf
hKNHhf
hK6hf
hKB,%&'+K.F2H2I2q2r2333i67J9K9#:;j<@d]@^`gdK@d]@^gdK$@d]@^a$gdK
$da$gdKd`gdKdgdKI2M2W2p22233::#;;;;<<=%=@=A=4>5>>>??AA\C]CH
HHIIJJBKCKEKMM$M1M2M3MNNQQQQ1S6SU8U9U:U}U~UUVVVXXYYU]V]] ^
^__ahf
hKaJjhf
hK0JUhf
hK6hf
hKhf
hK56hf
hK5Kj<<A=B=>AEHEKLؿؿؿززjhf
hKH*Uhf
hK6hf
hK0Jhf
hK\hf
hKjhf
hK0JUhf
hKaJhf
hK5aJFӸ-!:z_uL
$da$gdKd`gdK]^gdK>?]^"#$%JKPQ qĺڜڒhf
hK5aJjhf
hK0JUaJjhf
hK0JU\hf
hK\aJhf
hK6\aJhf
hK6aJhf
hKaJjhf
hK0JUhf
hKhf
hKNH9HIQRIJst-.yzEFlmoefg 89UWQR} uvhf
hKH*hf
hK0Jhf
hKNHaJhf
hKaJhf
hKNHhf
hKjhf
hK0JUJK0oN ,w
[J
="$
d`gdKdgdKd`gdKvR
S
J
K
$=tuHI567
\]]^b()4
øøîjhf
hK0J5Uhf
hK5aJhf
hKNHaJhf
hKaJhf
hK0Jjhf
hK0JU]hf
hK]jhf
hK0JUhf
hK>
W 7"8"9":"######$$$i%j%K)L)++K.L.333335566283888999/;0;1;;;;;>>>>>??MANA2BdBBBBBEEVGWG/JCJJJ&L'LMMyOOOOpPqPhf
hK0Jhf
hK5hf
hK6jhf
hK0JUhf
hKhf
hKaJP$$$$&()+L.2/0"254892;Z=;>?"BEYGIhd`hgdKd`gdKdgdK$
dda$gdK
ddgdKIMNyOpPqPrPsPPSdT[WGY;\]Hab@dx]@`gdK@dx]@^`gdK$@dx]@^a$gdK@d]@^gdKhd`hgdKdgdKd`gdKqPPdTpT}TVV.X/XEYFY(Z)ZZZ8\9\L^o^^^``i`j`aaa}b~bbbbbbbHfIfffff
gggghhhiiiAjBjvj巬hf
hKNHjhf
hK0JUjhf
hK0JUaJhf
hKNHaJhf
hKaJhK^JaJ
!jhf
hK0JU^JaJ
hf
hK6^JaJ
hf
hK^JaJ
hf
hKhf
hK64bcgisk8m1oMsvy^}9B!D2 HdgdKhd`hgdK
hdx`hgdKd`gdKvjjjskll6m7m/o0ooop0q4qPr:sCsHsIsJsKst(u)u7w8wx x*xLxx
yyy8zLzzz.|/|||\}]}d~e~67HISݿݷݬݬݬݬݬhf
hKNHaJhf
hKaJhf
hK6^JaJ
hf
hK^JaJ
hf
hK6!jhf
hK0JU^JaJ
jhf
hK0JUhf
hK!jhf
hK0JU^JaJhf
hK^JaJ6STUӂԂ@Ajks !3|}`a01ef:;*H}]^ßPQơǡȡ1bܿ˸ܸܸܸhf
hK5hf
hK0Jjhf
hK0JUhf
hKhf
hK6^JaJ
!jhf
hK0JU^JaJ
hf
hK^JaJ
hf
hKaJjhf
hK0JUaJ;>W>>>>/?j?gdKgdKdgdKd`gdK((((6(;(<(=(>(?(C(`(e(g(h(i(m(((((((((((((())7)@)A)B)S)w)))))))***.*\*x***'+++r++!,",#,9,z,|,,,ǿ쩡h5hK6h5hKh8`hK6h8`hKhI2hK6hdGhKhVhK5h@hK6h@hKh4hK hK6hKjhK0JU?,,,,------------.C.f.h.q.}......./ /
//
//=/i///0 0"0#0$0f1i1l1111122"2$2h.hKhphK5hphK6hphKh-hK5hghKhghK6 hK6h5hK6h5hKjhK0JUh.xhK6hJnXhKh8`hK6hKhdGhK5$2A2C2F2z2222223P3o3334
4
4444H4g4445.545;5I5Q5o5555+6@6L6`6j6~666677F7H7I7K7L7M7o7}77777h#hK6 hK5h;\hKh;\hK6]jhK0JUhK6]h]hK6hphK6]h-hKh-hK6hphK6hphKhKh.hK69777777777888888888888889939n:o:p:q:t:::::::::::;d<k<^=b=l=q======>>>>>θ簩癒hDhKhDhK6htthK6hhKhhK6h6vhK6h6vhKh,"hKjhK0JUhhK6h#hKhKh;\hKh#hK6 hK69>>(>;><>=>D>F>R>U>W>X>Y>m>x>{>~>>>>>>>>>>>>>>>>>>>>"?,?.?/?0?1?=?d?j?k?l????
@@D@O@S@T@U@V@W@r@ǽhGhKhGhK5h9qhKhZhKhZhK6hhK6]hhKhhK6 hK6hhhKhhhK6jhK0JUhDhK6hK;j?V@{@@@@AABBC#DrEFFqGGHHHHHI9JJJKO0OgdKgdKr@y@{@|@}@@@@@@@@@@@@@@@AAAAqAAAAAAAAAAAAABBBBB?BRBBBBBB0C>CDCjCmCvCCאhwbhKhf5hK6]hf5hKh\MlhK6]h\MlhKh.|hKh1hK6h#hKh#hK6 hK6hJ~hKhJ~hK6jhK0JUhKhfhhK6CCCCCCCCCCD#D$D*E.E>EeEfEkEoErEsEtEEFFFFFiFmFwFFFFFFFFGMGdGqGrGsGGڼڵڡڼڙڼڵ{thChKhhK6hhK6hhKh%fhKh%fhK5h&"hK hK5hhK5hhK hK6h)hKh)hK6jhK0JUhKhwbhKhwbhK\hwbhK6\hwbhK6-GGGHHHHIIIJJJ1J2J6J7J9J:JLJ_JJJJJJJJJJJJJJJJJJKKK K,KtKKKKKKL*LILNLuLvLLLLLLLLhQ<hK5hQ<hK6hQ<hKh`hKhhKh
hKh{hK6h{hKhM~hKhM~hK6 hK6hQHhKjhK0JUhK=LLMM$MJMKMfMMMMMMMMlN}NNNNNNOO OOO
O.O0O1O2O5O>OXOOOOOOOOOOOOOOOOPPP#PCPSP_PjPPPPPPhQ!hKhBhKhBhK6hhKhfhKjhK0JU hK6hQ<hK6hQ<hKh-ChK5h-ChKh`hKhK>0OOP(QQS+UUVAWFWWXjXXXXXXdZZ6[r[]]^^bbgdKPP'Q(Q)Q*QQQQQQQRRRRS0S2SSSSTTU"U)U+U,U-U8UUUUUU!V+V-V*hiihKhgA_hK6hgA_hKh1hK6 hK6h!hK6h|$hKhQ!hK6jhK0JUhK:ZZZ[[5[6[7[8[F[q[r[s[t[[[[[[[[\\\\]]]]]]]D^c^d^^^^^^^^^____``ia|aaaaabǿhhKhhK6hg[hKhg[hK6hy9hKhhKhThKhThK6 hK6hphKhphK6jhK0JUhQ!hK6hi/hKhi/hK6hK7bbbbbbbbbbbbbbbbbbccc
ccc"c-c0c>c@cAcBcFcJcQcZc_cbclcmcncocpcscxcccccccccccccddƾh3hK6]h3hKhKCJaJhK6]h9hK6h9hKh9hK6]h+fhKh+fhK6 hK6jhK0JUhhKhK;b@cnccdddddddffffffgg8gAgXg\g^g_g`gkgggghh-h/h0h1hAh_hahbhchghyhhhhiiiiiiiiiiiйף팹턹h^HhK5hFC*hKhUhK5hUhK6hUhKh>hK6h>hKh^HhKhyhK5hyhK6hyhK hK6h.hKjhK0JUhKhuLhK6huLhK4iiiiiij$j(j/j0j1j2j3j8jAjJjUjjjjj6kXkYkZkvkkkkkkkGlJlilllllm
n6n7n8n9n:nTnonqnznnnnn)o6o9o:oϛhs|UhKhs|UhK6hhP7hKh]=hK6h+hKhv?hK6hv?hKh^HhK hK6h^HhK6h phKh phK6jhK0JUhFC*hKhK::o;owyz{
1݁߁
"hiü~h*VhKh.AhKh:hK0JjhKUjhKU hK6h>NhKhUhK5hUhKh>NhK6jhK0JUh0RLhKhg$hK6h*VhK6hQ<hKhKhQ<hK51"#Ã !"#46\фׄVxUh2`hK6h-ChK5hEihK5hEihKhQ<hK6hQ<hKhjhK6hjhKh&hKh&hK6h%qhKh*hK6h*hKhfhK hK6hKjhK0JU3UVWgӇ߇Nk{~#&16XYuBabd01234MSU`abcn~ըըՍhi|hK6hi|hKhH6hKhH6hKH* hK5h_=hK6h_=hKhhK6h{hKh?MhK6h?MhK hK6h.hK5h.hK6h.hKhKjhK0JU4 !"#Kʍ<?EFGHJMV}r
(RzȐhVhKhVhK6h_ihK6h_ihKhLhK6hLhKhhK6hhKh-ThK6h-ThKjhK0JUhhK hK6hK=X}ˑ9;ߒ689:Kf{ɓ7홾퉂zuz hK\h5hK\hK5\h hK\h hK6h hKhFhK6hFhKhhK6hhK hK6hPKhKhPKhK6h hKh hK6]jhK0JUhKhVhK6hVhK.79A%0789:=>?@RlmnoÖݖޖߖ"_ŽϸϩϚϓϸτϸ|uuhRhKhRhK6h~hKh~hK6hbhKhbhK6h1YhKh1YhK6h=DVhK hK6h=DVhK\h=DVhK6\hKjhK0JUh5hK56\h5hK6\h5hK\ hK\-8>mߖm48P`wڣƥl;\խWղgdKgdK_t.0OQVdgiqrɚ?Ûě)+ABhlmnohzhKh3hK5h3hK6h3hKhRhK5hRhK6hRhKjhK0JUh*1hK5h(phK6h(phK hK5hQ<hK6hQ<hKh*1hK hK6hKh3u2hK.245:|~<>Пҟԟ֟؟"(8:<BNPRTBN\`bdt창창읕싃h=DVhK\h=DVhK6\hlkhK6hlkhKh'hK hK6h{xhKH* hKH*h{xhKhXBhK5hXBhKhhKhhK6jhK0JUhKhzhKhzhK61*]bqrswxy~ʣأڣۣܣ)-U ťƥǥѥ &y}¦ҦӦ߬䖬h?fhK5hBchKh@ChKhSJGhKh4 hKhnhKhnhK6h]1hKh[7hKh[7hK6 hK6hKjhK0JUh=DVhK\ hK\;#'4jlmnڧ "#$IJVlʨ6;<=Q
q+=ɬԬh5_hK]hThK6]hThKh\yhK6]h\hK6h\hKh@ThK6h:hK0JjhKUjhKUh@ThK5h@ThKjhK0JUh?fhKhKhSJGhK2*17QR\]^xӭԭխ֭8`cl !ӰUWXYԲղþ殧梘zshkhKh$hKh$hK6hk*hK\hk*hKhk*hK6] hK6h2hKhk*hK6hyhK6 hK\hThK\hThK6\jhK0JUhThKhK hK]hK6]hThK6]-ղֲײ
#lӳԳճֳҴմڴܴ%&=ƼƬ좛ƬƬ}vvldhL&hK6hL&hK6CJh;UhKh9AhK6h9AhK6CJ]aJh;UhK\hK6\h9AhK6] hK\h9AhK6\]h9AhK6\h9AhK\h5hKh5hK6hyhKhyhK6 hK6hKjhK0JU&=bdeõȵɵʵ˵̵)*+,-C,./015EŻ}vhk@hK hK5h hK5h hKh hK6hZhKhZhK6hhK hK]hK6]hThK6]hThKjhK0JUhL&hK\hL&hK6]hL&hK6 hK6hKhL&hK,ʵ+/SDe'9CgT53AgdK
4gdKE$+48Z45RSTͺԺ<W» @CDEFI˰읕줝hU~hK5hU~hK hK6hyhKhhhK0Jj&hKUhE9hKjhKUhyhK6h*hKjhK0JUhKhk@hKhk@hK6:Ig^adefgoq|˾6dep &'()=>G}h 8hK6] hK\h 8hK\h 8hK6h;3hK6h;3hKh8bhK6h5_hKh8bhK hK]hK6]hThK6]hThKh?JhK6jhK0JU hK6hU~hKhK/%GH
&Gax &089:;E¸h0*hK6hshK6hhKhhK hK\hK6\h 8hK\jhK0JUhhK6h`hKNHh`hK6NH]h`hK6] hK6h`hKhKh 8hK3ETVZ\el?CDLghiy
%()*/RSTUVp
%567K'13458@h3hK5h3hKhVShK6hVShKhi|hK6hi|hKh9hK6h9hKjhK0JUhP1hKhP1hK6 hK6hKh0*hK=@ABJ^cefgoq{#QwTwxyz8㶯zszskhp`hK6h-hKh-hK6h_
xhK6h]hK6h_
xhK hK]hK6]hThK6]hThK hK6 hK\h]hK\h]hK6\h]hK6]h3hKhKh3hK6jhK0JUh]hK*Aex#M0B0;)gdK!"#$%(FKLMNO`Do+./01G\ht1hKjhu?WhK0JUhu?WhK6hu?WhKhGhKhGhK6hhK6hUhKhhK hK6hp`hK hK]hK6]hThK6]hThKjhK0JUhK2#7Ro'v翸穱hw9hKh hK6h hKh=hKh=hK6hr5ahK6h hKhr5ahKjhK0JUhKhu?WhK6hu?WhK hK6<
/NVwABCDOry45ΒΒh@2hK6h&yhK6h&yhK5h&yhK hK]hK6]hThK6]hThKh hK6 hK6h,hKh,hK6jhK0JUhw9hKhKhw9hK63.012:;<},CW]),OQq1?EstwU|ȲhJJhK6hJJhKh%hK6h%hKh%hK5h%hKhx[hKhx[hK6h[hKh[hK6hnxhK hK6jhK0JUhKh@2hK6h@2hK7()*+<PYn{/01DEFGHLIL鸱鸱隓hhKhhK6hnxhKh6khK6h+,ShKh+,ShK6hhKhhK6h%hK6hnxhK6jhK0JUhK hK6h%hKhJJhK:)/FI` Pd84
gdKgdK-HIJ`abkx
!"35h.hKhphKh*hKh!Z,hK6hthKh'ghKh'ghK6h@2hK6 hK6h@2hKhhKhhK6hhK6jhK0JUhK85HRTYgiC_j
DNPQRUVgi|XYcdeg
濷桜ՎhyZhKhhK hK5hNhK5hNhKhphKh?hK5h?hKhWhK6hWhKjhK0JUh.hKhK hK6hIhK6h.hK68
1GNP*Lc%+/4D!*,789_#,245ao8\htvyh%hK6h%hKhT>hK6hT>hKhorhKjhK0JUh[MhKH*h[MhK5hZohK hK6h*hKhyZhKH* hK5hK>fwy "'J+Hn~
02EOϹϘ| hKH*hE<hK
hKaJhE<hK6aJhE<hKaJh5hKNHh5hKh%hK6h%hKhQXhKH*hQXhK hK6hIhK6h.hK6h.hKhphKjhK0JUhK0K-Pt!Eha gdK
`gdKgdKOQV'138EFKLM ,.OQsu ;<BD_`fgih%hK6h%hKh{,hKh.hK6h.hKhphKjhK0JUh;hKhKhIhK6 hK6F0A]krº驡h*hKCJh$#hKh*hK6h*hKjhK0JUhhK6hhKjhK0JUhnhKhnhK6hQXhK hK6h='hKhKh{,hK9A28G29Y&4FUpx8EN[UXǿӿӿhjhK6hjhKh]ihKhh2hK6^JaJ
hh2hK^JaJ
hK^JaJ
hCWhK6^JaJ
hCWhK^JaJ
hK6jhK0JUh0hK6h0hKhK9XZabkl\ a b c q
?
A
J
n
gq
ƿh]ihK^JaJ
h]ihK6^JaJ
hjhK^JaJ
hjhK6^JaJ
hK6^JaJ
h_+hKh_+hK6h%hK6h%hKjhK0JUhjhK6hjhKhK hK62a 9[u a~v? 8""$ $a$gdKgdK*ABX^9:=P
gAIVZ`abckx|~ hK6hhKhhK6hKjhK0JUh,hK^JaJ
h,hK6^JaJ
hhK^JaJ
hhK6^JaJ
h,hK^JaJ
hhK^JaJ
hK^JaJ
8#$-$(bdfpruŹŮţţţܐyh<,hKh+jhK5h+jhK6h+jhKh
hK6^JaJ
h
hK^JaJ
hhK^JaJ
h*hK6^JaJ
hK^JaJ
hK6jhK0JUhKhA:hKH*hA:hK5hA:hKhA:hK6-uvw#$?@ADKLQ\kEKMip{|
ο鱩霔~h7hKh7hK6h@hKh@hK5 hK5h>hK5h>hK6h>hKh~&hKh#ThKh+[hK6 hK6h+[hKh*hK6h*hKhKjhK0JUh<,hK61/=P
4Wg ) !!(!;!hIhK6h.hK6h.hKhphKhyhKhyhK6hKRhKh+[hKh*hK6h*hKjhK0JUh]jhK6h]jhK hK6h+[hK6hK6;!|!8"9":"K"M"`"j"l"q"~""""b#c#d#v#######$
$$$$$$$$$$$$$$$>%e%g%o%p%%*&,&1&8&R&T&a&hNthKh'hK6huhKh>hK5h>hK6h>hKh'hKh'lhK hK6hIhK6h.hK6h.hKhphKjhK0JUhKhNthK65$$'")))*A**++*-}--*113[55*67I:<=N>?B)C@CXCgdKgdKa&c&d&&&&A'd'e'n'p'q'u'~'''''''''''*(I((() )")#)$)+)D)F)L)T)U)j)))))))))))*****h:
hK6h:
hKh>hKhLbhKh*hK6h*hKh`(hK6h`(hKjhK0JUh+[hK6h+[hK hK6h'hK6h7*hK6hKh7*hK6*1*3*5*9*A*B*C*V*******+ +*+++,+-+B++++,&,M,Z,},,,,,,Ļ䊂{s{h{{`{XhhK5hhK6hhK6\]hhK\hhKh6thK6h6thKh*hKCJ^JaJ
h*hKPJ^JaJ-h*hK5PJ]^JaJ-h*hKaJh*hKh7*hK6h7*hKjhK0JUhKh>hKH*h>hKh>hK5",,,,-*-+-G-z-|------------..S.U.^.n..///Q0T0Z0000*1+1,111111111w2333&3_3j3l3u3ֲ竣甌hy`hK5hy`hKhhK6h%hK6h%hKh6thK6] hK6 hK\h6thK\h6thK6h6thKjhK0JUhKhhK5hhK hK57u333333455W5Z5[5\5]5}5555555555555556(6)6*6+6,6<666666677þ箤睕{th+#hKhF6hK6hF6hKhF6hKCJaJhKCJaJ
hK6aJhF6hK6aJhF6hKaJhO5hK hK6h@hKjhK0JUh0ThKH*aJh0ThKaJ
hKaJhKhthK6hthK+77777798U8[8)9D9\9c92:4:7:I:J:K:V:::;;;
<<<<<<<<<<<==== ==Q=M>N>O>P>Y>a>f>x>~>>???횓hA }hKh+hKh+hK6hdhK6hdhKhh{hK6hh{hK hK6hHhK6h^ihKhHhKh^ihK6jhK0JUhKhRhK5hRhK7???@,@C@@@@@@BBBBC"C'C)C*C+C>C@CACBCECVCXCYChCmCoCpCqCtCCCCCCCCCCCDD^DDDDDE
E&Eh7_hKhyZhKhyZhK6h~hKh~hK6hthKhhK hK6h5hKh5hKh5hK6jhK0JUhO5hK56hO5hK6hO5hKhK6XCoCCJEEEFFHHHIJKLaLNN>O[OQQQ
R%R=RCRV
gdKgdK&EFEIEJEKELEhEEEEEEEEEEEEEFFFFFFFFFFFG G$GXH[HlHHHHHHHHHHHHHHHHHHII"I#I-I@Ih%hKhhK6h\NhKhi
hK6hyphK6h+#hKhSS@hK6hSS@hKhzQhKhzQhK6h7_hK6jhK0JUh7_hK hK6hK:@IIIIIIIIIIIJJJJJ J%J}KKKKKKKKKKKKLLLLLLL>L_LaLbLcLxLLLLL㫤Ȕㅍ~v~hq.hhK6hq.hhKhPhK6hPhKhIhK6h.hK6h.hKhphKh@LhKhphKh+[hK6 hK6h+[hKh*hK6h*hKhKjhK0JUh%hK6h%hK.LMMMMMMNENNNNNNNNNNNN
OO=O>O?ONOYO[O\O]OeOqOyOOOOOOQQ3QQQQQQQQQQQQQQQ㮶㮧㧟㐉h8 jhKh8 jhK6hhKhOhK6hOhKhthK6hthKhPhKhPhK6hCfhK6 hK6jhK0JUhKhfhK6]hfhK6hfhK6QQQQ
RRRR%R&R'R*R=R>R?RBRCRDRERS S9SeS{STVVVVVVVVVVVVVVW%W6WWWWWWWWWWWWWWWWW
XnYoYpYqYtYYYYYY쿸hzhKh4z
hKh4z
hK6h0hK6h+hK6h+hKh0hKhhK hK6hKjhK0JUDVVWWoYYY[[\g]^^___`%aDaabbdefgg
gdK
gdKgdKYYYYYYYYYYZ[&[D[H[h[|[[[[[[f]g]h]]"^=^W^X^^^^^^^^^^^^^_____t_z___________hZFhKh/hKh5.hKH*h5.hK6h5.hKh?UhK6h?UhKh/6hKh^IhK6h^IhK hK6jhK0JUhKh.hK6h.hK:___` `=`?`Q````````````aaaa#a%a&a'a*a,a;a=a>aDaEahaaaaaaaaaaaaaaaaa|bbbbbbbbbbbbbbbb»hi
hKhi
hK6h;hKh$hKh$hK6jhK0JUhZFhK6hE<hK6aJhE<hKaJhZFhKhK hK6Abd d!d(d=d>dCdOdedhdndwdddeeeeeeeeff#f,f5f*@a>YEndnote ReferenceH*>@r>Y
Footnote TextPJLLYFootnote Text CharOJPJQJ^J@&@@YFootnote ReferenceH*D@DY List Paragraph
^m$6U@6Y Hyperlink>*B*phd/dYDefault1$7$8$H$-B*CJOJ QJ ^J _HaJmH phsH tH 4 @4YFooter
!>>YFooter CharOJPJQJ^J.)@.YPage Number44!YHeader
!>> YHeader CharOJPJQJ^JBB0T0CM3"dB*OJPJQJ^JphPK!K[Content_Types].xmlj0Eжr(]yl#!MB;BQޏaLSWyҟ^@
Lz]__CdR{`L=r85v&mQ뉑8ICX=H"Z=&JCjwA`.Â?U~YkG/̷x3%o3t\&@w!H'"v0PK!֧6_rels/.relsj0}Q%v/C/}(h"O
= C?hv=Ʌ%[xp{۵_Pѣ<1H0ORBdJE4b$q_6LR7`0̞O,En7Lib/SeеPK!kytheme/theme/themeManager.xmlM
@}w7c(EbˮCAǠҟ7՛K
Y,
e.|,H,lxɴIsQ}#Ր ֵ+!,^$j=GW)E+&
8PK!\theme/theme/theme1.xmlYOoE#F{o'NDuر i-q;N3'
G$$DAč*iEP~wq4;{o?g^;N:$BR64Mvsi-@R4mUb V*XX!cyg$w.Q"@oWL8*Bycjđ0蠦r,[LC9VbX*x_yuoBL͐u_.DKfN1엓:+ۥ~`jn[Zp֖zg,tV@bW/Oټl6Ws[R?S֒7 _כ[֪7 _w]ŌShN'^Bxk_[dC]zOլ\K=.:@MgdCf/o\ycB95B24SCEL|gO'sקo>W=n#p̰ZN|ӪV:8z1fk;ڇcp7#z8]Y/\{t\}}spķ=ʠoRVL3N(B<|ݥuK>P.EMLhɦM .co;əmr"*0#̡=6Kր0i1;$P0!YݩjbiXJB5IgAФa6{P g֢)҉-Ìq8RmcWyXg/u]6Q_Ê5H
Z2PU]Ǽ"GGFbCSOD%,p
6ޚwq̲R_gJSbj9)ed(w:/ak;6jAq11_xzG~F<:ɮ>O&kNa4dht\?J&l O٠NRpwhpse)tp)af]
27n}mk]\S,+a2g^Az
)˙>E
G鿰L7)'PK!
ѐ'theme/theme/_rels/themeManager.xml.relsM
0wooӺ&݈Э5
6?$Q
,.aic21h:qm@RN;d`o7gK(M&$R(.1r'JЊT8V"AȻHu}|$b{P8g/]QAsم(#L[PK-!K[Content_Types].xmlPK-!֧61_rels/.relsPK-!kytheme/theme/themeManager.xmlPK-!\theme/theme/theme1.xmlPK-!
ѐ'
theme/theme/_rels/themeManager.xml.relsPK]
W?E2
&zJ
uUNN * t!w"#Y$234669\;@ABBCE2EFI8M}MNPQUU VWY\.cdHmbrtt=yy}GKۃBq2~NIm#şР?PxbiгŷxߺK}EzYnIgty.i/4tt x&
9s3w9:U{Og0AMV
u_
>LZSE")*-./01_24{5>67C9A<=|ABCMNOP?QRyUVW9Y\]_^>_``Za{cceikmnpeqrZstuuivxzyy||}~ƀąb#@dbimRj9`٢Uc`(ΰ׳Qd^E,`bnOE&(D[#bMMAn0I
38M7k
!&')*+47_`aij}~=>?@ABCEFGHUVWXYZ[\]^_`abcdefghijklmno*T~-8]t(DmsWCh_^ &!!!"%&&'((n*,,-..3...W/Q11#2_2u4t5599-:[::);;=L>?N?@@AEB%E'FIFGHHHKKNNpPQQ~RRR TVfVXxY[p[\B^^l`cNccdd3effglh|hi%jjm%m+mZmmnqrZss!t vvvvdxxxyzY|(~ID|@1Ro&0TA" .Re :ڦΨ/(߸3mҺ6Mp
=Q%!
8z{nrNxNkqc,%}z.H
r
6;-E\7{ !N!##+$H$&&&&'*'0'++,,\...0T23334456166779:;;<>2?S??ABC7DaD`EF:GIEIvIJJJL`LcL7kn &(4
yBPKP,'#K&F*H*I*q*r*+++i./J1K1#23j44A5B569=@ECDL@CE FGGGG(GJKNORKTW>>>?????c@@>ADDEEFFjG(IJNKLLLTMMNINJNfNNNPPPQ/SoS.TsTPXyXXYFYYpZx\]]^U___`cdefg>g>jjlm*oopp8qqqrt uw2xy*z={|}&فȂO<Q&6W߈Hߋ^;ےÔϔWt:m|Y|6qҩ;֪gHV)δiƷܺXڻÿia<;W`JpAMzg9'?W*ebDWb>a34Vz5(,@2pQ%
+
akr7Z4: $!!$%&),))+-/01366667E78599|::[<<<m==t??@BBBCdEEEEEEE]JJAK~KM?MMFOmO7PQv
qPvjSb(l#(,$27>r@CGLP.WZb>di:otty}U7_ղ=EIE@5
OXu;!a&*,u37?&E@ILQY_bgmtww
"#$%&')*+,./01345,%j<eDӸ$IbH0<&j?0Ob{8A)a $XCVgww!(-26!!XXX|}6}ߎ!cLXXX8@0(
B
S ?w{
MTQZOT")XXtXyX[[ddeellRmWmqqvqyyӈڈ)EJ?Dpu#_d=JAGNT27 L2T2@@pp̔ؔSZ?Min7<sxCK F!M![!b!&#)#####y$$$$@%F%))++,,!,&,--<.C.....H/M/#0(01133!4'4.454444455T5[5t5{55566.757778&8s9{9994:;:;;<<<<?@m@r@AAD$DMEWEEEFFrG~GKKLLMM(N-NXXXXXXY&YYY3[9[[[\\&]-]b]f]]]^^^_e_______D`H`aaabcc!d(dMeSeTeZeeeff gg@gFgkk;lDl+p5pAqKqssttuuyy3z=z>{H{}}}}(.bh
ς+/io>Eąʅׅ|~ˏՏ'
(ZajlMTY_?D/7}ٟpy~IM7:xyIP}_f;As|!gp7?дٴ۴ktBGr{|5>)0Ϳݿktuz|<C{X]hoT[U\;Fiq;Drz>Diuw|IP[alr=C>CIP P
W
9Bcl $ -!3!;!F!G!L!w!|!!!!!&&))))6*:*d0w00001112222339979=9H9P9<(<<<????@&@EEMMNNNNOOQQzR~RRRS SSSSSTTTTMURUWWWZ`Z]]^^lbybccf
g2h9hiidjkjk5k8k)
uz
MTz#
#O$Y$11BBMZNOO\\^^4e6e#f%fllholorr2y:yzz{{wǇÈш˖(0#OYŢȬ]dp7>)7uH /mtmxKW{/8R<%%44R6d6X;f;;<@< F'F^LaLNN`{}{}}6IlxP%a~8Cn{09äܤ57V_-HWVe*MZ)>BE
1;hl
70 E H X%_%G)L)**++4,9,B-J-d.i.q////1-1r1111444444/5q555666!7$7G7J7b7e7z7(8j8k88::k:|:::::;<F<>>?!?R@b@r@z@@@@@BBCCDNDcDjDDDFFJJ#K*KKKVMMMMJNLNNNNN{O}OPPPQQSQtR~RoSsS0TrTUUWLWRXxX{XXXXXX(Y*YYYYZ^ZoZ[:[B[[[[[[ \d\e\w\]^j^S_W______a=aHa[acddelese+ffff@g\gghjjjjk,k5l:lFlHlmmmmmmqqr"rttuuuvwwayyyy{}}~"~([iwہ46Qۃv#H(5ه"J^ċދMN/;R]ǐ^БґՑŔΔYsĕŕٕ 9<bcʘ=FalV-!~X[{)38^_hv| ԩ[7@hŷȷ߷"
7ۺIWܻ+y&'¿@F(*UV:=VYG_3p-5;hHM&)>AV| )B %+:q{ S^!ak1;{R\.U[chq=B
?{
x}&+7=lmgir
$lu
6@<=M`mqAMmmL\Y&
')+)++112233-444\4e4666666777D7R7T7777748@889R9W9::a<r<<<<<a=i=O?Y???BCCCDDfEEEEEEEEFFMJTJjJlJNKPKKKAMM,P6PQR'R-R>RWRNSjSSSTT&U,UrVW*XgX [%[]]]^t``bccccd'd7d-f9fgggghhh/h2hWhXhrhiifiiiiijkjqjjjk5k8k::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::89Nh.^`OJPJQJ^J)
^`hH.
pL^p`LhH.
@^@`hH.
^`hH.
L^`LhH.
^`hH.
^`hH.
PL^P`LhH.89j8 Kkk5k8k@Xij>?7k@r@H@D@Unknown
GTimes New Roman5Symbol3Arial? Courier New7CourierKMTimesNewRomanPSMT7CambriaSMBemboTimes New Roman7Calibri;PHelvetica qhFF&UOkʿk
hh24dVIK#qHP?'Y0MICHAEL MEYERSONMICHAEL MEYERSON
Oh+'0
4@L
Xdlt|'MICHAEL MEYERSONNormal.dotmMICHAEL MEYERSON4Microsoft Word 12.0.0@Ik@a@^F@XGkUO
՜.+,D՜.+,`hpx
'VTitle 8@_PID_HLINKS'Ab >http://www.abanet.org/legaled/statistics/charts/stats - 1.pdfav&http://www.bls.gov/cps/cpswom2007.pdfa
')http://works.bepress.com/jonathan_kahn/1a
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
!"#$%&'()*+,-./012345679:;<=>?ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~Root Entry F]]GData
81Table@WordDocument*nSummaryInformation(DocumentSummaryInformation8CompObj` F Microsoft Word 97-2004 DocumentNB6WWord.Document.8