Curving University Grades
My advisor wrote a document that argues for the existence only of a grade of A. It will give you something to think about, even if you disagree.
Along these lines, I have been considering what I see as one of the major shortcomings of my experience at MIT: the curving of grades (at least in the EECS department). The basic grading algorithm assumes a Gaussian distribution, and measures the mean of the grades [tex]\mu[/tex] and the standard deviation [tex]\sigma^2[/tex]. This provides them with a simple, parameterized method for dividing the students into performance categories, based on the presumption of the amount of mass under the Gaussian:
A: [tex]\mu + \sigma < x[/tex]
B: [tex]\mu < x \le \mu + \sigma[/tex]
C: [tex]\mu - \sigma < x \le \mu[/tex]
F: [tex] x \le \mu - \sigma[/tex]
Presumably, it is hard to get into MIT. We might imagine that there is some metric that MIT uses for admission and that the general population has a Gaussian distribution for the metric. Now, for admission, MIT selects a small tail of this distribution that we will parameterize as [tex]x > \alpha\sigma[/tex]. To simplify things further, we will assume a unit variance and zero mean (a basic normal).
With the grading scheme described above, the basic assumption is that this selected group can be additionally distributed according to a Gaussian assumption. This is clearly false, because we only have the tail of a Gaussian if we assume that the admission metric correlates significantly with the metric upon which grading is based (a very reasonable assumption in my opinon). The question that we want to answer here then is, given a Gaussian-based attempt to distribute students into these grading categories, how well do we achieve this given that we have sampled from a Gaussian tail first? Obviously it will be broken, but how badly? Will more students get A’s than should, or more students get F’s? Let’s figure it out.
There are lots of reasons why using a curved grading scheme of any kind is a bad idea, but I’m not going to talk about that here. Rather, let’s consider the Gaussian assumption itself.
First, we want to calculate the mean and variance within the tail, in order to generate the same parameters that are used for distributing the students. First, the mean:
[tex]\mu(\alpha) = \frac{1}{\Phi(-\alpha)}\int_{\alpha}^{\infty}x\Pr_{gaussian}(x)dx[/tex]
Let’s use a zero mean for the initial distribution in order to avoid ugly things like error functions – it won’t change our analysis at all. This gives us a mean for the tail starting at [tex]\alpha[/tex]:
[tex]\mu(\alpha) = \frac{1}{\sqrt{2\pi}\Phi(-\alpha)} e^{-\frac{\alpha^2}{2}}[/tex]
Calculating the variance in the zero mean case is just finding the mean of the square:
[tex]\sigma^2(\alpha) = \frac{1}{\Phi(- \alpha\sigma)}\int_{\alpha\sigma}^{\infty}x^2\Pr_{gaussian}(x)dx[/tex]
Not pretty:
[tex]\sigma^2(\alpha) = \frac{1}{\sigma\sqrt{2\pi}\Phi(-\alpha\sigma)}\left(\sqrt{\frac{\pi}{2}} + \alpha e^{-\frac{\alpha^2}{2}} – \sqrt{\frac{\pi}{2}}\text{erf}(\frac{\alpha}{\sqrt{2}})\right)[/tex]
To make things more concrete, let’s plug in [tex]\alpha = 1[/tex], which would be to say that the tail only includes people one standard deviation above the mean (the top 16% of the population). In this case:
[tex]\mu(1) \approx 1.526[/tex]
[tex]\sigma(1) \approx 2.507[/tex]
So if we now imagine a Gaussian with these parameters and use it to determine the boundaries between grades using the algorithm above, we find:
A: [tex]4.033 < x[/tex]
B: [tex]1.526 < x \le 4.033[/tex]
C: [tex]-0.981 < x \le 1.526[/tex]
F: [tex] x \le -0.981[/tex]
The total mass under the tail is 0.1586, so we can use this to normalize the fraction in each of these regions.
A: 0.000189
B: 0.107526
C: 0.892285
F: 0.0
Compare these to the proportions for a true Gaussian with this same grading algorithm:
A: 0.16
B: 0.34
C: 0.34
F: 0.16
The good news is that no one should fail, because the upper bound is below the lower bound for the population sample, so there aren’t any people in the F range. The bad news, however, is that almost everyone gets a mediocre grade because the tail decreases with the square, placing a very large proportion of the people in the C range.
Obviously, this is not an extraordinarily accurate representation of things as an EECS student at MIT. People do actually achieve A’s and people do actually fail. However, what I am hoping is that this illustrates is the implication of attempting to use a Gaussian to curve out a population that has been biased using a correlating metric.