Skip to main content

College major peer effects and attrition from the sciences


This paper examines how peer quality within distinct college majors affects graduation rates and major persistence. To mitigate the selection problem, we control for school-specific fixed effects, as well as very flexible application-admissions pattern fixed effects. Non-science peer quality appears to have a positive effect on both the likelihood that a student chooses a science major and on his or her cumulative GPA. Conversely, students who attend campuses with stronger peers in the sciences are less likely to graduate with a science degree. Weaker, non-minority students typically react to stronger peers in the sciences by shifting majors. Under-represented minorities tend to persist in the sciences regardless of peer quality, but in more competitive programs they suffer – often substantially – in terms of college grades and the likelihood of graduating.

1 Introduction

Science, Technology, Engineering, and Mathematics (STEM) training is an important driver of economic growth and innovation in the United States. Not surprisingly, there is great public interest in promoting policies that improve the breadth and depth of STEM skills in the American workforce. Undergraduate education is a critical part of the STEM pipeline. Some studies suggest that college students may benefit by attending academic institutions with stronger peers. However, the peer dynamics found in university-level STEM courses are potentially drastically different from those seen in the sorts of academic contexts commonly evaluated in the peer effects literature. In STEM classes, rigid grading curves are prevalent and peer competition is often fierce. This paper is one of the first studies to empirically test how the competitive environment found in university-level STEM courses mediates the influence of peer quality on the college outcomes of intending science majors.

The effect of peer quality on STEM achievement is also of great interest for the affirmative action debate (Elliott et al. 1996). African-Americans are especially underrepresented in STEM fields, even though they are even more likely than whites to express interest in a science career as high school seniors (Astin and Astin, 1993). Young whites are nearly three times as likely as blacks to achieve a bachelor's degree in a STEM field and nearly seven times as likely to achieve a doctorate in the sciences (author’s calculation from US Census Bureau 2003). A central argument in favor of affirmative action policies is that such policies potentially benefit minority students by exposing them to stronger peers. However, a variety of studies have suggested that large preferences may have counter-productive effects (Sowell 1972; Loury and Garman 1995). In particular, they may cause science-interested minorities to shift out of undergraduate STEM fields because of the difficulties of competing with students with stronger academic preparation. If STEM training benefits from positive peer externalities, then one would expect affirmative action to promote diversity within the scientific workforce. Conversely, affirmative action policies would potentially have the perverse effect of undermining diversity in the sciences if net peer effects are negative in university-level STEM courses. To the extent that minorities that benefit from affirmative action have skill deficits relative to their university peers (i.e., they are mismatched), such relative deficits may exacerbate any negative peer effects in the sciences or negate any positive average peer effects. Therefore, our paper additionally aims to evaluate any mismatch effects on STEM attrition and to better understand how peer quality influences the major choice and final college outcomes of minorities pursuing STEM majors.

In our analysis, we take advantage of a very large dataset covering all freshmen enrollees at the eight undergraduate campuses of the University of California over a nine-year period.1 These campuses have many functional similarities – they are all public, they are similar in size, they have common entrance requirements and a common administration – but they embrace a wide range of student competitiveness. At UC Berkeley, the median student has SAT scores that place her at the 91st percentile of all American students taking the SAT; at the least elite UC campuses, the median student has SAT scores that place her at the 62nd percentile. While all UC students are academically stronger than the average American college student, this range of entering credentials is much broader than that observed in most other studies of peer effects.

Because of the extremely rich data we have on application and admission patterns across all eight campuses, we are able to effectively address selection problems by building upon and improving on the strategy used by Dale and Krueger (2002) in their well-known study of the effects of college eliteness on earnings. Specifically, we control for very flexible application-admissions pattern fixed effects to account for student unobserved characteristics, as well as school-specific fixed effects to account for typically unobserved institutional characteristics that are plausibly correlated with peer quality and student outcomes. We find strong support for the role of peer effects, and significant support for the mismatch effect as it is usually defined. We also suggest issues to pursue in future research.

2 Related literature

Because of concerns about potential selection bias resulting from non-random college enrollment patterns, research on peer effects in higher education have tended to examine particular academic settings where certain peer groups are randomly or quasi-randomly assigned. Using data from a middle-sized public university in southern Italy, Brunello et al. (2010) find that students with academically stronger roommates achieved significantly higher grades in the hard sciences; they also found no measureable peer effect upon grades achieved in the humanities and social sciences. Similarly, Carrell et al. (2009) find positive academic peer-effects resulting from the random assignment to squadron at the United States Air Force Academy, with particularly notable results in math and science courses. Other studies, such as Sacerdote (2001) and Foster (2006), that have looked at roommate and residential peer groups find no or limited resulting effects at other particular academic institutions.

A necessary limitation of studies that focus on roommates or smaller peer groups is that they may not provide us with relevant information that would allow us to project how an individual’s outcomes would differ if his or her entire cohort of peers were to change (as would result from a change in college attended). Specifically, the peer dynamics that occur between roommates and within smaller college peer groups are potentially different from those that occur at the more macro peer group level. In our analysis we focus on major groupings as the relevant peer groups that determine persistence in the sciences.

The general literature that looks at attrition from STEM majors also suggests that peer dynamics may function differently in university-level science courses than in other academic settings. Tobias and Lin (1991) note that introductory STEM courses typically have competitive, rather than cooperative, learning environments. Lipson and Tobias (1991) argue that the prevalence of curve-grading systems in STEM classes leads to peer competition that is often aggressive and cutthroat. As a result, students with high academic ability who might otherwise thrive in the sciences are pushed out. Others, who reject the culture of competition found in the sciences, opt out. Ours is one of the first studies to empirically test whether the competitive environment found in university-level STEM courses fundamentally impacts how peer quality affects student outcomes.

In considering policies that might promote the production of science graduates, one should be particularly interested in the likely outcomes of those students who are on the margin of successfully obtaining a STEM degree or failing to do so. Ost (2010) examines longitudinal data from a single large elite research university and finds that the students who are less likely to persist in the sciences particularly benefit from taking courses with more persistent science peers. Carrel et al. (2009) also find that students with lower academic ability particularly benefited from being assigned to a squadron with stronger peers. However, in a subsequent follow-up study, Carrell et al. (2013) deliberately assigned students to control and treatment groups, with the treatment groups aimed at maximizing the academic success of students arriving with the weakest academic preparation by pairing them with stronger peers. Contrary to expectations, the low-preparation students assigned to the treatment group “avoided the peers with whom we intended them to interact and instead formed more homogenous sub-groups,” and had worse academic outcomes than similar students assigned to the control group. Several studies that have focused on the mismatch between a student’s own ability and that of her peers also find that weaker STEM students have worse academic outcomes when paired with stronger peers.2 In our analysis, we also consider whether peer ability of different major groups has heterogenous effects by student ability.

Studies that attempt to identify average peer effects or heterogeneous peer effects for lower ability (potentially mismatched) students face the strong challenge of dealing with endogeneity resulting from the college admission and enrollment process. Specifically, since students enrolled at the same campus faced the same admission process, individual observed and unobserved ability is likely to be highly correlated with peer ability. The key contribution of our work is that we are able to take advantage of a rich dataset that allows us to take a number of steps to ensure that our findings are not contaminated by selection bias. Furthermore, this data allows us to examine major group peer effects in a broader range of academic institutions.

3 Data

The University of California Office of the President (UCOP) maintains extensive and largely consistent databases on every student that has enrolled at the university since 1992. In 2007, a group of economists approached UCOP about studying the effects of Proposition 209 (which banned race preferences at the university for cohorts admitted in 1998 and later) upon academic outcomes (Arcidiacono et al. 2012). In 2008 and 2010, UCOP released a public-use version of its database. Most relevant for our purposes, the dataset contains nearly forty variables on every freshman applicant to a UC campus from 1995 through 2003. Among the variables are the identity of each campus to which a student applied, which campuses admitted the student, and whether (and where) the student enrolled. The data also includes information on each student’s planned field of study and a range of information about high school and standardized test performance. For all enrollees, the dataset includes information on college grades, final field of study, and time to graduation (if the student graduated).

For privacy reasons, UCOP collapsed student observations in a variety of ways. SAT scores are reported in forty-point ranges, for example, and college GPAs are reported in one-tenth point increments. However, this data does include an exact academic index score, which UCOP constructed as a linear combination of each student’s high school GPA, SAT I verbal, and SAT I math scores based on pre-assigned weights. A potential issue with this index score is that the way that it weights these three measures of high school credentials is not necessarily appropriate for an investigation of the determinants of success in the sciences.3 Therefore, we use these two sets of information on high school credentials to impute precise high school GPA, SAT verbal, and SAT math scores and use these imputed variables in the analysis that follows.4 Details on the methodology used to impute high school grades and SAT scores are reported in the Appendix.

Another data issue, also arising from UCOP’s privacy concerns, is the grouping of student observations into three-year cohorts (the cohorts of interest here are freshmen entering in 1995-1997, 1998-2000, and 2001-2003). The measures of peer characteristics that we use in our analysis thus pertain to these three-year cohorts instead of more refined entering-class cohorts. If the relevant peer group is the entering-class cohort instead of the three-year cohort, this aggregation will produce measurement error in our peer group measures. However because our peer measures are based on cohort averages, OLS estimates will remain consistent when using the three-year cohort aggregates.5 The standard errors of our estimated peer effects, of course, will be larger than in the case where entering-class cohort measures were available.

Finally, the data does not contain information on gender and does not distinguish between black and Hispanic students, categorizing both as under-represented minorities. We would definitely expect student ability and preferences towards the sciences to vary by gender and possibly by minority subgroups. Because we do not observe this information in the data, it is again important that we use an identification strategy that accounts for unmeasured student qualifications and/or preferences that might influence both the selection of a student’s enrollment campus and his or her subsequent outcomes.

We have restricted our sample to students that are not missing information on personal characteristics and college outcomes. Summary statistics for this sample are presented in Table 1. On average, UC students that intend to major in the sciences have stronger high school credentials (in terms of SAT I scores and high school grades) than students intending to pursue a non-STEM major. However, these same students achieve a slightly lower graduation rate and lower cumulative college GPA than students intending to pursue a non-STEM major.

Table 1 Sample means (Standard Deviations in Brackets)

Only 49 percent of intending science majors actually graduate with a science degree. The outflow of students from the sciences is not counterbalanced by a comparable inflow of students from other fields. Only 10 percent of students that do not initially intend to pursue a STEM major end up doing so.

4 Empirical strategy

The following framework guides our thinking about how major peer groups affect college outcomes.

4.1 Conceptual framework

Students take courses in different major fields, and their course-level outcomes are potentially determined, at least in part, by the characteristics of the peers that share these courses. One particularly important outcome is average grades earned in each major j \( \left({\overline{g}}_i^j\right) \), as students need to satisfy minimum GPA requirements within their chosen major in order to obtain a degree in that field. The average grades earned in major j courses by student i:

$$ {\overline{g}}_i^j={g}^j\left({a}_i^j,{q}^j,{\eta}_i^j\right) $$

are likely to be determined by her own aptitude for major j (a), the quality of peers in major j courses (q), and a random shock η.

Students must consider the likelihood that they will successfully attain or exceed minimum grade and credit hour requirements both overall as well as within their chosen major field. In deciding on their college major, students are faced with a degree progress production function:

$$ {d}_i^j={d}^j\left({{\overline{g}}_i}^j,{{\overline{g}}_i}^{-j},{h}_i,{\sigma_i}^j\right), $$

where \( {\overline{g}}^{-j} \) is the GPA earned in non-major field courses, h is the total number of credit hours earned by the student, and σ is the share of all credit hours taken in major field courses. The student successfully obtains a degree in major j if her degree progress exceeds the minimum standard (d) for that major set by the college:

$$ {D}_i^j=1\ \mathrm{if}\ {d}_i^j\ \ge\ {\underset{\bar{\mkern6mu}}{d}}^j $$

A student receives current and future utility (V):

$$ {V}_i^j=V\left(Y\left({D}_i^j,{e}_i\right),\gamma \left({D}_i^j,{u}_i\right)\right), $$

where this utility is a function of the pecuniary (Y) and non-pecuniary (γ) benefits that student i receives from studying major j. Specifically, we can think of Y as a vector of the lifetime earnings stream that student i receives from having pursued studies in major j. Similarly, we can think of γ as a vector of the lifetime non-pecuniary benefit stream that i experiences as a result of studying major j. Each of these benefit streams potentially depend on whether the student successfully obtains a degree in major j and a series of random shocks e and u. A student chooses a major j* that maximizes expected current and future utility:

$$ {j}_i^{*}= \arg \underset{j}{ \max }E\left[{V}_i^j\right]. $$

To further simplify things, let us assume that students choose between two majors: STEM (S) and non-STEM (N). Whether a student obtains a STEM degree is dependent on both her utility-maximizing decision and how successful the student is in meeting the minimum requirements of her desired major. It follows that the reduced form model of this outcome is:

$$ {D}_i^S=f\left({a}_i^S,{a}_i^N,{q}^S,{q}^N,{\underset{\bar{\mkern6mu}}{d}}^S,{\underset{\bar{\mkern6mu}}{d}}^N,{v}_i^S,{v}_i^N,{\varepsilon}_i\right), $$

where \( {v}_i^S \) and \( {v}_i^N \) represent preference parameters of student i for each major, and ε i represents a composite random shock.

4.2 Empirical specification

The basic specification that we use in our empirical analysis is:

$$ {y}_{ikasc}={\beta}_1{q}_{ikc}^S+{\beta}_2{q}_{ikc}^N+{\varGamma}^{\hbox{'}}{X}_{ic}+{\mu}_k+{\delta}_{sc}+{A}_{ac}+{\varepsilon}_{ikasc}, $$

where we look at a number of outcomes of interest, y ikasc . The main independent variable of interest, \( {q}_{ikc}^S \), is a peer quality index for the other intending STEM majors at campus k in three-year cohort c. Similarly, \( {q}_{ikc}^N \) is a peer quality index for non-STEM majors. We define major peer groups according to the intended major each student states on her college application to her enrollment campus and use principal component analysis to construct the major-specific peer quality indices. Specifically, we use the principal component of average peer high school GPA, math SAT I score, and verbal SAT I score for each major group.6

The main coefficients of interest are β 1 and β 2. Findings from the general peer effects literature would appear to suggest that these two coefficients should be positive. However, as noted previously, the academic environment found in the sciences is potentially very competitive, where stronger students may push their peers down the grading curve in university-level STEM courses. If this type of competition dominates any positive peer interactions in the sciences, than one might expect β 1 to be negative.

Given that the within-campus variance in our peer measures is small, we cluster our standard errors by campus. We additionally take steps to address concerns that cluster-robust standard errors may be downwards biased, given the small number of clusters in our analysis. All tables report statistical significance based on critical values from a t-distribution with five degrees of freedom. We also calculate p-values for our regressors of interest using the wild cluster bootstrap-t procedure advocated by Cameron et al. (2008).

Our main concern in trying to identify the effects of major peer groups on different college outcomes is that students selectively enroll in colleges with different peer characteristics. Peer characteristics are likely to be correlated with other institutional characteristics (e.g., campus resources, faculty quality, academic standards, etc.) that affect student outcomes. In order to account for these institutional differences, our specification includes enrollment campus fixed effects (μ k ).

An additional concern in analyzing our particular data is that the University of California made substantive changes to its admissions policies over this period. Prior to the adoption of Proposition 209 and continuing through the 1997 cohort, all underrepresented minorities received admissions preferences which varied by campus. Beginning with the fall of 2001, the UC system guaranteed the top 4 percent of students in the graduating class of every California high school UC eligibility if they had completed 11 specific college prep courses by the end of their junior year. This policy, known as “Eligibility in the Local Context” or ELC, was implemented to encourage students who had excelled academically in disadvantaged high schools to attend UC campuses.7 Subsequent changes in enrollment patterns (and, therefore, in peer competition) tended to vary by the selectivity of the different campuses.

In order to account for these types of trends, our specification also includes campus selectivity tier by cohort fixed effects (δ sc ). We group campuses into selectivity tiers based on their 1995 U.S. News & World Report rankings. Berkeley and UCLA, ranked 26th and 28th overall, are grouped in the top tier. Davis (40th), San Diego (43rd), and Irvine (48th) are grouped in the middle tier, and the remaining campuses are grouped in the bottom tier.8 These fixed effects additionally account for any general time trends in the data.

Finally, the preferences and/or abilities of a student that drive the selection of the enrollment campus are also likely to directly affect his or her subsequent outcomes. Our empirical model includes a vector of observable student characteristics (X ic ) that control for SAT I math and verbal scores, high school GPA, parental education, and family income.9 We also include in our specification very flexible application-admissions pattern fixed effects (A ac ) to account for student unobservables. The UCOP data provides us with information on which UC campuses each student submitted applications to, whether the stated major on each application was a STEM major, and whether the application was accepted. For each of the 8 UC campuses, each student has seven possible application-admission outcomes: (1) did not apply, (2) applied with an intended non-STEM major and rejected, (3) applied with an intended STEM major and rejected, (4) applied with an intended non-STEM major and accepted, (5) applied with an intended STEM major and accepted, (6) applied with an intended non-STEM major and admissions outcome missing, and (7) applied with an intended STEM major and admissions outcome missing. While there are 7 to the 8th power application-admission patterns that are theoretically possible, we only observe 11,281 unique combinations for intending STEM majors and 30,803 for all students in our sample. Additionally, we allow these fixed effects to vary by whether the student was eligible for an admissions preference and by cohort, since the admissions regime employed by the UC system changed over time.

Our analysis focuses on four key outcomes that we observe in the data: cumulative GPA, whether a student intending to major in the sciences persists in the sciences, whether a student graduates, and whether students achieve both science-persistence and graduation. Major-specific grades would appear to be the most appropriate outcome for testing the effects of major-specific peer quality. Unfortunately, we do not observe major-specific grades and instead only observe cumulative GPA. It is important to note, however, that cumulative GPA is a weighted average of grades obtained in STEM and non-STEM courses. Specifically,

$$ \overline{g}={\sigma}^S{\overline{g}}^S+\left(1-{\sigma}^S\right){\overline{g}}^N, $$

where σ S is the share of all credit hours taken in the STEM field. We can potentially characterize the likely effects of peer quality within each major field on major-specific grades based on our analyses of cumulative GPA and major choice.

Let us assume that the peer quality of intending majors in field j has little to no impact on grades outside of that major (i.e., \( \frac{\partial {\overline{g}}^{-j}}{\partial {q}^j}=0 \)).10 It then follows that the net effect of peer quality on cumulative GPA aggregates the effects of peer quality on major-specific grades and on the choice of STEM course load such that:

$$ \frac{\partial \overline{g}}{\partial {q}^j}={\sigma}^j\frac{\partial {\overline{g}}^j}{\partial {q}^j}+\left({\overline{g}}^S-{\overline{g}}^N\right)\frac{\partial {\sigma}^S}{\partial {q}^j}, $$

where σ N = 1 − σ S.

The literature and our data suggest that average grades in non-STEM courses are higher than average grades in STEM courses (i.e., \( {\overline{g}}^S-{\overline{g}}^N<0 \)). Our analysis of the effect of peer quality on major choice will also allow us to infer the sign of \( \frac{\partial {\sigma}^S}{\partial {q}^j} \). Therefore, certain sets of results from our analysis of major choice and cumulative GPA will allow us to bound the effect of peer quality on major-specific grades. Specifically,

  1. 1.

    If one of \( \frac{\partial \overline{g}}{\partial {q}^j} \) and \( \frac{\partial {\sigma}^S}{\partial {q}^j} \) is strictly negative and the other is non-positive, then \( \frac{\partial {\overline{g}}^j}{\partial {q}^j}<\frac{\partial \overline{g}}{\partial {q}^j} \); and,

  2. 2.

    If one of \( \frac{\partial \overline{g}}{\partial {q}^j} \) and \( \frac{\partial {\sigma}^S}{\partial {q}^j} \) is strictly positive and the other is non-negative, then \( \frac{\partial {\overline{g}}^j}{\partial {q}^j}>\frac{\partial \overline{g}}{\partial {q}^j} \).

Graduation outcomes depend on whether students attain or exceed minimum grade and credit hour requirements both overall and within their chosen major field. Therefore, the effect of peer quality on whether the student earns a bachelor’s degree (b) is driven, in turn, by the effects of peer quality on grades and credit hours. If we again assume no cross-major peer effects and that major choice, independent of grades, does not affect the likelihood of graduation, it follows that:

$$ \frac{\partial b}{\partial {q}^j}=\frac{\partial b}{\partial h}\frac{\partial h}{\partial {q}^j}+\frac{\partial b}{\partial {\overline{g}}^j}\frac{\partial {\overline{g}}^j}{\partial {q}^j}. $$

We do not observe the credit hours taken by students or any comparable measures of length of enrollment.11 If we find similar effects of peer quality on grades and the likelihood of graduation, it could be the case that graduation effects are driven purely by grade effects (or that there are also countervailing or reinforcing effects on enrollment length).

5 Results

We begin by examining the effect of major group peer ability on the likelihood that students graduate with a science degree. These results are presented in Table 2. A student’s own high school credentials are highly statistically significant predictors of whether he or she graduates with a science degree. As expected, students with higher math SAT scores are more likely to obtain science degrees, while students with higher verbal SAT score are less likely.

Table 2 Determinants of graduating with a science degree

When a student attends a campus with stronger peers in the sciences, he or she is less likely to graduate with a science degree. Specifically, we find that increasing the ability of intending science major peers by a standard deviation decreases the likelihood of graduating with a science degree by ten percentage points. Conversely, attending a college with stronger peers in the non-sciences increases the likelihood that students pursue and obtain a STEM degree. Increasing the ability of intending non-science major peers by a standard deviation increases the likelihood of graduating with a STEM degree by roughly nine percentage points. The corresponding wild cluster bootstrap-t p-value for each of these peer effects is less than 0.02.

The results in Table 2 and subsequent tables are able to improve upon the identification strategy used in Dale and Kruger (2002). Dale and Kruger used information on student applications, and on which colleges accepted them, to compare students who were accepted by similar schools but in fact attended schools with differing levels of eliteness. We are able to go a step further. Because we have such a large number of observations in the UC dataset, and successive cohorts at each of the institutions we study, we are able to compare students who applied to, and were accepted by, the exact same sets of schools, and we also use college fixed effects, which account for potential differences that influence both the enrollment decision of students and their subsequent outcomes. It is worth noting that our results are robust to alternative strategies of accounting for student unobservables.

In Tables 3 and 4, we use the model with the most stringent set of controls in Table 2 (Model 5) and do two things: first, we break students into those who, as high school seniors, intended to major in STEM fields, and those who did not; second, we examine several college outcomes: GPA, time to degree, graduating, graduating with a STEM degree, and ending one’s UC career (with or without a degree) as a STEM major. These results tell us several valuable things. First, we can see that the peer-effect results in Table 2 – although they hold for all students – are largely driven by students who initially intend, upon starting college, to major in STEM fields.12 That is to say, the coefficients for the peer ability effects in Table 3 are generally larger and more precisely estimated than those reported in Table 4. Tables 3 and 4 also allow us to explore the multitude of ways in which a student may fail to obtain a science degree. One possibility is that an intending science major changes fields but still goes on to graduate from college. Another is that the student fails to graduate in any field.13 Our findings suggest that a student’s major choice and subsequent college outcomes depend on the quality of major peer groups and that the nature of these peer effects varies by major field.

Table 3 Determinants of college outcomes of intended STEM majors
Table 4 Determinants of college outcomes of intended non-STEM majors

Non-STEM peer quality appears to have a positive effect on both the likelihood of choosing a STEM major and on cumulative GPA for both intended STEM majors and those intending a non-STEM major. Increasing the ability of non-STEM major peers by a standard deviation increases the likelihood that an intending STEM major stays in the sciences by just over fourteen percentage points. While the sign and magnitudes of these effects are similar to what we find when looking at the likelihood of graduating with a science degree, the statistical precision of these particular estimates are much lower (especially when evaluated with the wild cluster bootstrap-t procedure). As we have argued in the previous section, this pattern of results suggests that the positive effect of non-STEM peer quality on non-STEM GPA is even higher than that estimated for cumulative GPA. Therefore, the type of peer effects that we see in the non-sciences is similar to those found in the broader peer effects literature.

Conversely, we find that STEM peer quality has a negative effect on both the likelihood of choosing a STEM major and on cumulative GPA. This holds for both students with and without an intended STEM major. Specifically, increasing the ability of other intending STEM major peers by a standard deviation increases the likelihood that a student switches to a non-STEM field by around ten percentage points. We also find a negative effect of the ability of STEM peers on the college grades of intending science majors, which is statistically significant according to the wild cluster bootstrap-t procedure. This suggests that the negative effect of STEM peer quality on STEM grades is even more severe than that estimated for cumulative GPA. Therefore, our results suggest that peer effect dynamics in university-level science courses are fundamentally different than those of the other academic environments typically evaluated by the peer effects literature.

We also find evidence that the academic strength of STEM peers affects the chances that an intending STEM major will graduate from college with any degree (including a non-STEM degree). Increasing the ability of other intending science major peers by a standard deviation decreases the likelihood that an intending STEM major graduates from college (in any field) by around 15 percentage points.14 The magnitude of this effect is somewhat larger than our estimated effect on major-switching (i.e., leaving a STEM field). For those who graduate, we find that having stronger STEM peers increases the time to degree. One way to think about these results is that students who face difficult competition in a STEM field must make a choice: they can switch to a non-STEM field, which will probably entail staying in college longer; they can persist in the major, perhaps with low grades, or they can drop out.

Since non-STEM peer quality appears to increase the cumulative GPA of intending STEM majors, it could be the case that this mechanism lessens the likelihood that these students will fail to meet graduation grade requirements. Non-STEM peer ability has a positive effect on the likelihood that intending science majors graduate from college that is highly statistically significant according to the wild cluster bootstrap-t procedure. These results also suggest that higher quality peers in non-STEM courses may lessen the risk of taking more STEM courses (where expected grades are lower than non-STEM courses), leading to greater STEM persistence.

5.1 Heterogeneous effects by student ability (evidence on mismatch effects)

To this point, we have found that the college outcomes of intending science students are influenced by peer effects on average. However, we have not considered the possibility that these peer effects might be moderated by a student’s own ability. The mismatch literature has specifically suggested that placing students in strong academic environments might have large, negative effects on less academically prepared students. To test for this possibility, we include interaction terms between “own ability” and peer ability in our specifications. Our prior estimates suggests that higher student math ability increases the likelihood of declaring a STEM final major, while higher student verbal ability increases the probability of choosing a non-STEM final major. Therefore, we interact STEM peer ability with own math SAT score and non-STEM peer ability with own verbal SAT score.

In accord with the mismatch hypothesis, we find that a student’s persistence in science is particularly hurt by stronger peers when the student’s own math ability is relatively low. As shown in Table 5, the interaction term between a student’s own math SAT and the ability of STEM major peers is substantial and positive. This suggests that if the ability of intending STEM major peers increases by one standard deviation, the student’s likelihood of having a STEM “final major” drops 13 percentage points if the student’s own math SAT is 550, but only 11 points if the student’s own math SAT is 650.

Table 5 Determinants of college outcomes of intended science majors w/own ability-peer ability interactions

While own math ability does appear to moderate the effect that STEM major peer ability has on science persistence, it does not appear to influence peer effects on the likelihood of graduating or on cumulative college grades. All else equal, we would expect that shifting out of the sciences would increase cumulative GPA. We find that weaker students are more likely to exit the STEM track in response to higher STEM peer quality. Thus, the fact that we find no differential effect of peer quality on cumulative GPA based on own student ability suggests that higher STEM peer quality has a more severe negative impact on the STEM grades of less prepared students. This finding is also consistent with the mismatch hypothesis.

Here, we have evaluated whether peer effects are heterogeneous by own ability by including a simple linear interaction term. We find very similar results when we express own ability as a series of dummy variables and interact these dummy variables with average peer ability.

5.2 Heterogeneous effects by race

Up until this point, our findings have suggested that weaker students are more likely to switch majors when faced with greater competition in the sciences. Black and Hispanic (“URM”) students might seem to be particularly vulnerable; they tend to have lower credentials than their peers, and our earlier analyses showed grade and graduation gaps that may reflect unobserved racial differentials in college preparation. However, our analyses suggest that URM students are less likely than other students to respond to stronger STEM peers by switching from STEM to non-STEM fields; they are more likely to persist than their non-URM peers, but they pay a price in terms of lower graduation rates and college grades.

Because of racial preferences in college admissions, it is also important to evaluate how minority groups are affected by peer quality in terms of their college outcomes. Results, broken down by minority status, are presented in Tables 6 and 7. Non-minority intending science majors appear to put more weight on peer quality within the STEM and non-STEM fields in making their final major choice. In particular, non-minority students are much more likely to exit the sciences when faced with stronger STEM peers compared to under-represented minorities.

Table 6 Determinants of college outcomes of URM intended science majors
Table 7 Determinants of college outcomes of non-URM intended science majors

Being more persistent when faced with stronger peers within the sciences comes with a cost for minority student groups. The results in Table 6 imply that attending a college with stronger science peers leads blacks and Hispanics who are interested in science to have lower cumulative college GPAs and lower odds of graduating from college. Increasing the ability of other intending science majors by one standard deviation decreases the likelihood that URM intended STEM majors graduate from college by around 43 percentage points. The same change in science peer ability decreases the cumulative college GPA of black and Hispanic intended science majors by approximately 0.32 points.

Much of the peer effects literature measures outcomes in terms of first-year grades, while the principal grade outcome we examine is final GPA. Carrell et al. (2009) is an exception; these authors found that a 100-point increase in the Verbal SAT scores of freshmen squadron members, the equivalent of a standard deviation increase in test score, raised the cumulative grades of U.S. Air Force Academy cadets by approximately 0.25 grade points. Our findings imply a standard deviation increase in science peer quality lowers the final GPA of non-minority intending science majors by 0.125 points. As noted earlier, our findings are not at all necessarily in conflict with those like Carrell et al. because a squadron is a relatively small and cohesive group that exists in an academic setting where collaboration and mutual support of squadron members is expected. This type of environment stands in stark contrast to the competitive and often cutthroat culture prevalent in the sciences.

5.3 Robustness to alternative definitions of peer group and peer ability

Our analysis to this point has defined relevant peer groups based on intended major and has used the principal component of average peer high school GPA, math SAT I score, and verbal SAT I score as a measure of peer quality within each major group. Table 8 presents results using a number of alternative measures of peer group ability. Specifically, we consider specifications that instead use average high school grades, math SAT score, verbal SAT score or combined SAT score as a measure of STEM and non-STEM peer quality. All peer measures are standardized so that coefficient estimates of peer effects are roughly comparable across specifications. The general magnitude, sign, and statistical significance of our estimated effects do not vary much with these alternative measures.

Table 8 Determinants of college outcomes of intended science majors w/different peer group ability measures

A significant finding of this paper is that peer effects stand out more clearly when we use narrow, rather than broad measures of peer groups. Past research on peer effects has often examined outcomes for entire college classes. In our analysis, such broad definitions would mask the most important peer effects, which are strong and significant when we split peers into intended majors. Scholars examining these issues should be careful to test for disaggregated effects like these.

6 Conclusion

Using a rich dataset on the universe of University of California students who enrolled between 1995 and 2003, we examine how the quality of potential peers within distinct college majors at a student’s campus affects his or her major choices and other college outcomes. We find that students initially interested in pursuing a science major respond to peer quality within both the broad science and non-science major tracks. Higher quality non-STEM peers appear to boost the non-STEM grades of intending STEM majors, potentially lessening the risk of taking a greater share of STEM courses (where expected grades are lower). Conversely, attending a campus with a stronger group of intending science majors lowers the likelihood that students graduate with a science degree. While some of those students who leave the sciences simply shift their course of study, others fail to graduate at all. We take strong measures to ensure that these findings are not driven by peer group endogeneity.

Consistent with mismatch theories, we find that weaker students are particularly adversely affected by attending colleges where the sciences are more competitive. Perhaps most striking, we find that underrepresented minorities are much less likely than non-minorities to respond to higher STEM peer quality by switching majors. Instead, minority students interested in science are much more likely to drop out when they are placed among stronger STEM peers, and, if they do graduate, they take a very substantial hit on their GPAs. What accounts for this race effect? Perhaps minority students have a stronger commitment to pursuing science and are willing to bear the risk of not graduating in order to pursue their dream of becoming a scientist. Julian (2012) provides Synthetic Work-Life Earnings (SWE) estimates that suggest that those with a STEM degree on average can expect to earn approximately $600,000 more than holders of non-STEM degrees over their work-life. Even if one were to adjust these estimates to reflect the risk of not successfully completing a degree, it would appear that minorities still face higher expected lifetime earnings by pursuing STEM over non-STEM degrees.

Or perhaps, minority students are less likely to know how to maneuver the college landscape. Perhaps they underestimate the potential risks of failing to meet graduation requirements when pursuing different types of degrees. Our data does not allow us to evaluate these alternate hypotheses. Future research in this area should evaluate whether these particular patterns of minority attrition hold in other instances. The general problem of science attrition, as shown by this and other research, is sufficiently serious that universities should attempt careful measurement of evolving student attitudes and outcomes as they enter and advance through college.

7 Endnotes

1Arcidiacono et al. (2013) examine how well different UC campuses produce STEM graduates. These authors focus on net differences across institutions. What distinguishes our work is that we focus on identifying how a particular institutional characteristic, major peer group quality, influences STEM major choice.

2See, for example, Smyth and McArdle (2004) and Arcidiacono et al. (2013).

3In particular, the index weights verbal and math SAT scores equally. However, we argue that math skills are more predictive of success in the sciences and that verbal skills are more predictive of success in other disciplines (and, therefore, possibly of selection out of the sciences).

4SAT verbal and math scores for tests taken prior to 1995 have been re-centered to be comparable to later scores.

5See Hyslop and Imbens (2001).

6Following Arcidiacono (2004), we also considered specifications with multiple measures of peer quality. These results are qualitatively similar to those found using single measures of peer quality. As Black and Smith (2006) note, the high degree of correlation between different measures of peer ability make it difficult to interpret individual coefficient estimates for specific measures when multiple measures are included in the empirical model. For this reason, our main specifications rely on peer quality indices constructed through a principal component analysis of multiple potential measures of peer quality. The principal component scoring is reported in Table 9, and summary statistics for the constructed peer ability indices, broken down by campus and cohort, are reported in Table 10 of the Appendix.

7We are able to observe the ELC status of students after this policy is adopted.

8In his analysis of the effects of Proposition 209 on college enrollment patterns at the University of California, Hinrichs (2012) groups campuses into two tiers: the top and bottom four. His results based on these aggregated groupings are reported to be similar to those that allow for separate estimates by campus. Our results are also similar if we use alternative definitions of selectivity tiers.

9Family income is reported as a categorical variable and in nominal terms. Therefore, we include in our specification a set of family income category by cohort fixed effects.

10This would not appear to be a particularly strong assumption, as intending STEM majors would likely represent a small portion of students in non-STEM courses and vice versa.

11We do observe the number of trimesters attended, but only for those students that graduate (i.e., time to degree). We also consider the effects of peer quality on time to degree. However, this analysis is somewhat problematic because it involves a selected sample of students where selection occurs based on an outcome variable. Therefore, those results are not necessarily informative as to how peer quality affects students on the margin of graduating.

12In these analyses, we have classified a student’s intended major based on the choice she made on the college application to the campus in which she eventually enrolled. Some students express different intentions on different applications – they may list a STEM field in their application to UC Davis, but humanities in their application to Berkeley. If one classifies as an “intending STEM” major anyone who indicates a STEM preference on any UC application, this picks up more students – but it does not have much effect on the results in Table 3. Conversely, if one classifies as an intended STEM major only those students who indicate a STEM preference on all UC applications, this produces a smaller sample of intending STEM majors, but again our findings hold steady.

13The UCOP data does not tell us the timing or history of changes to a student’s intended field of study. We only observe the student’s entering preferences and her final major, which is the last official major registered by the student before she exited the UC system (either by graduating or dropping out).

14The corresponding wild cluster bootstrap-t p-value is 0.035.


  • Arcidiacono P (2004) Ability Sorting and the Returns to College Major. J Econ 121(1–2):343–375

    Article  Google Scholar 

  • Arcidiacono P, Aucejo E, Coate P, Hotz VJ (2012) “Affirmative Action Bans and University Fit: Evidence from Proposition 209.”. Working Paper No. 18523, National Bureau of Economic Research, Cambridge, MA

    Book  Google Scholar 

  • Arcidiacono P, Aucejo E, Hotz VJ (2013) University Differences in the Graduation of Minorities in STEM Fields: Evidence from California. Working paper (under revise and resubmit from American Economic Review, 2014, Cambridge, MA

    Book  Google Scholar 

  • Astin A, Astin H (1993) Undergraduate Science Education: The Impact of Different College Environments on the Educational Pipeline in the Sciences. Higher Education Research Institute, UCLA, Los Angeles, CA

    Google Scholar 

  • Brunello G, De Paola M, Scoppa V (2010) Peer Effects in Higher Education: Does the Field of Study Matter?”. Econ Inq 48(3):621–634

    Google Scholar 

  • Carrell SE, Fullerton RL, West JE (2009) Does Your Cohort Matter? Measuring Peer Effects in College Achievement. J Labor Econ 27(3):439–464

    Article  Google Scholar 

  • Carrell SE, Sacerdote BI, West JE (2013) From Natural Variation to Optimal Policy? The Importance of Endogenous Peer Group Formation. Econometrica 81(3):855–882

    Article  Google Scholar 

  • US Census Bureau (2003). National Survey of College Graduates

  • Colin CA, Gelbach JB, Miller DL (2008) Bootstrap-based Improvements for Inference with Clustered Errors. Rev Econ Stat 90(3):414–427

    Article  Google Scholar 

  • Dale SB, Krueger AB (2002) Estimating the Payoff to Attending a More Selective College: An Application of Selection on Observables and Unobservables. Q J Econ 117(4):1491–1527

    Article  Google Scholar 

  • Elliott R, Christopher Strenta A, Adair R, Matier M, Scott J (1996) The Role of Ethnicity in Choosing and Leaving Science in Highly Selective Institutions. Res High Educ 37(6):681–709

    Article  Google Scholar 

  • Foster G (2006) It’s Not Your Peers, and It’s Not Your Friends: Some Progress Toward Understanding the Educational Peer Effect Mechanism. J Public Econ 90:1455–1475

    Article  Google Scholar 

  • Hinrichs P (2012) The Effects of Affirmative Action Bans on College Enrollment, Educational Attainment, and the Demographic Composition of Universities. Rev Econ Stat 94(3):712–722

    Article  Google Scholar 

  • Hyslop DR, Imbens GW (2001) Bias From Classical and Other Forms of Measurement Error. J Bus Econ Stat 19(4):475–481

    Article  Google Scholar 

  • Julian T (2012) Work-Life Earnings by Field of Degree and Occupation for People With a Bachelor’s Degree: 2011. American Community Survey Briefs, U.S. Census Bureau, Washington, DC

    Google Scholar 

  • Lipson A, Tobias S (1991) Why do some of our best students leave science?”. J Coll Sci Teach 21(2):92

    Google Scholar 

  • Loury L, Garman D (1995) College Selectivity and Earnings. J Labor Econ 13:289–308

    Article  Google Scholar 

  • Ost B (2010) The Role of Peers and Grades in Determining Major Persistence in the Sciences. Econ Educ Rev 29:923–934

    Article  Google Scholar 

  • Sacerdote B (2001) Peer Effects with Random Assignment: Results from Dartmouth Roommates. Q J Econ 116(2):681–704

    Article  Google Scholar 

  • Smyth FL, McArdle JJ (2004) Ethnic and Gender Differences in Science Graduation at Selective Colleges with Implications for Admission Policy and College Choice. Res High Educ 45(4):353–381

    Article  Google Scholar 

  • Sowell T (1972) Black Education: Myths and Tragedies. David McKay Company, Inc., New York

    Google Scholar 

  • Tobias S, Lin H (1991) They’re Not Dumb, They’re Different: Stalking the Second Tier. Am J Phys 59(12):1155–1157

    Article  Google Scholar 

Download references


The authors would like to thank the anonymous referee.

Resposible editor: Peter Arcidiacono

Author information

Authors and Affiliations


Corresponding author

Correspondence to Marc Luppino.

Additional information

Competing interests

The IZA Journal of Labor Economics is committed to the IZA Guiding Principles of Research Integrity. The authors declare that they have observed these principles.



1.1 Imputation of SAT I Scores and High School GPA

Instead of reporting the exact high school GPA and SAT I scores for students, the UCOP contains an academic index, which is a weighted linear combination of SAT I math (m) and verbal (v) scores and high school GPA (g):

$$ Inde{x}_i=c+{w}_m{x}_{m,i}^{*}+{w}_v{x}_{v,i}^{*}+{w}_g{x}_{g,i}^{*}. $$

Each of the weights (w) and the constant (c) in this equation are known, while the x* terms represent the unobserved true values of a student’s SAT scores and high school GPA.

The UCOP data also reports categorical ranges for each student’s scores and GPA, with these ranges having an upper \( \left(\overline{x}\right) \) and lower \( \left(\underset{\bar{\mkern6mu}}{x}\right) \) bound:

$$ {\overline{x}}_{j,\kern0.5em i}^0\ge {x}_{j,\kern0.5em i}^{*}\ge {x}_{j,\kern0.5em i}^0\kern0.5em \mathrm{f}\mathrm{o}\mathrm{r}\kern0.5em \mathrm{each}\kern0.5em j\kern0.5em \in \left[m,\kern0.5em v,\kern0.5em g\right], $$

where the zero superscript represents that these bounds are those that are initially reported in the data.

We can rearrange the terms in equation (ix) to express the unknown math score as a function of known and unknown inputs:

$$ {x}_{m,\kern0.5em i}^{*}=\left[\frac{1}{w_m}\right]\kern0.5em Inde{x}_i-\left[\frac{c}{w_m}\right]\kern0.5em {x}_{v,\kern0.5em i}^{*}-\left[\frac{w_g}{w_m}\right]\kern0.5em {x}_{g,\kern0.5em i}^{*}. $$

Similarly, we can write this expression for verbal score and high school GPA. Given the initial upper and lower bounds reported in the data and equation (xi), we first attempt to tighten the bounds around the true values of each unobserved variable by iteratively running the following algorithm:

until all bounds converge (e.g. \( {\overline{x}}_{j,\kern0.5em i}^R={\overline{x}}_{j,\kern0.5em i}^{R-1} \) and \( {\underset{\bar{\mkern6mu}}{x}}_{j,\kern0.5em i}^R={\underset{\bar{\mkern6mu}}{x}}_{j,\kern0.5em i}^{R-1} \) for each j [m, v, g]). In running this algorithm, we additionally take advantage of the discrete nature of SAT scores and high school GPA to further tighten the revised upper and lower bounds implied by the data for each of our unobserved measures.

One can envision constructing imputed measures as weighted averages of the revised upper and lower bounds:

$$ {\widehat{x}}_{j,i}^{*}={a}_j{\overline{x}}_{j,i}^R+\left(1-{a}_j\right){\underset{\bar{\mkern6mu}}{x}}_{j,i}^R $$

The question then becomes, “what is the appropriate weight (a)?” If we plug equation (xiv) into equation (xv) for each unobserved variable we get:

$$ Inde{x}_i=c+{u}_m{\overline{x}}_{m,i}^R+{l}_m{\underset{\bar{\mkern6mu}}{x}}_{m,i}^R+{u}_v{\overline{x}}_{v,i}^R+{l}_v{\underset{\bar{\mkern6mu}}{x}}_{v,i}^R+{u}_g{\overline{x}}_{g,i}^R+{l}_g{\underset{\bar{\mkern6mu}}{x}}_{g,i}^R, $$

where u j  = w j a j and l j  = w j (1 − a j ) for each j [m, v, g]. It also follows that:

$$ {a}_j=\frac{u_j}{u_j+{l}_j}. $$

We can estimate equation (xviii) using regression analysis and construct the implied weights based on the corresponding coefficient estimates. Specifically, we estimate the following regression model for each vigintile (q) of the academic index distribution:

$$ Inde{x}_i^q={\beta}_0^q+{\beta}_{\overline{m}}^q{\overline{x}}_{m,i}^R+{\beta}_{\overline{m}}^q{\underset{\bar{\mkern6mu}}{x}}_{m,i}^R+{\beta}_{\overline{v}}^q{\overline{x}}_{v,\kern0.5em i}^R+{\beta}_{\underset{\bar{\mkern6mu}}{v}}^q{\underset{\bar{\mkern6mu}}{x}}_{v,i}^R+{\beta}_{\overline{g}}^q{\overline{x}}_{g,i}^R+{\beta}_{\underset{\bar{\mkern6mu}}{g}}^q{\underset{\bar{\mkern6mu}}{x}}_{g,i}^R+{\varepsilon}_i. $$

The correspondence between this regression model and equation (xx) implies that appropriate weights should be constructed such that:

$$ {a}_j^q=\frac{\widehat{\beta}\frac{q}{j}}{\widehat{\beta}\frac{q}{j}+{\widehat{\beta}}_{\underset{\bar{\mkern6mu}}{j}}^q},\kern0.5em \mathrm{f}\mathrm{o}\mathrm{r}\ \mathrm{each}\kern0.5em j\kern0.5em \in \left[\underset{\bar{\mkern6mu}}{m},\kern0.5em \overline{m},\kern0.5em \underset{\bar{\mkern6mu}}{v},\kern0.5em \overline{v},\kern0.5em \underset{\bar{\mkern6mu}}{g},\kern0.5em \overline{g}\right]. $$

Using these weights and the revised upper and lower bounds for each unobserved variable, we then construct imputed values for each student’s unobserved high school GPA, math SAT score, and verbal SAT score.

Table 9 Principal component factor loadings for peer ability indices
Table 10 Summary statistics for peer ability indices and residual peer ability indices by campus and cohort

1.2 Appendix Tables

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luppino, M., Sander, R. College major peer effects and attrition from the sciences. IZA J Labor Econ 4, 4 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:

JEL codes

  • I21
  • J24


  • College major choice
  • Mismatch
  • Peer Effects