A Toy Model of College Admissions

Allen Sirolly / November 27, 2020

The chart below from a 2019 Pew Research article caught my attention the other day. It shows a concerning trend of increasing selectivity at colleges which has been written about in a number of other places as well.1 The article reports that the average number of applications per enrollee increased from about four in 2002 to 6.8 in 2017, with the increase likely even more severe for competitive colleges. While this doesn’t necessarily make it “harder” to get into these schools (it could just be that more students who would have been non-competitive in previous years are applying), the rise in the average number of applications per applicant can certainly bring with it a concomitant increase in anxiety for successive waves of prospects.

My goal is to build a model for a multi-period college admissions market which, through some mechanism, gives rise to a vicious cycle of increasing application volume and lower admission rates. This is a big task which I don’t expect to fully accomplish here; instead, I’ll use this post to sketch out a simple “toy model” which might serve as a starting point. The main idea is that applicants shrink their beliefs about their chances of admission toward a public signal, such as last year’s admission rates, in such a way that the resulting admission rates will be slightly lower this year. I’ll start by considering a one-period model where applicants solve a utility maximization problem.

The Model


Suppose there is an ordered set of colleges \(\mathcal{K}\) indexed by \(k = 1, \dots, K\). Each has capacity \(A_k\) for an entering freshman class. Colleges receive reports of \(\widetilde{W}_i\) (a noisy signal of applicant \(i\)’s ability/“type” \(W_i\)) from all applicants and must decide whom to admit based on capacity and expected yield. Colleges prefer applicant \(i\) to applicant \(j\) if \(\widetilde{W}_i > \widetilde{W}_j\), and applicants prefer college \(k\) to \(k'\) if \(k < k'\).

To compute the results of an admissions round, I pretend that decisions are made sequentially and publicly, such that college \(k\) observes the admittances of all higher-ranked colleges (\(k' < k\)). The admissions process follows a simple algorithm2:

  • The top-ranked college sorts its applicants by \(\widetilde{W}_i\) (descending) and admits the top students up to capacity \(A_1\) (all of whom match with it, for a 100% yield);

  • The second-ranked college sorts its applicants by \(\widetilde{W}_i\) (descending) and admits all who were admitted by the first-ranked college, plus the top students who were not admitted by the first-ranked college, up to capacity \(A_2\);

  • And so on for \(k = 3, \dots, K\).

The embedded assumptions are that (a) colleges do not have minimum requirements for \(\widetilde{W}_i\) and (b) colleges will not reject applicants for being “overqualified”3 (in the sense of having a \(\widetilde{W}_i\) higher than another admit).


The choice problem

Each applicant \(i\) has utility function \(u_i(k)\) over \(\mathcal{K}\) and faces a constant application cost \(c\). Each has an ability (“type”) \(W_i\) and beliefs—shaped by priors and public information—about the probability of acceptance to college \(k\) given \(W_i\). The problem the applicant faces is to choose a set of colleges \(C_i\) to which to apply; from the subset of admittances, the applicant will choose the college which maximizes utility. Assume that the utility of not being admitted to any college is normalized to 0, with \(u_i(k) > 0\) for all \(k\).

Let \(\mathscr{P}(S)\) denote the power set of \(S\), so that \(C_i \in \mathscr{P}(\mathcal{K})\) are the possible application/choice sets4 and \(\mathcal{S} \in \mathscr{P}(C_i)\) are the possible sets of colleges to which the applicant is admitted. Formally, an applicant will solve

\[ \max_{C_i \in \mathscr{P}(\mathcal{K})} \Bigg\{ \sum_{\mathcal{S} \in \mathscr{P}(C_i)} P_i(\mathcal{S}) \cdot \max_{k \in \mathcal{S}} u_i(k) - c |C_i| \Bigg\} \]

where \(P_i(\mathcal{S})\) is the probability that \(i\) is admitted to colleges in \(\mathcal{S}\) and rejected from colleges in \(C_i \setminus \mathcal{S}\). Applicants believe that admissions decisions by colleges are independent, so that

\[ P_i(\mathcal{S}) = \prod_{k \in \mathcal{S}} P(\text{admit at }k \mid W_i) \prod_{k \in C_i\setminus\mathcal{S}} (1 - P(\text{admit at }k \mid W_i)). \]

\(\mathscr{P}(\mathcal{K})\) is potentially a very large search space; I simplify the problem by using a greedy algorithm which at each step (starting from \(C_i = \varnothing\)) adds the college \(k'\) to \(C_i\) which yields the highest increase in expected utility, provided there is one. (update 12/12/2020: here’s a proof that this algorithm finds the optimal \(C^*(W)\).) Note that by adding \(k'\) to \(C_i\), an applicant’s expected utility will change by

\[ P(\text{admit at }k' \mid W_i) \left[ \sum_{\mathcal{S} \in \mathscr{P}(C_i)} P_i(\mathcal{S}) \cdot \max\{0,\; u_i(k') - \max_{k \in \mathcal{S}} u_i(k)\} \right] - c \]

This expression can be simplified by considering each “best option” in \(C_i\) to which \(i\) may be admitted:

\[ P(\text{admit at }k' \mid W_i) \left[ P_i(\varnothing) u_i(k') + \sum_{k \in C_i} P_i(k \text{ is best option}) \cdot \max\{0,\; u_i(k') - u_i(k) \} \right] - c \]

The corresponding probabilities are easy to compute by the independent admissions assumption.


Suppose that types are distributed \(W \sim G(w)\). Each applicant knows her own type as well as the distribution \(G\), and sends colleges a noisy signal \(\widetilde{W}_i = W_i + \epsilon_i\), \(\epsilon_i \sim F(\epsilon)\), where \(F\) is known but the realization of \(\epsilon_i\) is unknown. I assume that prior beliefs are consistent with a view of the admissions process as a perfect sorting mechanism, such that the \(\sum_{j\leq k} A_j\) applicants with the highest \(\widetilde{W}_i\)’s will be admitted to the \(k\) highest-ranked colleges. (This actually agrees with the outcome of the colleges’ decision process if every prospect sends an application to every college, which is what happens when \(c = 0\).5)

Let \(\widetilde{G}\) be the marginal distribution of \(\widetilde{W}_i\) (which applicants can calculate using knowledge of \(F\) and \(G\)). After observing public information \(I_k\), e.g., last year’s admission rate at college \(k\), the posterior beliefs are

\[ P_\alpha(\text{admit at }k \mid W_i) = (1 - \alpha) P\left( 1 - \widetilde{G}(\widetilde{W}_i) < \frac{\sum_{j\leq k}A_j}{\sum_j A_j} \right) + \alpha I_k \]

where \(\alpha \in [0, 1]\) represents the degree of shrinkage. There are certainly other possibilities: maybe it’s more realistic that applicants only update their beliefs to make themselves overconfident about their chances at more selective colleges.

One challenging problem (which I won’t attempt in this post) is to figure out the beliefs and utility functions which result in an admissions outcome that is consistent with \(\{I_k\}\), in an equilibrium sense, for the \(c > 0\) case.


The features of this model can be studied via a simulation exercise, in the spirit of agent-based modeling. I’ve made the code available on my GitHub.

For the simulation I construct a set of \(K = 50\) colleges, each with capacity \(A_k = 100\). There are \(\sum_k A_k = 5000\) prospective applicants with

\[ W_i \sim \mathcal{N}(0, 1^2) \\ \widetilde{W}_i \mid W_i \sim \mathcal{N}(W_i, 0.1^2) \]

so that \(\widetilde{W}_i \sim \mathcal{N}(0, 1.1^2)\). The public admission rates are

\[ I_k = \left( 1 - \frac{k-1}{K-1} \right) \underline{r} + \frac{k-1}{K-1}\;\overline{r} \]

with \((\underline{r}, \overline{r}) = (.05, 1)\), which corresponds to \(I_k = \sum_{j\leq k} A_j / \sum_j A_j\) in this example. The utility function (common to all applicants) has the form

\[ u_i(k) = I_k^{-\beta} + \gamma (K - k) \]

for \(\beta > 0\) and \(\gamma > 0\). The latter parameter guarantees that there is a minimum increment in the utility function between consecutively-ranked colleges. I normalize the utilities so that \(\sum_k u_i(k)\cdot I_k = 1000\).

I computed the “results” of this artificial admissions process for a grid of parameter values, which led to some interesting charts:

Set of utility functions considered.
Applicants’ beliefs about probability of acceptance at each college, for different values of \((W_i, \alpha)\). Dashed line is the public information \(I_k\).
Application volume by college rank for \(\gamma = 3\) and different values of \((\alpha, \beta, c)\).
Acceptance rates for \(\gamma = 3\) and different values of \((\alpha, \beta, c)\). Dotted line is \(I_k\).
Yields for \(\gamma = 3\) and different values of \((\alpha, \beta, c)\). Dotted line is \(I_k\).
Average number of applications per applicant by ability/type percentile, for \(\gamma = 3\) and different values of \((\alpha, \beta, c)\).
Total number of applications to each college, by applicant ability/type percentile, for \(\gamma = 3\), \(\beta = 1.5\), and different values of \((\alpha, c)\).
Number of admittances to each college, by applicant ability/type percentile, for \(\gamma = 3\), \(\beta = 1.5\), and different values of \((\alpha, c)\).

That’s it for now! More to come in a future post…

  1. I tweeted about it, too.

  2. This is similar to what’s known as a “serial dictatorship mechanism”, the difference being that there is still an application process that forces students to make decisions about which colleges to apply to.

  3. A richer model might incorporate incentives for colleges to keep their admission rates low.

  4. In practice, applicants will face an application limit \(L\) so that the choice set is \(C_i \in \{ \mathcal{C} \in \mathscr{P}(\mathcal{K}) : |\mathcal{C}| < L \}\). The Common App presently limits students to 20 applications.

  5. The acceptance rate for college \(k\) will be \(\sum_{j\leq k} A_j/\sum_j A_j\), and the yield \(A_k / \sum_{j\leq k} A_j\).