A rigorous quantile-regression analysis of 229 UK higher education providers, linking occupational outcomes from HESA Table 22 to salary distributions from Table 26 — to quantify the graduate skills premium and benchmark provider performance.
UK policy-makers and prospective students lack a rigorous, provider-level evidence base linking occupational outcomes to graduate earnings. Raw salary tables exist — but the causal signal is buried in noise.
Does attending a provider whose graduates enter high-skill professional roles (SOC major groups 1–3) translate into a measurable salary premium — and does this premium vary across the earnings distribution?
HESA publishes Table 22 (occupational outcomes) and Table 26 (salary bands) separately. No published analysis joins them at provider level to quantify the skills premium — the salary uplift attributable to professional occupation entry rates.
Jisc and HESA need analyst-grade evidence to inform the Teaching Excellence Framework (TEF), Office for Students (OfS) Graduate Outcomes metrics, and widening participation benchmarking across 229 providers.
Salary data is banded (not individual), counts are suppressed for small providers, and the relationship between occupational mix and earnings is heterogeneous across the distribution — requiring quantile regression, not OLS.
National Salary Range Across 229 Providers
£26,938 spread between lowest and highest-earning provider — equivalent to 186% of the national minimum
Two HESA Graduate Outcomes Survey tables — joined at provider level for the first time — covering UK-domiciled first degree graduates in full-time paid employment.
| Field | Description | Type |
|---|---|---|
| provider | Higher education provider name (UKPRN-linked) | string |
| year | Academic year (2017/18 to 2022/23) | string |
| mode | Mode of former study: Full-Time / Part-Time | factor |
| soc_group | SOC 2020 major group (1–9 + Unknown) | factor |
| is_graduate_role | Derived: TRUE if SOC group 1, 2, or 3 | boolean |
| count | Number of graduates (suppressed if <5) | integer |
| pct_graduate_role | % of graduates in SOC 1–3 (derived) | numeric |
| Field | Description | Type |
|---|---|---|
| provider | Higher education provider name (UKPRN-linked) | string |
| year | Academic year (2017/18 to 2022/23) | string |
| skill_group | High Skilled / Medium Skilled / Low Skilled | factor |
| salary_band | 14 salary bands from <£15k to £51k+ | string |
| salary_midpoint | Band midpoint used as numeric proxy (£) | numeric |
| count | Number of graduates in band (0 = genuine zero) | integer |
| weighted_mean_salary | Σ(midpoint × count) ÷ Σ(count) per provider | numeric |
Population Scope — Both Tables Aligned To:
Google Colab Pro chosen over free tier for guaranteed high-RAM runtime (51GB) essential for loading full HESA microdata, running bootstrap quantile regression (R=500), and serving a live Shiny dashboard via ngrok tunnel.
Raw HESA CSVs loaded with skip-row handling, column harmonisation via janitor, and strict suppression-aware filtering (NA ≠ 0).
Tables 22 and 26 joined on UKPRN + year + mode to create a unified module_b_fixed analytical dataset of 1,205 provider-year-mode records.
Three quantile regressions (Q25, Q50, Q75) via quantreg::rq() with bootstrap SE (R=500, seed=42) to capture heterogeneous salary effects across the distribution.
Full 4-tab Shiny app with bslib, plotly, DT, and shinyWidgets — live at the ngrok URL, serving real-time filtered charts and league tables.
Each provider's weighted mean salary benchmarked against the national median (£22,172) to produce a signed skills premium in £ — enabling a ranked league table of 229 providers.
Shiny served on port 3838 using processx background process; ngrok binary tunnels traffic to a public HTTPS URL without firewall configuration.
Tables 22 & 26 from HESA
Suppress NA, filter scope
Provider-level merge
Weighted means, % SOC 1–3
Q25 / Q50 / Q75
Provider vs national median
Interactive exploration
Six headline findings from the analysis of 229 providers, 1,205 provider-year records, and three quantile regression models.
| Quantile | Estimate (£ per 1pp) | 95% CI | Significance |
|---|---|---|---|
| Q25 — Lower quartile | £80 | £58 – £102 | p < 0.001 |
| Q50 — Median | £92 | £60 – £124 | p < 0.001 |
| Q75 — Upper quartile | £172 | £142 – £203 | p < 0.001 |
Interactive Plotly charts generated directly from the R analysis — embedded below for reference alongside the live Shiny dashboard.
This analysis follows HESA official statistics publication standards — including suppression handling, population alignment, and transparent uncertainty quantification.
Salary midpoints approximate true earnings. The open-ended upper band introduces uncertainty for high-earning providers. Table 22 SOC classification reflects graduate destination 15 months after graduation — not lifetime outcomes. Response rates vary by provider (see HESA Table 5). Skills premium is descriptive, not causal — unobserved provider and student characteristics confound the relationship. HESA data used under Creative Commons Attribution 4.0 International licence.