DOL Form 5500 — ERISA pension & welfare plan filings
Verified May 16, 2026 · tested with live www.askebsa.dol.gov fetch (F_5500_2022, 30 MB) + apex-redirect gotcha confirmed
Form 5500 (US DOL / EBSA) is the free annual ERISA filing for every US pension and welfare plan with ≥100 participants. Schedule H gives plan-level asset breakdowns — including mutual-fund holdings — making it the go-to free source for retirement, household-finance, DC/401(k), and pension-as-shareholder research. It is what the ZeroPaper pipeline uses for retirement and labor-finance work.
- Cost: free, no auth.
- Coverage: annual from 1999; stable schedule structure from 2009 (post EFAST2). Plans <100 participants file 5500-SF (separate).
- Home: https://www.dol.gov/agencies/ebsa/about-ebsa/our-activities/public-disclosure/foia/form-5500-datasets
Access
Section titled “Access”Per-year, per-schedule ZIPs (no auth):
https://www.askebsa.dol.gov/FOIA%20Files/{year}/Latest/F_{NAME}_{year}_Latest.zipNAME ∈ {5500, SCH_A, SCH_C, SCH_D, SCH_G, SCH_H, SCH_I, SCH_R, SCH_MB, SCH_SB}import io, zipfile, requests, pandas as pd
def get_5500(year, name="5500"): url = (f"https://www.askebsa.dol.gov/FOIA%20Files/{year}" f"/Latest/F_{name}_{year}_Latest.zip") z = zipfile.ZipFile(io.BytesIO( requests.get(url, stream=True, timeout=120).content)) return pd.read_csv(z.open(z.namelist()[0]), low_memory=False)
main22 = get_5500(2022) # ~243K plan filingssch_h = get_5500(2022, "SCH_H") # plan financials / asset breakdownjoined = main22.merge(sch_h, on="ACK_ID") # filing-level key — see gotchasGotchas (the ones that bite pipelines)
Section titled “Gotchas (the ones that bite pipelines)”The reason to read this page rather than the DOL site. Verified live on the date above (the 2022 main file is a ~30 MB ZIP; the apex-redirect below was reproduced).
- Hit
www.askebsa.dol.gov, not the apex. The apexaskebsa.dol.govissues a 301 whoseLocationcontains a literal space (FOIA Files, notFOIA%20Files) — confirmed live. That malformed header hangsurllib. Always request thewww.host directly. - Join schedules on
ACK_ID, notEIN.ACK_IDis the DOL filing ID and the correct key to attach a schedule to one filing.(EIN, PN)identifies a plan across years — use that for a plan-year panel, not for joining schedules. - “Latest” is a moving target. DOL revises files as late filings arrive. Row counts change month to month — snapshot the cache and report the access date, or your results aren’t reproducible.
- Slow source. ~50 KB/s on
urllib; userequestsstreaming (~2.5 MB/s). Expect a multi-minute first download per (year, schedule). Cache todata/form_5500/. - Big in memory. Schedule H is ~52 MB CSV/year; multi-year panels reach
hundreds of MB — use a column subset (
usecols=). - Schedule structure stabilized in 2009. Pre-2009 column names differ; don’t assume a 2022 schema for a 2005 file.
Key columns
Section titled “Key columns”| Column | Meaning |
|---|---|
ACK_ID | DOL filing ID — the join key for schedules |
EIN, PN | Sponsor EIN + plan number; (EIN, PN) = plan across years |
TOT_ASSETS_EOY_AMT | Total plan assets, end of year (Sch H) |
INT_REG_INVST_CO_EOY_AMT | Mutual-fund holdings (registered inv. cos.) |
INT_COMMON_TR_EOY_AMT | Common collective trusts (DC substitute) |
EMPLR_CONTRIB_*_AMT / PARTCP_CONTRIB_*_AMT | Employer / participant contributions |
PARTCP_LOANS_*_AMT | Participant loans outstanding |
Pair BOY/EOY columns to construct flows.
Standard operations
Section titled “Standard operations”- Plan-year panel: stack Schedule H across years, key on
(EIN, PN, year). - Mutual-fund exposure share:
INT_REG_INVST_CO_EOY_AMT / TOT_ASSETS_EOY_AMT. - Implied flows:
EOY − BOY·(1+r_t)with a benchmarked plan return. - Sponsor link: match sponsor EIN to Compustat for firm characteristics.
- Always state schedule, year, and access date (DOL revises “Latest”).
Citation
Section titled “Citation”U.S. Department of Labor, Employee Benefits Security Administration, Form 5500 [Schedule, plan year], public-use research files; https://www.dol.gov/agencies/ebsa/…/form-5500-datasets, accessed YYYY-MM-DD.