Skip to content

About & methodology

How Capitol Releases works

Capitol Releases archives official press output from all 100 U.S. senators and 435 U.S. representatives. The goal is a searchable public record with enough provenance that a reporter can cite it and a developer can audit it.

What we collect

We collect original content from official .gov member websites: press releases, statements, op-eds, blog posts, floor statements, letters and photo releases.

The collection window starts Jan. 1, 2025. For seat changes, the archive follows the current officeholder only from the day that person took office.

TypeDefinition
press_release
Press release
The default class for original announcements from a member's news, media or press section.
statement
Statement
A public statement posted by the office, usually without a separate legislative action attached.
op_ed
Op-ed
Signed commentary or opinion writing republished on the official site.
blog
Blog post
Original posts from member blog, diary, newsletter or similar site sections.
floor_statement
Floor statement
Floor remarks when a member's office publishes them on its own press page.
letter
Letter
Published letters to agencies, officials, colleagues or constituents.
photo_release
Photo release
Photo-only or media-advisory items. Stored, but excluded from default public feeds.
presidential_action
Presidential action
White House actions stored in the same schema for federal executive coverage.
other
Other
Original official content that does not fit a more specific class. Reviewed during cleanup.

What we don't

We do not collect third-party clippings, "In the News" mentions, campaign content, campaign websites, interviews or outside media hits.

We do not backfill predecessor coverage when a seat changes hands. We also do not collect voting records, bill tracking or campaign finance records. Those records already exist elsewhere, including Congress.gov and the FEC.

How dates work

Every record can carry two date fields beyond the timestamp itself: date_source and date_confidence. They record where the date came from and how much the parser trusts it.

Most dates come from metadata, listing text or page-level date elements. About 1% of records have null dates, mostly ColdFusion sites where the date is embedded in body text rather than exposed as metadata.

Provenance

Every record stores source_url, scrape_run and scraped_at. The source URL is the office's page. The scrape run ties the row back to a collector pass. The scrape timestamp says when Capitol Releases saw it.

Records are never hard-deleted. If a source URL stops resolving on repeated checks, the row stays in the archive and gets a deleted_at tombstone.

Update cadence

GitHub Actions runs collection four times a day: 13:00, 17:00, 21:00 and 01:00 UTC. The same schedule refreshes WordPress JSON silos used for op-eds, newsletters, blogs and related official sections.

A health check runs before every collection pass. It verifies that configured source pages respond, selectors still find items and dates remain parseable.

Coverage status

The live coverage diagnostic is expected at docs/coverage-diagnostic-2026-05-03.json. Until that lands, this page points to the current House trouble list.

House coverage trouble sites, May 3, 2026

MetricStatusNote
U.S. senators (clean)311 / 1003 documented gaps; rest publishing on schedule.
U.S. House (configured)437 / 435Every member has a source row. Two non-voting delegate seats exclude (DC, PR, etc., counted in the 437 active rows).
House — reaches Jan. 2025418 / 43795.7% have ≥10 records reaching back to early 2025.
House — bulletproof accounted for435 / 43799.5% — clean rows plus 17 members tagged as documented gaps (Phase 2 scrapers, scraper bugs, low-volume offices).
House — open trouble list2Members with shallow archives where the cause is still under investigation.

Known low-volume offices

Some offices publish rarely or not at all. Those rows will be marked in the seed files once the expected_low_volume and expected_zero fields land.

NameChamberDistrict/stateStatusReasonLast verified
Adelita S. GrijalvaHouseAZ-7Expected low volumeSworn in 2025-09 after winning special election to replace her father Raúl Grijalva (died March 2025). Limited archive expected — 64 records since first day.2026-05-03
Alan ArmstrongSenateOKExpected zeroSworn in 2026-03-24 to fill ND seat vacated by Hoeven retirement; office is still in setup phase, no press releases published yet (verified 2026-05-03).2026-04-15
Ashley MoodySenateFLExpected low volumeAppointed March 5, 2025 to fill FL Senate seat vacated when Marco Rubio became Secretary of State. Listing reaches first publishing date; pre-appointment coverage doesn't apply.2026-05-03
Christian D. MenefeeHouseTX-18Expected low volumeSworn in February 2, 2026 after winning special election for Sylvester Turner's former Houston seat (TX-18). Limited archive expected — recent appointment.2026-05-03
Clay FullerHouseGA-14Expected low volumeSworn in April 14, 2026 after winning special election to replace Marjorie Taylor Greene (GA-14). Limited archive expected — recent appointment.2026-05-03
Guy ReschenthalerHousePA-14Expected low volumeChief Deputy Whip in 119th Congress. Whip-team work focuses on internal vote counting and member outreach rather than public press operations, which explains the relatively light press output (40 records starting March 2025). Verified by visual scan of his media listing 2026-05-03 — this is his real output, not a scraper gap.2026-05-03
Jim JordanHouseOH-4Expected zeroFound 13 items on listing page. 2026-05-03
Jon HustedSenateOHExpected low volumeAppointed February 18, 2025 to fill OH Senate seat vacated when JD Vance became Vice President. Listing reaches first publishing date; pre-appointment coverage doesn't apply.2026-05-03
Sheila Cherfilus-McCormickHouseFL-20Expected low volumeResigned April 21, 2026 while facing House Ethics action and federal charges. Coverage stops at resignation date.2026-05-03

Schema history

The schema was renamed in May 2026 as the project moved from a Senate-only archive to Congress-wide coverage. The old senators table became officials, and press_releases became official_site_items. Compatibility views remain during the transition.