source · business~30M entities reachable→ entities by state→ Search→ by NAICS sector

US Business

All currently-registered US business entities — SEC EDGAR public companies (10K w/ ticker, 1M+ historical CIKs), IRS exempt orgs (1.95M nonprofits), state SOS filings (NY 20.5M free), GLEIF LEI (349K US, CC0), SAM.gov vendors, PPP recipients (5M small biz).

59,988live
Entities (merged)
50,000live
NY corporations
60live
FL Sunbiz
10,000live
PPP loans
500live
GLEIF LEI · US
60live
SAM.gov vendors
Entities
live
59,988

Live mix at 60K total: 9K SEC public companies (with ticker + exchange) + 9K NY DOS recent formations (Socrata live) + 9K SBA PPP recipients + 3K GLEIF LEI + 30K nonprofits across all 14 downloaded IRS EO states (CA/TX/NY/FL/IL/PA/OH/GA/MI/NC/NJ/MA/VA/WA). Detail page at `/business/entities/[id]` cross-links via clusters when available.

SEC EDGAR + NY DOS + IRS EO + PPP + GLEIF
Browse →
Officers & directors
live
100,365

Named officers, directors and key employees from public US filings — SEC EDGAR Form 4 (insider transaction filings, public companies) + IRS Form 990 Part VII Section A (compensation table for nonprofits, top-paid first). Names + titles + companies + compensation. No emails — pair with company website / Apollo for outreach.

SEC EDGAR Form 4 + IRS 990 e-file XML
Browse →
Cross-source clusters
live
1,000

Linked-entity clusters where the same legal entity appears across 2+ sources — joined via shared CIK / EIN / LEI / UEI or (normalized name + state) fallback. 1,000 clusters retained (member_count ≥ 2) from union-find over 6 datasets. Click a cluster to see all source records side-by-side.

ingest/entity_link union-find
Browse →
NY corporations
live
50,000

Active NY State corporations / LLCs / LPs / nonprofits via Socrata `n9v6-gdp6` (data.ny.gov, free, anonymous, near-realtime). 50K rows of the canonical one-row-per-entity registry across 5 entity_kinds. Full corpus 20.5M filings via the per-event 63wc-4exh dataset.

NY Open Data
Browse →
FL Sunbiz
stub
60

Florida corp/LLC/LP filings via free SFTP (Public/PubAccess1845!). 60-row synthetic fixture spanning Miami/Orlando/Tampa/Jacksonville/Tallahassee. Full ingest path stubbed in `ingest/fl_sunbiz.py` with the 1440-byte fixed-width field-offset map encoded.

FL DOS Sunbiz SFTP
Browse →
PPP loan recipients
live
10,000

10K unique borrowers sampled from SBA `public_150k_plus` FOIA (968K loans / 863K unique). Top 4K by initial amount + 6K evenly-spaced mid-tier. 54 states/territories. Full deduped corpus (863,501 rows) on NAS at `staging/ppp_unique.parquet` (24 MB zstd).

SBA PPP FOIA
Browse →
GLEIF LEI
live
500

500 US LEI records sampled across 174 distinct EntityLegalForm codes and all 50 states + DC, from the GLEIF Golden Copy (3.3M global / 349K US, CC0). Streamed from the lei2 zip; relationship records (parent LEI) live in the separate rr_latest.zip and are not yet joined.

GLEIF golden copy
Browse →
SAM.gov vendors
stub
60

60-row synthetic fixture of federal vendors with realistic UEI / CAGE / NAICS across DC/VA/MD/CA/TX/FL/GA. Full ingest path stubbed in `ingest/sam_gov.py` for the SAM_PUBLIC_MONTHLY_V2 monthly extract — gated on a free api.data.gov key.

SAM.gov / api.data.gov
Browse →
USAspending recipients
live
1,351

1,351 top federal-contract recipients aggregated from USAspending.gov spending_by_award API (5K award rows → 1.35K unique UEI). Lockheed Martin $322B, Electric Boat $141B at the top. DoD-dominated; 47 states represented. Free, no auth.

api.usaspending.gov
Browse →
Financials (SEC XBRL)
live
220

220 latest-FY financial snapshots (revenue / net income / assets / employees) for popular tickers from `data.sec.gov/api/xbrl/companyfacts/`. Apple $416B, Amazon $717B, MSFT $282B captured. Full nightly companyfacts.zip (~10GB) wired in `ingest/sec_companyfacts.py`.

SEC XBRL companyfacts API
Browse →
IP assets (USPTO TM + Patent)
stub
200

200-row synthetic preview of trademark + patent holders with realistic class distribution. IBM 1180 TMs / 110K patents at the top. Full bulk path stubbed in `ingest/uspto_tm.py` + `ingest/uspto_patent.py` against USPTO Open Data Portal CSV archives.

USPTO Open Data Portal
Browse →
Identifier coverageCIK · SECEIN · IRSLEI · GLEIFUEI · SAM.govDOS ID · NYSunbiz # · FL

Verified bulk-download paths

  • SEC EDGARlive company_tickers_exchange.json (10K active public companies w/ ticker + exchange) and cik-lookup-data.txt (1M+ historical CIKs). Free, daily, no auth.
  • IRS EO BMFlive — per-state CSV at irs.gov/pub/irs-soi/eo_<state>.csv (1.95M nonprofits total, monthly).
  • NY State Open Datalive — Socrata API data.ny.gov/api/views/63wc-4exh/rows.csv — 20.5M corp filings, free, near-realtime. Best free state SOS bulk in the country.
  • FL Sunbizstub — public SFTP sftp.floridados.gov (Public/PubAccess1845!), 10M+ entities, daily delta + quarterly full.
  • SBA PPPlive — 13 CSVs at data.sba.gov/dataset/ppp-foia, ~5GB, 11.4M loans / ~5M unique businesses.
  • GLEIF LEIlive goldencopy.gleif.org/api/v2/golden-copies/publishes/lei2/latest.csv, 3.3M global / 349K US, CC0, daily.
  • SAM.govstub — monthly SAM_PUBLIC_MONTHLY_V2_*.ZIP, ~700K registered vendors with UEI/CAGE/NAICS. Needs free api.data.gov key.

Skipped (paid / locked): OpenCorporates (£12K+/yr), CA SOS bulk ($100/snapshot), TX SOS ($1,350+), DE Division of Corporations (no bulk).