Overview
The Dilution dataset captures U.S. equity dilution risk at the moment it enters the market.
It is a point-in-time record of S-1 registration statements, enriched with market context and lifecycle tracking, designed for traders and researchers who need to identify and manage dilution-driven risk with precision.
Each record represents a filing as it was known on the filing date, labeled for its dilutive impact and supplemented with forward-resolving outcomes such as whether the offering later became effective or was withdrawn.
What This Dataset Represents
Initial S-1 filings, captured at filing time
A binary dilutive classification, indicating whether the filing represents meaningful equity dilution
Point-in-time firm characteristics, including market capitalization measured one trading day prior to filing to avoid look-ahead bias
Full lifecycle resolution, tracking if and when a filing becomes effective or is withdrawn
This allows users to study both:
the immediate signal introduced by a filing, and
the eventual outcome of that filing over time.
Why Dilution Matters
Equity dilution is one of the most persistent structural headwinds in public markets, particularly in small- and mid-cap equities.
S-1 filings often precede:
secondary offerings
resale pressure
changes in float and liquidity
prolonged price underperformance
By labeling and normalizing these filings at the moment they occur, the dataset enables systematic approaches to:
risk filtering
short-biased strategies
event-driven research
post-filing lifecycle analysis
Point-in-Time Integrity
All features are constructed to preserve point-in-time correctness:
Market capitalization is measured using data available before the filing date
Dilution labels reflect filing content, not future price action
Lifecycle fields (e.g. effectiveness or withdrawal) are updated only as those events occur
Historical records are never rewritten — only lifecycle status fields evolve.
Update Schedule
New filings are ingested nightly at 11:00 PM ET
Lifecycle statuses (effective / withdrawn) are checked and updated every morning at 7:00 AM ET
This ensures the dataset remains both timely and historically reliable.
Designed for Systematic Use
The Dilution dataset is built for:
programmatic ingestion
backtesting without leakage
production-grade trading and monitoring pipelines
If you can query it, you can trade it.
Last updated