book-openOverview

The Dilution dataset captures U.S. equity dilution risk at the moment it enters the market.

It is a point-in-time record of S-1 registration statements, enriched with market context and lifecycle tracking, designed for traders and researchers who need to identify and manage dilution-driven risk with precision.

Each record represents a filing as it was known on the filing date, labeled for its dilutive impact and supplemented with forward-resolving outcomes such as whether the offering later became effective or was withdrawn.


What This Dataset Represents

  • Initial S-1 filings, captured at filing time

  • A binary dilutive classification, indicating whether the filing represents meaningful equity dilution

  • Point-in-time firm characteristics, including market capitalization measured one trading day prior to filing to avoid look-ahead bias

  • Full lifecycle resolution, tracking if and when a filing becomes effective or is withdrawn

This allows users to study both:

  • the immediate signal introduced by a filing, and

  • the eventual outcome of that filing over time.


Why Dilution Matters

Equity dilution is one of the most persistent structural headwinds in public markets, particularly in small- and mid-cap equities.

S-1 filings often precede:

  • secondary offerings

  • resale pressure

  • changes in float and liquidity

  • prolonged price underperformance

By labeling and normalizing these filings at the moment they occur, the dataset enables systematic approaches to:

  • risk filtering

  • short-biased strategies

  • event-driven research

  • post-filing lifecycle analysis


Point-in-Time Integrity

All features are constructed to preserve point-in-time correctness:

  • Market capitalization is measured using data available before the filing date

  • Dilution labels reflect filing content, not future price action

  • Lifecycle fields (e.g. effectiveness or withdrawal) are updated only as those events occur

Historical records are never rewritten — only lifecycle status fields evolve.


Update Schedule

  • New filings are ingested nightly at 11:00 PM ET

  • Lifecycle statuses (effective / withdrawn) are checked and updated every morning at 7:00 AM ET

This ensures the dataset remains both timely and historically reliable.


Designed for Systematic Use

The Dilution dataset is built for:

  • programmatic ingestion

  • backtesting without leakage

  • production-grade trading and monitoring pipelines

If you can query it, you can trade it.

Last updated