SEC filings are filled with risk disclosures that describe what could go wrong, but they almost never tell you how likely those events are to occur. This leaves analysts to rely on professional intuition and qualitative factors to estimate the probability of a risk materializing.

At Lightning Rod Labs, we’ve built a system capable of predicting the specific probability that a given risk will materialize. We achieved this by combining a fully automated data generation pipeline with a specialized training methodology called Foresight Learning.

Building The Data Foundation

While many AI projects rely on human labelers, corporate risk is too complex and firm-specific to label at scale. Even if you could hire enough analysts, the ambiguity of risk language means different experts will label the same sentence differently, creating noisy data that prevents accurate probability modeling.

To build our training dataset, we used Foresight Data, our proprietary data generation platform. Foresight Data allows us to instantly generate high-fidelity training data from SEC risk disclosures without human intervention.

  • Firm-Specific Questions: Our pipeline generates concrete prediction questions from public 10-K and 10-Q filings (e.g., "Will DraftKings Inc. disclose a cessation of operations in any U.S. state by June 30, 2025?").

  • Outcome Identification (Labels): We scan subsequent SEC filings to find documented evidence of whether the risk actually materialized. Instead of relying on human labelers, we turn the firm’s actual history into a verifiable training set.

Using Foresight Data, we processed 6,109 risk queries across 1,953 distinct firms, a task that would likely have taken years to complete through manual annotation alone. This dataset was then used to train our model via Foresight Learning.

Training a Prediction Expert

We use a proprietary training method called Foresight Learning. Foresight Learning trains the model by "blindfolding" it to the future. We show the model a past disclosure and force it to have a prediction using only the information available at that time. We then use the actual historical outcome to score the model’s accuracy.

This closed-loop training grounds the model in reality and creates more accurate predictions:

  • Incentive for Honesty: During training, we heavily penalize the model for being "confident and wrong." An incorrect prediction carries a much heavier penalty if the model assigned it a 99% probability than if it assigned it a 60% probability. This forces the model to factor in its own uncertainty and only provide high probabilities when the evidence truly supports them.

  • Separating Signal from Boilerplate: By rewarding the model for accuracy against real outcomes, it learns to distinguish between boilerplate legal "cover" and the specific narrative markers that historically precede a material event.

Expertise Over Scale: Outperforming GPT-5

We then benchmarked our model against GPT-5 to see which could more accurately predict future risk materialization.

While GPT-5 is a powerful generalist, our research shows that specialized supervision allows a significantly smaller model to outperform one of the leading frontier models in both accuracy and calibration.

Higher Accuracy (Brier Score)

We measured performance using Brier Score, where a lower score indicates that predicted probabilities align more closely with actual outcomes.

We evaluated both models on a held-out test set using the same input context and prompt template. Despite being orders of magnitude smaller than GPT-5, our model achieved a superior aggregate Brier Score (0.1979 vs. 0.1986).

Superior Reliability (Calibration)

Calibration evaluates the deviation between predicted probabilities and actual frequencies. If the model predicts a 70% chance of an event, that outcome should materialize 70% of the time in the real world.

Generalist models tend to be overconfident. They may assign a high-certainty probability (e.g., 95%) to an event that only occurs 60% of the time. This presents a significant risk for decision-makers because you aren't sure if the output is a high-probability event or just an overconfident guess.

Trained with Foresight Learning, our model achieved a 64.7% lower calibration error than GPT-5.

Performance Benchmark

Infrastructure Sovereignty:

The requirement to send sensitive data to external model providers remains a significant concern for many leaders. Sharing this confidential information creates a risk of unintentional knowledge transfer, where your unique logic could inadvertently help train the very models your competitors use.

Our model is designed to be deployed directly within your own environment. Hosting the model inside your secure environment ensures:

  • Custody of Intellectual Property: Your proprietary data remains within your firewalls. This ensures your expertise is never used to refine the underlying models of an external provider.

  • Regulatory & Residency Compliance: A completely closed-loop system simplifies oversight by meeting the most stringent data-residency requirements and internal security protocols.

  • Ability to Process Sensitive Data: You can finally use AI on restricted datasets containing PII or PHI without the regulatory risk of data leaving your environment.

  • Full Auditability: Your IT and compliance teams maintain total control over a private utility, rather than relying on a third-party black box.

The Strategic Takeaway

For years, the hunt for alpha has driven firms toward expensive alternative data like satellite imagery and credit card scraps. However, our research suggests that a potent source of signal is already within a firm's own public disclosures.

By applying Foresight Learning to raw SEC filings, our specialized model outperformed GPT-5 in both predictive accuracy and calibration. This shows that the information edge isn't about having more data, but about having a superior method for extracting the predictive signal that’s already in your data.

This methodology isn’t limited to public filings. By applying this same training logic to internal knowledge, you can create domain experts that transform institutional memory into a proprietary engine. Instead of relying on the same frontier models like your competitors, you can now transform your internal data into a private intelligence layer that remains entirely under your control.

Learn More

Lightning Rod Labs is at the frontier of financial AI. We help institutional investors and risk practitioners transform unstructured data into well-calibrated predictive signals.

Interested in applying Foresight Learning to your firm's internal data?

  • Contact Our Team: Reach out to us directly at [email protected] to learn more and schedule a demo.

  • Read the Research: You can download our full technical paper here.

  • Visit Us: Learn more about our platform and ongoing research at www.lightningrodlabs.ai.