Build a Labeled Forecasting Dataset from News in 10 Minuntes

Foresight Data helps AI teams create model-ready training data at scale. Our pipeline transforms unstructured sources into questions and labels, no human labeling required. You can bring your own data or start with our integrated sources, such as Google News.

In this walkthrough, we’ll show you how to use the Foresight Data pipeline to generate a labeled forecasting dataset directly from the news in under 10 minutes.

How the Foresight Data pipeline works

The pipeline separates question generation from outcome resolution to prevent data leakage and ensure questions remain objective:

Question Generation: The system creates forward-looking questions (e.g., “Will Candidate X win the 2024 Arizona Senate election?”) based on source news articles.
Outcome Resolution: A separate resolver model uses web search to find the actual result and produce a label.

This pipeline is the data foundation of our Foresight Training methodology, which uses outcome-based supervision to train LLMs to make better predictions. You can learn more here.

What you’ll build

In about 10 minutes, you’ll generate:

Forecasting questions about a topic you choose (for example, elections)
Labeled outcomes resolved from real-world sources
A downloadable dataset you can use for training or evaluation

Step 1: Create a new dataset

Log into the Lightning Rod dashboard and select Dataset Generation from the left navigation menu. This opens the Dataset Generation screen, where you can configure the full pipeline.

Dataset Generation Screen

Step 2: Select topics for question generation

The first step is to decide what topics you want to generate questions about.

For this walkthrough, we’ll use Input: News Search.

News Search takes a search query and retrieves matching news articles. Questions will later be generated from these articles.

In the search queries field input: election.

Leave source filters, article count, and intervals at their default values.

Tip: It can be helpful to test your query in Google News first. If the results match the kinds of events you want to generate questions from, you’re on the right track. If not, adjust the terms to better reflect your topic.

Step 3: Define your time frame

Next, define the start and end dates. The start and end dates control when questions are evaluated, not which articles are included.

For this example, select a start date of January 1, 2025 and an end date of December 31, 2025. This will generate questions whose outcomes resolve in this time window.

Step 4: Define question criteria

Now we define what kinds of questions the system should generate. For this walkthrough, we’ll use:

Question Type: Forward Looking Question
Answer Type: Binary (Yes/No)
Questions Per Document: 3

Forward-looking questions are future-dated questions with a close data and resolution criteria (e.g. Will Microsoft announce an AI agent that can autonomously complete multi-step coding tasks by March 1, 2026?).

Tip: Increasing the Questions Per Document value identifies multiple distinct prediction questions within a single source article, resulting in more diverse training data

Step 5: Update your instructions

Now we’ll define the instructions the model will follow when generating questions. In the instructions field input:

Generate binary forecasting questions about election results and election outcomes.

Anything that voters, journalists, analysts, or prediction market participants might want to forecast.

For example: winners of races (president, governor, senate, house, mayor), primary outcomes, party control of legislatures, vote share and margin thresholds, ballot measure outcomes, runoff triggers, recounts, and certification outcomes.

Your question section should now look like this:

Step 6: Refine your query with example questions

Example questions strongly influence what the system generates.

In practice, models often follow examples more closely than written instructions. Even small changes here can significantly affect dataset quality.

To edit the example questions:

Expand the Example questions section
Replace the defaults with examples tailored to elections

Use examples like these:

Question	Why It Works Or Fails
Will Candidate X win the 2024 Texas Republican Senate primary?	✅ Clearly defined race, objective and unambiguous.
Will the Democratic candidate win the 2024 Arizona gubernatorial election?	✅ Specific, time-bound, objective resolution
Will the 2025 election be surprising?	❌ Too vague - 'surprising' is subjective and not measurable
Will misinformation affect the election?	❌ Not time-bound, no specific metric, entirely subjective

Update the example questions

Step 7: Adding context at prediction times (optional)

When context is enabled, the system reconstructs what was known at the time each question was asked by summarizing relevant articles published before the decision into a compact context block.

For this walkthrough, we’ll leave Context off.

Because context generation is the most compute-heavy and expensive part of the pipeline, we recommend refining your questions and labels first before adding the context layer.

Step 8: Resolving outcomes and labels

The next step is to enable Labels. Select the Labels toggle in the dataset configuration to turn outcome labeling on.

When Labels are enabled, a separate resolver model uses web search to find up-to-date information and determine the outcome for each question. It then produces a source-grounded label for every question.

The confidence threshold controls how certain the system must be before including a resolved outcome. Higher thresholds filter out ambiguous cases where post-event reporting is unclear.

For this walkthrough, leave the confidence threshold at its default value.

Enable labels

Step 9: Generate your dataset

You can run the pipeline as either:

Test Run: small preview, lower cost
Full Run: generates the complete dataset

The Max questions setting caps the total number of questions across all retrieved articles.

As you change settings like max questions or labels, the cost estimate updates automatically.

For this walkthrough, start with a Test Run.

Once the run starts, a progress indicator will appear while the job runs.

Progress indicator to show dataset is running

Step 10: View and download your dataset

Once the run completes, navigate to Datasets from the left menu.

Navigate to Datasets page

From the dataset preview, you can:

Inspect a sample of generated rows, including the source article, question, and label
Download the dataset using Download JSON
Open the full dataset view by selecting View Full Dataset

The dataset preview also shows basic metadata, including the number of rows generated and the total cost.

Example output (what you’ll see)

Congrats, you generated your first forecasting dataset

Now that you’ve generated the dataset, you can plug it into your workflow to train supervised forecasting models or use it as an evaluation set to benchmark LLM prediction performance.

Or, you can refine your Project Settings to explore new use cases:

Enable Context: Automatically retrieve and attach relevant news signals that were available at the time the question was asked, giving the model situational awareness.
Adjust Confidence Thresholds: Run multiple datasets with different confidence scores to explore impact on label output
Explore a new topic: Change your query from elections to finance, geopolitics, or technology

Build a Labeled Forecasting Dataset from Real-World News in 10 Minutes