Forecasts That Read Contracts: LLMs as Game Changers in Forecasting

Resources

Insight

Blog

Reading time:

9 min

Elitmind - Listen to the article. Global Private Markets Report 2025: Private equity emerging from the fog.

Audio Highlights

Summary text goes here

This component uses custom JavaScript to open and close. Custom attributes and additional custom JavaScript is added to this component to make it accessible.

Video Highlights

Summary text goes here

This component uses custom JavaScript to open and close. Custom attributes and additional custom JavaScript is added to this component to make it accessible.

Auto Generate Table-Of-Contents H2

Auto Generate Table-Of-Contents H3

For years, forecasting was synonymous with "good time-series models": ARIMA, Prophet, exponential smoothing, and in more advanced cases - boosting or deep learning. Today, we're increasingly finding that the edge doesn't come from the algorithm itself, but from the forecasting system: the ability to rapidly incorporate new signals, maintain stable production performance, ensure data quality control, and run predictable forecast update processes.

This is where LLM/GenAI comes in. Not as a "magic forecasting model," but as a technology that can transform previously unusable resources (contracts, documents, promotion descriptions, emails, news) into features and external signals that genuinely improve forecasts. In other words: LLMs enable forecasts to read a world that was previously locked away in PDFs and text.

Forecasting System ≠ Forecasting Model

The model is one piece of the puzzle. A forecasting system must answer business questions:

How quickly do we detect trend changes or demand shocks?
Are forecasts coherent across hierarchies (country → region → store, category → SKU)?
How do we handle long tail, new products, and short history?
How do forecasts drive decisions (orders, allocations, production) and do we measure this?
What happens when data "goes down" or a quality incident occurs?

In practice, most forecasting "failures" don't stem from choosing the wrong algorithm. They stem from missing: features, data quality, MLOps, and process. In our projects, we start with the system and "plugging forecasts into decisions" - only then do we optimize models.

Where LLMs Add Value in Forecasting: Exogenous Signals at Scale

In forecasting, the biggest quality jumps typically come from better exogenous variables, not from more sophisticated time-series models.

What can LLMs add?

LLMs can transform unstructured data into signals like:

A) Contracts and commercial terms

price indexation, discount thresholds, penalties, delivery windows
effective dates of changes that impact demand or availability
supplier risk: clauses, SLA exceptions, red flags

B) Promotion and campaign descriptions

campaign type (brand vs. performance), intensity, mechanics (bundle, cashback, -X%)
ad content semantics (e.g., "limited series," "last units")
marketing activity categorization without manual tagging

C) Operational documents

logistics reports, delays, stockout reasons
quality reports, complaints, voice of customer

D) Internet and news (where relevant and policy-compliant)

signals about supplier issues, outages, embargoes, strikes
consumer trends and sentiment around brands/categories
descriptive signals about raw material and transport price changes

In practice, LLMs do something crucial here: they turn chaos into features, which can then feed into classical forecasting models. Of course, we build this layer in a controlled way: with versioning, determinism, and monitoring - so that "AI features" are production-grade, not one-offs.

Which Forecasting Models Benefit Most?

LLMs don't replace classical forecasting. They enhance it - by providing features.

Models that effectively absorb new features:

Gradient Boosting (LightGBM/XGBoost/CatBoost): excellent for tabular + many external features, stable and fast
Global models for multiple series (SKU×store): one model learns from the entire population and generalizes to long tail
Deep learning for time series (TFT/Transformer/TCN): when scale is large and you want to capture non-linearities
SARIMAX / dynamic regression: classic, still very good when features are sensible and controlled

Where LLMs deliver the biggest impact:

where demand is text-driven (promotions, contracts, communications)
where variability stems from descriptive events (incidents, condition changes)
where you have long tail and cold start - product embeddings help transfer knowledge between similar SKUs

Feature Store in Forecasting Systems: Classical Features + Embeddings

A modern forecasting system needs a single source of truth for features, but in practice, that's 2 layers:

A) Feature Store (tabular):

features from sales history (lags, rolling windows, seasonality)
prices, promotions, availability, inventory levels
lead time, MOQ, calendar, holidays

B) Embedding/Vector Store (text and documents):

embeddings of contracts / amendments / promo specs
embeddings of product descriptions (for cold start)
embeddings of operational reports and notes

The key is simple: features must be versioned and repeatable. This enforces discipline: LLM model version, prompt, generation parameters, context scope, and sources.

This is a critical element often forgotten - without versioning and repeatability, forecasting with "AI features" becomes difficult to audit, test, and maintain long-term.

RAG in Forecasting: Not Just "QA," But Source of Features and Explanations

RAG is most commonly associated with chatbots, but in forecasting systems, it serves a much more practical role.

RAG can do two things:

1) Build features from documents "on demand" - For a given SKU/supplier, the system retrieves recent amendments, extracts dates and parameters, creates "delta features."

2) Generate forecast justifications - Business wants to know "why did the forecast change?" If the model received a signal like "price indexation change in contract from February 1" or "campaign starts January 15," RAG can explain this in language understandable to planners and managers - with references to specific sources. More detailed analyses and justifications can be generated using AI agents that combine RAG with the ability to query tabular databases. Such an agent, based on a natural language query, can query the feature table, compare it with prediction results, enrich the response with RAG results, and create a complete analysis of the situation.

This isn't window dressing. It's an element of trust and forecast adoption in the organization.

MLOps for Forecasting: What Must Work to Be Production-Ready

Forecasting is highly sensitive to data and process. Good systems have:

1) Data quality monitoring

gaps (missing data vs. zero sales)
unit of measure and price list changes
inventory level errors
data latency
LLM-generated features, e.g., using LLM-as-a-judge systems

2) Backtesting as a continuous process

rolling backtest (e.g., weekly)
metrics per segment (top SKU vs. long tail, categories, regions)
business-weighted metrics (cost of under-forecasting vs. over-forecasting)

3) Drift and event detection - When a new campaign launches, contract changes, or supply problems occur - that's contextual drift. Embeddings generated by LLMs enable detection of semantic changes in documents and operational signals. Based on these, the system can automatically trigger a response - retraining, feature correction, or operational alert.

4) Fallback path In mature forecasting, there's always:

baseline (e.g., seasonal naive / ETS)
simple emergency model
safety rules …so the system works even when upstream has problems or LLM inference must be temporarily limited.

This is an area where many companies "lose" ML value - and we at Elitmind deliver this engineering very strongly.

Forecasting + Replenishment: AI Improves Forecasts, But Decisions Happen in Optimization

The biggest value isn't in "better MAPE." It's in:

reduced stockouts
decreased frozen capital in inventory
increased service levels
fewer expedited deliveries
improved turnover

That's why the forecasting system should "feed" the decision layer:

safety stock, (s,S) policies, base-stock
order optimization (LP/MILP) under constraints (MOQ, capacity, delivery windows)
risk simulations (Monte Carlo)

LLMs excel here as generators of risk and event signals (contracts, supplier issues, operational documents) that influence optimization parameters.

Scenarios: "Forecasts That Read Contracts"

Scenario 1: Pricing term changes from a specific date

LLM extracts from amendment: "from February 1, indexation / discount / MOQ change"

System creates features: contract_price_shift_effective_date, discount_delta, moq_delta

Model accounts for demand spike (e.g., "stock-up buying" before increase) and drop after change

Scenario 2: Promotion described in text, without proper system tag

Marketing uploads brief in PDF. LLM classifies campaign and creates intensity

Forecast "sees" campaign before someone manually enters it in ERP

Scenario 3: Logistics, supplier risk, and lead time

LLM analyzes complaint logistics reports and correspondence, detects systematic delays

Risk feature is created and lead time corrected

Optimization increases safety stock for critical SKUs

Common Pitfalls and How to Avoid Them

Leakage: documents may contain "the answer" after the fact → hard time cutoffs + feature lineage

LLM cost and latency → batch, cache, refresh schedules; classification is often better than generation

Lack of versioning → version prompt, model, sources, and feature outputs

Too many features → selection, stability tests, limit to features with proven impact

These are the things that determine whether AI is an advantage - or just a curiosity.

Summary

LLM is not "another forecasting algorithm." It's a technology that:

transforms chaos (contracts, documents, promotion descriptions, internet) into features
allows forecasting models to leverage context they couldn't see before
industrializes forecasting through MLOps, monitoring, versioning
ultimately supports better decisions: replenishment, turnover, service levels, and cost

At Elitmind, we build these systems by combining ML and GenAI into production forecasting. We start with a quick diagnostic covering data, process, metrics, and risks, then build an MVP forecasting system with Feature/Embedding Store, and only at the end tune models. This approach delivers fast results - and is scalable.

Interested in exploring how LLM-enhanced forecasting could work in your context? Reach out - we're happy to discuss your specific use case.

‍

Meet the authors

Robert Woźniak

Chief Commercial Officer, Co-Founder

Barbara Leśniarek-Woźniak

AI & Apps Domain Lead

Przemysław Wiesiołek

AI/ML Team Leader

Dorota Szepietowska

Lead ML Consultant

Talk to us

Connect with your expert

Connect with Expert

Forecasts That Read Contracts: LLMs as Game Changers in Forecasting

Audio Highlights

Video Highlights

Forecasting System ≠ Forecasting Model

Where LLMs Add Value in Forecasting: Exogenous Signals at Scale

Which Forecasting Models Benefit Most?

Where LLMs deliver the biggest impact:

Feature Store in Forecasting Systems: Classical Features + Embeddings

RAG in Forecasting: Not Just "QA," But Source of Features and Explanations

MLOps for Forecasting: What Must Work to Be Production-Ready

Forecasting + Replenishment: AI Improves Forecasts, But Decisions Happen in Optimization

Scenarios: "Forecasts That Read Contracts"

Scenario 1: Pricing term changes from a specific date

Scenario 2: Promotion described in text, without proper system tag

Scenario 3: Logistics, supplier risk, and lead time

Common Pitfalls and How to Avoid Them

Summary

Meet the authors

Robert Woźniak

Barbara Leśniarek-Woźniak

Przemysław Wiesiołek

Dorota Szepietowska

Talk to us

Related Articles

Harmony: AI-Powered Data Reconciliation Built on Databricks

Becoming a Trusted Advisor with professional Competency Hub

Think You’re Not Affected by the EU Data Act? Think Again.

Behind Every ESG Report Is a Data Governance Story

Getting Things Done with HR Data

Meet our Team: Managed Services

Why AI Fails Without Data Governance

10 Benefits of Artificial Intelligence: Transforming Challenges into Competitive Advantages

How to Conduct an Effective Artificial Intelligence Audit: A Guide by Elitmind

Understanding ESG Metrics: A Comprehensive Guide

Meet our Team: Infrastructure & Security

Alternative Monitoring of Azure Data Factory with Azure Monitor Metrics

Meet our Team: Delivery & Technology

How to Get Started with ESG Reporting?

All About ESG Dashboards We Would Tell Our Younger Selves

My day as a…Data & Analytics Consultant – Iwona Kamińska

How to take your reporting and analysis in PowerBI to the next level?

5 main differences between ETL and ELT

How we help our clients to improve analytics thanks to a Modern Data Platform?

Modern Analytical Platform as a response to the rapidly growing e-commerce market

What’s data visualization? – definition, best practices, and examples

Transform Your ML Operations with Model Factory

Databricks Serverless Forecasting as a powerful tool for time-series model training

Feature Store at Scale with Azure Databricks Unity Catalog

Foundations of digital success: Expert insights from Elitmind’s leaders

Elitmind Data Lakehouse – a solution more agile than ordinary data warehouses

Meet our Team: Infrastructure & Security