Models

What Is Linear Regression?

A supervised-learning model that predicts a continuous numeric output as a weighted linear combination of input features.

What Is Linear Regression?

Linear regression predicts a continuous numeric output as a weighted linear combination of input features. For each feature x_i, the model learns a coefficient w_i; the prediction is y_hat = w_0 + w_1*x_1 + ... + w_n*x_n. Weights are fit by minimising squared error (ordinary least squares) — closed form via the normal equations on small data, or stochastic gradient descent at scale. It is the simplest model in classical ML, the canonical baseline, and the most interpretable: each weight reads as “how much output changes when this feature changes by one unit, holding others constant”.

Why It Matters in Production LLM and Agent Systems

Linear regression is not an LLM, but in 2026 production stacks it sits next to every LLM that matters. A re-ranking head on top of a vector retriever uses a linear model on retriever-score features. A cost-prediction model that decides whether to route a request to GPT-4 or a cheaper local model is a linear regression on prompt-length and tool-count features. A fraud-risk score feeding an agent decision is often a logistic or linear scorer with hundreds of engineered features. Each of these is the first layer in a pipeline whose final output is an LLM response — meaning a regression in the linear layer cascades into worse RAG or worse agent behaviour.

The pain shows up unevenly. A platform engineer notices retrieval quality dropped after a re-ranking model retrain; the linear model’s coefficients shifted because an upstream feature changed scale. A product team sees agent costs balloon because a cost-router’s linear weights drifted out of calibration. A compliance lead is asked to explain why a model declined a user — and a linear regression’s per-feature coefficients are the cleanest answer available.

In hybrid LLM-plus-classical-ML systems, monitoring needs to span both layers. A pure-LLM eval pipeline that ignores the linear scorer feeding it will mis-attribute every regression.

How FutureAGI Handles Linear Regression in Agent Pipelines

FutureAGI does not fit linear regression models — we evaluate their outputs when they participate in an agent or LLM-driven pipeline. The pattern: register the linear model’s predictions as span attributes via traceAI, attach the trace to a versioned Dataset, and run regression evals after any retrain or feature pipeline change.

Concretely: a recommendation team uses a linear scoring head over retriever-score, recency, and engagement features to rank candidate items before the LLM writes a personalised pitch. Every agent trace records model.scorer.value as a span attribute via traceAI-langchain. A weekly regression eval pulls 5,000 traces, runs NumericSimilarity and a CustomEvaluation rubric (“did the recommended item match the user’s eventual click”), plus AnswerRelevancy on the LLM response. When the eval-fail-rate spikes, the team can split contribution: did the linear scorer drift, or did the LLM start hallucinating? Without that joint surface, “the recommender got worse” is unactionable.

For pure linear-regression workflows outside an LLM pipeline, FutureAGI’s link is honest but weaker: the regression-eval discipline applies, but FutureAGI’s centre of gravity is LLM and agent observability, not classical-ML feature monitoring.

How to Measure or Detect It

Wrap linear-regression output with the same eval discipline as any other model:

  • Equals / NumericSimilarity: numeric comparison against ground truth.
  • CustomEvaluation with a business-metric rubric: e.g. “did this score correctly rank the chosen item top-3?”.
  • Span attribute model.scorer.value: makes the linear-model output visible in the agent trace.
  • Coefficient drift dashboard: compare current vs prior model weights — large shifts flag training-data drift.
  • Population stability index on input features: detects upstream feature distribution shift.
from fi.datasets import Dataset
from fi.evals import NumericSimilarity

ds = Dataset(name="reranker-eval", version=7)
ds.add_evaluation(evaluator="NumericSimilarity")
# Compare scorer output against ground-truth ranks across model retrains.

Common Mistakes

  • Skipping regression eval after a feature pipeline change. Coefficients tuned on the old feature scale produce silently wrong predictions on the new scale.
  • Confusing correlation with causation in coefficients. A negative weight does not mean the feature reduces the outcome; it can mean another correlated feature is doing the work.
  • Not regularising on wide feature sets. Plain OLS on hundreds of features overfits; ridge or lasso regression usually wins.
  • Assuming linearity holds. If the residuals show structure, a non-linear model (gradient boosting, neural net) is the better choice.
  • Treating linear regression as too simple to evaluate. It feeds production decisions; it deserves the same eval gate as the LLM downstream.

Frequently Asked Questions

What is linear regression?

Linear regression is a supervised-learning model that predicts a continuous numeric output as a weighted sum of input features plus an intercept, fit by minimising squared error between predictions and labels.

When should you use linear regression instead of a more complex model?

Use it as a baseline whenever the signal is roughly linear, when you need explainable per-feature coefficients, or when you have small data. If a tuned linear regression is within 1-2 points of a gradient-boosted model, the simpler model usually wins on production cost and interpretability.

How does FutureAGI relate to linear regression?

FutureAGI does not train linear regression models. We evaluate the predictions of any model — including linear regression scorers and re-rankers — when they participate in an LLM or agent pipeline, via Dataset.add_evaluation and CustomEvaluation.