What Is a Support Vector Machine (SVM)? Definition (2026)

What Is a Support Vector Machine (SVM)?

A support vector machine, introduced by Cortes and Vapnik (1995), is a supervised classification (or regression) algorithm that finds the hyperplane separating classes with the largest margin between the closest points (the support vectors). When the data is not linearly separable, the kernel trick implicitly maps inputs into a higher-dimensional space — RBF, polynomial, sigmoid kernels are common — where a linear margin exists. SVMs are deterministic, have well-understood generalisation bounds, and remain a strong baseline on tabular and small-to-medium datasets. They are not a replacement for LLMs, but they live alongside them as cheap classifiers over embeddings.

Why It Matters in Production LLM and Agent Systems

LLM applications often need a tiny classifier somewhere in the pipeline. Should this query route to the cheap model or the expensive one? Is this user input safe enough to pass to the agent? Does this response trip a domain-specific filter? Calling an LLM judge for every classification is expensive and slow; calling a 50-microsecond SVM over a pre-computed embedding is essentially free.

The pain shows up across roles. A platform engineer notices their cost-optimized-routing policy uses a 7B model to decide whether to call a 70B model — the routing decision costs more than the savings. A safety lead wants a real-time toxicity gate but cannot afford Toxicity LLM judge calls on every request. A product manager wants intent classification at 100ms p99 budget and the LLM-classifier latency keeps blowing through it.

SVMs over embeddings solve all three. A binary SVM trained on 5K labelled embeddings classifies in microseconds, deploys as a 100KB blob, and updates with a quick retrain. They will not replace LLMs as the workhorse, but they are the right tool for the bottom of the funnel where every microsecond and dollar counts. In 2026, the pattern compounds — every agent has 5–10 tiny gating decisions per request, and SVMs (alongside small transformer classifiers and logistic regression) are how those decisions stay cheap.

How FutureAGI Handles SVM-Routed Pipelines

FutureAGI does not train SVMs; we evaluate the LLM outputs and judge the embeddings that flow through SVM-routed pipelines. The platform integrates at three points: embedding evaluation with EmbeddingSimilarity to verify that the embeddings the SVM consumes are stable, classifier output evaluation with GroundTruthMatch against a held-out labelled set, and end-to-end LLM eval to measure whether the SVM’s routing decision actually improved the downstream metric.

Concretely: a team uses an SVM to classify customer-support queries into “billing”, “technical”, or “general” intents and route each intent to a different prompt template. The SVM is trained on text-embedding-3-large embeddings of 8K labelled queries. FutureAGI’s Dataset stores the held-out test split. GroundTruthMatch runs against the SVM’s predicted intent, surfacing per-class precision/recall on the dashboard. Downstream, the agent’s AnswerRelevancy and TaskCompletion are sliced by routed intent — when “billing” intent’s TaskCompletion regresses by 6 points, the team can attribute it cleanly to either an SVM regression or a prompt regression by comparing against a baseline run.

For drift, monitoring-embeddings (a related glossary topic) detects when the embedding distribution shifts under the SVM, signalling that retraining is due before classifier accuracy collapses.

How to Measure or Detect It

GroundTruthMatch: returns binary or scored match against the SVM’s labelled gold class.
Per-class precision and recall: the standard SVM output metric — surfaces under-served classes.
EmbeddingSimilarity: checks whether the embedding inputs are drifting; an SVM trained on stale embeddings degrades silently.
End-to-end downstream eval: AnswerRelevancy, TaskCompletion sliced by SVM-predicted route — surfaces whether routing is helping.
Inference latency p99: a single SVM should deliver p99 under 1ms; if higher, the embedding-lookup path is the bottleneck.

from fi.evals import GroundTruthMatch, EmbeddingSimilarity

match = GroundTruthMatch()
sim = EmbeddingSimilarity()

result_a = match.evaluate(output="billing", expected_response="billing")
result_b = sim.evaluate(text_a="My card was declined", text_b="My credit card failed")
print(result_a.score, result_b.score)

Common Mistakes

Choosing kernel by gut. RBF is the default; check linear and poly empirically — SVM kernel choice is one of the few hyperparameters that should be tuned.
Skipping class balancing. Class imbalance kills SVM margins; use class_weight or stratified resampling.
Training on stale embeddings. Embedding model versions change; pin the embedding model and retrain the SVM on every embedding-model swap.
Reporting only accuracy. A 90% accurate SVM on a 90/10 split is the majority-class baseline; report per-class metrics.
Using SVM where a tiny transformer would do. Above ~100K samples, distilled-transformer classifiers usually beat SVM with similar latency.

Frequently Asked Questions

What is a support vector machine?

An SVM is a classical supervised learning algorithm that finds the maximum-margin hyperplane separating two classes in feature space, optionally lifted to higher dimensions through a kernel function.

How is an SVM different from a neural network?

Neural networks learn hierarchical representations end-to-end. SVMs work with hand-crafted features (or fixed embeddings) and find a single optimal margin. SVMs are faster to train on small data and more interpretable; neural networks scale better with data and compute.

Where do SVMs fit in modern LLM applications?

SVMs run as cheap lightweight classifiers over LLM-produced embeddings — intent routers, content gates, eval pre-filters — where a deep model would be overkill. FutureAGI evaluates the LLM outputs that flow through these SVM-routed pipelines.