AI engineering

What is a Large Language Model (LLM)?

Short definition

A Large Language Model (LLM) is an AI model trained on vast amounts of text to understand and generate human language. It predicts likely continuations of text and can be applied to tasks such as writing, summarising, answering questions, and reasoning. LLMs power most modern generative-AI products and are typically accessed as general-purpose models via an API.

A Large Language Model — an LLM — is an artificial-intelligence model trained on enormous quantities of text to understand and generate human language. At their core, LLMs predict likely continuations of a sequence of text, and from this seemingly simple capability emerges a remarkable range of behaviour: writing, summarising, translating, answering questions, extracting information, and a degree of reasoning. LLMs are the technology behind most of today’s generative-AI products and have become a standard building block for software.

How LLMs work at a high level

An LLM is trained by exposing it to vast amounts of text and having it learn to predict what comes next. Through this process it builds an internal statistical representation of language — grammar, facts, styles, patterns of reasoning. When given a prompt, it generates a response by repeatedly predicting the most likely next piece of text. It does not look up answers in a database; it produces them from the patterns it learned, which is both its strength and the source of its limitations.

Tokens

LLMs do not process text as words or characters but as tokens — chunks of text that may be whole words, parts of words, or punctuation. Both the input and the output are measured in tokens, and this matters in practice: pricing for commercial models is typically per token, and there are limits on how many tokens a model can handle at once. Understanding tokens helps in estimating cost and designing prompts that fit within limits.

The context window

Every LLM has a context window — the maximum amount of text, measured in tokens, it can consider at one time, spanning both the prompt and the response. Anything outside the window is invisible to the model. The size of the context window determines how much information — a long document, a conversation history, retrieved data — can be supplied at once, and working within it is a central constraint in designing LLM-powered features.

Capabilities

LLMs are strikingly versatile. A single model can draft text, summarise long documents, answer questions, translate, classify, extract structured data from prose, write and explain code, and hold a conversation. This generality is why they are described as general-purpose: rather than being built for one task, they can be applied to many through prompting alone. For software, this means a single integration can power a wide range of features.

Limitations and hallucination

LLMs have important limits. Because they generate plausible text rather than retrieve verified facts, they can produce confident but incorrect statements — known as hallucinations. Their knowledge is bounded by their training data and may be out of date. They can be inconsistent and sensitive to how prompts are phrased. Designing reliable products on LLMs means accounting for these limits with verification, grounding, and human oversight rather than assuming the output is always correct.

Grounding with retrieval

A key technique for improving reliability is to ground the model in trustworthy, relevant information at the time of the query — supplying it with the facts it needs rather than relying on its trained knowledge. Retrieval-augmented generation does exactly this, fetching relevant documents and giving them to the model as context. Grounding reduces hallucination and lets a model answer using current, organisation-specific information it was never trained on.

Integrating LLMs into software

Most products use LLMs by calling a provider’s model through an API rather than training their own. The application sends a prompt — often assembled from a template, user input, and retrieved context — and receives generated text to use in the product. This makes powerful language capabilities accessible to small teams, but it also introduces dependencies on the provider and considerations around cost, latency, and the data being sent.

LLMs and the EU AI Act

Large language models intersect directly with the EU AI Act. The most capable LLMs are general-purpose AI models subject to the Act’s GPAI obligations, which fall on the model provider. When an organisation builds an LLM into its own product, that product is typically an AI system in its own right, and the organisation must consider its own obligations — including transparency that users are interacting with AI. Understanding this layered responsibility is essential when shipping LLM features.

Data protection considerations

Sending data to an LLM — especially a third-party API — raises data-protection questions, since prompts may contain personal data. Organisations must consider where the data is processed, whether it is used to train the provider’s models, and how to keep personal data minimised and protected. For DACH and EU products, choosing providers and configurations that respect data residency and the GDPR is a core part of integrating LLMs responsibly. Innopulse builds LLM features into its products with these considerations designed in.

Conclusion

A Large Language Model is an AI trained on vast text to understand and generate language by predicting likely continuations, giving rise to versatile capabilities across writing, analysis, and reasoning. Working with LLMs means understanding tokens, context windows, and their tendency to hallucinate, and grounding them in trustworthy data for reliability. Integrated via API into software, they bring powerful capabilities within reach — but demand attention to the AI Act and data protection when handling real users’ data.

AI engineering is our specialty

Innopulse doesn't just explain terms — we put them into practice for DACH companies.

View services Back to the glossary