KI im Mittelstand: Was funktioniert und was nicht

Two years ago, someone at a conference in your industry claimed that AI would change everything. Today, almost every mid-sized company has a pilot project behind them. Some run on, many petered out, a few will never officially be buried.

The Bitkom 2026 study confirms this impression with numbers. 41 percent of German companies actively use AI today, another 48 percent are planning to. The previous year, only 17 percent were active users. That is the leap from pilot to daily business, and with it come lessons that the hype of 2023 and 2024 had not yet revealed.

This article is a stocktake after three years of generative AI in mid-sized companies. Which use cases visibly work, which fail reliably, why, and what is realistic for the next twelve months. No list of the thirty best AI tools, no sales pitch, no anti-hype polemic. An honest balance, written from the perspective of someone who supports mid-sized companies through these transitions.

What Visibly Works in Mid-Sized Companies

The successful AI projects look unspectacular. No virtual employee, no efficiency miracle, no disruption. Instead, narrowly scoped tasks where the model amplifies the employee rather than replacing them.

Internal Document Search and Knowledge Management

Employees ask the internal system instead of a colleague. Typical examples: technical documentation, contracts, customer files, project wikis. A RAG system on a clearly bounded data set with high usage frequency, where the benefit is felt immediately. When and how a retrieval system makes sense and when long context windows are the better choice is a topic in itself.

Realistic effect: search time per request drops from ten to two or three minutes. No miracle, but a measurable improvement employees feel in their daily work.

Classification and Triage

Categorize incoming emails, prioritize tickets, route requests to the right department. Tight task scope, clear success criteria, the human makes the final decision. For these structured tasks, models with three to fourteen billion parameters are often enough, including those that run on your own hardware.

Code Assistance for Developers

GitHub Copilot, Claude Code, and similar tools in IT teams. Realistic effect: ten to thirty percent faster routine work, hardly any effect on complex architectural work. Important: the developer is responsible for the quality of the output, not the tool. Who reviews what the AI just wrote? is the follow-up question that keeps appearing in audits.

Translation and Language Standardization

Marketing copy, support communication, international correspondence. Output is easy to verify, quality control is clear, risk is low. One of the oldest AI use cases, and still one of the most reliable.

Data Extraction from Structured Documents

Invoice processing, delivery note handling, contract metadata. Works well with standardized formats, less well with highly variable layouts. Going from 80 to 95 percent accuracy is usually achievable. Going to 99 percent requires significant additional effort that only pays off at high volume.

The Common Denominator

All successful use cases share three properties: tight scope, easy verifiability, and amplification rather than replacement of employees. If a pilot meets these three criteria, it is likely to still be running productively eighteen months later.

What Fails Reliably in Mid-Sized Companies

Failed projects also follow a pattern. Knowing it saves the first twelve months of tuition.

The All-Knowing Company Chatbot

The idea: a chatbot that can answer any question about products, service, HR, and accounting. The reality: data is too heterogeneous, answers are half right, hallucinations are inevitable. Employees lose trust after the first wrong answers and stop using the tool.

Typical trajectory: after three months usage has dropped to ten percent of the expected level, after six months the project is shut down. The official record reads "the technology was not yet mature", when in truth the use case was scoped too broadly.

Fully Automated Customer Communication

The idea: AI answers customer inquiries without human supervision. The reality: hallucinations create legal and reputational risks that exceed any efficiency gains. A single false statement about a warranty, delivery date, or contract terms can cost more than two years of efficiency wins.

The realistic version of the same idea: AI suggests answers, an employee reviews and sends. That is augmentation, not automation, and it works.

The AI Consultant as Virtual Employee

The idea: an AI that works like a senior employee, with minimal supervision. The reality: AI has no context, no judgment about edge cases, no accountability. It works in standard cases, but employees do not trust it because they cannot tell where the limits are.

Marketing slogans like "your AI employee who never takes a vacation" sell well but only hold up in demos. In production, it turns out that the "employee" needs supervision after all, and the cost of reviewing AI output often exceeds the cost of the original task.

AI as a Solution for Structural Problems

The idea: "we have a data problem, let the AI solve it". The reality: AI does not turn bad data into good data. It only makes the data problem more visible. PDF archives, fragmented Excel sheets, scanned documents, and wikis with outdated content are not an AI data source. They are preliminary work.

The consequence: anyone who skips the data problem rather than addressing it is building on sand. Data work is boring, hard to market, and unavoidable.

Show Projects Without Success Criteria

The idea: "we are doing an AI project because our competitors are too". The reality: without a measurable goal, the project can neither succeed nor fail. It just fades. After eighteen months, the board asks for results, no one can name them, the budget moves to other initiatives.

The Five Most Common Pilot Project Mistakes

Avoiding the anti-use-cases above addresses the largest risks. What remains are execution mistakes that occur regardless of use case.

Scope too large. Trying to transform the entire company at once. The sober recommendation: one use case, one department, one user group, three months. If it works, expand. If it does not, the damage is contained.

Demo infatuation. A five-minute demo does not show how the system behaves under real load with real data. Pilots belong with real data and real users, not with curated showcase material. What looks fluid in a demo can fail in production on data quality, latency, or simply user behavior.

No success metrics. "We want to see if it helps" is not a success criterion. Define three to five KPIs before the start: processing time per task, share of correct answers, employee acceptance. Skip this, and twelve months later there is no argument for the next investment.

Wrong ownership. AI projects are often placed in marketing or in an innovation unit. Both lack data competence, IT background, and mandate. AI projects belong in IT or with technical leadership. Business units provide requirements and domain knowledge, not architecture. Who in mid-sized companies takes technical responsibility for these decisions is a question best answered before the first pilot, not after.

Tool choice before use case. "We will buy Microsoft Copilot, then we will find a use case" is the expensive variant. First the use case and requirements, then the tool. Otherwise you pay for licenses no one uses and for features that do not match the task.

The Underestimated Data Problem

The Bitkom 2026 study names data protection as the main brake: 48 percent of companies see it as a barrier, 39 percent specifically fear data misuse. But that is only half the story. The bigger, less visible brake is data quality.

What mid-sized companies actually have: PDF archives, fragmented Excel sheets, scanned documents, older databases with inconsistencies, wikis with stale entries, mail inboxes acting as de facto knowledge bases. What AI needs: structured, current, cleanly referenced data.

The typical gap between "we have data" and "the AI can work with it" runs from weeks to months of preparation. Typical tasks: convert document scans into searchable text, decompose PDF contracts into structured fields, refresh wiki content and remove duplicates, clarify permissions so the AI does not access what it should not.

This data work is boring, hard to sell, and unavoidable. It is also work that should already be happening regardless of AI, because poor data quality is a latent problem in most business processes. AI just makes the problem visible and unpostponable.

Augmentation Instead of Automation

The fundamental architectural choice: should AI take over a task fully, or should it support an employee?

Why Augmentation Wins in Mid-Sized Companies

Lower risk. Errors are corrected by the employee before they reach the outside world. Hallucinations get filtered before they become legal or reputational issues.

Higher acceptance. Employees feel amplified, not replaced. This is not a PR question, it decides usage and data hygiene. Anyone who experiences the AI as a threat does not use it and does not improve it.

Faster rollout. Fewer compliance hurdles, less training, fewer internal escalations. In practice, an augmentation solution is in production in three months, full automation rarely in under nine.

Realistic expectations. Ten to thirty percent efficiency gain instead of the promised doubling. The doubling never comes, the efficiency gain is robust.

When Automation Is Still the Right Path

Clearly rule-based tasks with high repetition frequency. Low risk on wrong decisions. Clear escalation paths when the AI is uncertain.

A typical comparison from customer support: augmentation means the AI suggests customer answers and the support agent reviews and sends. Automation means the AI answers standard inquiries automatically and only escalates edge cases. Both are valid, with different risk profiles. Anyone promising "fully automated with AI" should be asked about escalation paths before the contract is signed.

The Employee Acceptance Everyone Forgets

Tools are introduced, praised by management, and abandoned three months later. This is the underestimated success variable.

Common causes of low acceptance: tools slower than the previous workflow. Answers employees no longer trust after the first errors. Workflows not integrated into daily work. Lack of transparency about what the tool can and cannot do. And the incentive problem: when the AI delivers efficiency gains, the company benefits, but when the employee corrects errors, the employee bears the effort. This asymmetry needs addressing or the AI will not help.

What raises acceptance: early involvement of future users in pilot and tool selection, realistic communication of capabilities and limits, clear feedback paths for errors, visible improvement over time. Early indicators of acceptance are usage frequency per employee, share of active users, and qualitative feedback at four, eight, and twelve weeks.

Measure ROI, Even When It Is Hard

AI ROI is harder to measure than classic IT ROI. Efficiency gains are distributed: ten seconds here, one minute there. Quality improvements are hard to quantify. Time savings do not automatically convert into productive work.

Still measurable: processing time per task before and after, share of inquiries resolved without escalation, employee surveys on effort and frustration, direct cost savings on external tools that get replaced.

Pragmatic recommendation: define three to five KPIs before the start, measure monthly, do an honest review at six months. Anyone who does not measure ROI will not have a budget twelve months later, because no one can show whether it paid off. Even if it actually did.

What Will Change in the Next Twelve Months

Four developments that matter for AI strategies in mid-sized companies through mid-2027:

Local models keep maturing. Hardware gets cheaper, open-source models keep closing the gap, data protection requirements drive adoption. For many standard tasks, the local option is becoming the realistic default in 2027, especially in regulated industries.

Tool use and agentic patterns become standard. AI systems access internal systems directly instead of copying data into the prompt. This changes architecture significantly, the business case less so.

The EU AI Act becomes concrete. The obligations for general-purpose AI model providers have been in force since 2 August 2025. From 2 August 2026, the Commission can issue fines, up to three percent of global annual turnover or 15 million euros, whichever is higher. Relevant for mid-sized companies: anyone deploying an AI system is a deployer under the AI Act and has documentation and transparency obligations of their own, depending on the system's risk class. The effort is currently underestimated.

Consolidation among providers. Many AI tools from 2023 and 2024 disappear or get acquired. Anyone betting on shaky vendors faces double migration work. When choosing a tool, the question "what does this vendor look like in two years?" belongs on the list before the feature matrix.

Seven Questions Before the Next Pilot

These seven questions decide success and failure. Not the choice between GPT, Claude, or Gemini.

Which concrete use case is being solved, in which department, with which measurable success criteria?
In what data format do the relevant inputs exist today, and what preparation is needed?
Should the system automate a task or support an employee?
Who in the organization owns the project, with IT background and authority?
Which five KPIs are defined before the start and measured monthly?
How will data protection, GDPR, and any sector regulation be satisfied?
What happens when the system gives a wrong or harmful answer, and who is accountable?

Conclusion

AI in mid-sized companies works. Just rarely as spectacularly as it is sold. The successful projects are unspectacular, narrowly scoped, and amplify employees. The failed projects follow a recurring pattern that can be avoided.

Data work, clear use cases, augmentation over automation, and consistent measurement are the four levers with the largest effect. They are neither new nor glamorous. They are the foundation the next twelve months need to be built on.

The Bitkom numbers show that AI has arrived in mid-sized companies. The next wave will not be decided between adoption and rejection, but between disciplined and undisciplined adoption. Anyone who has technical leadership accountable for these decisions has an advantage. Anyone without should secure it before the next pilot starts. A deeper look at when a mid-sized company needs a CTO is the natural follow-up question.

You have an AI pilot project planned, or one that is not delivering what was promised? Contact me for an AI strategy workshop that brings clarity on use case, architecture, and success metrics in two days.