21. Mai 2026

Tabular Foundation Models Are Having Their Moment. Should You Pay Attention?

Lesezeit: 7 Minuten

In the span of five months, tabular foundation models have gone from a niche research topic to a category attracting billions in investment, major acquisitions, and growing vendor attention. If you work with structured enterprise data, and most organizations do, this is a shift worth watching.

Let’s review what happened in chronological order

January 2026: Forbes names structured data „AI’s next $600 billion frontier“

In a Forbes deep dive, venture capitalist Rocio Wu Dianoux argued that while large language models (LLMs) have transformed how we work with text, the far larger opportunity lies in structured data: the tables, spreadsheets, and relational databases that power enterprise operations. The thesis: industries built on structured data (finance, insurance, manufacturing) still rely on thousands of task-specific machine learning (ML) models, each requiring its own pipeline, feature engineering, and monitoring. A general-purpose foundation model for tables could collapse that complexity. The article pointed to a new generation of companies building in this space — including Prior Labs, Fundamental, Neuralk AI, and Wood Wide AI. Each explores different architectural approaches to representing tabular and relational data, learning cross-column dependencies, and generalizing across tasks.

February 2026: Fundamental raises $255 million for its large tabular model

Weeks later, Fundamental emerged from stealth with $255 million in funding ($30 million seed plus $225 million Series A) at a $1.4 billion valuation. The round was led by Oak HC/FT, with participation from Salesforce Ventures, Battery Ventures, and Valor Equity Partners. Angel investors included the CEOs of Perplexity, Datadog, and Brex (TechCrunch).

Fundamental’s flagship model, NEXUS, is a deterministic Large Tabular Model (LTM) purpose-built to predict outcomes from enterprise data: demand forecasting, fraud detection, customer churn, and price prediction. Founded by DeepMind alumni, the company had already secured multiple seven-figure contracts with Fortune 100 companies and a strategic deployment partnership with AWS.

The scale of funding signals something important. Investors are treating tabular artificial intelligence (AI) as a platform category.

April 2026: H2O.ai launches TabH2O

Established AI platform vendor H2O.ai entered the tabular foundation model race with TabH2O, a foundation model for tabular data delivered as a simple API. The value proposition: send a file, get predictions back. The service requires no model training, no infrastructure management, and stores no customer data.

TabH2O is pre-trained on millions of synthetic tabular datasets and eliminates the traditional ML pipeline of feature engineering, model selection, and hyperparameter tuning. H2O.ai positions it as a tool for both standalone use and as a capability layer for AI agents working with structured data.

This matters because H2O.ai is not a start-up chasing a thesis. It is an established automated ML vendor validating that tabular foundation models represent the next evolution of the market it helped create.

May 2026: SAP acquires Prior Labs and bets €1 billion

The largest deal came on May 4, when SAP announced its acquisition of Prior Labs, the Freiburg-based pioneer of tabular foundation models and creator of TabPFN. SAP committed to investing more than €1 billion over four years to scale Prior Labs into a globally leading frontier AI lab for structured enterprise data.

The strategic logic: LLMs struggle with the structured, numerical data that runs enterprise operations. Tabular foundation models such as TabPFN are purpose-built for exactly this data, enabling use cases such as predictive maintenance, demand forecasting, customer churn analysis, and cash flow prediction, directly on SAP data.

At the same time, Prior Labs released TabPFN-3, pushing the model toward million-row regimes and introducing “Thinking” (test-time compute scaling) in its API offering. A strong signal that tabular foundation models are now scaling both performance and commercialization.

For a detailed analysis of what this means for SAP’s platform strategy, read our BARC Perspective on SAP’s dual acquisition of Prior Labs and Dremio.

Why this matters

Four data points in five months tell a compelling story:

Date	Event	Signal
January 2026	Forbes names structured data AI’s $600 billion frontier; profiles emerging ecosystem	Market narrative forming
February 2026	Fundamental raises $255 million for Large Tabular Model	Venture capital conviction
April 2026	H2O.ai launches TabH2O foundation model	Established vendor validation
May 2026	SAP acquires Prior Labs and Dremio; commits €1 billion+	Enterprise platform bet

From media narrative to start-up funding to established vendor adoption to major enterprise acquisition: all within a single quarter.

Beyond commercial signals: the open-source side

The momentum is not limited to venture capital and enterprise acquisitions. An active open-source community is driving innovation at a pace that rivals, and sometimes outpaces, commercial efforts.

The most striking example is TabICLv2, developed by researchers at INRIA (France’s national institute for digital science). On the TALENT benchmark, which evaluates models across 300 diverse tabular datasets, TabICLv2 is the current top performer, surpassing even RealTabPFN-2.5 (the hyperparameter-tuned, ensembled, and fine-tuned version of Prior Labs‘ model) without any tuning at all (arXiv). On the TabArena benchmark, it outperforms heavily tuned XGBoost, CatBoost, and LightGBM on approximately 80% of datasets. The model is fully open source, pip-installable, scikit-learn compatible, and available under a permissive license (GitHub).

TabICLv2 also addresses one of the key practical limitations of earlier tabular foundation models: scale. While TabPFN v2 was limited to datasets with up to 10,000 samples, TabICLv2 generalizes effectively to million-scale datasets under 50 GB of GPU memory, and is up to 10 times faster at inference than its predecessor. For a comprehensive overview of the current landscape, Christoph Molnar’s The state of Tabular Foundation Models (2026) provides an excellent independent assessment.

The existence of a fully open, state-of-the-art tabular foundation model matters for several reasons. It lowers the barrier to experimentation. It enables independent validation and reproducibility. And it ensures that the technology category is not solely defined by proprietary offerings. For organizations evaluating tabular foundation models, TabICLv2 is a natural starting point. It is free to use, easy to integrate, and competitive with the best commercial alternatives.

The question for enterprises

Tabular foundation models promise to do for structured data what LLMs did for text: make AI accessible without deep expertise, collapse the complexity of maintaining hundreds of task-specific models, and enable predictive capabilities on the data that actually runs business operations.

The real question is how quickly this technology will reshape the analytics and AI landscape, and whether your organization is ready.

Where tabular foundation models help, and where they fall short

Tabular foundation models make it easy to generate solid predictions with minimal effort. Upload a dataset, get a result. No feature engineering, no hyperparameter tuning, no model selection. For many standard use cases, the quality of those predictions is already on par with carefully tuned machine learning pipelines.

But there is a catch: the models are black boxes. The relationships in the input data that lead to a particular prediction remain hidden. Users get an answer, but not an explanation.

This is not entirely new. Many AutoML solutions share the same limitation. The difference is that some AutoML approaches, particularly those based on gradient-boosted trees, can provide meaningful explainability through built-in feature importance, SHAP values, or similar techniques. That transparency has real value: it builds trust, supports regulatory requirements, and enables domain experts to derive actionable insights from model outputs.

At the same time, the argument that building dedicated ML pipelines is the only alternative is weakening. Large language models have dramatically reduced the effort required to develop, test, and deploy custom pipelines. What once took weeks of engineering can now be prototyped in days.

So where do tabular foundation models fit today? We see two immediate, high-value use cases:

Rapid benchmarking. Because tabular foundation models deliver strong baselines almost instantly, they are ideal reference points for evaluating custom ML pipelines. If your purpose-built model cannot outperform a zero-shot foundation model, that is a signal worth investigating.
Fallback under extreme uncertainty. When historical patterns break down, as they did during the COVID-19 pandemic, when many traditional ML models failed because the data distribution shifted fundamentally, foundation models pretrained on diverse synthetic data may prove more resilient. Early research into drift-resilient tabular foundation models supports this hypothesis, though empirical evidence for real-world extreme scenarios remains limited.

The explainability gap is real, but it is narrowing. Current interpretability methods for tabular foundation models are exclusively post-hoc (such as SHAP or permutation feature importance) and come with disproportionately high computational costs, a consequence of the inverted cost structure where training is cheap but inference is expensive. However, researchers are actively working on integrated solutions. Approaches like KernelICL, which replaces the opaque prediction layer with transparent kernel functions, point toward a future where explainability is built into the model architecture rather than bolted on afterward.

If inference times continue to fall and integrated explainability matures, tabular foundation models could represent the next evolutionary step for machine learning on structured data. Until then, they are a useful addition to the toolkit, best used alongside interpretable approaches.

Run TabICLv2 against one of your existing ML pipelines this quarter. That benchmark will tell you more than any market signal.

DATA festival Munich: wAIt no more!

Das DATA festival kehrt am 16. Juni 2026 nach München zurück!

Sind Sie bereit, das volle Potenzial von Daten und künstlicher Intelligenz für Ihr Unternehmen zu nutzen? Das DATA festival bringt Data People aus der ganzen Welt zusammen, um echte Anwendungsmöglichkeiten von KI zu sehen.

Weitere Inhalte entdecken

Artikel

13. Mai 2026

Successful Delivery of AI/GenAI: AI Strategy and Management [EN]

Artikel

30. April 2026

Agentic AI in IP&A: was zuerst automatisiert wird und was manuell bleiben sollte

Artikel

29. April 2026

Successful Delivery of AI/GenAI: Who Is Doing the Work? [EN]

Artikel

16. April 2026

Successful Delivery of AI/GenAI: AI Leadership [EN]

Artikel

15. April 2026

Successful Delivery of AI/GenAI: Key Takeaways [EN]

Artikel

31. März 2026

BARC Perspective – SAP Acquires Reltio [EN]

Artikel

10. März 2026

SAP BW End of Life 2027: Strategische Neuausrichtung für Data, Analytics & AI

Artikel

3. März 2026

Von Oma lernen: Was KI-Agenten wirklich zuverlässig macht

Artikel

16. Februar 2026

Europäische Cloud im Praxistest: Funktioniert Datensouveränität ohne Performance-Verlust?

Artikel

12. Februar 2026

Data Culture Podcast 2025: Die Top Episoden

Artikel

9. Februar 2026

SAP Data & Analytics 2026: Von der Roadmap zur Realität

Artikel

2. Februar 2026

Data, BI and Analytics Trend Monitor 2026 – Recommendations & Methodology [EN]

Ein Beitrag von:

Alexander Seeliger

Analyst Data & Analytics, Data Scientist

Alexander Seeliger ist Analyst für Data & Analytics und Data Scientist am Business Application Research Center (BARC).

Er berät Unternehmen bei der Use-Case-Identifikation für Datenanalysen und bei der Werkzeugauswahl für Advanced Analytics.

Er führt Proof of Concepts im Bereich Advanced Analytics durch und gibt Data Science und Data Literacy Coachings.

Alexander Seeliger ist Autor von BARC-Marktstudien und Forschungsartikeln. Er hält Vorträge auf Konferenzen und führt BARC- und Inhouse-Seminare durch. Er ist maßgeblich für das Datenmanagement, die Datenaufbereitung und die Datenanreicherung der BARC Produkt- und Service-Übersichten verantwortlich.

Tabular Foundation Models Are Having Their Moment. Should You Pay Attention?

January 2026: Forbes names structured data „AI’s next $600 billion frontier“

February 2026: Fundamental raises $255 million for its large tabular model

April 2026: H2O.ai launches TabH2O

May 2026: SAP acquires Prior Labs and bets €1 billion

Why this matters

Beyond commercial signals: the open-source side

The question for enterprises

Where tabular foundation models help, and where they fall short

Weitere Inhalte entdecken

Successful Delivery of AI/GenAI: AI Strategy and Management [EN]

Agentic AI in IP&A: was zuerst automatisiert wird und was manuell bleiben sollte

Successful Delivery of AI/GenAI: Who Is Doing the Work? [EN]

Successful Delivery of AI/GenAI: AI Leadership [EN]

Successful Delivery of AI/GenAI: Key Takeaways [EN]

BARC Perspective – SAP Acquires Reltio [EN]

SAP BW End of Life 2027: Strategische Neuausrichtung für Data, Analytics & AI

Von Oma lernen: Was KI-Agenten wirklich zuverlässig macht

Europäische Cloud im Praxistest: Funktioniert Datensouveränität ohne Performance-Verlust?

Data Culture Podcast 2025: Die Top Episoden

SAP Data & Analytics 2026: Von der Roadmap zur Realität

Data, BI and Analytics Trend Monitor 2026 – Recommendations & Methodology [EN]

Ein Beitrag von:

Bereit für das datengetriebene Unternehmen von morgen?

Dann lassen Sie es uns gemeinsam zum Leben erwecken