In the span of five months, tabular foundation models have gone from a niche research topic to a category attracting billions in investment, major acquisitions, and growing vendor attention. If you work with structured enterprise data, and most organizations do, this is a shift worth watching.
Let’s review what happened in chronological order
January 2026: Forbes names structured data „AI’s next $600 billion frontier“
In a Forbes deep dive, venture capitalist Rocio Wu Dianoux argued that while large language models (LLMs) have transformed how we work with text, the far larger opportunity lies in structured data: the tables, spreadsheets, and relational databases that power enterprise operations. The thesis: industries built on structured data (finance, insurance, manufacturing) still rely on thousands of task-specific machine learning (ML) models, each requiring its own pipeline, feature engineering, and monitoring. A general-purpose foundation model for tables could collapse that complexity. The article pointed to a new generation of companies building in this space — including Prior Labs, Fundamental, Neuralk AI, and Wood Wide AI. Each explores different architectural approaches to representing tabular and relational data, learning cross-column dependencies, and generalizing across tasks.
February 2026: Fundamental raises $255 million for its large tabular model
Weeks later, Fundamental emerged from stealth with $255 million in funding ($30 million seed plus $225 million Series A) at a $1.4 billion valuation. The round was led by Oak HC/FT, with participation from Salesforce Ventures, Battery Ventures, and Valor Equity Partners. Angel investors included the CEOs of Perplexity, Datadog, and Brex (TechCrunch).
Fundamental’s flagship model, NEXUS, is a deterministic Large Tabular Model (LTM) purpose-built to predict outcomes from enterprise data: demand forecasting, fraud detection, customer churn, and price prediction. Founded by DeepMind alumni, the company had already secured multiple seven-figure contracts with Fortune 100 companies and a strategic deployment partnership with AWS.
The scale of funding signals something important. Investors are treating tabular artificial intelligence (AI) as a platform category.
April 2026: H2O.ai launches TabH2O
Established AI platform vendor H2O.ai entered the tabular foundation model race with TabH2O, a foundation model for tabular data delivered as a simple API. The value proposition: send a file, get predictions back. The service requires no model training, no infrastructure management, and stores no customer data.
TabH2O is pre-trained on millions of synthetic tabular datasets and eliminates the traditional ML pipeline of feature engineering, model selection, and hyperparameter tuning. H2O.ai positions it as a tool for both standalone use and as a capability layer for AI agents working with structured data.
This matters because H2O.ai is not a start-up chasing a thesis. It is an established automated ML vendor validating that tabular foundation models represent the next evolution of the market it helped create.
May 2026: SAP acquires Prior Labs and bets €1 billion
The largest deal came on May 4, when SAP announced its acquisition of Prior Labs, the Freiburg-based pioneer of tabular foundation models and creator of TabPFN. SAP committed to investing more than €1 billion over four years to scale Prior Labs into a globally leading frontier AI lab for structured enterprise data.
The strategic logic: LLMs struggle with the structured, numerical data that runs enterprise operations. Tabular foundation models such as TabPFN are purpose-built for exactly this data, enabling use cases such as predictive maintenance, demand forecasting, customer churn analysis, and cash flow prediction, directly on SAP data.
At the same time, Prior Labs released TabPFN-3, pushing the model toward million-row regimes and introducing “Thinking” (test-time compute scaling) in its API offering. A strong signal that tabular foundation models are now scaling both performance and commercialization.
For a detailed analysis of what this means for SAP’s platform strategy, read our BARC Perspective on SAP’s dual acquisition of Prior Labs and Dremio.
Why this matters
Four data points in five months tell a compelling story:
| Date | Event | Signal |
|---|---|---|
| January 2026 | Forbes names structured data AI’s $600 billion frontier; profiles emerging ecosystem | Market narrative forming |
| February 2026 | Fundamental raises $255 million for Large Tabular Model | Venture capital conviction |
| April 2026 | H2O.ai launches TabH2O foundation model | Established vendor validation |
| May 2026 | SAP acquires Prior Labs and Dremio; commits €1 billion+ | Enterprise platform bet |
From media narrative to start-up funding to established vendor adoption to major enterprise acquisition: all within a single quarter.
Beyond commercial signals: the open-source side
The momentum is not limited to venture capital and enterprise acquisitions. An active open-source community is driving innovation at a pace that rivals, and sometimes outpaces, commercial efforts.
The most striking example is TabICLv2, developed by researchers at INRIA (France’s national institute for digital science). On the TALENT benchmark, which evaluates models across 300 diverse tabular datasets, TabICLv2 is the current top performer, surpassing even RealTabPFN-2.5 (the hyperparameter-tuned, ensembled, and fine-tuned version of Prior Labs‘ model) without any tuning at all (arXiv). On the TabArena benchmark, it outperforms heavily tuned XGBoost, CatBoost, and LightGBM on approximately 80% of datasets. The model is fully open source, pip-installable, scikit-learn compatible, and available under a permissive license (GitHub).
TabICLv2 also addresses one of the key practical limitations of earlier tabular foundation models: scale. While TabPFN v2 was limited to datasets with up to 10,000 samples, TabICLv2 generalizes effectively to million-scale datasets under 50 GB of GPU memory, and is up to 10 times faster at inference than its predecessor. For a comprehensive overview of the current landscape, Christoph Molnar’s The state of Tabular Foundation Models (2026) provides an excellent independent assessment.
The existence of a fully open, state-of-the-art tabular foundation model matters for several reasons. It lowers the barrier to experimentation. It enables independent validation and reproducibility. And it ensures that the technology category is not solely defined by proprietary offerings. For organizations evaluating tabular foundation models, TabICLv2 is a natural starting point. It is free to use, easy to integrate, and competitive with the best commercial alternatives.
The question for enterprises
Tabular foundation models promise to do for structured data what LLMs did for text: make AI accessible without deep expertise, collapse the complexity of maintaining hundreds of task-specific models, and enable predictive capabilities on the data that actually runs business operations.
The real question is how quickly this technology will reshape the analytics and AI landscape, and whether your organization is ready.
Where tabular foundation models help, and where they fall short
Tabular foundation models make it easy to generate solid predictions with minimal effort. Upload a dataset, get a result. No feature engineering, no hyperparameter tuning, no model selection. For many standard use cases, the quality of those predictions is already on par with carefully tuned machine learning pipelines.
But there is a catch: the models are black boxes. The relationships in the input data that lead to a particular prediction remain hidden. Users get an answer, but not an explanation.
This is not entirely new. Many AutoML solutions share the same limitation. The difference is that some AutoML approaches, particularly those based on gradient-boosted trees, can provide meaningful explainability through built-in feature importance, SHAP values, or similar techniques. That transparency has real value: it builds trust, supports regulatory requirements, and enables domain experts to derive actionable insights from model outputs.
At the same time, the argument that building dedicated ML pipelines is the only alternative is weakening. Large language models have dramatically reduced the effort required to develop, test, and deploy custom pipelines. What once took weeks of engineering can now be prototyped in days.
So where do tabular foundation models fit today? We see two immediate, high-value use cases:
- Rapid benchmarking. Because tabular foundation models deliver strong baselines almost instantly, they are ideal reference points for evaluating custom ML pipelines. If your purpose-built model cannot outperform a zero-shot foundation model, that is a signal worth investigating.
- Fallback under extreme uncertainty. When historical patterns break down, as they did during the COVID-19 pandemic, when many traditional ML models failed because the data distribution shifted fundamentally, foundation models pretrained on diverse synthetic data may prove more resilient. Early research into drift-resilient tabular foundation models supports this hypothesis, though empirical evidence for real-world extreme scenarios remains limited.
The explainability gap is real, but it is narrowing. Current interpretability methods for tabular foundation models are exclusively post-hoc (such as SHAP or permutation feature importance) and come with disproportionately high computational costs, a consequence of the inverted cost structure where training is cheap but inference is expensive. However, researchers are actively working on integrated solutions. Approaches like KernelICL, which replaces the opaque prediction layer with transparent kernel functions, point toward a future where explainability is built into the model architecture rather than bolted on afterward.
If inference times continue to fall and integrated explainability matures, tabular foundation models could represent the next evolutionary step for machine learning on structured data. Until then, they are a useful addition to the toolkit, best used alongside interpretable approaches.
Run TabICLv2 against one of your existing ML pipelines this quarter. That benchmark will tell you more than any market signal.