What is an AI operating system in this context?

It refers to a unified platform that combines storage, database, compute, and orchestration to support AI workloads.

Why do AI projects fail when moving to production?

Systems that work in pilot environments often break at scale due to performance, data, and architectural limitations.

What issues do vector databases face at scale?

They can struggle with performance and responsiveness, especially in real-time AI applications.

Why is data fragmentation a problem for AI?

Data spread across multiple systems makes governance, access control, and consistency difficult.

What makes AI infrastructure complex today?

Organizations must integrate multiple components across storage, compute, and data layers, creating fragile systems.

Why is infrastructure flexibility important for AI?

Workloads may need to run across on-prem, cloud, or alternative environments depending on resource availability.

How do hardware shortages impact AI deployments?

Limited access to GPUs and SSDs forces organizations to rethink where and how workloads are executed.

What does moving compute to data mean?

It involves running processing workloads closer to where data resides to reduce latency and complexity.

Why is AI context management important?

Maintaining context improves AI performance but can be expensive and inefficient without optimized solutions.

How can integrated platforms improve AI operations?

They reduce complexity, improve scalability, and help organizations run AI workloads more reliably in production.

VAST Data: Stop Rebuilding Your AI Stack Every Time It Breaks

Name: VAST Data: Stop Rebuilding Your AI Stack Every Time It Breaks
Uploaded: 2026-04-15T10:46:30-04:00
Duration: 16 min 27 s
Description: As AI workloads scale, many teams encounter failures in performance, data management, and system design that weren’t visible during early testing.

Truth in IT

04/15/2026

3 (100%)

Report Like Favorite

Transcript

Mike Matchett: Hi Mike Matchett with Small World Big Data and I'm excited to be here today with VAST data. We are going to catch up with them on some of the things they've just announced at their VAST forward conference. They've got a lot of things going on in the AI space, but a lot of cool stuff overall. And the world is storage has changed from, you know, file systems and block systems and heavy arrays through all sorts of iterations to the point where we are now seeing basically AI systems coming out of our great vendors. And we're going to talk a little bit about how VAST has gotten there today. So just hang on a second. Hey, Phil, welcome to our show. Phil Manez, Vice President - GTM Execution: Thanks so much for having me. Mike Matchett: All right. We're going to talk about everything VAST today, which is a big topic because you guys named it VAST. Obviously, you got a lot to talk about. Probably too much for a single episode here. Uh, but just to set the stage, how did you get involved with VAST? What attracted you to what they were doing and what drew you in. Phil Manez, Vice President - GTM Execution: So what really drew me into VAST was the vision that they had for where the world was going to go with AI. So I joined in 2020 before ChatGPT and everything else, but we really saw what the demands were going to be on data and what AI was going to do for those demands. And I wanted to be in early. Mike Matchett: And when we look at where VAST is gone when it started, I hate to say it because I was a storage analyst, but it was yet another storage array kind of company that was working on the big end of the market. Hpc era stuff. Uh, how how would you position VAST today to someone who just wants a bigger picture of it? Phil Manez, Vice President - GTM Execution: So we've definitely grown very much beyond storage. Now we, we call our product today the AI operating system because it's really a full stack cloud services layer that gives everything you need between models and the hardware that they run on. So storage database functionality, but also a compute and serverless function environment where I can build and deploy models and create AI pipelines. Mike Matchett: What would be what would be a green flag for someone thinking that you should they should be considering VAST when they're looking around their environment and the projects and initiatives that they have. And they say, you know what? What would what would you say? They should prompt them to look at VAST data in terms of their use cases or workflows or needs. Phil Manez, Vice President - GTM Execution: Yeah. Generally it's if you're seeing scaling challenges, whether that's from a storage perspective, a database perspective, or we see a lot of customers that are struggling from bringing AI projects from pilot into production. And they see a lot of things breaking that they didn't see break in the pilot. That's a very good time to come talk to us. Mike Matchett: All right. So you've got a fit in there. Uh, let's let's just jump ahead a little bit. Um, and skipping over the architectural discussion, which is great, by the way, uh, it's a great story and I encourage everyone here to look into what's going on there. Uh, but let's talk about AI, right? Um, what, what are some of the big problems people are having with AI with their traditional. Infrastructure. Traditional approaches. I mean, nothing's traditional anymore. But you know, they've got they've got infrastructure on prem, maybe hybrid architecture, maybe they're using a cloud service. Uh, but they're having issues, as you said, scaling it. What, what, what, what goes on there? Yeah. Phil Manez, Vice President - GTM Execution: So we see things breaking in a lot of different areas. One interesting one is that as we start to bring projects into production, a lot of vector databases tip over. When they get to a certain level of scale, they start to tip over when they you need a certain level of interactivity. So trying to do things in real time, having AI respond to data as it's coming in, we see vector databases, uh, breaking. We see a lot of customers struggling around data controls. These environments are incredibly fragmented. So having data all over the place, but also being able to try to put consistent controls in place across all these different types of products, very challenging. And then just the complexity of putting the full stack together, right? All the components that I need to evaluate and stitch together to make something work reliably at production scale. I think those are kind of the main buckets where we see challenges. Mike Matchett: So, so if I, if I got this right, AVAST is no longer really centered on selling appliances or selling what you might consider the storage array, you really have this whole functional layer of an AI operating system. I think you guys call it. Right? And, uh, that does suppose that the customer has the right infrastructure, whatever that is. Maybe you could tell us a little bit about how you marry that up, how you integrate that and how a customer brings what they have to the table here. Phil Manez, Vice President - GTM Execution: Yeah, absolutely. So we want to give customers maximum flexibility. So that means that I can run VAST on a variety of different hardware platforms, whether that's something from an ODM contract manufacturer, a traditional OEM like a Supermicro, Cisco or HP Lenovo. But it also means I can run VAST in neo clouds. It means I can run VAST in, um, public cloud environments. We have something called the data space, which means I can access data from any of those platforms. But very recently, we've seen customers struggling under this supply chain crunch of not even being able to get SSDs. And we've launched a new program called amplify, where we're actually going into customers environments and helping them discover assets where we can bring that into VAST platform and help them actually store more data than they ever could in the way that it's deployed today. So we're really focused on flexibility and helping customers meet these growth goals and not have to slow down their AI projects because of a supply chain crunch. Mike Matchett: Yeah. I mean, I know this is an imperfect analogy or reference, but in the old days, if I had a petabyte of data or petabytes of data, I'd be looking for an HPC kind of solution. You know, going back to Gpfs or something like that, I'd be, how do I set up this, this cluster of delicate, uh, framed supercomputing, uh, storage to serve those needs? Well, the AI everyone needs that kind of storage performance and capacity, but they're not going to do that, right? So, so VAST seems like it's a great solution here. I like the I like the emphasis on what do you call it amplify where you guys are helping them build that infrastructure up from things maybe they already have like repurposing stuff because it's really getting harder and harder to find that. Um, tell us a little bit about, um, the GPU part of that story though, because, uh, you know, we just had GTC a couple weeks ago. Uh, it's great to talk about GPUs, but you can't get your hands on them. What do you do? Phil Manez, Vice President - GTM Execution: Yeah. So I think that's why flexibility is so important because, uh, I think customers have to be less opinionated on where these workloads are going to run. So in certain circumstances, I need to be able to move data to where I can get access to GPUs and other circumstances. I want to be able to bring that compute to the data I already have. So I think it's around a short and long term plan to say, where do these projects need to live? Maybe today I can get access to GPUs in the public cloud, but I don't want that to be my long term plan. Maybe today I can get them from a VAST neo cloud partner, or I can only get a limited amount on prem. So the idea is being able to move data and compute around to where I can get access to the resources that I need. I think that is paramount, and I think it will be paramount going forward for customers. We know the world is hybrid and it's going to stay that way. So I need that type of flexibility. Mike Matchett: Yeah, you're talking you just even more about like entire infrastructure resources, whether it's memory or if it's a GPU or it's even compute nodes on there. Uh, I almost, I almost, and I hate to do this to you. I almost hate to start to think of VAST as a hybridizing solution where it starts to say, I don't care what your underlying architecture is. You need to solve this AI use case and we'll bring it all together for you. And I know you have a global namespace. Do you have some other pieces of that puzzle? Yeah. Phil Manez, Vice President - GTM Execution: So, uh, our platform again, very broad from where we've come from, universal storage to now the AI operating system. So we have our global namespace technology. We have something called the sync engine, which actually allows me to discover data on either third party storage systems or even SaaS applications like confluence and G drive and pull that data into the platform. But now we're actually allowing customers to bring and deploy GPUs natively on the platform as well. So I can actually run my models. I can run functions on the VAST AI operating system as well. So we really want to do give customers that complete AI stack that makes that allows customers to focus more on what they want to do with AI than creating these very complicated, fragile stacks that just don't feel like they're ready for prime time. Mike Matchett: So you take me back 20 years when we had these conversations with IBM, back in the days when they had these huge, uh, consolidating storage arrays and saying, can we run some transcoding or some functionality in the data? You're creating these storage arrays that are just operating systems. We should be able to run stuff there. And they're like, no. Yeah, like we tried it. No, no. It interferes. Like, there's just too many real world issues with bringing your functionality to where the data was, even though it was capable of it functionally. And now we're seeing that being able to be unwrapped. I mean, I'm seeing it with what you guys are doing, where I can bring my whole AI workload into where the data is in that VAST layer, and I don't have to think of it as a storage array, but that's basically the storage layer, right? So I'm bringing my compute into the data and converging that I'm sorry to be using all these old fashioned words with you, Phil, because I know you guys are looking forward, but, uh, there's just a series of things going on there. Um, tell us about then a little bit more specifically about what you guys are talking about with AI. I know you had some announcements on AI specifically. Uh, you've got some, uh, stuff where you can empower. You just mentioned bringing in data from SaaS stuff. You've got some power to do some rag things. Uh, you've got some, uh, something about contacts. Tell us about that. Phil Manez, Vice President - GTM Execution: Yeah. So I think there's a few different things, And actually, it goes to your point of the, of what we're doing with the architecture and the flexibility. So you think about some of the things we're doing. One, we made an announcement, uh, we have a new capability to run GPUs directly on the platform called node X, and we're partnering with Nvidia. So not just being able to run the GPUs, but bringing the libraries to accelerate vector search to accelerate SQL queries. So you have to have the architecture underneath. You have to have the accelerated compute and the libraries to take advantage of those things. Um, we're now integrating different platforms. We had a really cool announcement with a company called 12 labs that does AI for video analytics and being able to bring their functionality. So bring their models and run them natively on that. So I've got the GPUs now we're bringing models to the platform, creates a very turnkey solution, but also the flexibility of our architecture means that we can do something really cool. Um, we announced a partnership with Nvidia around their CMF solution for context, where we're actually running our code on GPUs, right? The Bluefield DPU that's sitting right next to the DGX in the. It's inside the DGX. We're running our code there, and we're actually allowing customers that are doing large scale inference to extend their context memory onto the VAST platform. So I have a local VAST server, right? Running on a blue field inside a DGX, and that has direct access to a giant shared pool of storage, class memory and flash, where my context can now live. So we're really trying to solve these scaling challenges all along the way, whether that's the complexity, it's whether it's how I can access vectors and SQL data, but also how do I extend that context, memory and save money and improve my user experience by, you know, preventing us from having to recalculate context over and over again, which is really costly, but also very annoying for users. Mike Matchett: Yeah, let's, let's, let's just dive down a little bit more practically. So when I'm talking about context in AI, I'm talking about those longer sessions that I have where I'm going back and forth. And if I have to recreate those every time it becomes a burden. And if I'm dealing with large amounts of data, I don't want to reprocess all that data. I want to maintain that session concept over a longer term. And I don't mean like a day. I mean, you know, like, like maybe even a year or more longer than that as I'm adding to that. So context is becoming really key here in making effective use of AI, especially over large amounts of data and caching it or storing it effectively is going to be a real key to unlocking the value, I think of AI, right? Phil Manez, Vice President - GTM Execution: Absolutely. So you used to have three options, one recalculate, which means the user is going to wait, right? I can go back and recalculate that context two is stored in memory which is can be very, very expensive right. And obviously now we have memory supply constraints as well. And then the third was kind of sticking on maybe an object platform. And there you're going to have serious performance challenges. So Vasquez, a new option where I can extend that context memory onto our more modern architecture with the flexibility of running our code directly on that GPU. So I'm really getting the best of both worlds between something that feels like memory, but something that's shared in much more cost effective. Mike Matchett: I know we tend to we tend to draw these lines on our data as this is data that's going to storage, and this is data that's going to stay in a buffer or whatever. But really it's all important. It's all data. And you really need a more flexible fluid set of concepts. We have to come up with some better names for how we deal with our data here. Fill up on there and what, what uh, what's current and what's not current. Um, let me, uh, well, geez, is there anything else that came out of fast forward you want to mention, uh, going that before we wrap up here. Phil Manez, Vice President - GTM Execution: I do think something for maybe the partners listening, we announced a new partner program called the Cosmos Partner program. So our partner system is very eco systems, very broad. So we're working with GSIs, helping our customers build new applications. We have a broad OEM and channel ecosystem, but more and more we're working more with companies like 12 labs. We also announced a partnership with CrowdStrike. Being able to bring detection and response natively, building that into the AI pipeline. So you think about, you know, moving us towards that AI operating system, you're going to see us integrating with the ecosystem of the world very differently now that we can bring models, but also everything I need to support those models directly to the platform. Mike Matchett: I know extending rag out to all my SaaS SaaS providers, you know, the average company has 170 plus SaaS providers that data all over the place. And if you just natively do things like that where you sync it in with your sync engine, um, there's a lot of power there that people don't even think about today yet. Um, there is so much to talk about, Phil. Uh, but I think we got to wrap up. If someone wants more, uh, information, they want to follow up any things we've sort of teased them about today, you know, whether it's, uh, the amplify program you have for the supply chain or the cosmos partnering program, or to learn more about what you're doing with your key nodes and your key node. Uh, is that right? And, and the GPUs, where would you, where would you send them to look? What would you. Phil Manez, Vice President - GTM Execution: So very easy. Com has all of those things. But we also have a great blog called Shared Everything. And that is, you know, the name comes from our architecture. So that's a cue to go look at what we're doing from an architectural perspective, because it really is what enables everything but shared everything is a great blog with a lot of thought leadership in there about what's happening with AI and context and some of these other challenges we discussed. Mike Matchett: All right. Thank you so much. It's been very clear. You've been a great guest. Thank you for being here. Uh, Phil, and if you are watching this and you've ever been jealous of having your own HPC kind of solution, but just wouldn't get your organization there to stand up something really super computing ish? Uh, hey, VAST data's got a way to get you there. Especially now in this practical day of AI, you've got some AI initiatives. Here's a way to get that all put together in one place. Take a look, I encourage you. Thanks, Phil. Thanks.

In this inBrief chat, Mike Matchett speaks with Phil Manez about how AI infrastructure is evolving as organizations move from pilot projects to production deployments.

As AI workloads scale, many teams encounter failures in performance, data management, and system design that weren’t visible during early testing.

Mike and Phil talk about why traditional approaches struggle with real-time AI pipelines, vector database performance, and fragmented data environments.

Phil also outlines the growing complexity of assembling reliable AI stacks and the importance of flexibility across hybrid infrastructure, especially amid GPU and hardware constraints.

Watch to learn more about emerging approaches that bring compute closer to data and introduce new ways to manage AI context, helping organizations simplify architecture and operate AI workloads more effectively at scale.

Categories:

Channels:

Mike Matchett: Small World Big Data

Tags:

Show more Show less

Delving into Our Latest Investigations and the 2026 Threat Landscape

Understanding the True Costs of DIY Data Classification vs. Buying Solutions

Keepit Product Insights for June 23