Truth in IT
    • Sign In
    • Register
        • Videos
        • Channels
        • Pages
        • Galleries
        • News
        • Events
        • All
Truth in IT Truth in IT
  • Data Management ▼
    • Converged Infrastructure
    • DevOps
    • Networking
    • Storage
    • Virtualization
  • Cybersecurity ▼
    • Application Security
    • Backup & Recovery
    • Data Security
    • Identity & Access Management (IAM)
    • Zero Trust
  • Cloud ▼
    • Hybrid Cloud
    • Private Cloud
    • Public Cloud

KIOXIA: “Flash” Forward with AI!

Truth in IT
03/18/2025
75
0 (0%)
Share
  • Comments
  • Download
  • Transcript
Report Like Favorite
  • Share/Embed
  • Email
Link
Embed

Transcript


Hi Mike Matchett with Small World Big Data and we are here today talking about of course my favorite topic AI and how do you scale it? How do you implement it? How do you get it working for you? Uh, lots of people are running into obviously challenges with their infrastructure, getting it big enough, getting it large enough to run these huge llms. And they're looking at Bragg. They're looking at ways to augment their llms with extra data. But even that starts to offer some scalability issues. We've got Kioxia today here talking about their vector database solution that they've made for us, open source, to help us all with that. Hold on a second and we'll get right into it. Go. Hey, welcome to our show, Rory. Hi. Great to be here. Uh, so tell tell us a little bit about, you know, what you're doing with this AI. How did you even start to do AI I things at what most people think of as a flash hardware company. What? How did you cross that? How did you cross that barrier there? Well that's great. I mean, a lot of people really wonder what is a SSD manufacturer? A flash manufacturer doing in AI. And why should we even listen to them? It turns out we've been involved in AI for many years now, starting back in 2017. We installed machine vision systems in our factories to inspect the wafers as they're coming down the lines. We developed a lot of the machine vision software ourselves. Our research institute has published many papers on AI, and in general, we see it as a huge opportunity because, as should be apparent by now, AI runs, lives, breathes data. And so anything that enhances the need for data and high speed storage is a good thing for us. All right. So let's let's talk a little bit about, uh, rag, uh, the retrieval augmented generation that people are doing, uh, with their LMS. Well, first of all, why, why and what does rag do for people that, uh, that they just can't get from, you know, ChatGPT directly? Yeah. So that's a great question. Uh, rag is actually one of the primary areas of of focus for me right now within Kioxia. Um, so these large language models that are in the news today, uh, they're really, really expensive to make and they take a long time, a lot of processing power. It can take months for the large language models, uh, to be distilled down into the form that, you know, you can use in your enterprise. Um, but they're trained on publicly available data. And as I said, they take months to months to train. And so by the time they're published, they're actually a bit out of date. And the wonderful thing about retrieval, augmented generation, or Rag, is that it allows you to supplement these large language models that have been created at great expense with your own private data, as well as up to the minute data, so you can really ground your results, get more timely and accurate results from your AI systems. Yeah. So rag is this idea that you take your own data and you chunk it up in some way, and you can feed those chunks relative to the prompt that's going on into the, into the more larger static LLM. And it augments the result. So it gets smarter because it gets specific data or current data or private data on there. Um, but that's still a bit of a mystery to people. It's a black box. How you. Rag involves, um, something that I've heard called vectorization. Maybe you could just explain what that is and why that becomes kind of a problem for people. Right? So, um, to really take advantage of Rag, you have to preprocess your own data, right? Your private data or your up to the minute data. And that involves creating these small pieces that are called embeddings. Or you refer to them as chunks. Uh, these embeddings are tiny snippets of your data and classifying all those little embeddings along several dimensions within a vector. Right. So those dimensions could be if you're talking about, uh, visual data, it could be color or shape or size or any other, uh, thing that describes that data. So you, you break your data up into little pieces, you quantize it along the dimensions that are of interest to you and form vectors. And then you insert those vectors into a vector database and create an index for that vector database. And then when someone wants to perform a query, The query is submitted to the vector database, which does the lookup of the best matching vectors, and brings back the relevant context to feed into the large language models and generate these, uh, augmented results. And so. So this vector database becomes a critical piece of infrastructure to host. What. What are some of the challenges with with doing that. Well, as you want, uh, more relevant and higher accuracy, uh, in in your results, you end up, uh, classifying your data, uh, along more accesses and generating more dimensions and, and in fact, generating more, uh, vectors themselves. And the scale of these vector databases is growing without an end in sight right now. So, uh, there are now, uh, deployments of vector databases where they have over a trillion, uh, vectors in the vector database, and each one of those vectors can have several hundred dimensions to it, describing that particular node in the in the vector space. So we're trying to get ahead of the LM and offload that or preload that right away. But now we've got a trillion row database that we now have to manage in front of that. And and that that doesn't sound feasible. Yeah. So you know, I should point out, you know, trillion nodes is a very, very large database. But it's not uncommon for the vectorization, particularly if you're going into multimodal data for the vectorization to explode, the size of the data set by a factor of anywhere between 2 and 10 x. So, um, just the actual, uh, demands on your storage and in particular, uh, on your memory subsystems to hold the indices for these vector databases is growing without end right now. Truly a challenge. And generally I understand that when you, when you, when you create these vector databases, the point is they're fast lookups. So you're holding them in memory. And and memory becomes this, this critical resource on here. And and you guys are a flash outfit. So, uh, tell me the story now of, of what you're developing to get this stuff out of memory into into flash. Well, first I'm going to do a call out and say, you know, we're standing on the shoulders of giants here. So, uh, a while back, Microsoft did some research into a technique called disk an or disk, approximate nearest neighbor search. And, uh, this technology was, uh, developed specifically to, to address, uh, the memory pressure problem and to move those vectors out of memory onto SSDs and then to reduce the indices by quantizing the vectors and reducing their size, uh, at the expense of some accuracy there. Uh, so they would fit more easily into Dram. But as I mentioned, when they quantize these vectors and reduce the accuracy, um, they didn't stop there. Uh, they actually developed some very innovative techniques to post-process and regain, uh, and generate very accurate results, even though they're not using the full precision vectors in their indices. So Microsoft was the first to do this. They created disc n, and then we at uh, embraced this technology and wanted to extend it. And so whereas Microsoft took the first step of reducing the memory footprint, we flattened the memory footprint. So, uh, as far as we can tell, uh, there are no limits to the scalability. Now, this serves multiple purposes, right? The first is, is that, you know, it makes it practical to have very, very large databases. But AI is proliferating. It is everywhere. And it's going into edge devices and handheld devices, which have very limited memory space. And so it's very applicable at the small end of the scale too. All right. And I understand your solution. We're calling verbally Isaac. But it's spelled. How do you spell this. A I s a q all in storage approximate nearest neighbor quantized search. And there will be a test later for you folks. Okay. When you're watching, get that down. Uh, but, Isaac, um, and this is open source that you guys have built and are offering to the community. Right. Uh, and so just if I'm so if I take my vector database and I start to get it out of Dram and I can put it on a sensibly SSD or, you know, flash and get it, get it there. Uh, I think I can gain a lot of benefits. I get a little bit of lag. It's going to be a little slower, I would assume. Um, but correct me if I'm wrong. Uh, but what are some of the things I can gain if I'm, if I'm, you know, sitting there trying to say, to say, I got to build an AI cluster, or I'm trying to rent a whole bunch of resources in the cloud. What? What? What do I what do I really benefit from doing this? Well, the first thing is there's a bit of magic behind what Microsoft originally did. Um, their disk based solutions can actually outperform the in-memory vector databases at high levels of accuracy on the query. So, um, the way they did that once again is by quantizing the vectors, their lower precision vectors, uh, easier to perform arithmetic on. And so, um, it's just less work for them to do the initial part of the search before they actually have to go to the SSD and retrieve the full precision vectors and do the final work. So at higher levels of recall, accuracy can actually crosses over and outperforms the memory based solutions. So that's an interesting aside. Uh, and then we built upon that even further. Okay. So you've got you've got some bigger advantages. Uh, and, and I understand that, um, you know, the more we can get out of memory, just as a former capacity planner, obviously, the less memory, the less critical and limited resource I need. Uh, but there's some clustering benefits here, too, right? Because if it's on disk, uh, it looks like I could share this database rather than have multiple copies of it, one in each node. Right. So, so the benefits of reducing that memory footprint and flattening it out are manifold. So you can, in a multi-tenant environment, run more instances of vector databases on the same machine because they each take far less memory. Right. That's that's one benefit. Another benefit is if you're doing any sort of of, uh, on demand, uh, processing in a vector database, um, you don't have to preload the index before you can do a query because the index is actually out on the SSD. And so the time to first response is greatly shortened in in the Isaac solution. And then as you alluded to, if you have a network storage environment or shared storage environment where these vector databases and the vector indices are all out on the storage. If you're trying to scale out your service, you can provision a new node, attach it logically to that shared storage space, and without having to do any sort of initialization of the database, you're up and running so you can easily respond quickly to burst demands in your environment. And you and you had mentioned, uh, something about I moving towards the edge, which we think is a really big trend. That's where people are going to want to apply AI, where the business is happening. Uh, and so there's some advantage here also, because you don't have to have terabytes of Dram in your edge devices. That's right. You know, I mentioned earlier about these large vector databases, you know, using the using the traditional approaches, when you get up to about 100 billion vectors in your vector database, it requires hundreds of terabytes of Dram for the indices. Um, you know, Microsoft flattened that down to only requiring tens of terabytes. But, you know, with a lot. Yeah, that's still a lot. With Isaac, you can get down to a flat footprint of about 200GB of Dram, regardless of the size of the vector database. Yeah, yeah. And not not to put you guys down, but flash is relatively cheap, uh, you know, these days, right. And getting cheaper. Thanks. Thanks. Uh, but, uh, you know, putting ten terabytes of dram on a on an edge node would be still pretty expensive. So, uh. Yeah, I don't think your next, uh, cell phone is going to have terabytes of Dram in it. No. Oh, man. You should see how many windows I have open. I need that, but anyway, uh, we are looking at pretty, you know, it's pretty significant kind of advantage to having, uh, on disk vector database for Rag that is not, um, that doesn't lag in performance. It can actually deliver better performance, right? It seems like this is a good technology. So you're doing this in open source, which is which is an unusual thing. We talked about that if someone wants to find out more about this, or maybe get their hands on that open source or kick the tires on it, what would you what would you point them at? So the America, uh, GitHub repository is where you'll find all the open source projects and disk, and Isaac is listed prominently there. We don't have a web landing page for this project yet. Uh, so I would just suggest going to the GitHub repository and monitoring it for updated information. Um, and eventually I'm sure we'll have a landing page for this. Sure. And, uh, you can go to Kioxia.com and look for information about the flash solutions. You really do. You really do make money selling on to start with. So, uh. That's great. Uh, and I know there's more coming. You're sort of, you know, offline, you're sort of hinting at stuff for me. So we're looking forward for you to come back and tell us about what's coming up next on this and what what more you're releasing. I'm just super excited to hear about this kind of advance coming from a sector of the market. I didn't exactly expect it to come from. Right? That's right. You're not selling. You're not selling GPUs. You're selling the flash. It's great to hear great to hear everyone contributing here. So thank you for being here today, Rory. My pleasure. All right. And check it out. Uh, you know, if you're in the AI space, are you trying to build an AI solution? Here is an on disc, uh, out of memory, scalable vector database for your Rag solutions, which you all need to do. Right. That's how you keep your your llms relevant and current. So take care, folks.
Mike Matchett from Small World Big Data sat down with Rory Bolt, Sr. Fellow & Principal Architect at KIOXIA, to explore how AI is reshaping the need for advanced data storage solutions. KIOXIA, traditionally known for flash memory, has been integrating AI into their operations since 2017, starting with machine vision systems to inspect manufacturing processes.

Rory explains the concept of Retrieval Augmented Generation (RAG), which enhances pre-trained large language models by integrating real-time data, making AI outputs more relevant and timely. This process requires transforming data into embeddings, stored in vector databases that quickly become massive and complex to manage.

To tackle the scalability issues of vector databases, KIOXIA developed ISAAC, a technology that moves these databases from memory-reliant systems to more scalable flash storage, significantly reducing the required memory space. This approach not only makes AI tools faster and more efficient but also extends their use to devices with limited memory, like mobile devices and edge computers.
Categories:
  • » Cybersecurity Webinars » Backup & Recovery
  • » Small World Big Data
Channels:
News:
Events:
Tags:
  • inbrief
  • matchett
  • kioxia
  • data
  • storage
  • ai
  • artificial
  • intelligence
  • machine
  • learning
  • flash
  • storage
  • storage
  • vectors
  • vector
  • database
  • rag
  • retrieval
  • augmented
  • generation
  • isaac
  • scalability
  • ssd
  • mobile
  • edge
  • computing
  • memory
Show more Show less

Browse videos

  • Related
  • Featured
  • By date
  • Most viewed
  • Top rated

            Video's comments: KIOXIA: “Flash” Forward with AI!

            Upcoming Webinar Calendar

            • 09/16/2025
              12:00 PM
              09/16/2025
              SOC 2 for Startups: Strategies to Reduce Costs, Enhance Efficiency, and Achieve Compliance
              https://www.truthinit.com/index.php/channel/1410/soc-2-for-startups-strategies-to-reduce-costs-enhance-efficiency-and-achieve-compliance/
            • 09/16/2025
              01:00 PM
              09/16/2025
              KnowBe4: Beyond DMARC: Closing Critical Gaps in Your Email Security Shield
              https://www.truthinit.com/index.php/channel/1403/beyond-dmarc-closing-critical-gaps-in-your-email-security-shield/
            • 09/16/2025
              01:00 PM
              09/16/2025
              HUMAN Security: CISO to CISO: A HUMAN conversation about Artificial Intelligence
              https://www.truthinit.com/index.php/channel/1411/ciso-to-ciso-a-human-conversation-about-artificial-intelligence/
            • 09/18/2025
              04:00 AM
              09/18/2025
              Netskope: Die doppelte Funktion der KI: Innovationsantrieb und Sicherheitsgarant
              https://www.truthinit.com/index.php/channel/1445/die-doppelte-funktion-der-ki-innovationsantrieb-und-sicherheitsgarant/
            • 09/18/2025
              04:00 AM
              09/18/2025
              Netskope: La dualité de l’IA : inspirer l’innovation tout en protégeant l’avenir
              https://www.truthinit.com/index.php/channel/1446/la-dualité-de-lia-inspirer-linnovation-tout-en-protégeant-lavenir/
            • 09/18/2025
              10:00 AM
              09/18/2025
              Netskope: La dualità dell'IA: stimolare l'innovazione e salvaguardare il futuro
              https://www.truthinit.com/index.php/channel/1444/la-dualità-dellia-stimolare-linnovazione-e-salvaguardare-il-futuro/
            • 09/18/2025
              11:00 AM
              09/18/2025
              Trend Micro Webinar: Risk in Real Time: Agentic SIEM
              https://www.truthinit.com/index.php/channel/1372/risk-real-time-agentic-siem/
            • 09/18/2025
              01:00 PM
              09/18/2025
              Netskope: Harnessing AI’s Dual Nature: Progress and Protection
              https://www.truthinit.com/index.php/channel/1424/harnessing-ais-potential-for-innovation-and-mitigating-associated-risks/
            • 09/23/2025
              01:00 PM
              09/23/2025
              Enhancing Visibility, Control, and Trust in Cloud-First Data Security Management
              https://www.truthinit.com/index.php/channel/1497/enhancing-visibility-control-and-trust-in-cloud-first-data-security-management/
            • 09/24/2025
              01:00 PM
              09/24/2025
              Transforming Risk into Resilience: Managing Data and Access in Regulated Settings
              https://www.truthinit.com/index.php/channel/1500/transforming-risk-into-resilience-managing-data-and-access-in-regulated-settings/
            • 09/24/2025
              02:00 PM
              09/24/2025
              Achieving Full Network and Application Visibility through Zero Trust with Netskope and ExtraHop
              https://www.truthinit.com/index.php/channel/1427/achieving-full-network-and-application-visibility-through-zero-trust-with-netskope-and-extrahop/
            • 09/25/2025
              12:00 PM
              09/25/2025
              Netskope: Secure the Future--AI Usage & Data Security in the Enterprise
              https://www.truthinit.com/index.php/channel/1434/ensuring-data-security-and-ai-integration-for-a-resilient-enterprise-future/
            • 10/23/2025
              12:00 PM
              10/23/2025
              360View: Preventing Data Exfiltration: Keeping Enterprise Data Secure
              https://www.truthinit.com/index.php/channel/931/360view-preventing-data-exfiltration-keeping-enterprise-data-secure/
            • 10/28/2025
              12:00 PM
              10/28/2025
              Netskope: Data Security Reimagined: Regain Your Control and Confidence
              https://www.truthinit.com/index.php/channel/1432/data-security-reimagined-regain-your-control-and-confidence/
            • 11/20/2025
              12:00 PM
              11/20/2025
              360View: Budget Optimization: Doing More with Less
              https://www.truthinit.com/index.php/channel/932/360view-budget-optimization-doing-more-with-less/
            • 12/18/2025
              12:00 PM
              12/18/2025
              360View: 2026 IT Predictions & Emerging Trends
              https://www.truthinit.com/index.php/channel/933/360view-2026-it-predictions-emerging-trends/

            Upcoming Spotlight Events

            • Sep
              16

              KnowBe4: Beyond DMARC: Closing Critical Gaps in Your Email Security Shield

              09/16/202501:00 PM ET
              More events

              Upcoming 360 View Events

              • Oct
                23

                360View: Preventing Data Exfiltration: Keeping Enterprise Data Secure

                10/23/202512:00 PM ET
                • Nov
                  20

                  360View: Budget Optimization: Doing More with Less

                  11/20/202512:00 PM ET
                  • Dec
                    18

                    360View: 2026 IT Predictions & Emerging Trends

                    12/18/202512:00 PM ET
                    More events

                    Upcoming Industry Events

                    • Sep
                      16

                      SOC 2 for Startups: Strategies to Reduce Costs, Enhance Efficiency, and Achieve Compliance

                      09/16/202512:00 PM ET
                      • Sep
                        16

                        HUMAN Security: CISO to CISO: A HUMAN conversation about Artificial Intelligence

                        09/16/202501:00 PM ET
                        • Sep
                          18

                          Netskope: Die doppelte Funktion der KI: Innovationsantrieb und Sicherheitsgarant

                          09/18/202504:00 AM ET
                          More events

                          Recent Industry Events

                          • Sep
                            10

                            Netskope: Ask Bob: How to Mitigate Your Risk Profile

                            09/10/202501:00 PM ET
                            • Sep
                              09

                              From SSE to SASE: Transforming Connectivity with Netskope One SD-WAN

                              09/09/202511:00 AM ET
                              • Aug
                                26

                                Renown Health Secures 10K Mailboxes & Stops $1M+ in Email Threats (Abnormal Security Webinar)

                                08/26/202501:00 PM ET
                                More events
                                Truth in IT
                                • Sponsor
                                • About Us
                                • Terms of Service
                                • Privacy Policy
                                • Contact Us
                                • Preference Management
                                Desktop version
                                Standard version