Transcript
at the intersection of AI and data security. Hosted by Cohesity CEO and President Sanjay Poonen. Hello, everyone. Welcome to our discussion here with one of my good friends, CJ. Good to see you. Good to see you. Great to see you as well. CJ is a man for many years, from my years at Symantec, and I've followed his career. He's done incredibly well through a variety of companies, but most prominently ServiceNow and MongoDB. Congratulations. And you've been at MongoDB now how long, CJ? Six months. Six months. Fantastic. Yes. Tell us a little bit about that journey, why you took the role, and what's your vision about where you're taking the company? I joined MongoDB six months ago. After ServiceNow, I was at Cloudflare, and after Cloudflare, I joined MongoDB to be first-time CEO. Like you. We're doing it together. We are doing it together. And MongoDB, in 2007, the company was created in 2007, and the fundamentals were very simple. Relational database is very rigid. The Oracles of the world and SQL servers of the world are a very rigid way to represent data. You should represent data in a natural way, as in a document model, and cloud was just emerging as in AWS public cloud. Can you leverage data and scale data in a scale-out fashion, rather than scale-up Oracle and all other architectures, or scale-up architectures, that you can run on commodity hardware, and you could be cloud-agnostic? I mean, literally, that was the thing. Very similar to our story. Yes. Very similar to your story. And unstructured data, as you know, Sanjay, you guys do this every day, continues to grow at a much higher rate than structured, and AI continues to have a lot of unstructured data. I want to create an image. I want to create a video, whatever it is. All of those things, we are the default data platform for many companies, and we want to be the default data platform in the agentic world, as well as how we have been in the application world. And Sanjay, you've got tens of thousands of customers. Which are the prominent segments these customers play in, in the enterprise? And now you've got a variety of newer AI-native companies, too, in your base. Maybe you can segment where you're getting used the most. Yeah. We go back and forth a lot internally, and I'm sure you do, too. How do we segment customers in this AI transformation? So we have 65,200 customers. We have a product-led growth motion, which helps us with acquisition of those customers in a very effective way. Your frontier model companies, there are very few. You can count them on fingers. Then you have AI-native companies. These are the companies truly Sanjay built after the 2022 Chad GBT era, where on top of LLM, whichever LLMs you use, you build something. So those are AI-native companies. Then you had the digital natives right here in the Bay, many companies that got created in early 2000s, post-mobile era, and so on, and the enterprises. But enterprises is definitely where MongoDB is used for mission-critical workloads, whether it's payments, commercial banking, healthcare, insurance policies, public sector. So within the enterprises, I would say banking or financial services, healthcare, tech, and then the fourth one, I would put it as public sector. Any company with lots of unstructured data, and clearly the healthcare, financial services, tech companies, we share in common. But maybe you can give a use case of how some of these frontier models, the data they collect in chat, where is the data coming for the ones that are in the more frontier model-type companies of today? Yeah. I mean, the way I see it is, and when I understand why they are using us and for what specific use case, there are two ways to do that. Because even if you are training a frontier model, there's so much unstructured data that you feed into LLM so that LLM can provide you better answers. So some frontier model companies use us for research because they put lots and lots of training data, and then the new model comes out, versus there is a frontier model company that uses us for inference. So you ask a question, create an image of Sanjay and CJ standing together, it creates an image, and then you go back and forth, and it keeps that long-term memory. So we have research as well as inference, depending on who it is. That's great. Well, you remind me every time I see you of another tie between our companies, because as you know, Cohesity acquired Veritas, and when you were in one of your previous companies, you were running this division. Yes. And you know many of the people who are now here. Yes. When you think about the state of the data that's in your database for unstructured data, many of our large customers are also your largest customers, banks, the big names in the US and other places came to us and said, just like we typically protect five forms of data, virtual machines and containers, databases, we put you in that bucket, NAS data files, identity, and then SaaS applications. And they came to us and said, very similar to this Oracle and SQL Server, we're now getting a lot of state on MongoDB. Can you go and work with MongoDB to build a data protection security solution so that you can back up their data into our immutable backup platform? So we went to work, and your product people have been extremely helpful. Maybe you can talk about why partnerships like this are important in your ecosystem, where you do something really well, but then we can come behind you and protect the state of the data that's in MongoDB. So first, I do want to say thanks to your teams. Even before the partnerships with our joint customers, which really matters, is that Helio's control plane is built on top of MongoDB. And that is the brain behind all the different backup silos, as you outlined. And that really, really means a lot to us, that if we are part of the control plane on Helio's, that just tells us that we are the most mission-critical, even for you, on how you serve. We are a happy customer. You're great, and we embed that everywhere we go. Yeah, so thank you. Number two, on the customers, Sanjay, it's very simple. Initially, we started out 2007, 2008, MongoDB, when the company was created. A lot of focus was, we were boxed as a NoSQL company, unstructured data company, and hey, if it's unstructured data, this is a great database for it, because our document models scale out, et cetera. But over time, we have become mission-critical at some of the firms, our joint customers, where they are running, like I said, payment processing workload, or insurance policy claims, all of those mission-critical workloads run on MongoDB. Then the request was, hey, this is our horizontal backup strategy. They are always trying to improve RTO, and with ransomware and all that, that you guys have been focused on, all these regulated industries, both from op-res or operational resilience perspective, but just cyber protection, the ask is always the simple, how do you work with leaders such as Cohesity, and what can you do? That partnership, from my standpoint, just gives them peace of mind, and if it gives our customers peace of mind, it gives me peace of mind, that, I mean, ransomware stats, you know they're insane right now, all the way from core data to identity, immutable is such an important point, and this regulated industry, where they have to show to regulators all the RTO objectives and so on, that is where the partnership is really, really meaningful. For me, it's just simple statement to that CI or CTO at a bank, we can just articulate joint partnership together so that they know, hey, if I made MongoDB as my standard for retail banking, and oh, by the way, our horizontal standard for backup, restore, cyber-res, is Cohesity, how do you guys work together? I should be able to answer that very simply. That's in essence what happened. For the last few years, we were able to, CJ talked about RTOs, so one of the things that makes Cohesity differentiated is our cyber-recovery time, thanks to the founders of our company like Mohit, built this to be supersonically fast, so the speed at which we could recover that data is the fastest of anybody in our space, and now you take data that's coming from MongoDB, we can ingest that data into our immutable platform and ensure, just like all the other stores of data there, it's safe and secure from ransomware, with guaranteed supersonic fast time of recovery when they need to recover that, and you can then apply that now to all your customer base. I know, and Sanjay, here is the interesting piece that I'm seeing. If you look at frontier model companies, even last week's announcement by somebody like Anthropic, they are now moving their workloads on-prem, and this is very interesting. Sanjay and I have worked together for many other companies in the past, and we thought, like 10 years ago, everybody's moving to public cloud, and now when I'm speaking to customers, the joint customers we have, there is this fascination, hey, we may not move this application to actually public cloud, and we are actually expanding more. One of the large banks on the East Coast, they're expanding their on-prem data center footprint even more now, and I know that's the one big advantage NetBackup had, Cohesity had, and that's the advantage MongoDB has, so I go and tell these customers, that's fine. If you don't want to move to Atlas or public consumption offering, you can use MongoDB on-prem, and I know it's the same core base that works for Cohesity across multi-cloud. One customer told me that, CJ, the way you described that is hybrid multi-cloud. The same words I use. Yeah, hybrid multi-cloud. Yeah, and I think you have to be in an environment. We have very similar ways and philosophies to build it. On-premise, for many of our customers, it runs in an appliance that's optimized for speed and performance and size, and then the same core base runs AWS, Azure, Google. You've had tremendous success with that model with Atlas. The other thing that I'm noticing, and it'll be interesting if you're hearing that, that same remark you heard from that bank on the East Coast, I'm hearing that a lot internationally. Yeah. Data sovereignty. Exactly. Data sovereignty. I think there's going to be some yin and yang where there's going to be data, especially in the Middle East, European countries that might have sensitivity to U.S. public clouds to having a sovereign cloud solution that works for them. Even Indian banks. Even Indian banks. Yeah. I was talking to one of our sales teams recently, and they said that the whole data sovereignty in parts of Asia is becoming even more important, and so we are not only just Europe, of course, like France, you see a lot, UK public sector, very, very meaningful way, but then you also go to Scandinavia, parts of Scandinavia, and like you said, Middle East is almost has to work in sovereign environment. One of the things we pride ourselves in doing when we work with any integration with a strong technology player like MongoDB is we ensure that the speed at which we do it, the integration is the best. We were the first of any of our competitors to go and optimize MongoDB, and from that came APIs you could optimize for the industry. Today, we're doing this now in just hundreds of your customers and our customers together. Our hope is that our product teams can continue to optimize this so that you get, in our world, the ingest and the recovery of data needs to be supersonically fast. The amount of data, when I was coming up to speed both on Cohesity and the Veritas data protection architecture that has evolved on top of file system over time, your file system, you have a lot more data than we always will have. How do we make it easy to write that data, APIs and integration, and also from RTO perspective, how fast you can restore a MongoDB cluster at a bank or a public sector organization or whatever, that is critical because we are becoming mission critical. On the AI world, Sanjay, because of the examples I shared, now there are AI-native companies which are completely built on top of MongoDB or run on MongoDB, but even in banks, they are experimenting with agent-set production where we could be the context layer or the memory layer, and that may be actually more data than the persistent data, and how Cohesity and us can partner, that would excite me for the next. Let's talk about that. You recently did an acquisition. You got a brilliant researcher from Stanford that's on your team. We met them. We're in early stage of discussion because we'll talk a little bit about what we're doing in AI. What's your vision of where you're taking MongoDB in this AI world? What's this acquisition you did, and how does it all fit into the portfolio? So 100% and 50% credit, 150% credit to David Tachari and the team. They thought of this idea that when you want to do agents in production, it is never an LLM issue. It is always issue of how effectively you do your RAG, and that should be cheap enough because there is so much data everywhere. You need it to have real-time feed to give real-time answer because if I'm making a decision on your mortgage via agents, of course, I'm going to do RAG across the entire data estates I have. How do we make that effective? So we call it the embedding model. So that was the acquisition done of Voyage AI in 2025, February, and where it's going right now is that we will help you the most effective way to retrieve the right data, and that's what then you feed with your LLM data, so even your cost of tokens go down. So phenomenal acquisition, and our vision, which already we have achieved now because it's been 15 months since that acquisition, is core operational data layer with search, vector search, and embeddings all in one so that you can build agent-like applications at scale. That's it, literally. And for everyone's benefit, RAGs are too long with the generation. In this stack now, you'd have vector database, the whole stack that you need to build a semantic layer for these. That's correct, and you feed in text, and now you can ask a question rather than just doing text-based search and all that, and it's all in the op layer, as in MongoDB, operational layer of real-time data. And we are really excited about this team. Is that available now to your customers? Yeah. It's available. That's great. What's the early feedback you're hearing from customers? So far, really, really good. So AI-native companies, many at scale, are already using it, and they are saying the quality of retrieval is extremely high and fast, and the models that we have created, even the latest retrieval something benchmark, I can't even tell you. But based on that benchmark, we are ahead of Gemini and OpenAI models for embedding. So for your key piece of innovation, Gaia ... We should talk about that. Yeah. We are excited if we can partner together. It's our partnership. Well, it's early days. I had a good discussion with your Stanford team, who is very smart, a few weeks ago, so it's early days. What Gaia does is, in essence, on top of all of these hundreds of exabytes of secondary data, backup data, we've been able to build a semantic layer. NVIDIA put money into our company about a year and a half ago, and we built the first form of this vector database that allows you to search on top of backup data directly. So you may have a lot of invoices, you may have a lot of PDF data that are contracts. You can use a chat GPT cloud type interface directly on the back. That has been solved before. In fact, we patented the idea of being able to rag directly on top of backup. Rag itself is a concept that a lot of people are doing, but doing that directly on backup was a new concept. So there's ways by which that semantic layer stack, Gaia, could use your tech. We're very open to it. Yeah. And we're in early discussions there. But the problem statement's very similar, which is type of query and search unstructured data. And guess what, CJ? In our data that we're backing up, the number one area that our customers are asking us to search with Gaia is unstructured data. Right. Because your PDF files, I had a bank tell me just recently that, hey, we have all this stuff in PDF documents. Banks charts and others are within that PDF document. Can I do that search effectively without having to pull everything out of that PDF document? So that's exactly right. That you think about unstructured data, the problem, I think Jensen, who I know you've spoken to, also talks about PDFs and other, which is the content repository of an organization. How do you do that search? And you have backups of pretty much every relevant Fortune 100 and 500 that matters. And that has always been the holy grail, that I have all this data in the Cohesity backup. How can I leverage it? The good news is, I think about our company in sort of three phases. You protect the data first, you apply deep security to it, and then the third act for us is AI. We wisely started off in our relationship with MongoDB on those first two acts, protection and security. Because all this wonderful stuff about AI and data makes no sense if your data is being stolen. So let's get it safe from ransomware first. But now that that's protected and safe, let's use it now in the context of whatever you might have in your ragpike. Here is my take right now, and I may be wrong, but here is my intuition just based on looking at our joint customers piloting AI production at scale, the amount of memory or the size of the memory you would require to have the context where agents can make real decisions. So how you work with us on the persistent storage layer of MongoDB. As we evolve our architecture, how you can be still the best in class in terms of providing protection for that data as well. So I can go back and look at why did the agent make this decision, or if something goes wrong or a tampering happened, how can we look that up? I think those are the opportunities that exist for us. Well, so let's switch topics as we kind of wrap up and talk a little bit as a CEO. How are you bringing the agentic world to your employees? You have now how many employees in the company? We are 5,500. Okay. So we're around that same size. But if you look at your employee base, are you opening up cloud or other tools to them? How are you transforming your workforce in this AI world? So honest answer, which you always get from me, we are in early stages, okay? And I'll tell you my simple framework, and you'll say, CJ, okay, that's not very insightful, but I'm just going to- I'm sure it's insightful. I'm sure you're evolving it, but I'd love to hear. So my simple framework at a company level, because your question is at a company level. The company level is, I want four buckets, okay, on how I think about AI internally. Bucket number one, which is the lifeblood of a tech company, innovate faster. Engineering. Yeah. So innovate faster. Hey, because of AI, are you innovating faster or are you creating new products faster? Whatever the case might be. Second, can you sell more? Okay. Okay. So whether it's your SDRs, how you run campaigns, how you help the seller, can you sell more? Third, once you sold something, can you serve better? This is like customer support, customer success. How are you helping them serve better? And fourth is run efficiently. So across HR and ITG. Very simple. It's four good frameworks. Four frameworks. You have KPIs against each one of them. So we recently rolled out a coding assistant and the teams have embraced it. They are giving us, you know, sometimes investors ask me, which I'm sure they ask you, what percent of CJ, your code is written by AI? I'm not ready to give those kinds of stats yet. And I don't know if that really matters as long as it's a good- But are you seeing things moving faster? Things moving faster. Absolutely. There are things that used to take us weeks. We can do that in days. Fantastic. Right? Even the thing that you guys are all about, which is around cyber resiliency. I will look at, hey, can we fast vulnerabilities internally for our MongoDB faster by doing hackathons and other using this coding assistance? 100%. So, you know it faster, which is the killer use case for AI, we are looking at right now multiple KPIs. My intuition, having talked to other people who are like a year ahead of us or two years ahead of us, 20 to 30% higher productivity. And then as a CEO, my question is very simple. Does that mean I get 20 to 30% more features that we can sell to our customers or we don't need to hire 20 to 30% more people and we can just have current team being more productive? I mean, literally it's as simple as that. And I have to keep it simple. Otherwise you get too complicated. I think it's a good framework because the first one is your engineering team, the second one is your sales and marketing team, the third one is your support team, and fourth is everybody. Everybody. Right? Maybe so. Yeah. The co-productivity, knowledge worker, this and that. Great. And do you care? It's the end. But when you look at sort of coding assistance and engineering, are you neutral to one of them? Are you seeing something emerge as the best among all of these? Everyone's talking about Anthropic and cloud coding. Are you flexible that if that changes tomorrow, you can switch out? Are you finding any one that's better today or is it still early days to figure out what's the best coding? So I was the first one among the peer group back in 2023 on the ask from Microsoft and you work with them closely to roll out GitHub Copilot. Good. Okay. And that was 20. Today, not many companies even mention them and mention these other names. So one of the things that I told our CIO Deepa to do is that when we sign with these coding assistant companies or any AI company, I just sign Sanjay one year at a time. Because if they are changed tomorrow, who knows, Codex is going to be amazing and my team may benefit on innovate faster. They may say Codex. That's a good one. Yeah. Codex. I think if you go a year at a time, things are very dynamically changing. Correct. Especially in this space. Let's talk a little bit about the negative. Everyone's asking about mythos. How do you view that as a, is this something in a world that's very worried about ransomware and security threats? How do we as a CEO should be worried about? How do we as joint CEOs protect our code base from the bad guys? And at the same time, we realize that AI could actually be a force for good. Yeah. So I do have a perspective just based on today's information, because that continues to change, is we were with a bank last week in the UK, in London, okay? And we were talking to the CIO there and he was very worried about mythos, right? Because he said, I do not know what I do not know. Okay. So from our customer environment standpoint, they are really worried. And as you know that they have started taking this approach, we will start with the operating system layer, Windows, Unix, Linux, the things that we used to do and where it has stays. Windows, Unix, Linux. We look at all the browser and the front end interface. And of course, all the infrastructure software, whether it's MongoDB, other databases, software from you guys, we look at all of them. I can tell you without hesitation, you are going to find a lot. Then which one is a priority? Which one has to be dealt with? And Sanjay, you have more understand the time to exploit has always continued to reduce, but now I would say it's even moving faster on the time to exploit. So I think this is a real threat. And this bank individual told me that he thinks he may change the priority from innovation to cyber protection because he doesn't know what mythos is going to find. So I don't know. I know mythos has not been rolled out to every bank. I know some joint customers are, yeah, stages of doing it. I think this is wise advice. And I think, you know, it is going to move the security discussion from not just detection and prevention, but also recovery and resilience, which is what we are focused on. We want you to be successful storing a lot of the world's data, unstructured data, any form of data, whether it's in traditional applications or the more AI native applications, we're going to come behind you, CJ, and protect all that. That's a very simple story. You store the data, put it in the MongoDB database on-prem or Atlas proliferate. And we want to come behind you and build the best security solution. That's our mission. Some part of that will also be an AI mission together. So folks, that's the simple story. I'm delighted to be spending some time with CJ. He's a great innovator, a great leader. I hope all of you out there watching this get a chance to meet him. MongoDB is a great company, and he's leading it with enormous amount of fashion and dynamism, as you can see. Thank you very much, CJ. Thank you, Sanjay. And wishing you great luck.