Transcript
Hi, Mike Matchett with Small World Big Data. And we're here today talking about, of course, big data and a lot of people with big data today think they have to move it around, you know, get it to where the compute is, get it into that cloud, collect it in from the edge in order to do something productive with it. But moving data has a whole raft of problems with it, not the least of which is time and expense. We've got some alternatives for you today. We're going to be talking to vicinity. Hold on just a second and we'll get right into it. All right. Welcome, Harry. Welcome to our show. Thank you. Mike. Glad to be here. So you're a CEO of vicinity. And this is a company that isn't really brand new. This is a company that's focused on moving data. We'll get into all the technology and gee whiz stuff in a second. But what was sort of the origins of vicinity and how did you get involved in that? Sure. Thanks. So, so vicinity is about five years old and it was really formed to buy company a predecessor, a company called Bay Microsystems, not Bay networks. Bay Microsystems, and it was an interesting technology. Originally it was an ASIC company that was focused around how to move data really, really quickly and deterministically for the US government. And we've taken that technology and evolved it into a number of interesting uses, focused now still on the fed side, but more also on the commercial side as well. So you're really bringing a technology that's fairly mature into the market, into the more commercial markets, right? And trying to make this available to everyone. And it's and it's probably the right time to be doing this because data's just getting bigger. People have bigger and bigger data. Thus our name Small World Big Data. I understand before we get into it, I just also want to say, Harry, we were talking earlier, my first analyst paper was on InfiniBand and DMA and stuff. And as we're talking to find out, well, there's a big crossover there. So this goes back to those kinds of solid roots, doesn't it? It does. In fact, you know, as I, as I was mentioning earlier, my underlying, if you sort of oversimplify what we do, it's a proprietary encapsulation of DMA with a whole bunch of network engineering around it. And to make sure that it can traverse any network, we put an IP header on the front of it. All right. So that's a nutshell. Obviously that's kind of a gantlet thrown to some of our more technologists in our audience who are going to want a deeper dive. We're not going to get there today, I promise that. But maybe we got a chance to do an event on this at some point. So you've, you've you've brought this technology that's been in fairly heavy use by some heavy hitters over the years for their biggest data sets, and there's a lot of government stuff in there. And you're bringing it down to clouds. You've got you're in the Amazon Marketplace. You've got some Azure stuff going on. Dell's Dell's a partner in there. So you've got you've got a lot of things moving. Tell me a little bit about the market for data movement. What what what does sort of what is sort of the panoply of solutions look like today? If I need to move a lot of data. There's still a bunch of other companies out there that have been doing it for a long time. They tend to use some form of UDP as a technology underlying it, whether it's a proprietary version of it or or just flat out we do something very, very different. Or when we move data, it's lossless and deterministic, and truthfully, it's faster than anything else on the planet as far as we know. But one of the things we challenge ourselves with is how to be even faster. And the only way we figured out how to do that is to use our technology to actually not have to move the data and still use it. All right. So that's kind of the gantlet thrown to because we did tease earlier. And if people know about you're talking usually about building kind of a data lake warehouse or a even a kind of configuration where I've got a way to get access across a cluster to some, to some data and bring it directly into the memory of the the compute node that needs it, rather than going through a whole bunch of different layers. That's a remote direct memory access. You probably get those acronyms better than I can. But now we're doing this really remote, right? This is a remote remote. Exactly. I mean, you know, before vicinity, the only real application of DMA was within the four walls of a high performance computing data center, right where they use it. Mellanox was the leader, the market leader in this space, where they used essentially their technology, and it was InfiniBand and DMA to allow massive amounts of parallel flows of data that a normal GPU going up and down the, you know, the OSI stack just simply couldn't keep up with. And the difference is essentially, you know, what Mellanox fundamentally did within the four walls of data center we do across the globe. That's really the difference between the companies. All right. And I think this is the major point. So if you're listening to this, this is the major point. We could move the data using vicinity faster than anything else. If we're not. And we're talking like large data, terabytes of data, big data sets and bring them across the world faster than anything else, that's a gantlet thrown. But the real beat point here is you don't have to move the data when you have this capability, right? This the data can now stay where it is, and the compute can be somewhere else and it can still work. Is that what I'm getting? I'm getting that right. That's exactly right. And. And, um. And the only problem with that is no one believes it when you first talk to them about it. But if you think about it, you know, we we keep talking about a hybrid cloud world. And, and I actually believe that, you know, to me and to vicinity, hybrid cloud means that you have infrastructure. You have resources in lots of different places. Some of them may be in a public cloud, some of them could be in a data center in a colo place in your own corporation. It shouldn't matter, right? You need a way to be able to connect those together and use them without having to move applications here, data there, and that's what we enable. Mike Weir, my CTO, likes to say we're the glue, if you will. That allows that to happen by using the wide area network as that connecting layer. So wide area network is not just for messaging or some metadata or some coordination, but you're actually saying we can use that wide area network as if it's a local area network, if we have the right technologies such as vicinity provides in place. So so, so, so if I'm building a hybrid cloud today, I tend to think of it as kind of still discrete components like a chess board, a black pieces and white pieces. And, you know, in a particular application is only sitting on one square, even though I can move it around that hybrid cloud environment, it's somewhere. But if I think of this new architecture, I can build a vicinity. It becomes more continuous, like my I can break this down. My applications can be running in different parts of the hybrid cloud, where it's where it seems best, and the data can be resident in the parts of the hybrid cloud where that seems best for the data. And it's still going to work as if it was all in one data center. That's absolutely right. But with I want to make one additional point, though, we don't have a preconceived notion of where the customer should have an application and where they should have the data, it's going to vary company to company. It's going to vary industry to industry, it's going to vary country to country. We don't care. Right. There's been a lot of almost religious discussion about public cloud, private cloud on prem, whatever. And the truth is everyone calls what they do today cloud. And you know, and the reality is if it's really cloud, then it should all be able to work together. And you shouldn't have to rewrite your apps. You shouldn't have to move your data. You shouldn't have to put them both in a container and ship them off somewhere. You should just be able to use it. To us, that's true hybrid cloud and that's what we enable as a company. There's some questions there about how you might be able to deal with geofencing and compliance, and the rest of it if the data doesn't have to be moved, but can be accessed from anywhere as if it had been moved on, on keeping compliant, on where the data has to be residing. Absolutely. And whether it's, you know, GDPR or privacy laws or, you know, for example, in the oil and gas space, a lot of countries believe that their data about where their what natural resources they have is how can't leave the country. Right. So what are you going to do if you want to be able to access and analyze some of that data? We give you that ability because we don't need to make a copy of it outside their country in order for you to be able to analyze it. All right. And we're talking about not small amounts of data here. It's not like a database transaction that's been compressed. This is large amounts of data, terabytes of data that can be remotely accessed because we're looking at use cases that are big data use cases. Really warehouses of data. Sure. I mean, that's certainly an absolutely strong use case. And we're you know, if you're down where your data size is in the kilobits of packets. Yeah, we're not the technology for that. When you get up to megabits and gigabits. And obviously that results in terabytes of data. That's where this technology really adds significant value okay. So and the more distributed it is, the more I'm sorry. The more distributed it is. The more latency is involved, the more value we add in. And especially leaving the data where it's at and just accessing where the data sits. Exactly. There's all sorts of other interesting things talking about here. There's a capacity utilization federation, there's security. There's stuff about the data's not doesn't have to be moved. There's a lot of benefit to that that one could possibly get about this. I did want to ask you about just maybe giving us an example of, of where there might be large data out in the field that someone might want to look at half a world away from one of your real world examples. Sure. So, you know, one of our first field trials was with a company in the energy space where they would go out and do ocean exploration and gather seismic and geological information, and then they would come back to port in South America. The problem was that all the compute capability was up in Houston. And so we actually convinced them to run a test of our technology. And just to show you sort of the data movement and the not having to move it. So they moved the data over the existing infrastructure. They had four gig of bandwidth and some, when optimizing technology, took them 15.5 hours. And they used our technology in parallel to move a terabyte of data on a one gig connection. And it took just a little over two hours. But that really wasn't the disruptive part. Yes, we're faster. It just proves we're faster. The disruptive part was when we said to them, okay, now run the actual application against the data sitting 5000 miles away and compare it to running it locally. And they did that. They ran it actually with one tenth the bandwidth of what they had in a LAN over 5000 miles. And the difference in time to run the application was less than one second. Local versus remote. With our technology, I know examples I could give you of that mean. You've opened up this one pipe as, as it were, to look like local performance. Accessing large data to the point where you don't need to move the data. This is just this is just mind blowing a little bit here. Because if we don't have to move the data to use it again, there's just a whole lot of benefits that could accrue because of that. And mean CIOs everywhere should be saying, hey, wait a minute, why are we moving data? It could get hacked when you move it, right. We've got this recent hack of this move it hack on there, but everyone trying to move their large data sets around the world and it becomes vulnerable. Should have left the data where it was sitting. At any time you move it, you become vulnerable. The more copies you have, the larger your attack vector is. When we do move it, or even when we access it, we can apply as 256 encryption. So and by the way, without affecting performance it's virtually line rate performance as well. Mike. All right. And this this is something just to be clear, this is not because you're deploying large bricks of of stuff around the world wherever someone needs this access point. Right. This is something that is brought into the modern software kind of realm. Absolutely. Yeah. We don't sell technology by the pound, as one of my sales guys likes to say, we sell it for the value. It's software. And and at worst you might need a PCI card off commodity one. We don't make them. We don't sell them. We'll tell you which ones to apply and we load our firmware on it. If you're looking like north of 40 and 100 gig over the Wan, we need a little bit of that hardware assist. But otherwise it's software. And by the way, the software just software alone is getting more scalable and and higher throughput day by day with us. So keep watching that space as well. Right. And I know, I know the next great frontier for companies like NVIDIA, once they've got everyone buying GPUs, is to sell them GPUs to help the data center servers offload some of their communication and data processing within the data center. This sounds like you're really talking about like the remote data center thing that that has to come next, right? The next evolution of this already. So you think of us as sort of the wide area GPU, except we're not a piece of hardware, right? That's really what we are. And as I think I mentioned earlier, you know, Jensen in one of his keynotes, I'm not sure if it was last year or this year, talked about a data center being the, quote, new unit of compute. Well, if you believe that, then I think you need wide area GPUs. Don't you got to connect your data centers. That's the thing. All right. So there's so much to dive into here. We're kind of running out of time. I would just say to the audience, if this isn't tickled, anything that you've got on your to do list, I don't know what will. But Harry, if someone wants to look a little bit more into vicinity, get a little bit more information on it. We've got a website and any resources you want to you want to point people at. Sure. W-w-w dot vicinity.io lots of information. Reach out. We'd love to talk to you. Love to let you do. Try our software and experiment with it. You'll will believe it after you use it. All right, that's it folks. You got data. You got to get it accessed. Don't move it anymore. That's that's the message here. Loving it. Loving it. Thank you so much for being here today, Harry. Thank you Mike. Have a great day. All right. Take care folks.