Transcript
now. Packages that don't just run code, but give an AI agent the ability to execute commands on your behalf. It can access your file system, read your environment variables, maybe even touch your production infrastructure. And Snyk just finished scanning nearly 4,000 of them. What we found should change how you think about installing anything into your AI agent stack. Today I'm going to walk you through exactly what agent skills are, why they're a fundamentally different security problem than traditional packages, what the data shows about the current state of the ecosystem, and most importantly, what you can do right now to protect yourself. By the end of this video, you'll know exactly how to vet any agent skill before it touches your machine using free tools that are available today. On January 20th, 2026, Vercel shipped a project called Skills. It's a CLI tool and open registry at skills.sh for installing capability packages into AI agents. Think of it like NPM, but instead of importing a JavaScript function or module, you're giving your AI agent a new thing it can do. The top skill hit 20,000 installs within six hours of launch. Stripe shipped their own skill that same day. The ecosystem now covers Cloud Code, Cursor, Windsurf, GitHub Copilot, basically every major AI coding agent in active use. The install command is dead simple. You use mpx, skills, add, and then the package name or skills name. And that simplicity is part of the problem. Here's what makes agent skills different from any package you've installed before. A traditional NPM package is code you import. You call specific functions, you control when and how it runs. An agent skill, on the other hand, is code plus natural language instructions. When you install a skill, you're giving an AI agent a new capability and, critically, instructions in plain English about when and how to use it. The agent decides when to invoke that skill based on those instructions. You're one step removed from the execution decision altogether. That's not just a philosophical difference. It has direct security implications. Let me give you three specific ways the threat model here differs from what you're used to. The first way is the blast radius is much larger now. A malicious NPM package can run code, but it typically does so within a constrained environment. Maybe it reads files, maybe it exfiltrates environment variables, but a malicious agent skill inherits all the permissions of the agent itself. Modern agents often have access to your shell, your cloud credentials, your email, your deployment pipeline. One compromised skill means a compromised agent. The second way is the attack surface is novel. You cannot detect prompt injection with a traditional static analysis tool. Agent skills introduce natural language as an attack vector now. A skill can include instructions that look completely benign in isolation, but are designed to manipulate the agent into doing something the user never authorized. Sneak calls these toxic flows. Scenarios where a legitimate looking prompt triggers a malicious action chain. No regex pattern catches that. No traditional SAS scanner catches that. You need a system that understands language and code. The third way is the trust model is inverted. With a regular package, you make a conscious call, require module A or module B. You know you're using it. With agent skills, the agent makes the call based on natural language context. The human is no longer the decision point. The agent is, and that makes the agent the new attack target. Now that you understand why this threat class is genuinely new. Here's what sneak found when we actually went and looked at the ecosystem. Sneak's research team, powered by our acquisition of invariant labs, completed the first comprehensive security audit of the agent skill ecosystem. They scanned 3,984 skills across major marketplaces, including skills.sh. The full technical report for that is linked in the description, but here are the numbers that matter. The ecosystem is growing at an average of 147 new skills per day. At that rate, manual review is not a viable strategy. Community reporting is going to be too slow. The only approach that works at scale is automated, continuous scanning that runs at submission time before anything reaches a developer's machine. That's exactly what sneak built with Vercel. Here's where it gets interesting from a detection standpoint. Our critical level detectors, the ones flagging genuinely malicious skills, achieved 90 to 100% recall on confirmed bad skills while maintaining a 0% false positive rate on the top 100 legitimate skills from skills.sh. That's the bar you need to deploy something as an automated gate. If your scanner cries wolf on legitimate packages, developers are going to begin to ignore it. Getting that precision right is what makes this approach viable. The sneak and Vercel integration works at the infrastructure level. So every time a new skill is installed using the MPX skills installer, Vercel systems call out to sneak scanning API automatically. No action required from the skill author, no action required from the developer installing it. The scanning engine is built on a tool called Agent Scan, which sneak open sourced on GitHub. It combines LLM based analysis with deterministic rules. The LLM component handles the language layer, detecting prompt injection, toxic flows, and natural language manipulation embedded in skill instructions. The deterministic rules handle the code layer, suspicious downloads, insecure credential handling, malicious patterns in the executable components. Scan results surface directly on each skills page on skills.sh or within your CLI during the setup. This is what Vercel is calling security audits or risk assessments. Immediate transparent visibility into the security posture of any skill before you install it. Now remember those three threat vectors we covered? Blast radius, novel attack service, inverted trust model. The scanning architecture is designed to address all three at the point of distribution. With all this, you don't have to wait for someone else to protect your setup. Here are two concrete things you can do today. One, run sneak agent scan on your existing agent configuration. This is the CLI version. It auto discovers your MCP configurations, installed skills and agent tools, then scans them for prompt injections, malicious code, and suspicious behavior. You can run this right now in your terminal. UVX, sneak agent scan at latest with the skills option. Number two, check the security audits before installing anything. Skills.sh now surfaces sneak scan results on every skill page. Make it a habit. The same way you check download counts and maintain a reputation on npm, check the security audits before you let the skill install. Before we wrap up and we're almost there, here are the three things you can take away from this. One, agent skills are not npm packages. They inherit full agent permissions and introduce natural language as an attack vector. The threat model is genuinely new. Two, the ecosystem is growing at 147 skills per day. A sneak scan of nearly 4,000 skills confirms real threats are already present. The window between ecosystem launches and attackers show up is measured in weeks, sometimes days and hours, not months or years anymore. Number three, you have a free working tool to protect yourself today. Sneak agent scan on the command line. Run it to check if what your AI agents are using is secure. If you've got agent skills installed right now that you haven't scanned, go use that uvx command in your terminal and drop your findings in the comments below. I'm curious what the real world numbers look like across developer environments. Again, links to the sneak inverse cell announcement, the full research report and the agent scan tool are all going to be in the description below. That does it for this video. If you got value out of it, be sure to like it down below and share with somebody who could put it to use. And if you made it this far, subscribe to the channel so you don't miss out on upcoming videos. Thanks for watching and happy, safe coding everyone.