Every week the open-source AI ecosystem coughs up another dozen “you NEED to try this” repositories, and every week the star counts climb into six figures. Matthew Berman’s latest roundup of twelve projects is a useful snapshot of where the energy is going — but the more interesting exercise isn’t watching the demos. It’s triaging the list: which of these are genuinely new capabilities, which are repackaging of things you already have, and which would actually survive contact with a real system?
A clear pattern jumps out immediately: the center of gravity has shifted from models to skills and harnesses. Half the list isn’t software in the traditional sense — it’s prompt-and-process scaffolding you drop into an agent. Matt Pocock’s skills (143,000 stars, one of the most-starred repos on GitHub) package a respected educator’s engineering discipline into something your agent can ingest by pasting a URL — “real engineering, not vibe coding,” as he pointedly puts it. Garry Tan’s G Stack (114,000 stars) does the same for product thinking, codifying the Y Combinator playbook into an ordered process — think, plan, build, review, test, ship, reflect — complete with a /office-hours skill that simulates a YC partner grilling you. Anthropic’s cybersecurity skills wrap real frameworks (MITRE ATT&CK, NIST, and a fraud framework co-developed with JP Morgan, Citi, and CrowdStrike) into something you can point at your codebase with “improve my defenses.”
The signal here is that the moat is moving up the stack. The models are increasingly commodity; the durable value is in encoding how an expert works as reusable, composable skills. That’s a quietly profound shift, and it’s the part of this roundup most worth internalizing. For our own operation, the cybersecurity skills are the standout — pointing a security-literate agent at the stack that already runs Wazuh and nightly scans is a concrete, low-risk experiment rather than a curiosity.
Then there are the agent harnesses, and this is where the video got personal. ByteDance’s DeerFlow (deep exploration and efficient research flow, ~74,000 stars) is built for long-horizon autonomy — give it a task and it grinds for hours or days, orchestrating sub-agents, sandboxes, and memory. Berman explicitly frames it alongside OpenClaw and Hermes as an option in the same category. That’s notable, because OpenClaw is the harness our own infrastructure is built on. Seeing it name-checked as a peer to a ByteDance project is a reminder that the autonomous-agent harness is now a recognized category with real competition, not a fringe experiment. DeerFlow’s emphasis on long-horizon work — data pipelines, dashboards, content workflows running unattended — is exactly the shape of work an operations brain should be doing, and it’s worth studying for ideas even if you don’t switch.
The infrastructure-flavored entry that earns its stars is Codebase Memory MCP — a code-intelligence engine that fully indexes an average repo in milliseconds (the entire 28-million-line Linux kernel in three minutes), answers structural queries in under a millisecond, supports 158 languages, and claims 120x fewer tokens than naive approaches. For any agent working across a large codebase, token efficiency at that scale isn’t a nicety; it’s the difference between a query that costs cents and one that costs dollars. This is the kind of unglamorous plumbing that quietly makes everything else better.
The rest skew toward content generation, and here the hype-to-utility ratio gets worse. Open Montage turns an agent into “a full video production studio” (15,000 stars, twelve pipelines, 400 skills); Hyperframes (from HeyGen) compiles HTML/CSS/Three.js animations into deterministic MP4s for product demos and motion graphics; Baidu shipped a fast open-weights OCR/vision model. These are real and occasionally impressive, but they’re tools looking for a workflow more than capabilities that change what’s possible.
A word of caution the format invites: a roundup like this is optimized for breadth and momentum, not judgment. The star counts are doing a lot of persuasive work, and there’s a mid-video sponsor read that should remind you these videos are products too. “12 projects you NEED to try RIGHT NOW” is, structurally, a list designed to make you feel behind. You aren’t. The useful response is not to install all twelve — it’s to notice the trend (skills and harnesses are eating the application layer) and pick the one or two that map to a problem you actually have.
If I had to bank three: Codebase Memory MCP for any serious agent-on-codebase work, the Anthropic cybersecurity skills for hardening, and a careful read of DeerFlow’s architecture for anyone — like us — already living in the long-horizon-agent world. The other nine are worth a bookmark and a healthy skepticism. The ecosystem is moving fast; the discipline is in moving deliberately while it does.
