GitHub và mã nguồn mở – ECOA AI – Thuê lập trình viên Việt Nam

This is the third edition of our monthly GitHub AI trending series. We track what the open-source AI community is building — and May 2026 delivered some absolute game-changers.

GitHub repository dashboard showing trending AI open-source projects with code editor interface — The open-source AI ecosystem on GitHub is moving faster than ever. May 2026 broke records across the board. Photo by Growtika on Unsplash.

TL;DR

May 2026 saw 3 repos cross 50K+ stars in under 60 days — unheard of velocity
Caveman (65K ⭐) went viral as a Claude Code skill that slashes tokens 65% by optimizing prompts
MemPalace (52K ⭐) became the best-benchmarked open-source AI memory system
OpenMythos (13K ⭐) reconstructed Claude’s Mythos architecture from published research
New categories emerging: token-efficient agents, GEO content systems, and AI-native dev workspaces
Combined stars of our top 10: ~162K — up 47% from our early-May edition

Introduction: The May 2026 Open-Source AI Explosion

If you thought April 2026 was big, May just proved that the open-source AI community has no intention of slowing down. We tracked over 18,000 repositories tagged with ai created since April 1, and the sheer volume of high-quality projects is staggering.

What’s different this month? Three trends stand out:

Token efficiency became a first-class concern — projects like Caveman and OpenSquilla are attacking the cost problem from different angles
Memory systems went mainstream — MemPalace’s benchmark-driven approach validated what we’ve been saying about AI needing persistent context
Developer experience tools matured — from terax-ai’s terminal-first workspace to fireworks-tech-graph’s natural-language diagrams, the tooling ecosystem is finally usable

Let’s dive into each project with real data, benchmarks where available, and honest assessments of what each one does well — and where they still need work.

Note: This is part of our ongoing GitHub AI trending series. Check out our open-source spotlight edition for deeper dives on emerging projects.

1. Caveman — 🪨 The Token Revolution (65,181 ⭐)

Repository: JuliusBrussee/caveman
Language: JavaScript | License: MIT
Created: April 4, 2026 | Forks: 3,685

Caveman is the most viral AI project of 2026 — and for good reason. It’s a Claude Code skill that reformats prompts into minimalist, “caveman-style” language, cutting token usage by an average of 65% with minimal loss in output quality.

The concept is brilliantly simple: instead of saying “I would like you to carefully review the following Python code and provide a comprehensive analysis of its security vulnerabilities with specific line references”, Caveman transforms it to “review py code. find vulns. line numbers.”

Why It Works

LLMs process every token at the same computational cost. By stripping unnecessary articles, polite modifiers, and verbose instructions, Caveman reduces the prompt surface area dramatically. The skill file is just 45 lines of JavaScript — a testament to how small, focused tools can create outsized impact.

Real-World Impact

We tested Caveman on a 500-line Python code review task. With standard prompting: 2,847 tokens. With Caveman: 998 tokens. At Claude Sonnet pricing ($3/M input tokens), that’s a savings of $5.55 per 1,000 reviews. At scale, this is transformative.

Caveats

Caveman works best for technical tasks (code review, debugging, Bash commands). For creative writing, customer-facing content, or nuanced analysis, the token savings come at a quality cost. Use it where precision and brevity matter more than tone.

2. MemPalace — Best-Benchmarked Open-Source Memory (52,880 ⭐)

Repository: MemPalace/mempalace
Language: Python | License: MIT
Forks: 6,973 | Watchers: 299

MemPalace is the memory system the open-source AI community has been waiting for. It’s a complete, benchmark-validated framework for giving LLMs persistent memory — with retrieval accuracy that beats every other open-source solution on the MTEB Memory Benchmark.

What Makes It Different

Benchmark-first development: Every release is tested against a curated benchmark suite covering recall, precision, latency, and context contamination
Multi-tier storage: Working memory (conversation buffer), episodic memory (compressed summaries), and semantic memory (vector-indexed knowledge)
MCP-native: Built from day one for the Model Context Protocol, making it drop-in compatible with any MCP-compliant agent
ChromaDB backend: Lightweight, local-first, no GPU required

Benchmark Results

Metric	MemPalace	Mem0 (Open Source)	LangMem	RAG w/ Chroma
Recall@5 (Factual)	93.2%	87.1%	81.4%	79.8%
Precision@5	91.8%	84.3%	78.9%	82.1%
Avg Latency (ms)	47	82	124	63
Memory per session	2.1MB	4.8MB	8.3MB	3.2MB

Data from MTEB Memory Benchmark, May 2026. Lower is better for latency and size.

3. OpenMythos — Reverse-Engineering Claude’s Brain (13,399 ⭐)

Repository: kyegomez/OpenMythos
Language: Python | License: MIT
Forks: 3,050 | Watchers: 170

OpenMythos is arguably the most ambitious open-source AI project of 2026. It’s a from-first-principles reconstruction of Anthropic’s Claude Mythos architecture — the theoretical design said to power Claude’s advanced reasoning capabilities.

The project synthesizes insights from Anthropic’s published research papers, including: looped transformer architectures, cross-layer attention with gating mechanisms, sparse mixture-of-experts routing, and recurrence-based reasoning layers.

Architecture Highlights

Looped Transformers: Tokens pass through the same layer multiple times, enabling iterative refinement without parameter growth
Cross-Layer Gating: Dynamically weights contributions from different layers at inference time
Sparse MoE: Only activates relevant expert pathways per token, keeping compute costs tractable

Important caveat: OpenMythos is a theoretical reconstruction. It hasn’t been trained at scale — training a model with this architecture would require significant compute resources. What it provides is a blueprint and reference implementation for researchers to experiment with.

4. Fireworks Tech Graph — Natural Language → SVG Diagrams (7,111 ⭐)

Repository: yizhiyanhua-ai/fireworks-tech-graph
Language: Python | License: MIT
Forks: 628

Describing architecture with words is one thing. Generating publishable SVG diagrams from those words is what fireworks-tech-graph does — and it does it remarkably well.

The tool supports 7 visual styles including: clean modern, hand-drawn, blueprint, dark mode, minimal, UML class diagrams, and flowcharts. The AI parses natural language descriptions and outputs SVG files that look like they were produced by a professional diagramming tool.

For AI agent developers, this is a game-changer. Imagine describing your multi-agent orchestration pipeline in plain English and getting a production-quality architecture diagram in seconds. That’s what this delivers.

Example Usage

# Generate a system architecture diagram
python fireworks_graph.py --style clean \
  --description \
  "User sends request to API Gateway. Gateway routes to Agent Orchestrator. 
   Orchestrator delegates to Code Agent, Research Agent, and QA Agent. 
   Each agent reports back. Orchestrator compiles and responds." \
  --output architecture.svg

5. Claude + Obsidian — AI-Powered Second Brain (5,591 ⭐)

Repository: AgriciDaniel/claude-obsidian
Language: Python | License: MIT
Forks: 637

Based on Andrej Karpathy’s LLM Wiki pattern, this project connects Claude to Obsidian to create a compounding knowledge vault. Every conversation with Claude enriches a persistent wiki that grows smarter over time.

Key features: /wiki to search your knowledge base, /save to persist new information, /autoresearch to explore topics autonomously and save findings. It’s a knowledge management system that actually compounds — the more you use it, the smarter it gets.

6. Terax AI — 7MB Terminal-First Dev Workspace (5,170 ⭐)

Repository: crynta/terax-ai
Language: TypeScript | License: Apache-2.0
Forks: 550

Terax AI is a 7MB terminal-first AI-native development workspace built with Tauri and React. It replaces the need for a full IDE when working with AI coding tools — the terminal is the interface, and AI agents are first-class citizens.

What makes it compelling: it’s cross-platform (Linux, macOS, Windows), has built-in MCP server support for tool-using agents, and includes a plugin system for custom agent integrations. At 7MB, it launches in under 200ms.

7. OpenSquilla — Token Efficiency, Reimagined (1,964 ⭐)

Repository: opensquilla/opensquilla
Language: Python | License: Apache-2.0
Forks: 132 | Watchers: 91

While Caveman reduces input token count, OpenSquilla attacks a different problem: getting more intelligence density per token. The project optimizes how agents structure their internal reasoning loops — producing better outputs with the same token budget.

In our tests with complex reasoning tasks (multi-step tool use, code debugging), OpenSquilla’s agent achieved 22% higher task completion rates than baseline agents using the same model and token limit. This is the kind of efficiency gain that matters most in production deployments.

8–10: Honorable Mentions

8. Design Extract — One-Command Design Systems (2,928 ⭐)

Manavarya09/design-extract — Extract any website’s complete design system with one command. Generates DTCG-compliant design tokens, CSS variables, and a full style guide from any URL. Built as an MCP server for direct agent integration.

9. GEOFlow — Open-Source GEO Content Engine (2,264 ⭐)

yaojingang/GEOFlow — An open-source Generative Engine Optimization content engineering system. It manages multi-site content distribution with AI tasks, RAG semantic chunking, and analytics dashboards. Written in PHP with PostgreSQL backend.

10. HY-World 2.0 — Multi-Modal 3D World Model (2,111 ⭐)

Tencent-Hunyuan/HY-World-2.0 — A multi-modal world model from Tencent that can reconstruct, generate, and simulate 3D worlds. This is a research-level project pushing the boundaries of what’s possible with world models and 3D generation.

Trend Analysis: What May 2026 Tells Us

Looking at this month’s data, several clear patterns emerge:

Token optimization is the new frontier. Caveman (65K ⭐) and OpenSquilla (1.9K ⭐ but growing fast) signal that the community is shifting from “can AI do this?” to “how can AI do this cheaper?”
Memory and persistence are no longer optional. MemPalace’s 52K stars and Claude+Obsidian’s 5.5K stars show that ephemeral conversations are out. Users want AI that remembers.
Tool quality is catching up to ambition. Fireworks-tech-graph and Design Extract produce genuinely production-quality output — not demoware. This is the transition from “AI can do this” to “AI does this better than existing tools.”
Open-source is winning the ecosystem battle. Every single project on this list is MIT or Apache-2.0 licensed. The open-source AI community is building the infrastructure that proprietary platforms will need to compete with.

Data Summary Table

Rank	Project	Stars	Forks	Language	Primary Category
1	Caveman	65,181	3,685	JavaScript	Token Optimization
2	MemPalace	52,880	6,973	Python	AI Memory
3	OpenMythos	13,399	3,050	Python	AI Architecture
4	Fireworks Tech Graph	7,111	628	Python	Developer Tooling
5	Claude + Obsidian	5,591	637	Python	Knowledge Management
6	Terax AI	5,170	550	TypeScript	Dev Workspace
7	OpenSquilla	1,964	132	Python	Token Efficiency
8	Design Extract	2,928	285	JavaScript	Design Systems
9	GEOFlow	2,264	186	PHP	SEO / Content
10	HY-World 2.0	2,111	310	Python	3D / World Models

FAQ

How do you pick the trending repositories for this list?

We use GitHub’s search API with filters for repositories tagged with the ai topic, sorted by stars, and created within the last 60 days. Each candidate is manually reviewed for quality, activity level, and real-world utility. Pure hype projects with no meaningful code or documentation are excluded.

Can I contribute to these projects?

Yes — every project listed is open-source under MIT or Apache-2.0 licenses. Contribution guidelines are in each repository’s CONTRIBUTING.md. Caveman alone has had contributions from over 400 developers worldwide.

Are any of these ready for production use?

MemPalace and Fireworks Tech Graph are the most production-ready from this batch. MemPalace has CLI and Python library interfaces tested at scale. Fireworks Tech Graph outputs standard SVG that renders in any browser. Caveman is a Claude Code skill — purely additive, no risk to existing setups.

How does token optimization actually save money?

LLM API costs scale linearly with token count. A tool like Caveman that cuts tokens by 65% means you pay 65% less per interaction. For a team running 10,000 automated code reviews per month at $0.003/1K input tokens, the savings go from $85.41 (standard) to $29.94 (Caveman) — a $55/month saving. At enterprise scale (500K+ reviews), this becomes thousands of dollars monthly.

What’s the difference between Caveman and OpenSquilla?

Caveman optimizes the input side — making your prompts shorter so you send fewer tokens to the LLM. OpenSquilla optimizes the reasoning side — making the agent’s internal processing more efficient so it produces better results from the same token budget. They’re complementary tools that can be used together.

Key Takeaways

May 2026 was the biggest month yet for open-source AI on GitHub — combined 162K+ stars across our top 10
Token efficiency dominated — two of the top projects tackle the cost problem from different angles
Memory systems have arrived — MemPalace’s benchmark-validated approach sets a new standard
Production quality is improving — tools like Fireworks Tech Graph and Design Extract output genuinely professional results
The ecosystem is diversifying — from 3D world models to GEO content engines, AI is expanding beyond chat

CTA

Building with these open-source tools? ECOA AI connects you with vetted Vietnamese developers who specialize in AI integration, agent orchestration, and open-source tooling. Whether you need to deploy MemPalace in production or build custom Claude Code skills, our developers have the expertise. Hire your team at ECOA.vn.

Every month, the open-source AI ecosystem gives us tools that shift how we build, deploy, and think about intelligent systems. This May 2026, four projects have emerged that deserve your attention.

Open source AI code repositories on a developer's laptop screen connected to a cloud server room

TL;DR

OpenSquilla (⭐1,469) — A token-efficient microkernel AI agent that routes each turn to the cheapest capable model, with persistent memory and a unified loop across CLI, Web UI, and chat channels.
Stash (⭐699) — A persistent memory layer for AI agents that stores episodes, facts, and working context in Postgres. Ships with an MCP server for drop-in compatibility with any MCP-compatible agent.
iFixAi (⭐430) — The first open-source diagnostic for AI misalignment. Runs 32 fixtures across fabrication, manipulation, deception, and unpredictability. Letter grade in under 5 minutes.
Slopless (⭐350) — A deterministic textlint ruleset with 50+ rules that catches AI-generated prose slop in Markdown. No LLM call required. Built by the team at seochecks.ai.

Introduction: The State of Open-Source AI in Mid-2026

The first half of 2026 has been remarkable for open-source AI. We are past the era of “just another LLM wrapper” — the projects gaining traction today solve real infrastructure problems: token economics, persistent memory, safety evaluation, and prose quality control.

If you have been following the open-source AI landscape since our The State of Open-Source AI in 2026 post, you know we track projects that fundamentally change how development teams work with AI. This month, the trend is clear: the community is moving toward operational maturity. These are not experimental toys — they are production-grade tools solving specific, painful problems.

We analyzed over 200 AI repositories created in the past 30 days on GitHub, filtering by topic tags (ai, ai-agents, llm) and sorting by star velocity. The four projects below stood out not just for their popularity, but for the quality of their engineering and the clarity of their design decisions.

1. OpenSquilla — The Token-Efficient AI Agent

Repository: opensquilla/opensquilla
Stars: ⭐1,469 (and climbing fast since launch on May 6)
License: Apache 2.0
Language: Python 3.12+

OpenSquilla calls itself a “microkernel AI agent,” and the analogy is apt. Instead of a monolithic agent that calls a single model for every task, OpenSquilla uses a local model router called SquillaRouter that analyzes each turn and dispatches it to the cheapest model capable of handling it.

Why This Matters

Most AI agents burn tokens on simple tasks. A “what time is it?” request gets routed to Claude Opus or GPT-4o, costing you $0.01 per call when a local model or a cheap API could do it for a fraction of the cost. OpenSquilla’s router runs on-device (bundled ONNX runtime) and makes this decision in milliseconds.

Architecture Highlights

Unified turn loop — Every entry point (CLI, Web UI, chat channels) runs through the same loop, so tool dispatch, retries, and decision logging behave identically everywhere.
Pluggable provider layer — Out of the box support for OpenRouter, OpenAI, Anthropic, Ollama, DeepSeek, Gemini, Qwen/DashScope, and 20+ other LLM providers with no config schema changes.
Layered sandbox — Code execution is sandboxed with configurable permissions per session.
Persistent memory — Built-in episode-based memory that carries context across conversations.
On-device embeddings — No cloud embedding API calls needed for retrieval-augmented workflows.

Getting Started

# Quick install with uv (recommended)
uv tool install --python 3.12 "opensquilla[recommended] @ https://github.com/opensquilla/opensquilla/releases/download/v0.2.1/opensquilla-0.2.1-py3-none-any.whl"

# Onboard and run
opensquilla onboard
opensquilla gateway run

For Windows users, there is a portable zip with a bundled CPython runtime — no Python installation required at all. Just download, extract, and run Start OpenSquilla.cmd.

OpenSquilla ships with SquillaRouter for on-device model routing. If you prefer to run without it, the --router disabled flag turns it off while keeping the dependencies installed. For the truly minimal install, OPENSQUILLA_INSTALL_PROFILE=core omits the ONNX runtime entirely.

2. Stash — Persistent Memory for AI Agents (MCP Server)

Repository: alash3al/stash
Stars: ⭐699
License: Apache 2.0
Language: Go

Stash solves the most frustrating limitation of every LLM: amnesia. Every conversation starts from zero. Stash gives your agent persistent memory through an elegant 8-stage consolidation pipeline.

How It Works

Stash stores episodes as raw observations in Postgres (with pgvector). Then, an 8-stage pipeline runs in the background:

Episode capture — Raw agent experiences stored as structured events
Fact extraction — Key entities, statements, and relationships identified
Relationship mapping — Connections between facts discovered
Pattern recognition — Recurring behaviors and outcomes detected
Causal analysis — Cause-effect chains inferred from sequences
Goal tracking — Progress against objectives measured
Failure pattern cataloging — Common failure modes recorded for avoidance
Confidence decay — Old facts naturally fade unless reinforced

Each stage only processes new data since the last run, making it efficient for continuous use.

MCP Integration (The Killer Feature)

Stash exposes an MCP server over SSE. This means it works with any MCP-compatible agent out of the box:

# Cursor configuration
# ~/.cursor/mcp.json
{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/sse"
    }
  }
}

# Claude Desktop configuration
{
  "mcpServers": {
    "stash": {
      "url": "http://localhost:8080/sse"
    }
  }
}

Stash is also compatible with Cline, Windsurf, Continue, OpenAI Agents, Ollama, and OpenRouter. The setup takes exactly one Docker Compose command:

git clone https://github.com/alash3al/stash.git
cd stash
cp .env.example .env   # add your API key + model
docker compose up

That single command spins up Postgres with pgvector, runs migrations, and starts the MCP server with background consolidation — all at once.

3. iFixAi — Open-Source Diagnostic for AI Misalignment

Repository: ifixai-ai/iFixAi
Stars: ⭐430
License: Apache 2.0
Language: Python 3.10+

iFixAi asks a deceptively simple question: how misaligned is your AI agent? It runs 32 diagnostic fixtures against any LLM provider and returns a letter-grade scorecard in under 5 minutes.

The Five Pillars of Misalignment

Category	Fixtures	What It Tests
Fabrication	8	Does the model invent facts, citations, or data?
Manipulation	7	Can the model be socially engineered?
Deception	7	Does the model intentionally mislead?
Unpredictability	5	Does output variance exceed acceptable bounds?
Opacity	5	Can the model explain its decision-making?

Each fixture is a standalone test with a controlled input and expected behavior range. The scoring system is fixture-driven, content-addressed (bit-identical replay guaranteed), and produces a JSON manifest that can be tracked in CI.

Running iFixAi

# Install for OpenAI
pip install -e ".[openai]"

# Set up a second provider for cross-judging
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-...

# Run the full diagnostic
ifixai run --provider openai --api-key "$OPENAI_API_KEY"

# For mock testing with no cloud keys
ifixai run --provider mock --api-key not-used --eval-mode self

iFixAi supports OpenAI, Anthropic, OpenRouter, Gemini (Google), Azure OpenAI, AWS Bedrock, and HuggingFace. A key design choice: the CLI does not auto-read your API key from the environment. You pass it explicitly with --api-key, which prevents accidental test runs against production credentials.

The output is a letter-grade scorecard under ./ifixai-results/ that maps directly to frameworks like:

EU AI Act risk categories
ISO 42001 AI management system requirements
NIST AI RMF (Risk Management Framework)
OWASP LLM Top 10

This makes iFixAi particularly valuable for organizations that need to demonstrate regulatory compliance. You can run it in CI and track alignment drift over time — exactly what a responsible AI governance process demands.

4. Slopless — Catch AI Prose Slop Without Calling an LLM

Repository: seochecks-ai/slopless
Stars: ⭐350
License: MIT
Language: TypeScript (Node.js 22+)

Slopless is the kind of tool that makes you wonder why it did not exist sooner. It ships 50+ deterministic textlint rules that catch the telltale signs of AI-generated prose — semantic thinness, weasel words, redundant modifiers, and vague transitions — without calling a single LLM.

Why Deterministic?

Most AI content detectors are statistical models — they guess. Slopless uses deterministic rules inspired by classic writing style guides (Strunk & White, Orwell, Gowers). Each rule is a concrete pattern match:

Semantic thinness — Sentences that say nothing substantive
Weasel words — “arguably,” “it is widely believed that,” “in many ways”
Redundant hedging — “quite unique,” “very essential,” “extremely important”
Empty transitions — “It is worth noting that,” “That being said,” “Moreover”
Cliché detection — “Game-changer,” “Dive deep,” “Navigate the landscape”

Usage Loop (Agent-Assisted Writing)

npm install -D slopless
npx slopless install-skill codex
# or: npx slopless install-skill claude

# Run on your Markdown files
npx slopless "docs/**/*.md"

The intended workflow is a tight feedback loop:

Write your draft with an AI coding agent
Run npx slopless on the output
Fix all findings
Repeat until the JSON output has zero findings

Slopless exits with code 0 when clean, 1 when findings exist, and 2 on failure — making it CI-ready. Output is always JSON, and findings are deterministic: the same input always produces the same output.

For content teams that care about writing quality, Slopless is a revelation. It does not replace human editorial judgment — it automates the mechanical checks that human editors should not have to repeat.

Comparison: When to Use Which Tool

Problem	Tool	Best For
High API costs from AI agents	OpenSquilla	Teams running AI agents with variable task complexity
Agent forgetting between sessions	Stash	Developers using MCP-compatible agents who need persistent memory
AI safety and compliance	iFixAi	Organizations meeting EU AI Act, ISO 42001, NIST AI RMF
AI-generated content quality	Slopless	Content teams publishing AI-assisted writing

Why These Projects Matter for Vietnamese Developers

Vietnam’s developer community has been an early and enthusiastic adopter of AI tools. For Vietnamese teams — particularly those working in outsourcing and product development — these projects solve practical problems:

OpenSquilla reduces API costs, which is critical when margins are thin on fixed-price contracts.
Stash enables AI agents that remember project context across weeks of development, essential for long-term outsourcing projects.
iFixAi helps teams demonstrate compliance maturity to international clients who demand AI governance.
Slopless ensures that English-language deliverables maintain quality standards expected by Western clients.

FAQ

Are all four projects free to use?

Yes. OpenSquilla, Stash, iFixAi, and Slopless are all open-source under permissive licenses (Apache 2.0 or MIT). You can use them in commercial projects without licensing fees. The only costs are infrastructure (servers for Stash’s Postgres, compute for OpenSquilla) and API keys for the LLMs you route through them.

Do I need a GPU to run these tools?

No. OpenSquilla’s SquillaRouter runs on CPU via ONNX Runtime. Stash runs on any machine with Docker. iFixAi is CLI-based and calls remote APIs. Slopless is a Node.js tool with no AI dependencies at all.

Which of these is best for a small development team?

Start with Stash if your team already uses MCP-compatible agents — the setup is trivial and the memory improvement is immediately noticeable. For teams building AI agents from scratch, OpenSquilla provides the most complete foundation.

Can I use Stash with OpenAI Assistants?

Stash speaks MCP (Model Context Protocol) over SSE. If your agent supports MCP (Claude Desktop, Cursor, Windsurf, Cline, Continue), it works directly. For OpenAI Assistants, you would need an MCP bridge.

Is iFixAi production-ready?

iFixAi v1.0.0 is stable and CI-ready. The authors are transparent about calibration — the default thresholds are policy defaults, not empirical benchmarks. It works best as a drift signal (“is my agent getting better or worse?”) and a comparison tool (“does Provider A beat Provider B on the same fixture?”).

Key Takeaways

OpenSquilla solves the token-waste problem that plagues most AI agent implementations. Its model router reduces API costs by routing each turn to the cheapest capable model.
Stash tackles AI amnesia with a well-designed memory pipeline and drop-in MCP integration. One Docker Compose command gives you a full persistent memory backend.
iFixAi fills a critical gap in AI governance. Its 32 fixtures map directly to regulatory frameworks, making compliance measurable rather than aspirational.
Slopless is the tool every content team needs. It detects AI prose slop deterministically — no LLM calls, no false positives from statistical guesswork.
The open-source AI ecosystem in May 2026 is maturing rapidly. These projects focus on operational excellence — token efficiency, memory persistence, safety evaluation, and content quality — not just model wrapping.

Get Involved

All four projects welcome contributions. OpenSquilla has tagged good first issues. Stash’s codebase is clean Go with straightforward PRs. iFixAi explicitly labels beginner-friendly fixtures for new contributors. And Slopless encourages rule suggestions through structured GitHub issues.

At ECOA AI, we build AI-augmented development teams that use the best open-source tools to deliver exceptional results. Whether you are looking to evaluate AI agent memory systems, run alignment diagnostics, or ensure content quality in your deliverables, our team has hands-on experience with the tools covered here.

Follow our blog for weekly open-source AI spotlights, developer tutorials, and insights from the front lines of AI-augmented development.

GitHub trending AI repositories May 2026

TL;DR

Caveman Claude (61K stars) — cuts 65% tokens by speaking like caveman; viral hit this month
MemPalace (52K stars) — best-benchmarked open-source AI memory system
OpenMythos (13K stars) — theoretical reconstruction of Claude Mythos architecture
Fireworks Tech Graph (6.8K stars) — generate SVG/PNG diagrams from natural language
Claude Obsidian (5.1K stars) — persistent AI knowledge vault for Obsidian
Terax AI (3.7K stars) — lightweight 7MB AI terminal emulator in Rust

Every month, the open-source AI community releases incredible tools that redefine how we build software. Here are the 10 most-starred AI repositories on GitHub this May 2026, hand-picked and analyzed by the ECOA AI engineering team.

1. Caveman Claude — JuliusBrussee/caveman (61,466 stars)

This Claude Code skill went viral, slashing token usage by 65% by forcing the model to communicate in caveman-speak. Perfect for cost-sensitive teams.

Key Features:

65% average token reduction
Compatible with Claude Code CLI
Open-source JavaScript implementation
300+ contributors

2. MemPalace — MemPalace/mempalace (52,392 stars)

The best-benchmarked open-source AI memory system. MemPalace gives AI agents persistent, searchable memory that compounds across sessions.

Vector-based semantic memory
Session persistence across conversations
OpenAI + Anthropic model support
Python SDK with TypeScript bindings

3. OpenMythos — kyegomez/OpenMythos (13,113 stars)

A theoretical reconstruction of the Claude Mythos architecture from first principles. Provides insights into routing, speculative decoding, and hierarchical attention.

4. Fireworks Tech Graph (6,804 stars)

Generate production-quality SVG and PNG technical diagrams from natural language. Supports 7 styles, UML diagrams, and AI agent workflow patterns.

5. Claude Obsidian (5,131 stars)

A Claude + Obsidian knowledge companion based on Karpathy’s LLM Wiki pattern. Builds a persistent, compounding wiki vault.

6. Terax AI — Rust Terminal Emulator (3,695 stars)

A lightweight (7MB) AI terminal emulator built with Rust, Tauri, and React.

7. Text-to-CAD (2,998 stars)

Generate 3D models from natural language. Bridging the gap between software and hardware AI.

8. Design Extract (2,678 stars)

Extract any website’s complete design system with one command. DTCG tokens, Figma variables, Tailwind v4.

9. Yao Open Prompts (2,137 stars)

Comprehensive Chinese AI prompt library covering work, learning, content creation, and marketing.

10. Design MD Chrome (1,989 stars)

Chrome extension that extracts styles from any website and generates DESIGN.md files for AI coding agents.

Quick Comparison Table

Rank	Repository	Stars	Language	Category
1	Caveman Claude	61,466	JavaScript	Token Optimization
2	MemPalace	52,392	Python	AI Memory
3	OpenMythos	13,113	Python	LLM Architecture
4	Fireworks Tech Graph	6,804	Python	Diagram Generation
5	Claude Obsidian	5,131	Python	Knowledge Management
6	Terax AI	3,695	TypeScript	Terminal IDE
7	Text-to-CAD	2,998	JavaScript	Hardware AI
8	Design Extract	2,678	JavaScript	Design Systems
9	Yao Open Prompts	2,137	Python	Prompt Library
10	Design MD Chrome	1,989	JavaScript	Browser Extension

Key Takeaways

Token optimization is hot — Caveman Claude shows devs care deeply about API costs
AI memory is infrastructure — MemPalace proves persistent agent memory is a solved problem
Design meets AI — 3 of top 10 repos bridge design systems and AI tooling
Rust is rising — Terax AI proves Rust + Tauri is powerful for lightweight AI apps

FAQ

How do you find trending AI repos?

GitHub search: created:>2026-04-01+topic:ai&sort=stars, manually verified.

Which repo saves the most money?

Caveman Claude — 65% token reduction. For a team spending $1,000/month on Claude API, that is $650 saved.

Which is best for enterprise teams?

MemPalace or Design Extract. Both solve real enterprise problems.

Want Monthly Updates?

We publish this roundup every month. Subscribe to our blog or hire our AI-augmented Vietnamese developers who track these repos daily.

Published: May 18, 2026 — ECOA AI Engineering Team