The world's largest open-source library of AI prompts for ChatGPT, Claude, Gemini, and other AI models. Browse thousands of curated prompts, contribute your own, and self-host for your organization with complete privacy.
A structured set of coding guidelines for Claude AI that addresses common LLM pitfalls like overcomplication, unnecessary changes, and poor assumption management. Based on Andrej Karpathy's observations about how to get better code output from AI assistants.
A C/C++ library for running large language models locally with minimal setup and optimized performance across different hardware architectures. Enables LLM inference on CPUs and GPUs with various quantization options to reduce memory usage.
An open-source OCR toolkit that converts PDF documents and images into structured, LLM-ready data (JSON/Markdown) with support for 100+ languages. Features advanced document parsing capabilities and is trusted by major AI projects like Dify and RAGFlow.
An open-source tool that lets you run and manage large language models locally on your machine. It supports popular models like Llama, Gemma, DeepSeek, and Qwen, providing both command-line interface and REST API for easy integration.
Open source voice cloning desktop application that lets you clone any voice from just 3 seconds of audio and generate natural speech locally. Features multi-voice project composition, audio effects, and support for multiple TTS engines with complete privacy.
A local-first AI workbench for job hunting that scrapes roles from multiple sources, ranks them by fit using transparent criteria, and generates tailored application materials like resumes and cover letters. Designed for people tired of noisy job boards and black-box AI apply tools.
Real-time face swap and deepfake tool that uses AI to replace faces in live video streams or recordings using just a single source image. Includes built-in safety checks and supports multiple execution providers for different hardware configurations.
A mobile app that runs powerful open-source AI models like Gemma 4 entirely on your device for completely private, offline AI experiences. Features include AI chat with thinking mode, image analysis, voice transcription, and agent skills that can interact with external tools.
OpenAI's robust speech recognition model trained on 680k hours of multilingual audio, with strong performance on accents and technical language.
A state-of-the-art memory and context engine for AI that automatically learns from conversations, extracts facts, builds user profiles, and delivers persistent memory across AI interactions. It combines RAG, connectors, and file processing into a single system that makes AI assistants remember you.
A lightweight 64M parameter GPT model that can be trained from scratch in just 2 hours on a single RTX 3090. Provides complete training pipeline including pretraining, SFT, LoRA, RLHF, and tool use capabilities.
A Claude Code skill and plugin that makes AI agents communicate like cavemen, reducing token usage by ~75% while maintaining full technical accuracy. Transforms verbose AI responses into concise, direct answers that save money and increase response speed.
A minimal, fast implementation for training and fine-tuning medium-sized GPT models from scratch. Features clean, readable code with just ~300 lines each for the training loop and model definition, making it easy to customize and experiment with.
Open-source voice AI framework that includes advanced speech recognition (ASR) for 60-minute audio transcription with speaker diarization, text-to-speech (TTS) for 90-minute multi-speaker synthesis, and real-time streaming TTS. Operates at ultra-low 7.5Hz frame rate for efficient long-form audio processing.
A design skills package for AI coding assistants that provides 17 commands and curated anti-patterns to help developers create better frontend designs. Works with popular AI harnesses like Cursor, Claude Code, and Gemini CLI to bring design vocabulary and expertise to AI-generated code.
Meta's open-weight LLM family — 8B, 70B, and 405B parameter models for local and cloud inference.
Open-source alternative to Claude Design that generates web prototypes, mobile apps, presentations, and design systems using local coding agents. Connects with 15+ AI coding CLIs (Claude Code, Cursor, Copilot, etc.) and includes 72 design systems with local-first workflow.
RuView transforms commodity WiFi signals into real-time human pose estimation, vital sign monitoring, and presence detection without cameras or wearables. It uses Channel State Information (CSI) from WiFi to detect breathing, heart rate, and body position through walls using edge AI on inexpensive ESP32 hardware.
A vectorless, reasoning-based RAG system that builds hierarchical tree indexes from documents and uses LLM reasoning for context-aware retrieval. Eliminates the need for vector databases and chunking while achieving superior accuracy on professional documents.
A benchmark tool that tests whether AI models can detect and challenge nonsensical prompts instead of confidently answering invalid questions. It evaluates models across multiple domains using 100 carefully crafted nonsense questions.
Microsoft Research's structured 3D latent diffusion model for scalable 3D asset generation from text or images. CVPR 2025 Spotlight.
NVIDIA's Python API for defining LLMs and running them on NVIDIA GPUs with state-of-the-art inference optimizations.
Microsoft's official inference framework for 1-bit LLMs — runs large language models with extreme memory and energy efficiency.
Google's cross-platform ML framework for streaming media — face/hand/pose tracking, audio processing, and customizable inference pipelines on mobile and desktop.
Andrej Karpathy's Llama 2 inference in one file of pure C — minimalist reference for understanding LLM inference end-to-end.
Microsoft's prompt and KV-cache compression for LLMs — up to 20x compression with minimal accuracy loss for cheaper, faster inference.
A collection of AI prompting skills that teach AI tools to generate premium, modern frontend code instead of boring, generic interfaces. Improves AI output quality for web design with proper animations, spacing, and visual appeal.
Andrej Karpathy's LLM training implementation in pure C/CUDA — small, readable, performant reference for understanding GPT training from the metal up.
Microsoft's screen-parsing model that turns UI screenshots into structured element data — the perception layer for pure-vision GUI agents.