VIBE
← Back to Leaderboard
Developer ToolsTOOL
Developer ToolsOpen SourceTOOL1mo ago7.4k

About

A high-performance CUDA library for FP8/FP4 tensor operations in large language models, featuring optimized GEMM kernels, MoE fusion, and runtime JIT compilation. Designed for NVIDIA GPUs with clean, accessible code for learning GPU optimization techniques.

Tags

cudagpumachine-learningtensoroptimizationllmnvidiaperformance

Comments

No comments yet.