swllm.cpp is a high-performance C/C++ LLM inference engine by shenwenAI,
optimized for local deployment and edge computing, supporting multiple model formats and hardware acceleration.
swllm.cpp provides high-performance, lightweight LLM inference solutions
Built with C/C++ for maximum hardware utilization, supporting AVX2, NEON and other SIMD instruction sets
No network connection required, run LLMs on local devices to protect data privacy
Supports GGUF, GGML and other model formats, compatible with mainstream open-source LLMs
Supports 4-bit, 8-bit and other quantization schemes, significantly reducing memory usage while maintaining accuracy
Supports Linux, macOS, Windows and other major operating systems across multiple hardware platforms
Licensed under MIT, free to use, modify and distribute. Community contributions are welcome
swllm.cpp is an open-source project. We welcome developers to contribute code, report issues, and share ideas
Visit GitHub Repository