Open Source Project

swllm.cpp
|

swllm.cpp is a high-performance C/C++ LLM inference engine by shenwenAI,
optimized for local deployment and edge computing, supporting multiple model formats and hardware acceleration.
Now supports inference of our latest shenwen-coderV2-GGUF model.

View on GitHub Get API

Core Features

swllm.cpp provides high-performance, lightweight LLM inference solutions

High-Performance Inference

Built with C/C++ for maximum hardware utilization, supporting AVX2, NEON and other SIMD instruction sets

Local Deployment

No network connection required, run LLMs on local devices to protect data privacy

Multi-Format Support

Supports GGUF, GGML and other model formats, compatible with mainstream open-source LLMs

Quantization Optimization

Supports 4-bit, 8-bit and other quantization schemes, significantly reducing memory usage while maintaining accuracy

Cross-Platform

Supports Linux, macOS, Windows and other major operating systems across multiple hardware platforms

Open Source & Free

Licensed under GPL-3.0, free to use, modify and distribute. Community contributions are welcome

Contribute on GitHub

swllm.cpp is an open-source project. We welcome developers to contribute code, report issues, and share ideas

Visit GitHub Repository

C/C++ Language

GPL-3.0 License

Cross-Platform Support

swllm.cpp
|

Open Source License

Core Features

High-Performance Inference

Local Deployment

Multi-Format Support

Quantization Optimization

Cross-Platform

Open Source & Free

Contribute on GitHub

Ready to try swllm.cpp?

Support Us

swllm.cpp |

Open Source License

Core Features

High-Performance Inference

Local Deployment

Multi-Format Support

Quantization Optimization

Cross-Platform

Open Source & Free

Contribute on GitHub

Ready to try swllm.cpp?

swllm.cpp
|