MiniCPM5-1B (LiteRT-LM)

This repository hosts the LiteRT-LM (LiteRT formerly known as TensorFlow Lite) version of MiniCPM5-1B, optimized for fully on-device inference on mobile and edge hardware.

Available Models

minicpm_dynamic_wi8_afp32_gpu_opt.litertlm: This model features dynamic weight-only INT8 quantization (wi8) with FP32 activations (afp32), heavily optimized for GPU execution.

What is MiniCPM?

MiniCPM5-1B is the first model in the MiniCPM5 series from OpenBMB. It is a dense 1B-parameter Transformer built specifically for on-device, local, and resource-constrained deployment, while reaching 1B-class open-source SOTA in its size class.

Highlights

🏆 1B-class open-source SOTA — strongest in tool use, code generation, and difficult reasoning among comparable open models.
🧠 Hybrid Reasoning — a single checkpoint serves as both a fast assistant and a deliberate reasoner via a built-in <think> template (enable_thinking).
📏 Long context — native 131,072-token context length.
📱 Built for the edge — compact footprint designed for local assistants, coding agents, and tool-use workflows.

Model Information

Item	Value
Type	Causal Language Model
Architecture	Standard `LlamaForCausalLM`
Parameters	1,080,632,832 (~1B)
Non-Embedding Parameters	679,552,512
Layers	24
Attention Heads (GQA)	16 (Q) / 2 (KV)
Context Length	131,072

Use the model

Android

Edge Gallery App

Download or build the app from GitHub.
Install the app from Google Play.
Follow the instructions in the app.

To build the demo app from source, please follow the instructions from the GitHub repository.

Try It (Desktop/CLI)

Install uv and run the model directly from the LiteRT-LM command line:

uv tool install litert-lm
uvx litert-lm run --from-huggingface-repo=litert-community/MiniCPM5-1B minicpm_dynamic_wi8_afp32_gpu_opt.litertlm --prompt="What is the capital of France?"

License

Released under the Apache-2.0 License, consistent with the upstream openbmb/MiniCPM5-1B.

Citation

@article{minicpm4,
  title={MiniCPM4: Ultra-efficient LLMs on end devices},
  author={MiniCPM, Team},
  journal={arXiv preprint arXiv:2506.07900},
  year={2025}
}

Downloads last month: -

Model tree for litert-community/MiniCPM5-1B

Base model

openbmb/MiniCPM5-1B

Finetuned

(25)

this model

Paper for litert-community/MiniCPM5-1B

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published Jun 9, 2025 • 99

litert-community
/

MiniCPM5-1B