alibaba/MNNPlease refer to the latest official releases for information GitHub Homepage

An efficient and lightweight deep learning framework optimized for mobile and embedded devices, supporting model inference and training.

Apache-2.0C++ 12.0kalibabaMNN Last Updated: 2025-06-20

MNN - Alibaba Open Source Lightweight Deep Learning Framework

Project Overview

MNN is a highly efficient and lightweight deep learning framework that supports both inference and training of deep learning models, boasting industry-leading performance in on-device inference and training. Currently, MNN has been integrated into over 30 Alibaba applications, such as Taobao, Tmall, Youku, DingTalk, and Xianyu, covering more than 70 use cases including live streaming, short video shooting, search recommendation, image search, interactive marketing, benefit distribution, and security risk control.

GitHub Address: https://github.com/alibaba/MNN

Core Features

1. Extreme Lightweight Design

iOS Platform: Static library size for armv7+arm64 platforms is approximately 12MB, with an executable file increment of about 2MB after linking.
Android Platform: Core SO library size is approximately 800KB (armv7a - c++_shared).
Using MNN_BUILD_MINI can reduce the package size by about 25%.
Supports FP16/Int8 quantization, which can reduce model size by 50%-70%.

2. Broad Model Support

Framework Support: TensorFlow, Caffe, ONNX, Torchscripts
Network Types: CNN, RNN, GAN, Transformer, and other common neural networks
Operator Support:
- 178 TensorFlow operators
- 52 Caffe operators
- 163 Torchscripts operators
- 158 ONNX operators

3. Cross-Platform Compatibility

Mobile Platforms: iOS 8.0+, Android 4.3+
Embedded Devices: Devices supporting POSIX interfaces
Multi-Device Hybrid Computing: CPU and GPU collaboration
IoT Devices: Also applicable on IoT devices

4. High-Performance Optimization

A large number of optimized assembly codes fully utilize ARM/x64 CPUs.
Supports mobile GPU inference using Metal/OpenCL/Vulkan.
Supports NVIDIA GPU using CUDA and Tensor Core.
Winograd convolution algorithm is widely used for symmetric convolutions such as 3x3, 4x4, 5x5, 6x6, and 7x7.
ARM v8.2 architecture FP16 half-precision computation support, speeds up by 2x.
ARM v8.2 sdot and VNNI support, speeds up by 2.5x.

Architecture Support Matrix

Architecture/Precision	Normal	FP16	BF16	Int8
CPU
Native	B	C	B	B
x86/x64-SSE4.1	A	B	B	A
x86/x64-AVX2	S	B	B	A
x86/x64-AVX512	S	B	B	S
ARMv7a	S	S(ARMv8.2)	S	S
ARMv8	S	S(ARMv8.2)	S(ARMv8.6)	S
GPU
OpenCL	A	S	C	S
Vulkan	A	A	C	A
Metal	A	S	C	S
CUDA	A	S	C	A
NPU
CoreML	A	C	C	C
HIAI	A	C	C	C
NNAPI	B	B	C	B

Legend: S - Strongly Recommended | A - Well Supported | B - Supported but with Issues | C - Not Supported

Core Components

1. MNN-Converter

Model conversion tool, supports converting models from other frameworks to MNN models:

Supports TensorFlow(lite), Caffe, ONNX, Torchscripts
Performs graph optimization to reduce computation

2. MNN-Compress

Model compression tool, reduces model size and improves performance

3. MNN-Express

Supports model execution with control flow, using MNN operators for general-purpose computation

4. MNN-CV

Lightweight image processing library, similar to OpenCV but implemented based on MNN

5. MNN-Train

Supports MNN model training

Featured Applications

MNN-LLM

A large language model runtime solution developed based on the MNN engine, aiming to deploy LLM models locally on everyone's platform (mobile phone/PC/IoT). Supports:

Mainstream large language models such as Qwen, Baichuan, Zhipu, LLAMA, etc.
Full-modality LLM Android applications
Text generation, image understanding, speech-to-text, text-to-image

MNN-Diffusion

A stable diffusion model runtime solution based on the MNN engine, supporting local deployment of stable diffusion models on various platforms.

Academic Achievements

MNN-related research has been published in top systems conferences OSDI'22 and MLSys 2020, demonstrating its influence in academia and industry.

Development Tools

MNN Workbench

Available for download from the MNN official website, providing:

Pre-trained models
Visualized training tools
One-click model deployment to devices

Python API

Provides easy-to-use Python interfaces for machine learning engineers, allowing inference, training, and image processing without writing C++ code.

Summary

As an open-source deep learning framework from Alibaba, MNN has become an excellent choice for mobile and embedded AI deployment due to its lightweight design, high performance, and cross-platform capabilities. Whether it's traditional CNN model inference or the latest large language model deployment, MNN provides complete solutions and is an invaluable toolkit for AI developers.