An efficient and lightweight deep learning framework optimized for mobile and embedded devices, supporting model inference and training.
MNN - Alibaba Open Source Lightweight Deep Learning Framework
Project Overview
MNN is a highly efficient and lightweight deep learning framework that supports both inference and training of deep learning models, boasting industry-leading performance in on-device inference and training. Currently, MNN has been integrated into over 30 Alibaba applications, such as Taobao, Tmall, Youku, DingTalk, and Xianyu, covering more than 70 use cases including live streaming, short video shooting, search recommendation, image search, interactive marketing, benefit distribution, and security risk control.
GitHub Address: https://github.com/alibaba/MNN
Core Features
1. Extreme Lightweight Design
- iOS Platform: Static library size for armv7+arm64 platforms is approximately 12MB, with an executable file increment of about 2MB after linking.
- Android Platform: Core SO library size is approximately 800KB (armv7a - c++_shared).
- Using MNN_BUILD_MINI can reduce the package size by about 25%.
- Supports FP16/Int8 quantization, which can reduce model size by 50%-70%.
2. Broad Model Support
- Framework Support: TensorFlow, Caffe, ONNX, Torchscripts
- Network Types: CNN, RNN, GAN, Transformer, and other common neural networks
- Operator Support:
- 178 TensorFlow operators
- 52 Caffe operators
- 163 Torchscripts operators
- 158 ONNX operators
3. Cross-Platform Compatibility
- Mobile Platforms: iOS 8.0+, Android 4.3+
- Embedded Devices: Devices supporting POSIX interfaces
- Multi-Device Hybrid Computing: CPU and GPU collaboration
- IoT Devices: Also applicable on IoT devices
4. High-Performance Optimization
- A large number of optimized assembly codes fully utilize ARM/x64 CPUs.
- Supports mobile GPU inference using Metal/OpenCL/Vulkan.
- Supports NVIDIA GPU using CUDA and Tensor Core.
- Winograd convolution algorithm is widely used for symmetric convolutions such as 3x3, 4x4, 5x5, 6x6, and 7x7.
- ARM v8.2 architecture FP16 half-precision computation support, speeds up by 2x.
- ARM v8.2 sdot and VNNI support, speeds up by 2.5x.
Architecture Support Matrix
Architecture/Precision | Normal | FP16 | BF16 | Int8 |
---|---|---|---|---|
CPU | ||||
Native | B | C | B | B |
x86/x64-SSE4.1 | A | B | B | A |
x86/x64-AVX2 | S | B | B | A |
x86/x64-AVX512 | S | B | B | S |
ARMv7a | S | S(ARMv8.2) | S | S |
ARMv8 | S | S(ARMv8.2) | S(ARMv8.6) | S |
GPU | ||||
OpenCL | A | S | C | S |
Vulkan | A | A | C | A |
Metal | A | S | C | S |
CUDA | A | S | C | A |
NPU | ||||
CoreML | A | C | C | C |
HIAI | A | C | C | C |
NNAPI | B | B | C | B |
Legend: S - Strongly Recommended | A - Well Supported | B - Supported but with Issues | C - Not Supported
Core Components
1. MNN-Converter
Model conversion tool, supports converting models from other frameworks to MNN models:
- Supports TensorFlow(lite), Caffe, ONNX, Torchscripts
- Performs graph optimization to reduce computation
2. MNN-Compress
Model compression tool, reduces model size and improves performance
3. MNN-Express
Supports model execution with control flow, using MNN operators for general-purpose computation
4. MNN-CV
Lightweight image processing library, similar to OpenCV but implemented based on MNN
5. MNN-Train
Supports MNN model training
Featured Applications
MNN-LLM
A large language model runtime solution developed based on the MNN engine, aiming to deploy LLM models locally on everyone's platform (mobile phone/PC/IoT). Supports:
- Mainstream large language models such as Qwen, Baichuan, Zhipu, LLAMA, etc.
- Full-modality LLM Android applications
- Text generation, image understanding, speech-to-text, text-to-image
MNN-Diffusion
A stable diffusion model runtime solution based on the MNN engine, supporting local deployment of stable diffusion models on various platforms.
Academic Achievements
MNN-related research has been published in top systems conferences OSDI'22 and MLSys 2020, demonstrating its influence in academia and industry.
Development Tools
MNN Workbench
Available for download from the MNN official website, providing:
- Pre-trained models
- Visualized training tools
- One-click model deployment to devices
Python API
Provides easy-to-use Python interfaces for machine learning engineers, allowing inference, training, and image processing without writing C++ code.
Summary
As an open-source deep learning framework from Alibaba, MNN has become an excellent choice for mobile and embedded AI deployment due to its lightweight design, high performance, and cross-platform capabilities. Whether it's traditional CNN model inference or the latest large language model deployment, MNN provides complete solutions and is an invaluable toolkit for AI developers.