Skip to main content

Basic Environment Setup

This section describes how to set up an x86 computer and a LicheePi development board to prepare for running models on the NPU or CPU of the board.

Why is an x86 machine required?

To run models on the development board, it is first necessary to use the hhb tool on an x86 machine to convert general models such as ONNX into computation graphs and glue code executable by the board's CPU/NPU. The glue code and related application code must then be cross-compiled into binaries that can run on the board.

Since x86 machines generally offer higher performance and the hhb tool only supports the x86 architecture, an additional x86 computer is required for model conversion.

Development Board Setup

Lichee Pi 4A is a development board launched by Sipeed based on the TH1520 chip. For basic out-of-the-box configuration, refer to the official hardware documentation. The recommended operating system is RevyOS version 20250526. For instructions on flashing RevyOS, see the installation guide.

Installing Basic Tools

RevyOS does not include pip and other basic tools by default. Install them as follows:

debian@revyos-lpi4a:~$ sudo apt install python3-pip python3-venv wget curl git # Install required packages
debian@revyos-lpi4a:~$ mkdir npu # Create a Python virtual environment
debian@revyos-lpi4a:~$ cd npu
debian@revyos-lpi4a:~/npu$ python3 -m venv .venv
debian@revyos-lpi4a:~/npu$ . .venv/bin/activate # Activate the virtual environment
(.venv) debian@revyos-lpi4a:~/npu$ # Now inside the virtual environment

If the installation and activation are successful, the command prompt should be prefixed with (.venv), indicating the active virtual environment.

Why use a virtual environment?

Unlike the original Yuque documentation, this guide adds a section on creating a virtual environment. In the original, packages such as shl-python are installed system-wide, which may conflict with system Python packages managed by apt (e.g., python3-*). Using a virtual environment isolates packages installed via pip and prevents conflicts with apt. For details, see PEP 668.

Installing the SHL Library

About SHL

SHL (Structure of Heterogeneous Library) is a high-performance heterogeneous computing library provided by T-Head. Its main function interfaces use the CSI-NN2 API for the C-SKY CPU platform and provide a series of optimized binary libraries. User Manual

Install the SHL library using pip within the virtual environment:

Version Compatibility

The shl-python package is updated frequently, and the latest version may not be compatible with all boards or example code.
It is important to install a version of shl-python that matches your board and the examples you are following.
For example, for the LPI4A board, version 2.6.17 is known to be compatible with most provided examples.
Check your board documentation or example requirements for the recommended version.

To install a specific compatible version (e.g., 2.6.17):

:::

To install a specific compatible version (e.g., 2.6.17):

Setting a PyPI Mirror

The default PyPI server is located overseas, which may cause network issues in mainland China. Use the -i option to specify a temporary PyPI mirror. For more information, refer to Tsinghua University PyPI Mirror Usage Guide

After installation, use the SHL module's --whereis command to check the installation path.

(.venv) debian@revyos-lpi4a:~/npu$ python3 -m shl --whereis th1520
/home/debian/npu/.venv/lib/python3.11/site-packages/shl/install_nn2/th1520

Based on the output path (as highlighted above), set the LD_LIBRARY_PATH environment variable to specify the dynamic library search path. For example:

$ export SHL_PATH=/home/debian/npu/.venv/lib/python3.11/site-packages/shl/install_nn2/th1520/lib # Use the path from above

Add it to the environment variables:

$ export LD_LIBRARY_PATH=$SHL_PATH:$LD_LIBRARY_PATH

To make this setting persistent, add the above export command to your ~/.bashrc or ~/.profile.

Installing HHB-onnxruntime

HHB-onnxruntime integrates the SHL backend (execution providers), enabling onnxruntime to utilize SHL's high-performance code optimized for C-SKY CPUs.

$ wget https://github.com/zhangwm-pt/prebuilt_whl/raw/refs/heads/python3.11/numpy-1.25.0-cp311-cp311-linux_riscv64.whl
$ wget https://github.com/zhangwm-pt/onnxruntime/releases/download/riscv_whl_v2.6.0/hhb_onnxruntime_th1520-2.6.0-cp311-cp311-linux_riscv64.whl
$ pip install numpy-1.25.0-cp311-cp311-linux_riscv64.whl hhb_onnxruntime_th1520-2.6.0-cp311-cp311-linux_riscv64.whl
About Github Network Proxy

If you experience network issues accessing GitHub from mainland China, consider using a network proxy tool to accelerate access.

NPU Driver Configuration

Compared to CPU execution, NPU inference requires the NPU driver module vha to be loaded. Use lsmod to list loaded drivers:

$ lsmod | grep vha
vha 970752 0
img_mem 827392 1 vha

If the vha module (highlighted above) is not listed, load it manually:

$ sudo modprobe vha vha_info img_mem
User-space NPU Driver Configuration

In addition to the kernel driver, user-space drivers are also required for NPU execution. Check the /usr/lib directory for the following libraries:

  • libimgdnn.so
  • libnnasession.so
  • libimgdnn_execute.so

These libraries are pre-installed in RevyOS and generally do not require manual installation.

NPU Device Permission Configuration

After loading the NPU driver, the device /dev/vha0 may require permission adjustment for user access. For convenience, set the device permission to 0666 (read/write for all users):

$ sudo chmod 0666 /dev/vha0 # Set device permission to 0666

For security, it is recommended to configure udev rules for device management. Consult AI or documentation for udev configuration.

x86 Machine Setup

Unlike the original Yuque documentation, this guide uses a pre-built Docker image for installation. Please ensure Docker is installed on your computer.

Obtaining and Running the HHB Image

$ docker pull hhb4tools/hhb:2.6.17 # Pull the HHB image
$ docker run --rm -it hhb4tools/hhb:2.6.17 bash # Start a temporary HHB container
HHB Image Size

The HHB image is large (~7GB). Downloading may take some time. Please ensure sufficient disk space.

After running the above commands, the container will start and you can operate within it.

Container Filesystem

The container filesystem is temporary. All files created inside the container will be deleted upon exit. It is recommended to copy generated files to the host or use a persistent container.

For more information, refer to the Docker Official Documentation or consult AI.

Obtaining Example Code

The example code for this tutorial is available on Github. Clone it locally using:

$ git clone https://github.com/zhangwm-pt/lpi4a-example.git

Obtaining OpenCV

This tutorial uses OpenCV 4.5, optimized for the C920 RISC-V vector spec 0.7.1. Precompiled binaries are available, and the source code can be downloaded from the OCC download page.

The precompiled C++ binaries are hosted on Github. To update the submodule in the example program, run:

$ # Assuming you are in the repository root
$ git submodule update --init --