BERT

This tutorial guides you through running the BERT model on the RevyOS system. BERT is a widely used natural language processing (NLP) model, commonly applied to tasks such as question answering, classification, and translation.

Initial Environment Setup

Before proceeding, please ensure you have completed the environment setup section.

Obtaining Example Code

The example code for this tutorial is available on Github. Clone it locally using the following command:

$ git clone https://github.com/zhangwm-pt/lpi4a-example.git

The CPU example for this tutorial is located in the reading_comprehension/bert directory.

Obtaining the Model

The model used in this tutorial is from the Google BERT repository, converted to ONNX format. Download it with the following command:

$ wget https://github.com/zhangwm-pt/bert/releases/download/onnx/bert_small_int32_input.onnx

About Github Network Proxy

If you encounter network issues accessing GitHub from mainland China, consider using a network proxy tool to accelerate access.

Model Conversion and Compilation

On an x86 machine, use the HHB tool to convert the ONNX model into a computation graph and glue code suitable for RevyOS. Before proceeding, ensure you have started the HHB container and cloned the example repository as described in the environment setup section.

Model Compilation with HHB

After completing the environment setup, use HHB to compile the model into an executable for the c920 CPU. Enter the reading_comprehension/bert directory and execute the following command:

$ hhb --model-file bert_small_int32_input.onnx \
  --input-name "input_ids;input_mask;segment_ids" \
  --input-shape '1 384;1 384;1 384' \
  --output-name "output_start_logits;output_end_logits" \
  --board c920 \
  --quantization-scheme "float16" \
  --postprocess save_and_top5 \
  -D \
  --without-preprocess

About Parameters

-D: Specifies the HHB process to stop at the executable generation stage
--model-file: Specifies the downloaded BERT model in the current directory
--data-scale-div: Specifies the scale value
--board: Target platform, here c920
--input-name: Model input names
--output-name: Model output names
--input-shape: Model input sizes
--without-preprocess: Ignore preprocessing
--quantization-scheme: Quantization method

You can run hhb --help to view all available parameters and options.

About HHB Generated Files

After execution, an hhb_out subdirectory will be generated in the current directory, containing the following files:

hhb.bm: HHB model file, including quantized weights and related information
hhb_runtime: Executable for the c920 CPU platform, compiled from the C files in the directory
main.c: Reference entry for the example program
model.c: Model structure file
model.params: Weights file
io.c: File I/O helper functions
io.h: Declarations for helper functions
process.c: Preprocessing functions
process.h: Declarations for preprocessing functions

For more details on HHB options, refer to the HHB User Manual.

Uploading and Running the Application

Upload to the Development Board

Package all files in this directory and upload them to the development board. For example, use the scp command to upload to the /home/debian/npu directory:

$ scp -r bert debian20@10.63.x.x:/home/debian/bert/

Alternatively, you may use other methods such as USB storage devices or network sharing.

Running the Program

On the development board, navigate to the /home/debian/bert directory and execute:

$ python3 inference.py

During execution, the terminal will display the progress of each stage:

Preprocessing
Model execution
Post-processing

Files used by inference.py

test1.json: Test data, excerpted from the first entry of SQuAD 1.1
sample_0_input_ids.bin, sample_0_input_mask.bin, sample_0_segment_ids.bin: Intermediate results generated during preprocessing
hhb_out/hhb_runtime: Executable generated by HHB on the x86 host
hhb_out/hhb.bm: Model file generated by HHB on the x86 host
sample_0_input_ids.bin_output0_1_384.txt, sample_0_input_ids.bin_output1_1_384.txt: Output files from model execution
vocab.txt: Vocabulary file for the BERT model
bert: Python scripts for pre- and post-processing, adapted from the original BERT project

Sample output:

The reference input in this example is from the SQuAD dataset. The input for this example is as follows: the passage describes the events of a football game, and the question asks who participated in the game.

[Context]:  Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24–10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the "golden anniversary" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as "Super Bowl L"), so that the logo could prominently feature the Arabic numerals 50.

[Question]:  Which NFL team represented the AFC at Super Bowl 50?

$ python3 inference.py
 ********** preprocess test **********
[Context]:  Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24–10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the "golden anniversary" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as "Super Bowl L"), so that the logo could prominently feature the Arabic numerals 50.
[Question]:  Which NFL team represented the AFC at Super Bowl 50?
 ******* run bert *******

data pointer: 0x33a2ff70

=== tensor info ===
shape: 1 384
data pointer: 0x33a31590

=== tensor info ===
shape: 1 384
data pointer: 0x33a32bb0

=== tensor info ===
shape: 1 384
data pointer: 0x33b54820
The max_value of output: 3.794922
The min_value of output: -9.976562
The mean_value of output: -8.417037
The std_value of output: 5.098144
 ============ top5: ===========
 46: 3.794922
 57: 3.113281
 39: 1.210938
 38: 1.121094
 27: 0.603027

=== tensor info ===
shape: 1 384
data pointer: 0x33b54510
The max_value of output: 3.550781
The min_value of output: -9.632812
The mean_value of output: -7.799953
The std_value of output: 4.787047
 ============ top5: ===========
 47: 3.550781
 58: 3.437500
 32: 2.523438
 29: 1.539062
 41: 1.395508
 ********** postprocess **********
[Answer]:  Denver Broncoss

Expected output:
[Answer]: Denver Broncoss

Obtaining Example Code​

Obtaining the Model​

Model Conversion and Compilation​

Model Compilation with HHB​

Uploading and Running the Application​

Upload to the Development Board​

Running the Program​

Sample output:​