BERT
This tutorial guides you through running the BERT model on the RevyOS system. BERT is a widely used natural language processing (NLP) model, commonly applied to tasks such as question answering, classification, and translation.
Before proceeding, please ensure you have completed the environment setup section.
Obtaining Example Code
The example code for this tutorial is available on Github. Clone it locally using the following command:
$ git clone https://github.com/zhangwm-pt/lpi4a-example.git
The CPU example for this tutorial is located in the reading_comprehension/bert
directory.
Obtaining the Model
The model used in this tutorial is from the Google BERT repository, converted to ONNX format. Download it with the following command:
$ wget https://github.com/zhangwm-pt/bert/releases/download/onnx/bert_small_int32_input.onnx
If you encounter network issues accessing GitHub from mainland China, consider using a network proxy tool to accelerate access.
Model Conversion and Compilation
On an x86 machine, use the HHB tool to convert the ONNX model into a computation graph and glue code suitable for RevyOS. Before proceeding, ensure you have started the HHB container and cloned the example repository as described in the environment setup section.
Model Compilation with HHB
After completing the environment setup, use HHB to compile the model into an executable for the c920 CPU. Enter the reading_comprehension/bert
directory and execute the following command:
$ hhb --model-file bert_small_int32_input.onnx \
--input-name "input_ids;input_mask;segment_ids" \
--input-shape '1 384;1 384;1 384' \
--output-name "output_start_logits;output_end_logits" \
--board c920 \
--quantization-scheme "float16" \
--postprocess save_and_top5 \
-D \
--without-preprocess
-D
: Specifies the HHB process to stop at the executable generation stage--model-file
: Specifies the downloaded BERT model in the current directory--data-scale-div
: Specifies the scale value--board
: Target platform, here c920--input-name
: Model input names--output-name
: Model output names--input-shape
: Model input sizes--without-preprocess
: Ignore preprocessing--quantization-scheme
: Quantization method
You can run hhb --help
to view all available parameters and options.
After execution, an hhb_out
subdirectory will be generated in the current directory, containing the following files:
hhb.bm
: HHB model file, including quantized weights and related informationhhb_runtime
: Executable for the c920 CPU platform, compiled from the C files in the directorymain.c
: Reference entry for the example programmodel.c
: Model structure filemodel.params
: Weights fileio.c
: File I/O helper functionsio.h
: Declarations for helper functionsprocess.c
: Preprocessing functionsprocess.h
: Declarations for preprocessing functions
For more details on HHB options, refer to the HHB User Manual.
Uploading and Running the Application
Upload to the Development Board
Package all files in this directory and upload them to the development board. For example, use the scp
command to upload to the /home/debian/npu
directory:
$ scp -r bert [email protected]:/home/debian/bert/
Alternatively, you may use other methods such as USB storage devices or network sharing.
Running the Program
On the development board, navigate to the /home/debian/bert
directory and execute:
$ python3 inference.py
During execution, the terminal will display the progress of each stage:
- Preprocessing
- Model execution
- Post-processing
inference.py
test1.json
: Test data, excerpted from the first entry of SQuAD 1.1sample_0_input_ids.bin
,sample_0_input_mask.bin
,sample_0_segment_ids.bin
: Intermediate results generated during preprocessinghhb_out/hhb_runtime
: Executable generated by HHB on the x86 hosthhb_out/hhb.bm
: Model file generated by HHB on the x86 hostsample_0_input_ids.bin_output0_1_384.txt
,sample_0_input_ids.bin_output1_1_384.txt
: Output files from model executionvocab.txt
: Vocabulary file for the BERT modelbert
: Python scripts for pre- and post-processing, adapted from the original BERT project
Sample output:
The reference input in this example is from the SQuAD dataset. The input for this example is as follows: the passage describes the events of a football game, and the question asks who participated in the game.
[Context]: Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24–10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the "golden anniversary" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as "Super Bowl L"), so that the logo could prominently feature the Arabic numerals 50.
[Question]: Which NFL team represented the AFC at Super Bowl 50?
$ python3 inference.py
********** preprocess test **********
[Context]: Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24–10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California. As this was the 50th Super Bowl, the league emphasized the "golden anniversary" with various gold-themed initiatives, as well as temporarily suspending the tradition of naming each Super Bowl game with Roman numerals (under which the game would have been known as "Super Bowl L"), so that the logo could prominently feature the Arabic numerals 50.
[Question]: Which NFL team represented the AFC at Super Bowl 50?
******* run bert *******
data pointer: 0x33a2ff70
=== tensor info ===
shape: 1 384
data pointer: 0x33a31590
=== tensor info ===
shape: 1 384
data pointer: 0x33a32bb0
=== tensor info ===
shape: 1 384
data pointer: 0x33b54820
The max_value of output: 3.794922
The min_value of output: -9.976562
The mean_value of output: -8.417037
The std_value of output: 5.098144
============ top5: ===========
46: 3.794922
57: 3.113281
39: 1.210938
38: 1.121094
27: 0.603027
=== tensor info ===
shape: 1 384
data pointer: 0x33b54510
The max_value of output: 3.550781
The min_value of output: -9.632812
The mean_value of output: -7.799953
The std_value of output: 4.787047
============ top5: ===========
47: 3.550781
58: 3.437500
32: 2.523438
29: 1.539062
41: 1.395508
********** postprocess **********
[Answer]: Denver Broncoss
Expected output:
[Answer]: Denver Broncoss