Convert ONNX Model to HEF with Docker

Posted Feb 16, 2025

By Dylan Tintenfich

5 min read

HEF (Hailo Executable Format) is a file format used by Hailo devices to run deep learning models efficiently on edge hardware. Hailo provides specialized AI acceleration, enabling optimized inference for neural networks.

In this blog post, we will walk through the process of converting an ONNX model into a HEF model using Docker. This method allows for cross-platform compatibility, meaning you can follow this tutorial on any operating system. In this case, we will demonstrate the process on Windows.

Prerequisites

Before proceeding, ensure you have the following:

ONNX Model: The model must be one of the models included in the Hailo Model Zoo. In this example, we use a YOLOv8n model from Ultralytics trained for detecting vehicles and license plates.
Docker: Install Docker by following the official installation guide.
Calibration Images: These images help the compiler optimize the model for Hailo hardware, and can be the same images used for training the model, as long as they represent the deployment environment.

According to Hailo’s guidelines, the calibration set should typically consist of 32 to 96 images. However, for improved quantization accuracy, using a larger dataset of 500 to 1,000 images is recommended in forums.

1. Download the Hailo AI Software Suite Docker Image

Go to the Software Downloads (Hailo) page and download the Hailo AI Suite - Docker version for Linux:

In this case, we downloaded the 2024-10 version of the archive, but any version should work. This download contains a .zip file with the following two files: hailo_ai_sw_suite_2024-10.tar.gz and hailo_ai_sw_suite_docker_run.sh.

2. Load the Docker Image

To load the Docker image, open the command line, go to the unzipped folder, and run the following command:

docker load -i hailo_ai_sw_suite_2024-10.tar.gz

Then, verify that the image has been loaded by running:

docker images

You should see the hailo_ai_sw_suite_<version> image listed:

3. Create a Docker Container

Before creating the container, store the ONNX model weights you want to convert and some sample images in a directory. In this example, we created C:\Users\tinte\Downloads\vehicle_detection containing two subdirectories: models and images:

.\vehicle_detection
    |-- models
    |   |-- model.onnx
    |-- images
        |-- image1.jpg
        |-- image2.jpg
        |-- ...

Once everything is ready, execute the following command:

docker run -it --name hailo_container \
    --privileged \
    --network host \
    -v /dev:/dev \
    -v /lib/firmware:/lib/firmware \
    -v /lib/udev/rules.d:/lib/udev/rules.d \
    -v /lib/modules:/lib/modules \
    -v //c/Users/tinte/Downloads/vehicle_detection:/workspace \
    hailo_ai_sw_suite_2024-10:1

The option -v //c/Users/tinte/Downloads/vehicle_detection:/workspace tells Docker to mount the local directory C:\Users\tinte\Downloads\vehicle_detection into the container under the /workspace folder. This means that everything inside vehicle_detection folder (including the models and images subdirectories) will be accessible from within the container at /workspace.

Modify the first part of the path (//c/Users/tinte/Downloads/vehicle_detection) to match the location of your own files. You can also modify the target directory (\workspace), but it is recommended to maintain consistency with this tutorial.

If the command does not work on Windows (this is my case), try running it as a single line:

  
docker run -it --name hailo_container --privileged --network host -v /dev:/dev -v /lib/firmware:/lib/firmware -v /lib/udev/rules.d:/lib/udev/rules.d -v /lib/modules:/lib/modules -v //c/Users/tinte/Downloads/vehicle_detection:/workspace hailo_ai_sw_suite_2024-10:1

If successful, you will now be inside the container. Navigate to the workspace folder to find your images and models:

If you don’t see your files, check the path (//c/Users/tinte/Downloads/vehicle_detection) specified in the Docker command. Ask ChatGPT how to correctly specify this path for your case.

4. Convert ONNX Model to HEF

Once inside the container, navigate to the models directory and execute the following commands:

A. Parse the ONNX Model

hailomz parse --hw-arch hailo8l --ckpt model.onnx yolov8n

Replace model.onnx with the name of your ONNX model file, and yolov8n for the desired HEF model name. This creates a .har file, which is the Hailo model representation.

B. Optimize the model

  
hailomz optimize --hw-arch hailo8l --calib-path ../images --resize 640 640 --classes 2 --har yolov8n.har yolov8n --performance

This step fine-tunes the model for the Hailo hardware by applying optimizations based on the calibration dataset:

Replace ../images with the path to your calibration images directory. This isn’t necessary if you followed the same structure as in this tutorial.
Ensure --resize 640 640 matches the resolution used during training. In our case, the YOLOv8 model was trained on 640x640 images.
--classes 2 specifies the number of object classes detected by the model—modify it according to your model.
--har yolov8n.har should be replaced with the name of your generated .har file.
Replace yolov8n with the model architecture used in your specific case.
The --performance flag prioritizes speed while maintaining accuracy.

When the command completes successfully, you should see an output similar to:

This confirms that the model optimization process is finished and the updated .har file has been saved.

C. Compile the Model into a HEF File

  
hailomz compile yolov8n --hw-arch hailo8l --calib-path ../images --resize 640 640 --classes 2 --har yolov8n.har --performance

This final step converts the optimized model into a .hef file, which is deployable on Hailo hardware.

--hw-arch hailo8l specifies the target hardware architecture. In this case, the command compiles the model specifically for the Hailo-8L accelerator. If you are using a different Hailo device, ensure this flag matches your hardware.
Replace ../images with the path to your calibration images.
--har yolov8n.har specifies the previously generated .har file. Change the name to match your file.
Replace yolov8n with the model architecture used in your specific case.
Modify --resize 640 640 and --classes 2 according to your model’s specifications.
The --performance flag ensures the model is compiled for optimal inference speed (optional, read below).

Enabling performance optimization (using the --performance flag) initiates an exhaustive search for optimal throughput, which can significantly extend compilation time. It’s recommended to first compile without performance mode to ensure successful compilation before enabling it.

By following these steps, you will successfully convert an ONNX model into a HEF file:

🧠 Machine Learning

English

This post is licensed under CC BY 4.0 by the author.