C API

This document lists the application programming interface (API) that the generated C source code exported. With these APIs, applications can:

  • write input into the neural network,

  • start inference of the given neural network on microcontrollers, and

  • read output from the neural network

Inclusion of the inference.h header will make API defined visible.

#include "inference.h"

Data Structure

Neural Network Model

A onnc_model_t object represents a neural network model that generated by Tiny ONNC. You can get a model by onnc_open_<output>_model() function.

interface
struct onnc_model_t;

Raw Tensor

A onnc_raw_tensor_t object represents a tensor inside a model descriptor, such as feature maps, weight and bias.

interface
struct onnc_raw_tensor_t { int8_t* data; size_t size; }
data

the memory space of the tensor

size

the size of the tensor


Standard Procedures

This section describes all the open API functions to application.

Getting Neural Network Model

onnc_open_<name>_model returns a pointer to the nerual network.

interface
struct onnc_model_t* onnc_open_<name>_model();
return

The generated neural network model

To support running multiple models in one application, Tiny ONNC distinguishes different models by getting them with differen function calls. The model name is embedded in the function name. For example, if you set the model name as my_net, then Tiny ONNC will produces a function named onnc_open_my_net_model.

Developers can use either command line or python API to set up the model name. In command line tool, -o option sets a name to a model.

onnc.cortexm -o <model_name>

In python API, the name is set in onnc.bench.launch.

onnc.bench.launch(name=<name>, device="m487");

Writting Inputs

onnc_write_input copies the given input array into a model and quantizes the value of array.

interface
/**
 *  Write the input data into NN model
 * @retval -1 failure
 * @return the number of data written in the NN's input array. A given @ref pInput array may be larger than a NN model can afford. The function returns the number of elements of @ref pInput being moved inside the NN model.
 * @param[in,out] pModel the NN model
 * @param[in]     pInput the array of inputs
 * @param[in]     pSize  the size of the input array
 */
int onnc_write_input(onnc_model_t* pModel, float pInput[], unsigned int pSize);
pModel: onnc_model_t*

The neural network model. Getting from onnc_open_<name>_model.

pInput: float*

The input array. This function will copy values of the array into its internal raw tensors and do quantization.

pSize: unsigned int

The size of the input array.

return

The number of data written into the raw input tensor. Return -1 when an error occurs.

Model Inference

onnc_inference does the neural network inference, and returns the raw output tensor.

In case you don’t need to de-quantize the output value, you can operate raw output tensor directly. For example, if the last layer is monotonic and all you want is just to find out the biggest element, then you can operate on raw tensor directly.

interface
/**
 * Do inference on the @ref pModel
 * @retval -1 failure
 * @return The raw data of the output. In case users don't need to de-quantize the output value.
 * @param[in]  pModel The NN model.
 */
onnc_raw_tensor_t* onnc_inference(onnc_model_t* pModel);
pModel: onnc_model_t*

The neural network model. Getting from onnc_open_<name>_model.

return

The raw tensor of the output. Return null pointer when an error occurs.

Reading Outputs

onnc_read_output de-quantizes the output data and copies them into the given output array.

interface
/**
 * Read the output value from the NN model @ref pModel to the output array @ref pOutput
 * @retval -1 failure
 * @return The number of data moved into the @ref pOutput array. A given @ref pOutput array may be larger than a NN model's output. The function returns the number of elements that moved to @ref pOutput array.
 * @param[in]  pModel  the NN model.
 * @param[out] pOutput the output array.
 * @param[in]  pSize   the size of the @ref pOutput array.
 */
int onnc_read_output(onnc_model_t* pModel, float pOutput[], unsigned int pSize);
pModel: onnc_model_t*

The neural network model. Getting from onnc_open_<name>_model.

pOutput: float*

The output array. This function will de-quantize the raw output tensor and copy it into the given output array.

pSize: unsigned int

The size of the output array.

return

The number of data written into the output array. Return -1 when an error occurs.

Release The Model

onnc_close_model releases the memory space inside a model.

interface
/**
 * Release the memory of the NN model @ref pModel
 * @retval -1 failure
 * @retval 0  success
 * @param[in,out] pModel The NN model.
 */
int onnc_close(onnc_model_t* pModel);
pModel: onnc_model_t*

The neural network model. Getting from onnc_open_<name>_model.

return

Return 0 when the closing is successful. Otherwise, return -1 when an error occurs.

Example

Here is an example to use C API.
#include "inference.h"
#include "XYZ_model.h"
#include <stdlib.h>

int main(int pArgc, char* pArgv[])
{
  struct onnc_model_t* model = onnc_open_XYZ_model();

  unsigned int num_inputs = 10; // magic number
  float* input_array = (float*)malloc(sizeof(float)*num_inputs);

  if (-1 == onnc_write_input(model, input_array, num_inputs)) {
    return EXIT_FAILURE;
  }

  onnc_raw_tensor_t* output = onnc_inference(model); // raw data of the tensor
  if (nullptr == output) {
    return EXIT_FAILURE;
  }

  unsigned int num_outputs = 5; // magic number
  float* output_array = (float*)malloc(sizeof(float)*num_outputs);
  if (-1 == onnc_read_output(model, output_array, num_outputs)) {
    return EXIT_FAILURE;
  }

  if (-1 == onnc_close(model)) {
    return EXIT_FAILURE;
  }
  return EXIT_SUCCESS;
}