Neural Network Inference with Leo
April 12, 2023

Neural Network Inference with Leo

Copying Code

Throughout this blog, we will be referencing snippets of code. You can copy these snippets by clicking the top right of the box. The complete source code for this article can be found on GitHub here.


Artificial intelligence (AI) can solve many tasks that previously required human intelligence, enhancing the capabilities of software systems. Most AI systems nowadays are built upon neural networks that have proven to be capable of performing complex tasks and have paved the way for many of the recent AI breakthroughs. 

The typical AI workflow consists of two phases: training and inference. Along this workflow, the data-intensive nature of modern AI systems has raised privacy concerns. During training, especially the training data should often be protected. Afterward, the AI model parameters, as well as the model input and output may be protected. The comparison of both phases and the data to protect is highlighted in the following table.

AI engineering stage

Model training



Create an ML model that learns from the training data

Use an ML model to make a decision based on features


Many data points with features and a label (training data)

One data point with only features (no label)


ML model


Computational intensity



Potential data to protect

Training data, model

Input features, model, prediction

In this article, we focus on running the inference of multilayer perceptron neural networks in zkSNARKs. This means, computing the output of a neural network in a zkSnark, given input features. As the table highlights, there is a wide range of data we may want to protect in this computation, such as the input features, the input model, or even the output prediction. For this, we use the Leo programming language.

Inference computation of Neural Networks

A neural network is a mathematical function that deterministically transforms input values into output values. A neural network comprises multiple connected neurons, which themselves are mathematical functions transforming input values to output values.

To understand how a neural network computes output values, we first look at how to compute the output of individual neurons. Neurons consist of an activation function and two parameters - a weight parameter and a bias parameter.

A popular activation function a(x) for NNs is the ReLU (Rectified Linear Unit) activation function, which is defined by:

a(x) = max(0,x)

Before the activation function is computed, a sum of the inputs with weights and a bias is calculated. With a weight of w=1.5 and a bias of b=.5, a neuron outputs the following function:


An input of x=1 then creates an output of 2.

We can now connect different neurons to create a neural network. This example demonstrates a neural network with two inputs, two neurons in the middle layer (referred to as hidden layer), and one output. We illustrate the NN architecture in the figure below.

While the hidden layer and output layer have a ReLU activation function, the input layer typically has a linear activation function a(x)=x. This architecture translates to the mathematical function:


We obtain this function by looking at the last output neuron first, and then substituting with the layers before. For input values of x0 = 0.5 and x1 = 1, the correct output is 1.25+2.25+0.25=3.75.

Generally, NNs can approximate a wide array of functions, which is referred to as the Universal approximation theorem.

Fixed point numbers for neural networks

As in the example above, we often have noninteger values that we want to represent and compute with, otherwise, results may be wrong. This is especially important for deeper neural networks, where errors may compound over multiple layers. zk-SNARK based programming languages such as Leo do not support non-integer numbers by default, but we can work around that. One convenient way to do this is using fixed point numbers, which we discussed in a previous article. By using fixed-point numbers, we can represent and compute with fractional parts of numbers.

Implementation of Neural Networks in Leo

To implement a neural network in Leo, we set the neural network weights, biases, and the function input x as program input parameters. The neural network architecture is hardcoded, and the program computes and outputs the output. Thereby, the neurons in the hidden and output layer use a rectified linear activation function.

The following code for a LEO circuit computes the output of the neural network. Thereby, we compute the output from the left to the right in the network, meaning we first compute the outputs of the two neurons in the first layer. Then, the hidden layer and after that, the output layer is computed. The computing is based on fixed-point numbers. After multiplication operations with fixed point numbers, we need to correct the result, as described in the fixed-point article.


We input the numbers described in the graphic above, in the fixed-point representation (number multiplied by 100).

The output of the circuit then is as follows:


Dividing by 100 to account for the fixed-point number format, we obtain the result 3.75, as above.

Complexity-wise, the circuit grew to a relatively large number of 176,189 constraints - especially multiplications and divisions add large amounts of constraints.

Python script to automatically generate neural network Leo code

Now, let us analyze deeper neural networks with more layers. These deep neural networks are behind the recent astonishing advances in AI. In the above example, we have weights and biases that we can adjust. What we can not adjust, however, is the number of weights and biases, meaning the neural network architecture. We can also not pass an arbitrary number of inputs, since Leo is not turing complete and does not allow, for example, for loops with a dynamic amount of executions. However, we can code a program that generates Leo code for arbitrary neural network architectures, so that we don’t need to code the Leo NN by hand. Python is a good choice for such a program.

All we need as input for such a NN generator is the number of layers and neurons per layer, a scaling factor for the fixed point numbers, and the integer type used. The file presents a program capable of doing this. You can modify the number of neurons per layer, the scaling factor for the fixed-point computation, as well as the integer type.


Output of the program

Let’s run this python file on the latest python 3 version, and check the output. It generates two files, a LEO code file main.leo and an inputs file The neural network is of the same size as the one above and very similar with two minor changes. First, the automatically generated neural network returns an array, in this specific case of size 1, since we have one output neuron. Second, the automatically generated neural network is more generalized and also allows for all possible connections of input-layer neurons to middle-layer neurons.


‍We can now insert these two files in a Leo project, for example, in Aleo Studio, and use the “leo run” command. The program compiles successfully with 176091 constraints, very similar to the example above. We basically recreated the neural network automatically using a Python program that we first implemented by hand.


Now, we can generate deeper neural networks. The layer between the input layer and output layer are referred to as hidden layers. In the above example, we have a three-layer neural network with one hidden layer. Let us add another hidden layer with two neurons, meaning we change the input code to the following line:

Now we run the Python program, and run the generated Leo code through “leo run”. We now have 236997 constraints.

Let’s try three hidden layers:

We now obtain a circuit with 351863 constraints.‍

Four hidden layers gives us 439749 constraints, five hidden layers 527635 constraints, six hidden layers 615521 constraints, and so on. Let’s plot this on a graph.

We can see a linear relationship between the number of hidden layers and the number of circuit constraints. This is a quite good outlook for building deep neural network-based applications on Leo.