zkml transpiler
Aleo transpiler for building zero-knowledge machine learning applications.
This project provides a Python library named zkml
that transpiles scikit-learn
machine learning models into a Leo
project for inference. The resulting leo project can then be run and executed from Python. The project is at an early stage.
Supported machine learning models currently include:
- Decision tree classifiers
- Multilayer perceptron neural network regressor
Usage
The zkml
Python library is available on PyPI for installation.
Prerequisites
Python
Ensure you have 3.9.6 or newer installed.
- Verify with:
python3 --version
- If not installed, follow the instructions here.
Leo
Ensure you have Leo version 1.9.3 or newer installed.
- Verify with:
leo --version
- If necessary, update:
leo update
- Installation guide: Leo Installation
Installation
You can install the zkml
Python library from PyPI using the following command:
pip3 install zkml
Note: On some systems, you may need to use pip
instead of pip3
.
Alternatively, you can also install through the .whl
file or in editable mode from the GitHub repository.
Usage
Below is a brief description of the classes and functions provided by the library. Detailed documentation is in progress and will be available soon. We encourage you to also check out the examples on GitHub.
- In a first step, you can receive an object of the class
zkml.LeoTranspiler(model, validation_data)
- For the
model
parameter, pass the trained scikit-learn model - For the
validation_data
parameter, pass the training or validation dataset. While this parameter is not strictly required, we recommend using it. The dataset is used to compute a fixed-point scaling factor and the required Leo integer types. Using the parameter helps to ensure numerical stability in the inference computation. The larger the dataset, the better. - When the model is a scikit-learn multilayer perceptron regressor, there are additional optional parameters you can pass. First, the string
data_representation_type
, which can take the valuesint
orfield
. It determines if the data is represented as integers or fields inside the circuit. By default, it isint
, thefield
mode is experimental. The second parameter you can pass is the boolean parameterlayer_wise_fixed_point_scaling_factor
. By default, it istrue
, which leads to a growing fixed point scaling factor per layer. It can reduce the constraint usage, as there will be no divisions after multiplying two fixed-point numbers. For deep MLP networks (more than 3 layers) or large fixed point scaling factors, however, it may lead to large integers inside the circuit and setting it tofalse
may be worth trying out.
- For the
- In a second step, you can start the transpilation process through
leo_transpiler.to_leo(path, project_name, model_as_input, fixed_point_scaling_factor)
- For the
path
parameter, pass the path where the new leo project should be stored in - For the
project_name
parameter, pass the desired name of the leo project. - The boolean
model_as_input
parameter is optional and by defaultFalse
. If set toFalse
, the model parameters (i.e., thresholds, weights, biases) are hardcoded in the Leo code. If set toTrue
, these model parameters are treated as additional circuit inputs. - The integer
fixed_point_scaling_factor
parameter is optional and by defaultNone
. By default, the transpiler automatically computes a recommended fixed-point scaling factor. In certain cases, this scaling factor can be (by a wide margin) too large, which can result in constraint-rich circuits or numerical instability in inference. With the parameter, you can set the fixed point scaling factor manually. We recommend using powers of 2, and trying out smaller powers values such as 8, 16, 32. Generally, a higher fixed point scaling factor leads to more accurate computations at the cost of a higher circuit constraint count.
- For the
- After transpiling, you can run the computation of the Leo project through
leo_transpiler.run(input)
, and receive an object of classLeoComputation
- For the
input
parameter, pass the inference data sample or dataset
- For the
- Similarly, after transpiling, you can execute the Leo project through
leo_transpiler.execute(input)
, and receive an object of classZeroKnowledgeProof
- For the
input
parameter, pass the inference data sample or dataset
- For the
Building Python Apps
Please check out the examples on GitHub.
Further documentation and tutorials as to how to use the zkml
Python library will follow soon.