zkML IRL: Practical use cases for more secure models
Machine learning models rely on large amounts of data to churn out accurate outputs that appear almost magical in their ability to understand us. But, as the amount of data ingested into machine learning projects grows, privacy flaws will become more and more apparent — and there may be a tipping point where users are no longer willing to trade their privacy for the output.
Before the industry develops further, models need to have a layer of accountability built in. Users need to trust that their data won’t be abused, that the model hasn’t been hacked or altered, and that models meet industry data privacy standards. Right now, the standard isn’t being met — but we have the opportunity to change that.
Zero-knowledge proofs are one way for developers to create and run machine learning models that prove a computation was done correctly while being free to choose which properties to make public. The result is the best of both worlds: interesting, personalized outputs based on secure, trustworthy models.
We see a growing number of use cases for zero-knowledge proofs that could provide safeguards as the technology reaches its full potential.
Real-life applications for zkML
Machine learning will touch every part of our lives in the not-so-distant future, whether we want it to or not. We’re currently in the unique position of being able to add a layer of security in several key use cases, both on-chain (public transactions carried out on a blockchain) and off-chain (transactions confirmed outside of the main blockchain network).
On-chain use cases
We see several use cases for on-chain machine learning algorithms, most notably for credit scores, know-your-customer processes, and stablecoins.
Creating privacy-preserving credit scores
When crypto traders borrow, the lender needs to be sure the borrower is trustworthy. Borrowers may be operating internationally and even pseudonymously. A machine learning model can be used to assess the creditworthiness of borrowers in a privacy-preserving way, so borrowers can be matched with lenders that best meets their needs.
Building private know-your-customer (KYC) processes
As part of the KYC process, new users are often asked to upload a photo of their driver’s license and then perform a liveness test involving looking into their webcam and turning their head. To protect the user’s personal privacy, a machine learning model could be built using zero knowledge that performs the liveness test, checks it against a user’s driver's license photo, and returns a score for how closely they match along with a proof indicating the model was run accurately.
Generating accurate stablecoin exchange rates
Oracles serve an essential purpose by bringing off-chain data on-chain, which is important because smart contracts can’t get to information outside of the blockchain network. They are often used in stablecoins, where the security assumptions rely on regular and timely reporting of an exchange rate between an asset (for example, ETH) and a stablecoin which is supposed to remain pegged to a real-world currency. Machine learning can help make this reporting more accurate and robust, while zero-knowledge proofs guarantee that the oracle performed the computation correctly.
Off-chain use cases
Our personal information is fed into machine learning models regularly in the course of daily life, from applying for a mortgage to running a school assignment through a plagiarism detector. Zero-knowledge proofs can help keep more of our data in our own hands, allowing folks to generate an inference locally and submit a proof instead of relying on a third party to harvest their private information to achieve the same result.
Safeguarding the use of machine learning in high-assurance industries
When lives depend on it, it’s critical to know that machine learning models produce trustworthy results that haven’t been hacked or altered by malicious actors. Any machine learning models used in the military, AI-piloted cars, or medical imaging and diagnostics, for example, should have extra validation software that could validate a zero-knowledge proof that sensor input has been analyzed correctly.
Protecting proprietary machine learning models
Many companies have already built private machine-learning models that they don’t want to share publicly. Since these models may have been trained on proprietary data or used in highly regulated industries, it may become important to prove that a result came from that specific model.
As machine learning and AI enter our daily lives, it’s more important than ever for us to trust the models and data used to generate outputs that impact the safety of our world.
Aleo makes developing zero-knowledge algorithms simple thanks to our private by default, programmable platform. Machine learning developers can focus on doing what they do best: creating the algorithms and models that provide new insights into our world.
Start creating your own zkML algorithms and plugins using Aleo's open-source zkML transpiler.
Our blog features the stories of developer and privacy advocates building a better internet with zero knowledge.
About Kathie Jurek
Kathie Jurek is the Content Lead at Aleo, tasked with setting the direction for the conversations we have and where we have them. For 9 years, she’s led and created creative work in some of the most technical industries, from developer tools to robotics.
For further information contact us at firstname.lastname@example.org