Privacy-Preserving Machine Learning Trends
The research provides three trends of privacy-preserving machine learning (PPML) in the finance industry. They either address the current technical paint points or solve existing problems, such as fraud detection, faced by financial institutions. The trend of deploying PPML at financial institutions is expected to be driven by big data analytics, personal data protection regulations, and risk management.
Trend 1: Federated Learning System That Enables Interoperability
- The Federated learning technique is one component of PPML, which requires specialized software to share data or information on each node. However, the problem is the incompatibility of different software products used by different organizations, such as China's WeBank. Current research develops a proof of concept for a federated learning system, which allows the interoperability of different vendor products.
- It essentially creates an open community that builds a common and scalable core trusted by everyone in the federated learning system. The need for the federated learning system is likely attributed to the ubiquitous regulation for protecting individual privacy, which limits the sharing of critical data assets between entities or internally. Such a constraint could have a direct impact on the effectiveness of machine learning models and the scope of use.
- M.M. Hassan Mahmoud is working on the federated learning system, who is the AI and machine-learning technologist at the Digital Catapult, an organization funded by the UK government that supports startups.
- As federated learning systems incorporate machine learning models with different privacy-preserving approaches, they often face such challenges as unpractical system assumptions, scalability and efficiency. Recent research draws lessons learnt from cloud computing and databases and focuses on two rarely considered yet important features, namely heterogeneity and autonomy.
Trend 2: Governance of Machine Learning Models Against Cyberattack
- As for protecting individual data and privacy, machine learning models used for privacy-preservation should also be protected. This falls into the domain of machine learning governance.
- The potential vulnerability of machine learning models could be exploited by cyberattacks using reverse engineering techniques. Cyberattacks could cause the leakage of sensitive data used to train machine learning models, such as personally identifiable information and IP.
- The governance of machine learning models is part of the organizational risk or governance functions and it focuses on the protection and understanding of machine learning models.
- New research proposes a new data science platform that allows users to offer computation to personal data in a privacy-preserving way and request data they could have not been able to collect. These personal data can be used by financial institutions, such as PayPal and Bank of America, to detect fraud in financial transactions.
Trend 3: Collaboration For Fraud Detection
- PPML and associated techniques could enable a private machine learning service that allows competitors in the same industry, such as banks, to collaborate without the concern of losing competitive advantages. Financial institutions could use privacy-preserving computation to address regulatory issues, regarding the process and access of encrypted, sensitive data.
- This new way of collaboration solves critical problems, such as money laundering and financial fraud, by creating joint models.
- ING Belgium has deployed the "Inpher's XOR secret computing engine to build analytical models" based on data from other countries, such as Switzerland and Luxembourg, which has strict privacy regulations. The XOR compiled and secretly computed the data without revealing sensitive and personally identifiable information.
- Banks also deploy "the homomorphic encryption technique to detect cross-border anti-money laundering and anti-fraud use cases," which neither needs one bank to ask clients' information from another bank nor need to give out its own clients' data.
The research reviewed a series of recent academic papers and trend news in the space of privacy-preserving machine learning. We identified trends based on articles published within the past two years from credible sources, such as Forbes, Intel, Accenture and MIT. Although trends from different sources are not exactly the same, we extracted core technical components and the main problems they aim to resolve.