Imagine two investment bankers. They have a competitive bent, and they want to know who received the bigger bonus last year. Naturally they want to keep the amount of their bonus secret, afraid of revealing how badly one of them had been beaten, or perhaps of arousing public outrage. It seems on the surface to be an intractable problem. Short of some trusted third party, how could this puzzle possibly be solved?
It turns out there’s a solution, and it’s called Secure Multiparty Computation (SMC). In a nutshell, SMC is the process of computing a final value, shared with all the participants, without revealing the starting inputs. Conceptually it’s like having a trusted third party, only in this case it’s a “virtual” party. There’s no real third party. Instead, the trust comes from the participants themselves encrypting their data in a clever way.
SMC is a hard problem, both intellectually and computationally. Luckily for us, research in the field has been active and the speeds are getting faster. When Andrew Yao published the first work in 1986 the algorithm was so slow it was barely practical at all. By 2009, then-modern hardware running the new algorithms of the day could compare pairs of integers about 500 times per second. That speed is at least a million times slower than than the insecure method, for instance. By 2013 there were even better algorithms reaching speeds on the order of 2500 comparisons per second, or a 5X speedup in 4 years. Progress continues on more efficient algorithms and at the same time new hardware has provided another factor of eight in speedup.
Greatly simplifying our work, there are several implementations of SMC available freely from cryptography researchers. Unfortunately, these packages tend to show the hallmarks of research software – difficult installation, limited documentation, and small user bases to turn to with questions. We’re solving these problems by integrating SMC into ImPACT. Our goal is to make setting up a framework for SMC as automatic as possible: define your collaborators, click a few buttons, and start sharing results. Without sharing the underlying data.
Our first use case for automatic SMC construction is “cohort building” – letting researchers determine if, for instance, there is a large enough group of people having a particular set of characteristics to make a statistically useful experimental cohort. The challenge is that individual researchers guard their data closely. These researchers don’t want to even reveal how many people fit a given set of criteria, let alone share their raw data with anyone. This isn’t necessarily done for any competitive advantage, but instead because the information they hold is sometimes very, very private and there are substantial professional and legal consequences for leaking it.
ImPACT’s overall goal is make it easy for researchers to collaborate even between different institutions and do so in a safe way. SMC is one tool for doing this, but ImPACT provides a toolbox of methods. We’ve already explored some of these methods (Virtual Private Networks and Federated Identity Management, for instance) but next time we’ll look at The Biggie: a system that looks at data privacy from first principles and “writes” proofs of privacy or lack thereof.