Safe Data Sharing: How Harvard-developed research tools helped Suso Baleato win the 2019 poster prize of the FAS Postdoc Research Symposium

September 12, 2019
Suso Baleato

by Patrick McVay
 

Ask Dr. Suso Baleato about how his past two years working as a postdoctoral fellow at IQSS furthered his research, and he will talk less about the research itself than about the exciting technologies that have enabled it to be analyzed and made public. “Differential privacy,” he said, “is a computational way to safely share statistical analysis of sensitive data. Now you can apply differential privacy to real cases. This is IQSS!”

To understand the importance of differential privacy and two other Harvard-developed tools, DataTags and Dataverse, one need look no farther than Baleato’s work studying the digitalization of society. Personal computers, mobile phones, social media, the internet itself – these have all seen exponential growth in the past two decades, allowing average citizens to answer their doorbells and control the settings on their toasters from halfway across the globe. More importantly, the rise of digitalization has made information more accessible, in some cases beyond what some governments may wish their citizens to have.

Suso Baleato outside CGIS Knafel“This presents opportunities and challenges,” said Baleato. “In theory, autocracies can be challenged when there is more information for the public to see.” But even as we witness rogue governments get taken down by social-media-fueled popular uprisings and enjoy space-age benefits of digitalization in our everyday lives, negative effects are easily visible in the current political landscape: increased hate speech and the polarization of societies. Baleato notes that the latter of these is fueled by the internet and social media, where people tend to gravitate toward others who have similar beliefs. “It’s an echo-chamber effect,” said Baleato. “In a healthy democracy, you need people with different points of view. The internet tends to connect you more to people like yourself.” This ability to share similar points of view with people the world over without ever having to leave one’s personal bunker can foster extremism.

Baleato measures digitalization across the world beginning in 2004 from organizations that track this kind of public, but sensitive, data. Sharing these data with other scholars is important, “But,” said Baleato, “there are privacy and ethical implications.” If individuals can be reidentified in the data, they can be singled out for retribution, with damage ranging from the economic to the fatal. And there are legal issues as well. “In the European Union, they consider [the collected data] to be personal information, so you are subject to privacy law. But by aggregating the IP addresses and applying differential privacy you make this data useful for other researchers, minimizing the risk of identity disclosure.” Baleato notes that this is exactly what differential privacy solves. “The differential privacy guarantee ensures that you cannot reidentify a single individual in a given dataset.”

 

Without this promise, it would be impossible to release the data on which his research findings are built, and therefore much more difficult to publish in high-impact journals.

The success of Baleato’s two years at IQSS is illustrated in his winning the 2019 poster prize of the FAS Postdoc Research Symposium. The poster explains how to use differential privacy to enable the verification and reuse of statistical analysis made on sensitive datasets, such as Baleato’s Internet Connectivity Statistics, with negligible loss of utility. In other words, the accuracy of the dataset is not compromised even when privacy-preserving noise is added to the statistical computation.

Baleato said his two years at IQSS provided a measure of liberty, which he describes as “the ability to choose to do whatever you want to do.” He explained that Europeans are often confused by how much Americans talk about ‘freedom’. Even to some Americans, the repeated use of the word can feel cliché, associated with American flag lapel pins and pre-game military flyovers. Baleato provides a fresh perspective on this idea of freedom. He explains that when he arrived in Germany for graduate school, he was told what the following three years would look like, and they proceeded that way almost exactly. When he came to IQSS as a postdoc in September of 2017, however, he was introduced to a few people by his supervisor, IQSS’s Chief Data Science and Technology Officer, Mercè Crosas, and then was able to do as he wished academically. “For the first time in my life, I felt authentic freedom. It was like playing in kindergarten with all these new toys and friends.” He adds that he was able to take risks while here because “this is a demanding society: you need to give your best, but your effort is acknowledged; even if you fail, you are encouraged to try again.”

Suso BaleatoThis upbeat manner of was even observable in the most commonplace interactions with colleagues at the coffee urn -- a oft-used gathering spot at IQSS -- where he would employ his own unique turns of phrase, a favorite of which was the word “nice.” As in, “How are you today, Suso?” “Nice!”

This is not to say that everything about his experience in Cambridge was perfect. “The healthcare system here was an unexpected cultural shock,” he said, referring to his adventures trying to get new spectacles, which took several months and cost untold sums due to various mishaps and health insurance-related mumbo jumbo. But he is quick to point out that he received a lot of support from people working in healthcare-related positions, as well as the many colleagues he connected with at Harvard. “People are the best of Harvard’s assets.”

And he is not shy about lauding the contributions that IQSS Director Gary King has made to science with his articulation of how to implement the replication standard, which is essential for understanding and building on previous scientific work. Baleato calls this “among the most valuable contributions that Gary has done for the social sciences and science in general. ” King’s 1995 paper expressing ways to facilitate replication, as well IQSS's Dataverse software “were the primary reason I had IQSS on my radar two years ago,” Baleato said.

While he’s thankful to be leaving Cambridge-based optical shops behind, Baleato is staying connected to IQSS for another year to continue to collaborate with Crosas and others on work related to Dataverse, DataTags, and differential privacy. He will be relocating, however, to the Universidad de Santiago de Compostela in Galicia, Spain, where he was raised. And he will not leave his Maltron L90 non-orthogonal keyboard behind. At once vintage-looking, like something out of an Apollo control room, and yet futuristic – with the split between the left and right halves of the QWERTY keyboard separated by so much space that you could fit a plate of Polbo á feira (Galician octopus) between them – the keyboard helped him through repetitive strain issues years ago when he was in Germany, and is now an important tool in his research arsenal.

With a quirky keyboard, a winning poster, and the use of differential privacy, Datatags, and Dataverse to release his research data, one word about Baleato’s two years at IQSS comes to mind: nice!