Data Science Services at IQSS: Harnessing Data Interpretation for Academic Advancement

November 20, 2023
Closeup of multiple people interacting with a laptop

by Danielle Benaroche Gottesman
 

Research cycles of academic papers tend to follow a systematic process involving a number of stages—topic and hypothesis development, research design and methodology, and data analysis. Provided that each stage runs smoothly, a conclusion containing statistical or qualitative methodologies are interpreted into a body of work aimed at publication. 

But the research cycle doesn’t always run smoothly. It can be iterative, and researchers often revisit different stages multiple times to refine their study and ensure its quality before publication. And relying on conventional methodologies to gather and analyze data can limit researchers in their sample sizes, manual processes, and human resources. 

In a setting where research thrives on evidence and systematic inquiry, having a service round out the range of skills required to produce quality research is crucial. Data Science Services (DSS)—one of the many research services offered by IQSS at Harvard—does precisely that, providing researchers with tools and methodologies to unlock hidden insights, enable evidence-based decision making and innovation, and ultimately, advance knowledge in the pursuit of scholarly excellence. 

Steven Worthington
Steve Worthington

A resource that helps scholars elevate their work beyond the typical research cycle, DSS has emerged over its two-plus decades as an asset to faculty and graduate students alike. It is revolutionizing the research landscape across disciplines, departments, and even universities—it is available to anyone affiliated with Harvard University and MIT.

“One key thing we aspire to,” says Steve Worthington, director of DSS and lead data scientist within IQSS, “is to tailor our advice and consultations to the particular level of expertise the person we are consulting with has. We work with senior faculty, senior undergrads, and everyone in between. They [each] have different levels of expertise, and we adapt to give [custom] advice. We pride ourselves on making our responses useful, which requires us, from a human perspective, to understand [this] as we begin a consultation.”

How does it work? 

DSS offers three services to help clients overcome the obstacles typically associated with research whose goal is publication (either in a peer-reviewed journal or elsewhere): consulting, collaboration, and analytical tool building.

Consulting is offered free of charge and the duration can last up to three hours. Short-term consulting comprises a significant percentage of how DSS tends to work with researchers.

Collaboration can be longer term and is fee-based. Following initial contact and a project assessment, a data science specialist helps design and implement a data analysis pipeline. The collaboration component allows the data scientist assigned to the research project to assist by implementing the analyses, becoming an integral part of the research team itself. “This is specifically useful for faculty and students who don't have a skillset and need to outsource the technology piece,” says Worthington. 

The third service, analytical tool building—building R (programming language for statistics) packages—helps bring data analysis methods to a worldwide audience by taking code written by researchers, cleaning it up, and packaging it onto a site so people around the world can download and use it for their own projects. “We have a four- to six-month backlog, it's so popular,” says Worthington. 

Effective collaboration

The internal triage process allows the team to gather information from the person making contact, including details about the project they’re working on, which is then mapped to the skillset of the people on the team. The needs of the researcher(s), plus the skillset of the person on the team helping them are taken into account with each pairing. And though everyone on the DSS team has core skills in terms of programming, more than one person will sometimes be paired to a project.

“The people on the team are researchers too,” says Worthington. “We all have an applied research background—we are economists, we are psychologists, we are evolutionary biologists…Our team’s goal is to help researchers overcome hurdles at every stage—statistics, programming, or any other tech issue encountered during the research process. [A researcher] might have a dataset and not understand what kind of statistical analysis to employ, or [they’re] having trouble interpreting it, or [they’re] scraping a website and having trouble writing code for it. We can help figure out what type of statistical analysis to implement and get past the tech issues.”

Effective pairings can be critical to the outcome of a project, and the triage process takes a multitude of compatibility factors into account. Joanne Leong, an MIT Media Lab researcher who worked with DSS on a recent project, was pleased to have been paired with “an excellent mentor, [who] worked with me through the details of my study to determine an appropriate strategy to apply statistics to analyze my data. He [was] kind, patient, and conscientious. I learned a lot and greatly appreciate the advice he [gave] me.”

“One key thing,” per Worthington, “is that you can develop personal relationships with the consultants who know your subject matter and the program you’re trying to run. Researchers are usually part of some concerted program, so having someone who has an understanding of your hypotheses, your research, and design helps [them] to not have to get up to speed again. Having someone who understands the context and background of what you’re trying to do is enormously helpful.”

Once paired, a team can work together to:

  • Uncover hidden data patterns: DSS can advise researchers on advanced statistical and machine learning techniques that facilitate the ability to uncover hidden patterns and correlations within data, potentially leading to groundbreaking insights and research discoveries.
  • Advance knowledge and innovation: By integrating DSS’s help into their research workflow, scholars can leverage theory-driven or data-driven approaches to validate hypotheses, test theories, and validate findings. This emphasis on evidence-based research enhances the quality and reliability of scholarly output, driving knowledge advancement and fostering innovation.
  • Foster interdisciplinary collaboration: DSS promotes collaboration that breaks down silos between different research domains. By sharing data and methodologies, researchers from diverse fields can collectively tackle complex problems and gain a comprehensive understanding of multidimensional issues.
  • Improve research efficiency: DSS can help automate data processing, cleaning, and analysis tasks, enabling researchers to focus on interpreting results and draw meaningful conclusions. This enhanced efficiency can accelerate the pace of research and lead to a higher output of scholarly work.

Central hub

“[DSS is] the only team [of data science consultants] that covers the whole of Harvard in terms of its scope, and the whole of MIT—every department, every school—across both universities,” says Worthington. IQSS, the institute housing Data Science Services, “is a centralized hub of resources that can be accessed from anywhere. We are always hearing from researchers with different backgrounds and with different problems they need to solve.” If a researcher is uncertain where to begin, DSS is a great place to start and will aim to make necessary connections. Because IQSS is tied to the different schools across Harvard, they can screen, sort, and categorize requests to link prospective clients with internal or external resources as needed. Working with at least 30 to 40 researchers at a time, DSS has a hand in helping anywhere up to 500 people in a given year. 

Dr. José R. Zubizarreta, a Harvard professor and researcher who worked closely with DSS on an R package, summarized his experience: “Harvard's IQSS has been an invaluable partner to enhance and disseminate some of our methods by transforming ‘research code’ (i.e., early code that we wrote in the context of a methods research paper) into more elegant, robust, and transparent R packages for a broad spectrum of investigators and practitioners.”

Data Science Services is transforming academia, reshaping how research is conducted and perceived, and the ways in which knowledge is advancing. By leveraging the power of collaboration, end users are enabled to further uncover hidden insights, drive evidence-based decisions, and foster innovation across diverse research domains. DSS offers applications in scientific discovery, social sciences, humanities, political science, and interdisciplinary research projects, as well as the opportunity to enhance research outcomes and contribute to the collective pursuit of knowledge.

“We have people who say, ‘You saved our publication (or dissertation),’ or, ‘This wouldn’t have been published without you,’” says Worthington. “That we are able to alter the course of someone's research project and have a dramatic impact outweighs the formal number of consults. A lot of the time, it is impactful to the point of influencing someone's career (via publication in a high-tier journal, help with tenure, or getting someone a job), and the impact can be dramatic.”