DataFest 2020 Gathers Researchers to Collaborate and Hone Their Data Skills

February 3, 2020
DataFest speaker addressing workshop attendees on laptops
by Steve Worthington
Photography by Dwayne Liburd

During the 2020 J-term, IQSS hosted the DataFest 2020 conference—a two-day series of workshops focusing on teaching best practices for analyzing, visualizing, and managing data throughout its lifecycle. The conference was a collaboration among many Harvard institutes and centers, including the Center for Geographic Analysis, HUIT, and the Harvard Libraries.

Two DataFest speakers in front of colorful projected slide that says "How to Access the Data"

This year, DataFest comprised 27 different hands-on and lecture-based sessions and was attended by over 150 people wanting to brush-up on their data science skills. Highlights included keynote talks—held simultaneously at DataFest and IACS ComputeFest—by Xiao-Li Meng (Editor-in-Chief, Harvard Data Science Review), who discussed what data science is not, and Fernando Pérez (Co-founder, Project Jupyter), who described how the Jupyter notebook ecosystem can be used to enhance collaboration.

This year, the hands-on workshops were a mix of introductory and more advanced sessions, including Interactive Visualization using Shiny (in R) and glue (in Python), as well as Natural Language Processing with Python, and Reproducible Research using the Open Science Framework.

For more information about the Dataverse Project, visit the Dataverse website at

DataFest attendees using laptops in the Belfer Case Study Room, listening to a presenter in front with a slideshow