Data Science Services

Joshua Cetron

Joshua Cetron

Data Science Specialist

Josh is a data scientist, social scientist, and methodologist with a background in psychology, neuroscience, and multivariate statistics. As a Data Science Specialist at IQSS, Josh provides methods and analysis consulting services and collaborative support to Harvard-affiliate researchers across the sciences, with a focus on experimental methods, exploratory analysis, and regression modeling of empirical data in R and python... Read more about Joshua Cetron

2021 Dec 10

Data Science Services Office Hours (virtual)

11:00am to 12:00pm

Location: 

Virtual via Zoom

IQSS Data Science Services is pleased to assist researchers in the Harvard community working on publication research in overcoming obstacles at every stage of the research process. Drop-in office hours are available for short questions that should take no longer than 15 minutes to resolve.

For more information, visit the DSS services page.

2021 Oct 08

Data Science Services Office Hours (in person)

Repeats every 2 weeks every Friday until Sat Dec 18 2021 .
11:00am to 12:00pm

11:00am to 12:00pm
11:00am to 12:00pm
11:00am to 12:00pm
11:00am to 12:00pm
11:00am to 12:00pm

Location: 

CGIS Knafel room K017 (lower concourse)

IQSS Data Science Services is pleased to assist researchers in the Harvard community working on publication research in overcoming obstacles at every stage of the research process. Drop-in office hours are available for short questions that should take no longer than 15 minutes to resolve.

For more information, visit the DSS services page.

Read more about Data Science Services Office Hours (in person)
2020 Oct 07

Matching Methods in R

12:00pm to 3:00pm

Location: 

Online: Zoom

This is an applied part of a series of workshops on principles of causal inference and matching methods (attendance of the previous parts is not required, but recommended). During this workshop we will practice implementing matching methods in R using various packages, including MatchIt, cem, RItools, and cobalt. Please contact research@hbs.edu with any questions.

Visit the Harvard Training Portal for details and registration information (requires HarvardKey).

2020 Sep 03

Research Data Management

1:30pm to 3:30pm

Location: 

Online: Zoom

Want to be more efficient and save time doing your research and collaborating with others? Looking for new ways to promote your work and make a worldwide impact? Then come to this workshop to learn techniques and services to help you manage your research data. You will learn practices that ensure that your research is documented, reproducible, and accessible long-term. This includes how to acquire specialized data for your research, resources and tools to support your use of data throughout your research lifecycle, complying with internal and external data policies and regulations, and making data from Harvard researchers available to others where feasible.

This class, a combination of seminar and discussion, will highlight robust data management and documentation practices to help you, your future self and fellow researchers be successful in these areas.

This workshop will be delivered by the HBS Research Computing Services group in partnership with the Data Science Services group at IQSS.

Visit the Harvard Training Portal for details and registration information (requires HarvardKey).

2020 Sep 11

Structured Data, Databases, and SQL

1:00pm to 4:00pm

Location: 

Online: Zoom

Collecting, analyzing, and managing data is the bread-and-butter of any research project, and standard tools like Microsoft Excel are the go-to apps as they're omnipresent and easy to use. But these start to show their limitations when one needs to handle tens of thousands of rows or merge data from multiple sources. Using a relational database, such as SQLite, can meet this gap and is the logical next step for bigger data projects. 

This class will discuss the fundamentals of structured data, introduce you to using SQLite (a lightweight database available on all most computing platforms), and teach you the basics of querying and summarizing data with SQL. Meeting these objectives could open up new opportunities for research and help you with your research data management goals.

This workshop will be delivered by the HBS Research Computing Services group in partnership with the Data Science Services group at IQSS.

Visit the Harvard Training Portal for details and registration information (requires HarvardKey).

2020 Nov 20

Version Control with Git / GitKraken

1:00pm to 4:00pm

Location: 

Online: Zoom

Version control software allows you to save “versions” of files -- scripts, text files, web pages, data, etc. -- which show the changes that were made to the files over time, and allows you to backtrack if necessary and undo those changes. The ability alone – of being able to compare two versions or reverse changes, makes it fairly invaluable when working on larger projects. Even more so when collaborating in research groups.

This hands-on workshop will take you through the steps of using Git / GitKraken and GitHub, to track changes, revert to older versions, and share your files with other people. Ultimately, to keep you organized, to reduce the clutter, and maintain an intelligible history of files in your projects.

This workshop will be delivered by the HBS Research Computing Services group in partnership with the Data Science Services group at IQSS.

Visit the Harvard Training Portal for details and registration information (requires HarvardKey).

2020 Nov 13

Do Less Work by Using the Unix Shell

1:00pm to 4:00pm

Location: 

Online: Zoom

This hands-on workshop will introduce you to the Unix shell, a power tool that allows people to do complex things with just a few keystrokes, combine existing programs in new ways, and automate repetitive tasks.

The Unix shell (command line) has been around longer than most of its users have been alive. It has survived so long because it’s a power tool that allows people to do complex things with just a few keystrokes. More importantly, it helps them combine existing programs in new ways and automate repetitive tasks so they aren’t typing the same things over and over again. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including “high-performance computing” supercomputers). These lessons will start you on a path towards using these resources effectively.

This workshop will be delivered by the HBS Research Computing Services group in partnership with the Data Science Services group at IQSS.

Visit the Harvard Training Portal for details and registration information (requires HarvardKey).

2020 Nov 06

Python Natural Language Processing

1:00pm to 4:00pm

Location: 

Online: Zoom

This hands-on workshop will introduce foundational concepts in natural language processing (NLP) as well as techniques for analyzing text (natural language) data using Python's Natural Language ToolKit (NLTK) library. We will work through an entire basic NLP workflow covering acquiring text corpora from the web, text pre-processing, summary statistics and visualization, and building generative models. 

This workshop will be delivered by the HBS Research Computing Services group in partnership with the Data Science Services group at IQSS.

Visit the Harvard Training Portal for details and registration information (requires HarvardKey).

2020 Sep 18

Stata Introduction

1:00pm to 4:00pm

Location: 

Online: Zoom

This hands-on workshop provides an introduction to Stata, including how to import and manipulate data, as well as calculate descriptive statistics. This workshop is appropriate for those with little or no prior experience with Stata.

This workshop will be delivered by the HBS Research Computing Services group in partnership with the Data Science Services group at IQSS.

Setup instructions and materials: http://bit.ly/dss_statainstall
Class website: http://bit.ly/dss_stataintro

Visit the Harvard Training Portal for details and registration information (requires HarvardKey).

2020 Oct 30

Python Web-Scraping (flipped classroom)

1:00pm to 2:30pm

Location: 

Online: Zoom

This hands-on workshop will introduce basic techniques for web-scraping using popular Python libraries. This is an intermediate-level, and somewhat challenging, workshop appropriate for those who have been using Python for at least a few months. You should be familar with all of the material in the Python Introduction workshop and have used these skills in your own projects to the point where you are comfortable with them.

PLEASE NOTE: This workshop is being delivered in a FLIPPED CLASSROOM format. This means that participants will be responsible for working through the online materials at their own pace IN ADVANCE of the scheduled meeting time. During the scheduled meeting time, the instructor will demonstrate how to complete the example exercises and will be available to answer questions related to the workshop materials. The instructor WILL NOT walk through all the online materials during the scheduled meeting.

This workshop will be delivered by the HBS Research Computing Services group in partnership with the Data Science Services group at IQSS.

Setup instructions and materials: http://bit.ly/dss_pythoninstall
Class website: http://bit.ly/dss_pythonwebscrape

Visit the Harvard Training Portal for details and registration information (requires HarvardKey).

Pages