October 2008
Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Authors' Committee

Chair:

Matt Blackwell (Gov)

Members:

Martin Andersen (HealthPol)
Kevin Bartz (Stats)
Deirdre Bloome (Social Policy)
Andy Eggers (Gov)
John Graves (HealthPol)
Rich Nielsen (Gov)
Maya Sen (Gov)
Gary King (Gov)

Weekly Research Workshop Sponsors

Alberto Abadie, Lee Fleming, Adam Glynn, Guido Imbens, Gary King, Arthur Spirling, Jamie Robins, Don Rubin, Chris Winship

Recent Comments

Recent Entries

Categories

Blogroll

Brad DeLong
Cognitive Daily
Complexity & Social Networks
Developing Intelligence
EconLog
The Education Wonks
Empirical Legal Studies
Free Exchange
Freakonomics
Health Care Economist
Junk Charts
Language Log
Law & Econ Prof Blog
Machine Learning (Theory)
Marginal Revolution
Mixing Memory
Mystery Pollster
New Economist
Political Arithmetik
Political Science Methods
Pure Pedantry
Science & Law Blog
Simon Jackman
Social Science++
Statistical modeling, causal inference, and social science

Archives

Notification

Powered by
Movable Type 4.24-en


« Useful metric for comparing two distributions? | Main | A General Inequality Parameter »

23 October 2008

Firefox plugin for webscraping

Students here are often interested in how to efficiently collect information from the web. Here's a basic tool: iMacros is a plugin for the Firefox browser and lets you create macros to automate tasks or collect information. It exploits that all elements in html pages can be identified and hence targeted. For example a form field will have an ID that iMacros finds and fills with a value of your choice or click a specified button for you. Two nice features are that you can record your own macros without scripting, and that you can use the plugin to collect text information off the web. The capabilities are not what you would get from your customized Python script but it's easy to use and edit, and gets the basics done.

(The basic plugin is free but they also sell other editions with more capabilities.)

Posted by Sebastian Bauhoff at October 23, 2008 3:22 PM