GithubHelp home page GithubHelp logo

theportus / wikileaks-wardiaries-miner Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 7 KB

A Python script to extract data from a local mirror of the WikiLeaks War Diaries daya

License: MIT License

Python 100.00%

wikileaks-wardiaries-miner's Introduction

wikileaks-wardiaries-miner

A Python script to extract data from a local mirror of the WikiLeaks War Diaries daya

In the summer of 2010, an unknown source within the Defense Department provided WikiLeaks with a highly classified database containing over 492,000 files somewhat misleadingly called the ‘War Diaries.’ Each record in the files pertains to a single ‘kinetic event,’ jargon meaning any time a situation involves potential lethality or physical harm. All together, the database appears to contain every single event from both Iraq and Afghanistan, as known to U.S. Central Command, from 2004–2009.

Each event contains full metadata with date, time, location, and more. This allows us to see the reports coming into U.S. command as they happened. Of course, while extremely detailed we should be as careful with this as any source. Fog of war, concern for the narrative of events (better known as CYA) all effect how records are produced and submitted. Subsequent investigation has in fact shown that some events recorded appear counterfactual events as journalists as uncovered them. Above all, what this allows us to reconstruct is the picture as it appeared to U.S. decision makers.

To get a copy of the site, I used the wGet tool to download a local mirror. Check out ProgrammingHistorian’s wGet tutorial here for a great guide of how to use the tool beyond this demonstration.

To install wGet…

Windows: Go here and download and run the installer, then launch your Command Prompt and follow directions below.

OSX/Linux: We need to install wGet via Homebrew on the command line. We will install Homebrew if you haven’t already. Open up your Terminal program in your Utilities folder and enter the following commands to install Homebrew and then wGet.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install wget

All Systems: Once wGet is installed, run the following two commands to download the both sites to local mirrors. Note: this method limits the rate of download to be server-friendly.

wget https://WikiLeaks.org/irq/ -r -w 2 --limit-rate=150k
wget https://WikiLeaks.org/afg/ -r -w 2 --limit-rate=150k

Now you can run the script on the local mirrors!

wikileaks-wardiaries-miner's People

Contributors

theportus avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.