GithubHelp home page GithubHelp logo

wesabe / ssu Goto Github PK

View Code? Open in Web Editor NEW
250.0 9.0 46.0 1.31 MB

Server-Side Uploader, the data aggregation engine.

License: Apache License 2.0

C 0.06% Java 0.14% Ruby 2.50% Shell 0.70% CoffeeScript 41.62% JavaScript 54.98%

ssu's Introduction

SSU

SSU is a scripted web site navigator & scraper. It was originally designed and conceived as part of Wesabe's infrastructure and has since been open-sourced. Its original design goal was to extract OFX data given bank usernames and passwords for use on wesabe.com.

The system it uses to get this data is XulRunner, a project from Mozilla that provides a customizable (and scriptable) browser. SSU has scripts for each financial institution it supports that describes how to log in and download data from that institution's web site.

It was originally written in JavaScript but is currently mostly CoffeeScript. You can write bank scripts in either language, though CoffeeScript will be the preferred one going forward.

Why would I use this?

If you're trying to aggregate transaction data from multiple financial institutions, possibly for a large number of people, then this project might be useful to you.

How do I try this out?

First, clone the SSU repo:

$ git clone https://github.com/wesabe/ssu

The easiest way to try this is on your laptop/desktop computer running Linux or Mac OS X. Windows isn't supported. SSU comes with a bunch of scripts for financial institutions that it already supports. Your initial experience trying out SSU is going to be much easier if you have an account at one of these institutions. To check, go to the fi-scripts folder and start looking for your bank. Let's say your bank is Chase, whose site is chase.com. We store the scripts for financial institutions in a reverse DNS folder structure, so you need to look in the com directory for the chase.coffee script.

If your financial institution is supported, then great! Next you'll need to install XulRunner. If you're on Linux, you'll want to use your package manager (e.g. apt-get). If you're on OS X you can use the bundled setup script:

ssu$ ./bootstrap

That'll install it if it's not installed and tell you the version you have installed if it is already. Now go ahead and start the app itself in a terminal window:

ssu$ bin/server

You'll see some logging output along with some startup messages and a blank browser window titled "Wesabe DesktopUploader". As long as you don't see any errors you should be good to go. Next you can generate a credentials file to test with. Again, let's assume you have an account at Chase. In another terminal window, run this:

ssu$ bundle
ssu$ bundle exec script/generate credential com.chase chase

That'll create a file at credentials/chase that looks like this:

{"creds": {"username": "FIUSERNAME", "password": "FIPASSWORD"}, "fid": "com.chase"}

Just change FIUSERNAME and FIPASSWORD to your username and password for Chase and save the file.

Now fire up the test client and start a job:

ssu$ script/console
>> job = Job.create chase

Your first terminal window and the blank browser should now be doing something -- ideally logging into your financial institution site and getting your recent transaction data. If it succeeds it'll store the downloaded statement's in the app's profile directory. You can get a list like so from the console:

>> list = Statement.all
=> [#<Statement:0x10f014018 @id="1D87E4D6-DCCD-0001-FFFF-1FFF1FFF1FFF">]
>> list.first.read
=> "OFXHEADER:100\r\n..."

Congrats, you've successfully gotten data out of your financial institution's website!

So how do I use this for real?

The original application that used SSU is the one SSU was written for at Wesabe: pfc. You can see how to manage SSU in this file that controls the SSU process and this file that talks to it.

Basically, SSU sets up a tiny HTTP server (at port 5000 by default) for commands. Here's a request to list all the statements that have been downloaded:

POST /_legacy HTTP/1.0

{"action": "statement.list"}

Here's one that starts a job with credentials:

POST /_legacy HTTP/1.0
Content-Type: application/json
Content-Length: 76

{"action": "job.start", "body": {"fid":"com.ingdirect", "creds":{"username":"joesmith","password":"iamgod"}}}

Responses will be in JSON regardless of what Accept header you pass:

# a successful response to the /statements request
HTTP/1.0 200 OK
Content-Type: application/json

{"status": "ok", "statements": ["1D8787AA-6D2D-0001-DFF3-9EB052301CD4"]}

# an example error response
{"status": "error", "error": "ReferenceError: foo is not defined"}

NOTE: The /_legacy route is a temporary compatibility measure with the old pure-socket way of communicating with SSU. Eventually it'll be replaced by REST-based routes (e.g. GET /statements, POST /jobs, etc).

You can use any programming language you like that supports spawning processes and HTTP to manage an SSU instance. This project ships with development tools that also serve as basic examples in the server (spawning) and console (communication & managing via api.rb) scripts.

My bank isn't supported. Can I add it?

Yep, there's a generator for that which will build a skeleton script for your financial insitution:

ssu$ bundle
ssu$ bundle exec script/generate player com.ally "Ally Bank" https://www.ally.com/
Generating with player generator:
     [ADDED]  application/chrome/content/wesabe/fi-scripts/com/ally.coffee
     [ADDED]  application/chrome/content/wesabe/fi-scripts/com/ally/login.coffee
     [ADDED]  application/chrome/content/wesabe/fi-scripts/com/ally/accounts.coffee

You can probably leave the base script (ally.coffee in this example) alone and start filling in login.coffee with the info required to navigate the site. Once you've added something and created a matching credential file, go ahead and try it out:

ssu$ script/console
>> job.start ally

There are lots of examples in the fi-scripts directory for you to reference as you build your own script. Once you're satisfied just commit your files and send a pull request so we can add your financial institution for others to use.

Why use a browser?

Using XulRunner means that SSU can access any bank site that Firefox can, so you don't have to use mechanize or some other tool that doesn't fully emulate the browser environment. This matters because, by its nature, navigating any website in a scripted way is brittle and anything we can do to reduce the breakage is good. Websites are intended to be viewed in web browsers and their authors worked hard to make that function properly -- that is work you don't have to do when you use a browser as your scraper.

ssu's People

Contributors

boblail avatar eventualbuddha avatar indirect avatar sirianni avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ssu's Issues

Paypal does not work OOTB

Out of the box, SSU does not work with Paypal and potentially other sources.

  1. Install SSU on either Linux or Mac using provided install instructions
  2. Generate valid credentials.
  3. Connect to server via local console and start job.
  4. It successfully logs in, navigates to the download and begins the download.
  5. It hangs until the whole job times out.

Could not find templater (>= 0.3.2) Error when generating credentials

I downloaded the wesabe ssu yesterday and have been stuck at the following. I am using this on Mac OSX 10.6.8 and Ruby Version is 1.8.7.

Do i have to downgrade to version 1.3.6?

If not, please advise on the steps.

Thank you in advance for you help.

NCS

script/generate credential com.chase chase
/Library/Ruby/Site/1.8/rubygems/dependency.rb:247:in to_specs': Could not find templater (>= 0.3.2) amongst [RedCloth-4.1.1, actionmailer-2.3.5, actionmailer-1.3.6, actionpack-2.3.5, actionpack-1.13.6, actionwebservice-1.2.6, activerecord-2.3.5, activerecord-1.15.6, activeresource-2.3.5, activesupport-2.3.5, activesupport-1.4.4, acts_as_ferret-0.4.3, capistrano-2.5.2, cgi_multipart_eof_fix-2.5.0, daemons-1.0.10, dnssd-0.6.0, fastthread-1.0.1, fcgi-0.8.7, ferret-0.11.6, gem_plugin-0.2.3, highline-1.5.0, hpricot-0.6.164, libxml-ruby-1.1.2, mongrel-1.1.5, needle-1.3.0, net-scp-1.0.1, net-sftp-2.0.1, net-sftp-1.1.1, net-ssh-2.0.4, net-ssh-1.1.4, net-ssh-gateway-1.0.0, rack-1.0.1, rails-2.3.5, rails-1.2.6, rake-0.9.2, rake-0.9.2, rake-0.8.3, rdoc-3.9.4, rdoc-3.9.4, ruby-openid-2.1.2, ruby-yadis-0.3.4, rubygems-update-1.8.10, rubynode-0.1.5, sqlite3-ruby-1.2.4, termios-0.9.4, xmpp4r-0.4] (Gem::LoadError) from /Library/Ruby/Site/1.8/rubygems/dependency.rb:256:into_spec'
from /Library/Ruby/Site/1.8/rubygems.rb:1210:in `gem'
from script/generate:4

No xulrunner found running on port=5000!

I'm trying to run this APP on ubuntu 10.04, bin/server command executed without any issues but when I am trying to access console [script/console], I am facing below error :

No xulrunner found running on port=5000!

I have verified all the services that are running in the system, In that list I could not able to find xulrunner service running on 5000 port.

Can anyone please let me know, what might be the possible solution to fix this issue.

Thank you

Bootstrap

I am a beginner when I comes to Linux. When I run ./bootstrap I get the You don't have Ruby installed, however when I run Ruby -v I get ruby 1.8.7, so what am I supposed to install?

Problems starting SSU

I followed the directions in the README. It looks like the server starts, albeit with a few errors:

http://screencast.com/t/zMa0VJQfpoUM

However, when I try to start the console, it just hangs:

http://screencast.com/t/5wgv6yWavm

It seems to hang on this line in the "run" method in script/console: require File.expand_path('../console.rb', FILE)

I thought maybe it because I was running an old version of Ruby but I still see the issue with Ruby 1.9.3.

Any thoughts? I have tried everything I can think of.

Thanks!

Readme is out of date with head

When I send the following call per the readme:

POST /_legacy HTTP/1.0
Content-Type: application/json
Content-Length: 76
{"action": "job.start", "body": {"fid":"com.ingdirect", "creds":{"username":"joesmith","password":"iamgod"}}}

I'm told there's no action job_start -- it seems that this method was removed from Controller.coffee in a recent revision. How should this be updated?

Are the legacy actions no longer supported?

"Component is not available" error in bootstrap.js

Hi!

I've been working to get SSU up and running on an Ubuntu Linux Amazon EC2 ami, and have run into a trouble right up until the point where I try to actually scrape something. It's become greek to me so if anyone knows of a work around it would be greatly appreciated.

ERROR: while printing object ([object XPCNativeWrapper [object Window]]) for log: [Exception... "Component is not available"  nsresult: "0x80040111 (NS_ERROR_NOT_AVAILABLE)"  location: "JS frame :: chrome://desktopuploader/content/bootstrap.js :: anonymous :: line 880"  data: no]

Readme is out of date with head

When I send the following call per the readme:

`POST /_legacy HTTP/1.0
Content-Type: application/json
Content-Length: 76

{"action": "job.start", "body": {"fid":"com.ingdirect", "creds":{"username":"joesmith","password":"iamgod"}}}`

I'm told there's no action job_start -- it seems that this method was removed from Controller.coffee in a recent revision. How should this be updated?

Are the legacy actions no longer supported?

how to use with pfc

Not an issue, but I'm a little confused how to use ssu with pfc. I got as far as setting ssu_support in financial_insts, but it just doesn't seem to work.

What a I missing?

Thanks so much

Record "Content-Type" header value for downloads

To assist with debugging and processing of downloads, we should record the value of the "Content-Type" header with the download metadata. This is helpful when, for example, downloading the given resource fails on the server side but they actually return a 200 but with bogus (usually HTML) data.

Ownership

I guess it doesn't really matter, as this code is Apache 2.0 license, but the bottom of the license file says:

" Copyright [yyyy] [name of copyright owner] ".

Who owns it? If Wesabe no longer exists as a company, did it sell the copyrights of the software to anyone?

Running SSU on Ubuntu Server w/o a Monitor

I'm trying to get the SSU app working on an AWS EC2 instance of Ubuntu Server Edition. When I run:

:~/wesabe-ssu-ec8a3be$ bin/server
Error: no display specified

I've installed xorg to give the server a xwindows environment, but still get this error. Is it due to the fact that there is no monitor hooked up to the machine?

Can anyone recommend a solution?

ryan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.