GithubHelp home page GithubHelp logo

csu / export-saved-reddit Goto Github PK

View Code? Open in Web Editor NEW
428.0 19.0 35.0 197 KB

Export saved Reddit posts into a HTML file for import into Google Chrome.

License: Other

Python 96.82% HTML 2.81% Makefile 0.37%
python html reddit backup script

export-saved-reddit's Introduction

Export Saved Reddit Posts

Build Status Code Coverage

Exports saved and/or upvoted Reddit posts into a HTML file that is ready to be imported into Google Chrome. Sorts items into folders by subreddit.

Requirements

Installation

First, make sure you have Python 3.x, pip, and git installed on your machine.

Run the following in your command prompt to install:

git clone https://github.com/csu/export-saved-reddit.git
cd export-saved-reddit
pip install -r requirements.txt

To install without git, download the source code from GitHub, extract the archive, and follow the steps above beginning from the second line.

Usage

  1. Make a new Reddit app to get a client id and a client secret.

    • Scroll to the bottom of the page and click "create app"
    • You can name the app anything (e.g. "export-saved"). Select the "script" option. Put anything for the redirect URI (e.g. https://christopher.su).
    • After creating the app, the client id will appear under the app name while the client secret will be labeled "secret".

  2. In the export-saved-reddit folder, rename the AccountDetails.py.example file to AccountDetails.py.

  3. Open the AccountDetails.py in a text editor and enter your Reddit username, password, client id, client secret within the corresponding quotation marks. Save and close the file.

  4. Back in your shell, run python export_saved.py in the export-saved-reddit folder. This will run the export, which will create chrome-bookmarks.html and export-saved.csv files containing your data in the same folder.

Additional Options

usage: export_saved.py [-h] [-u USERNAME] [-p PASSWORD] [-id CLIENT_ID]
                       [-s CLIENT_SECRET] [-v] [-up] [-all] [-V]

Exports saved Reddit posts into a HTML file that is ready to be imported into
Google Chrome or Firefox

optional arguments:
  -h, --help            show this help message and exit
  -u USERNAME, --username USERNAME
                        pass in username as argument
  -p PASSWORD, --password PASSWORD
                        pass in password as argument
  -id CLIENT_ID, --client-id CLIENT_ID
                        pass in client id as argument
  -s CLIENT_SECRET, --client-secret CLIENT_SECRET
                        pass in client secret as argument
  -v, --verbose         increase output verbosity (deprecated; doesn't do
                        anything now)
  -up, --upvoted        get upvoted posts instead of saved posts
  -all, --all           get upvoted, saved, comments and submissions
  -V, --version         get program version.

Updating

To update the script to the latest version, enter the export-saved-reddit folder in your shell/command prompt and enter the following:

git pull

Help

If you have any questions or comments, please open an issue on GitHub.

If you would like to contribute, check out the project's open issues. Pull requests are welcome.

export-saved-reddit's People

Contributors

0xmh avatar csu avatar favrik avatar kevinwaddle avatar losuler avatar lukepayne avatar rachmadaniharyono avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

export-saved-reddit's Issues

UnicodeEncodeError: 'ascii' codec can't encode characters in position 119-120: ordinal not in range(128)

New errors =)

Traceback (most recent call last):
File "export_saved.py", line 336, in
main()
File "export_saved.py", line 330, in main
save_saved(reddit)
File "export_saved.py", line 293, in save_saved
process(reddit, seq, "export-saved", "Reddit - Saved")
File "export_saved.py", line 275, in process
write_csv(csv_rows, file_name + ".csv")
File "export_saved.py", line 261, in write_csv
if isinstance(r, str) else r for r in row])

received 401 HTTP response

As far as I can tell, my app secret, id password, and username are all correct. I tried both with the command line options and with the file loaded options.

maybe related to this https://www.reddit.com/r/redditdev/comments/6myq1h/praw_401_http_response_for_all_requests/

python export_saved.py -u xxxxx -p xxxxx -id xxxxx -s xxxxx -v
INFO:root:Login succesful
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.reddit.com
DEBUG:urllib3.connectionpool:https://www.reddit.com:443 "POST /api/v1/access_token HTTP/1.1" 401 41
Traceback (most recent call last):
File "export_saved.py", line 336, in
main()
File "export_saved.py", line 330, in main
save_saved(reddit)
File "export_saved.py", line 292, in save_saved
seq = reddit.user.me().saved(limit=None)
File "/Library/Python/2.7/site-packages/praw/models/user.py", line 60, in me
user_data = self._reddit.get(API_PATH['me'])
File "/Library/Python/2.7/site-packages/praw/reddit.py", line 320, in get
data = self.request('GET', path, params=params)
File "/Library/Python/2.7/site-packages/praw/reddit.py", line 404, in request
params=params)
File "/Library/Python/2.7/site-packages/prawcore/sessions.py", line 135, in request
self._authorizer.refresh()
File "/Library/Python/2.7/site-packages/prawcore/auth.py", line 328, in refresh
password=self._password)
File "/Library/Python/2.7/site-packages/prawcore/auth.py", line 138, in _request_token
response = self._authenticator._post(url, **data)
File "/Library/Python/2.7/site-packages/prawcore/auth.py", line 31, in _post
raise ResponseException(response)
prawcore.exceptions.ResponseException: received 401 HTTP response

Comments not saved, only links

The current version only saves link URLs and their titles, but saved comments are lost. However, I was able to dive into the code and fix it for my use. Here are the changes I made in order to also grab author, submission body, and a more readable time. I'm not sure if this is the right place to put this as I'm pretty new to github and coding in general, but the git diff is below:

diff --git a/export_saved.py b/export_saved.py
index 7e4f5db..0e88d5f 100755
--- a/export_saved.py
+++ b/export_saved.py
@@ -11,6 +11,7 @@ import argparse
 import csv
 import logging
 import sys
+import datetime

 import praw

@@ -212,19 +213,33 @@ def get_csv_rows(reddit, seq):
             created = int(i.created)
         except ValueError:
             created = 0
+
+        createdreadable = datetime.datetime.fromtimestamp(int(created)).strftime('%Y-%m-%d %H:%M:%S')

         try:
             folder = str(i.subreddit).encode('utf-8').decode('utf-8')
         except AttributeError:
             folder = "None"
+
+        try:
+            body = "N/A"
+            body = str(i.body).encode('utf-8').decode('utf-8')
+        except AttributeError:
+            body = "N/A"

+        try:
+            author = "N/A"
+            author = str(i.author).encode('utf-8').decode('utf-8')
+        except AttributeError:
+            author = "N/A"
+
         if callable(i.permalink):
             permalink = i.permalink()
         else:
             permalink = i.permalink
         permalink = permalink.encode('utf-8').decode('utf-8')

-        csv_rows.append([reddit_url + permalink, title, created, None, folder])
+        csv_rows.append([reddit_url + permalink, title, created, createdreadable, body, author, None, folder])

     return csv_rows

@@ -239,7 +254,7 @@ def write_csv(csv_rows, file_name=None):
     file_name = file_name if file_name is not None else 'export-saved.csv'

     # csv setting
-    csv_fields = ['URL', 'Title', 'Created', 'Selection', 'Folder']
+    csv_fields = ['URL', 'Submission Title', 'Created-UNIX', 'Created-Standard', 'Body', 'Username', 'Selection', 'Folder']
     delimiter = ','

     # write csv using csv module

"Import praw" does not work

Ran the script, received this message:

import praw
ImportError: No module named praw

Searched for "praw" and found none. Please advise next step.

Usage instructions

I've only found installation instructions in the repository and on the Github page. I do not know how to use this script.
Please improve the readme.

Improve performance

Tried running the script today and it was quite slow (running with verbose output on, it looked like it was taking greater than 1 second to process each item). Need identify the bottleneck and fix it. If we're bottlenecked at network or some kind of I/O, we should rewrite the code to be async/parallelized.

UnicodeEncodeError

$ python export_saved.py
Traceback (most recent call last):
File "export_saved.py", line 305, in
main()
File "export_saved.py", line 299, in main
save_saved(reddit)
File "export_saved.py", line 267, in save_saved
process(reddit, seq, "export-saved", "Reddit - Saved")
File "export_saved.py", line 247, in process
csv_rows = get_csv_rows(reddit, seq)
File "export_saved.py", line 194, in get_csv_rows
logging.debug('title: {}'.format(title))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 90: ordinal not in range(128)

traceback errors trying to run exposed-saved.py

This is what I'm getting from cmd:

C:\Users\Bill\export-saved-reddit>python export-saved.py
Traceback (most recent call last):
File "export-saved.py", line 148, in
main()
File "export-saved.py", line 81, in main
r = praw.Reddit(user_agent='export saved 1.0')
File "F:\Program Files (x86)Python27\lib\site-packages\praw\reddit.py", line 114, in init
raise ClientException(required_message.format(attribute))
praw.exceptions.ClientException: Required configuration setting 'client_id' missing.
This setting can be provided in a praw.ini file, as a keyword argument to the Reddit class constructor, or as an environment variable.

Any help is appreciated! Fairly inexperienced with programming.

PRAW deprecation warning

After using AccountDetails.py I get following deprecation warning and nothing is exported.

c:\Python27\lib\site-packages\praw\decorators.py:74: DeprecationWarning: reddit intends to disable password-based authentication of API clients sometime in the near future. As a result this method will be removed in a future major version of PRAW.

For more information please see:

* Original reddit deprecation notice: https://www.reddit.com/comments/2ujhkr/

* Updated delayed deprecation notice: https://www.reddit.com/comments/37e2mv/

Pass "disable_warning=True" to "login" to disable this warning.
 warn(msg, DeprecationWarning)

ModuleNotFoundError: No module named 'praw'

Forgive me for being a noob but I can't run instruction #4 "Back in your shell, run python export_saved.py in the export-saved-reddit folder". It gives me this "Traceback (most recent call last):
File "export_saved.py", line 15, in
import praw
ModuleNotFoundError: No module named 'praw'"

Praw is required

After running python export-saved.py in terminal:

python export-saved.py  
Traceback (most recent call last):  
  File "export-saved.py", line 13, in <module>  
    import praw   
ImportError: No module named praw

It seems that Praw is required. I was able to run this script only after installing The Python Reddit Api Wrapper.
If this is correct, please add Praw to 'requirements'.

invalid_grant error processing request

I get this error dunno why

Traceback (most recent call last):
File "export_saved.py", line 336, in
main()
File "export_saved.py", line 330, in main
save_saved(reddit)
File "export_saved.py", line 292, in save_saved
seq = reddit.user.me().saved(limit=None)
File "C:\Users\afons\AppData\Local\Programs\Python\Python37-32\lib\site-packages\praw\models\user.py", line 60, in me
user_data = self._reddit.get(API_PATH['me'])
File "C:\Users\afons\AppData\Local\Programs\Python\Python37-32\lib\site-packages\praw\reddit.py", line 320, in get
data = self.request('GET', path, params=params)
File "C:\Users\afons\AppData\Local\Programs\Python\Python37-32\lib\site-packages\praw\reddit.py", line 404, in request
params=params)
File "C:\Users\afons\AppData\Local\Programs\Python\Python37-32\lib\site-packages\prawcore\sessions.py", line 135, in request
self._authorizer.refresh()
File "C:\Users\afons\AppData\Local\Programs\Python\Python37-32\lib\site-packages\prawcore\auth.py", line 328, in refresh
password=self._password)
File "C:\Users\afons\AppData\Local\Programs\Python\Python37-32\lib\site-packages\prawcore\auth.py", line 142, in _request_token
payload.get('error_description'))
prawcore.exceptions.OAuthException: invalid_grant error processing request

Can't deal with non-ascii characters

When the scripts encounters a saved post with a title containing a non-ascii character, it displays the error "UnicodeEncodeError: 'ascii' codec can't encode characters in position 42-44: ordinal not in range(128)"

praw version issues

Hi,

Thanks for creating this. However, it doesn't work for me.

I've done the pip install. I renamed the example AccountDetails file to AccountDetails.py (and filled in username + password).

When executing it, I'm seeing this error message (using Python 2.7.9):

λ C:\Applications\Python27\python export-saved.py
Traceback (most recent call last):
  File "export-saved.py", line 72, in <module>
    main()
  File "export-saved.py", line 59, in main
    r.login(AccountDetails.REDDIT_USERNAME, AccountDetails.REDDIT_PASSWORD)
  File "C:\Applications\Python27\lib\site-packages\praw\__init__.py", line 1266, in login
    self.user = self.get_redditor(user)
  File "C:\Applications\Python27\lib\site-packages\praw\__init__.py", line 890, in get_redditor
    return objects.Redditor(self, user_name, *args, **kwargs)
  File "C:\Applications\Python27\lib\site-packages\praw\objects.py", line 663, in __init__
    fetch, info_url)
  File "C:\Applications\Python27\lib\site-packages\praw\objects.py", line 72, in __init__
    self.has_fetched = self._populate(json_dict, fetch)
  File "C:\Applications\Python27\lib\site-packages\praw\objects.py", line 127, in _populate
    json_dict = self._get_json_dict() if fetch else {}
  File "C:\Applications\Python27\lib\site-packages\praw\objects.py", line 120, in _get_json_dict
    as_objects=False)
  File "C:\Applications\Python27\lib\site-packages\praw\decorators.py", line 161, in wrapped
    return_value = function(reddit_session, *args, **kwargs)
  File "C:\Applications\Python27\lib\site-packages\praw\__init__.py", line 526, in request_json
    data = json.loads(response, object_hook=hook)
  File "C:\Applications\Python27\lib\json\__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "C:\Applications\Python27\lib\json\decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Applications\Python27\lib\json\decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Any clue as to what could be wrong?

Thanks!

AttributeError: module 'praw' has no attribute 'Reddit"

I've been trying to solve this all day and have been stumped, a few months ago I tried to get this script working (to no avail), the problem I was having is that even though I had praw installed it wasn't getting picked up, so today I tried to get it working again, this time I downloaded praw and copied it into the export-saved-reddit folder, and it worked, this time I got some other error that was resolved by an issue on this repo (I wasn't using version 4.40, pip wasn't working for some reason), next I got some error about unicode, so another issue solved that, (I needed to run it in python 3), so next I ran it and got:
Traceback (most recent call last):
File "export_saved.py", line 336, in
main()
File "export_saved.py", line 321, in main
reddit = login(args=args)
File "export_saved.py", line 126, in login
reddit = praw.Reddit(client_id=client_id,
AttributeError: module 'praw' has no attribute 'Reddit'
So I'm thinking, screw this, I make an ubuntu VM and set it up, and still got the exact same error, I feel like I'm doing something wrong but I followed the guide and put in the right info in the account details (Obvious because of the unicode error)

No JSON object could be decoded error.

Hey, I tried to running you program and got the following error :

Traceback (most recent call last):
  File "export-saved.py", line 72, in <module>
    main()
  File "export-saved.py", line 59, in main
    r.login(AccountDetails.REDDIT_USERNAME, AccountDetails.REDDIT_PASSWORD)
  File "/usr/local/lib/python2.7/dist-packages/praw/__init__.py", line 1266, in login
    self.user = self.get_redditor(user)
  File "/usr/local/lib/python2.7/dist-packages/praw/__init__.py", line 890, in get_redditor
    return objects.Redditor(self, user_name, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/praw/objects.py", line 663, in __init__
    fetch, info_url)
  File "/usr/local/lib/python2.7/dist-packages/praw/objects.py", line 72, in __init__
    self.has_fetched = self._populate(json_dict, fetch)
  File "/usr/local/lib/python2.7/dist-packages/praw/objects.py", line 127, in _populate
    json_dict = self._get_json_dict() if fetch else {}
  File "/usr/local/lib/python2.7/dist-packages/praw/objects.py", line 120, in _get_json_dict
    as_objects=False)
  File "/usr/local/lib/python2.7/dist-packages/praw/decorators.py", line 161, in wrapped
    return_value = function(reddit_session, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/praw/__init__.py", line 526, in request_json
    data = json.loads(response, object_hook=hook)
  File "/usr/lib/python2.7/json/__init__.py", line 326, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.