GithubHelp home page GithubHelp logo

gjjvdburg / paper2remarkable Goto Github PK

View Code? Open in Web Editor NEW
313.0 7.0 23.0 1.7 MB

Fetch an academic paper or web article and send it to the reMarkable tablet with a single command

License: MIT License

Python 98.81% Dockerfile 0.41% Makefile 0.78%
remarkable-tablet remarkable arxiv

paper2remarkable's Introduction

paper2remarkable

PyPI version Build status Downloads

paper2remarkable is a command line program for quickly and easily transferring an academic paper to your reMarkable:

$ p2r https://arxiv.org/abs/1811.11242

There is also support for transferring an article from a website:

$ p2r https://hbr.org/2019/11/getting-your-team-to-do-more-than-meet-deadlines

The script can be run through the p2r command line program or via Docker (see below). If you're using MacOS, you might be interested in the Alfred workflow or Printing to p2r. On Linux, a background terminal such as Guake can be very handy. Note that even without a reMarkable, this program can make downloading papers easier (just use the -n flag).

Introduction

paper2remarkable makes it as easy as possible to get a PDF on your reMarkable from any of the following sources:

The program aims to be flexible to the exact source URL, so for many of the academic sources you can either provide a URL to the abstract page or to the PDF file. If you have a source that you would like to see added to the list, let me know!

paper2remarkable takes the source URL and:

  1. Downloads the pdf
  2. Removes the arXiv timestamp (for arXiv sources)
  3. Crops the pdf to remove unnecessary borders
  4. Shrinks the pdf file to reduce the filesize
  5. Generates a nice filename based on author/title/year of the paper
  6. Uploads it to your reMarkable using rMapi.

Optionally, you can:

  • Download a paper but not upload to the reMarkable using the -n switch.
  • Insert a blank page after each page using the -b switch (useful for note taking!)
  • Center (-c) or right-align (-r) the pdf on the reMarkable (default is left-aligned), or disable cropping altogether (-k).
  • Provide an explicit filename using the --filename parameter
  • Specify the location on the reMarkable to place the file (default /)

Here's an example with verbose mode enabled that shows everything the script does by default:

$ p2r -v https://arxiv.org/abs/1811.11242
2019-05-30 00:38:27 - INFO - Starting ArxivProvider
2019-05-30 00:38:27 - INFO - Getting paper info from arXiv
2019-05-30 00:38:27 - INFO - Downloading url: https://arxiv.org/abs/1811.11242
2019-05-30 00:38:27 - INFO - Generating output filename
2019-05-30 00:38:27 - INFO - Created filename: Burg_Nazabal_Sutton_-_Wrangling_Messy_CSV_Files_by_Detecting_Row_and_Type_Patterns_2018.pdf
2019-05-30 00:38:27 - INFO - Downloading file at url: https://arxiv.org/pdf/1811.11242.pdf
2019-05-30 00:38:32 - INFO - Downloading url: https://arxiv.org/pdf/1811.11242.pdf
2019-05-30 00:38:32 - INFO - Removing arXiv timestamp
2019-05-30 00:38:34 - INFO - Cropping pdf file
2019-05-30 00:38:37 - INFO - Shrinking pdf file
2019-05-30 00:38:38 - INFO - Starting upload to reMarkable
2019-05-30 00:38:42 - INFO - Upload successful.

Installation

For ArchLinux, paper2remarkable can be installed through the Arch User Repository.

The script requires the following external programs to be available:

Specifically:

  1. First install rMAPI, using the instructions available here: https://github.com/juruen/rmapi#install

  2. Then install system dependencies:

    • Arch Linux: pacman -S pdftk ghostscript poppler
    • Ubuntu: apt-get install pdftk ghostscript poppler-utils. Replace pdftk with qpdf if your distribution doesn't package pdftk.
    • MacOS: brew install pdftk-java ghostscript poppler (using HomeBrew).
    • Windows: Installers or executables are available for qpdf (for instance the mingw binary executables) and GhostScript. Importantly, Windows support is untested and these are generic instructions, so we welcome clarifications where needed. The Docker instructions below may be more convenient on Windows.
  3. Finally, install paper2remarkable:

    $ pip install paper2remarkable
    

    this installs the p2r command line program.

Optionally, you can install:

  • pdftoppm (recommended for speed). Usually part of a Poppler installation.

  • the ReadabiliPy package with Node.js support, to allow using Readability.js for HTML articles. This is known to improve the output of certain web articles.

If any of the dependencies (such as rmapi or ghostscript) are not available on the PATH variable, you can supply them with the relevant options to the script (for instance p2r --rmapi /path/to/rmapi). If you run into trouble with the installation, please let me know by opening an issue on Github.

Usage

The full help of the script is as follows. Hopefully the various command line flags are self-explanatory, but if you'd like more information see the man page (man p2r) or open an issue on GitHub.

usage: p2r [-h] [-b] [-c] [-d] [-e] [-n] [-p REMARKABLE_DIR] [-r] [-k] [-v]
           [-V] [-f FILENAME] [--gs GS] [--pdftoppm PDFTOPPM] [--pdftk PDFTK]
           [--qpdf QPDF] [--rmapi RMAPI] [--css CSS] [--font-urls FONT_URLS]
           [-C CONFIG] input [input ...]

Paper2reMarkable version 0.9.4

positional arguments:
  input                 One or more URLs to a paper or paths to local PDF
                        files

optional arguments:
  -h, --help            show this help message and exit
  -b, --blank           Add a blank page after every page of the PDF
  -c, --center          Center the PDF on the page, instead of left align
  -d, --debug           debug mode, doesn't upload to reMarkable
  -e, --experimental    enable experimental features
  -n, --no-upload       don't upload to reMarkable, save the output in current
                        directory
  -p REMARKABLE_DIR, --remarkable-path REMARKABLE_DIR
                        directory on reMarkable to put the file (created if
                        missing, default: /)
  -r, --right           Right align so the menu doesn't cover it
  -k, --no-crop         Don't crop the pdf file
  -v, --verbose         be verbose
  -V, --version         Show version and exit
  -f FILENAME, --filename FILENAME
                        Filename to use for the file on reMarkable
  --gs GS               path to gs executable (default: gs)
  --pdftoppm PDFTOPPM   path to pdftoppm executable (default: pdftoppm)
  --pdftk PDFTK         path to pdftk executable (default: pdftk)
  --qpdf QPDF           path to qpdf executable (default: qpdf)
  --rmapi RMAPI         path to rmapi executable (default: rmapi)
  --css CSS             path to custom CSS file for HTML output
  --font-urls FONT_URLS
                        path to custom font urls file for HTML output
  -C CONFIG, --config CONFIG
                        path to config file (default: ~/.paper2remarkable.yml)

By default paper2remarkable makes a PDF fit better on the reMarkable by changing the page size and removing unnecessary whitespace. Some tools for exporting a PDF with annotations do not handle different page sizes properly, causing annotations to be misplaced (see discussion). If this is an issue for you, you can disable cropping using the -k/--no-crop option to p2r.

For HTML sources (i.e. web articles) you can specify custom styling using the --css and --font-urls options. The default style in the HTML provider can serve as a starting point.

Local PDF or Postscript files can be supplied too, using p2r /path/to/file.pdf.

A configuration file can be used to provide commonly-used command line options. By default the configuration file at ~/.paper2remarkable.yml is used if it exists, but an alternative location can be provided with the -C/--config flag. Command line flags override the settings in the configuration file. See the config.example.yml file for an example configuration file and an overview of supported options.

Alfred Workflow

On MacOS, you can optionally install this Alfred workflow. Alfred is a launcher for MacOS.

Once installed, you can then use rm command and rmb (for the --blank pages to insert blank pages between pages for notes) with a URL passed. The global shortcut Alt-P will send the current selection to p2r. Note that by default --right is passed and p2r is executed in your bash environment. You can edit the Workflow in Alfred if this doesn't work for your setup.

Alfred Screenshot

Printing

Printing to p2r allows printing prompts to save directly to your reMarkable tablet, passing through p2r for processing.

For MacOS, you can follow the guide for printing with rmapi, but for the bash script, instead use this script:

for f in "$@"
do
	bash -c -l "p2r --right '$f'" 
done

Docker

If you'd like to avoid installing the dependencies directly on your machine, you can use the Dockerfile. To make this work you will need git and docker installed.

First clone this repository with git clone and cd inside of it, then build the container:

docker build -t p2r .

Authorization

paper2remarkable uses rMapi to sync documents to the reMarkable. The first time you run paper2remarkable you will have to authenticate rMapi using a one-time code provided by reMarkable. By default, rMapi uses the ${HOME}/.rmapi file as a configuration file to store the credentials, and so this is the location we will use in the commands below. If you'd like to use a different location for the configuration (for instance, ${HOME}/.config/rmapi/rmapi.conf), make sure to change the commands below accordingly.

If you already have a ~/.rmapi file with the authentication details, you can skip this section. Otherwise we'll create it and run rmapi in the docker container for authentication:

$ touch ${HOME}/.rmapi
$ docker run --rm -i -t -v "${HOME}/.rmapi:/home/user/.rmapi:rw" --entrypoint=rmapi p2r version

This command will print a link where you can obtain a one-time code to authenticate rMapi and afterwards print the rMapi version (the version number may be different):

ReMarkable Cloud API Shell
rmapi version: 0.0.12

Usage

Use the container by replacing p2r with docker run --rm -v "${HOME}/.rmapi:/home/user/.rmapi:rw" p2r, e.g.

# print help and exit
docker run --rm -v "${HOME}/.rmapi:/home/user/.rmapi:rw" p2r --help

# equivalent to above usage
docker run --rm -v "${HOME}/.rmapi:/home/user/.rmapi:rw" p2r -v https://arxiv.org/abs/1811.11242

# to transfer a local file in the current directory
docker run --rm -v "${HOME}/.rmapi:/home/user/.rmapi:rw" -v "$(pwd):/home/user:ro" p2r -v localfile.pdf

For transferring local files using the Docker image, you may find this helper function useful.

You can also create an alias in your ~/.bashrc file to abstract away the Docker commands:

# in ~/.bashrc

alias p2r="docker run --rm -v \"${HOME}/.rmapi:/home/user/.rmapi:rw\" p2r"

After running source ~/.bashrc to activate the alias, you can then use paper2remarkable through Docker by calling p2r from the command line.

Notes

License: MIT

If you find a problem or want to suggest a feature, please open an issue on Github. You're helping to make this project better for everyone!

Thanks to all the contributors who've helped to support the project.

BuyMeACoffee

paper2remarkable's People

Contributors

claytonjy avatar gjjvdburg avatar gwtaylor avatar jayanth-kumar5566 avatar kazy avatar reinierkoops avatar savagej avatar sirupsen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

paper2remarkable's Issues

Installation Struggles on MacOS

Had issues on installing on a MacOS M1 / Ventura

because:

Installing Rmapi

go get -u github.com/juruen/rmapi --> had to install go --> newest version of go does not support get -u anymore.

Tried to install it via

git clone https://github.com/juruen/rmapi 
cd rmapi
go install

But then p2r could not find the binary.

So instead I downloaded the rmapi binary, and placed this in /usr/local/bin (Don't think it is the best way to go).

Installing dependencies

this worked

installing paper2remarkable

pip did not work, so I had to install it. But then only pip3 was possible to launch.
So I had to do
pip3 install paper2remarkable

running paper2remarkable

Then I ran into a problem that weasyprint cannot load library 'gobject-2.0-0': dlopen(gobject-2.0-0, 0x0002):
Because the symlinks are not working?!
Fixed this with:
sudo ln -s /opt/homebrew/lib /usr/local/lib (Don't think it is the best way to go).

But then when running p2r from the terminal it says:
zsh: command not found: p2r

Did a system-search, and found p2r at: /Users/XXXX/Library/Python/3.9/bin/p2r

Then I ran the command:
/Users/XXXXX/Library/Python/3.9/bin/p2r https://arxiv.org/abs/1811.11242

and it finally worked.....

Any article on a website

nice work on this tool.

i saw that you want to add the possibility to source from any website article. i just created a tool which takes URLs (from pocket read-later app) and create epubs for the remarkable: https://github.com/GliderGeek/pocket2rm. its written in go but there are plenty "readability" packages for python as well. hope this helps.

Default PDF Title

Moving d4c682b here, good call @GjjvdBurg :)

Some papers, e.g. this one doesn't provide the author/title metadata for a nice title ๐Ÿ˜ข I've been using the "Print to Remarkable" Chrome extension, but recently started trying to use p2r via Alfred to transfer all content to my reMarkable due to options like the blank pages between pages (brilliant idea, I've over-scribbled too many papers) and wanting to move to Firefox.

Some papers don't have this metadata. I think an option to use the pdf name (in the case above, w26752.pdf) as the default would be great for those uses, rather than raising :)

GLib-GObject-CRITICAL error when copying a website

Hi,
thanks providing this tool! When importing websites I get the following error (although it works and I get a pdf on the Remarkable 2)

p2r https://github.com/GjjvdBurg/paper2remarkable

(process:32820): GLib-GObject-CRITICAL **: 14:48:49.765: g_object_ref: assertion '!object_already_finalized' failed
[1]    32820 bus error  p2r https://github.com/GjjvdBurg/paper2remarkable

I'm on MacOS 11.4 and I installed all dependencies as described in the README. Maybe someone can point me what is causing the error?

Fails for local pdf

I try to upload a pdf I have locally on my computer, and get this error message:

Traceback (most recent call last):
  File "/usr/local/bin/p2r", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/paper2remarkable/__main__.py", line 12, in main
    sys.exit(realmain())
  File "/usr/local/lib/python3.7/site-packages/paper2remarkable/ui.py", line 96, in main
    url = follow_redirects(url)
  File "/usr/local/lib/python3.7/site-packages/paper2remarkable/utils.py", line 102, in follow_redirects
    req = requests.head(url, allow_redirects=False)
  File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 101, in head
    return request('head', url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 519, in request
    prep = self.prepare_request(req)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 462, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 313, in prepare
    self.prepare_url(url, params)
  File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 387, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL '/home/user/document.pdf': No schema supplied. Perhaps you meant http:///home/user/document.pdf?

So it seems like it's trying to fetch it from online.

Error on auth.go, Docker version

I realise that this may be a bug on the rmapi side, but thought it would be good to bring to your intention.

I'm trying to use the docker version of this application and for every file I try to download/upload, I run into the following error:

ERROR: 2020/11/11 23:06:36 auth.go:64: Code has the wrong length, it should be 8

System: Fedora 32
Device: Remarkable 2

Let me know what else you might need. As far as I can tell this happens right after removing the timestamp.

I'd also be happy to help where I can--I'm pretty comfortable with Python.

docker image load local files

I cannot seem to be able to load local files using the docker image.

~/Downloads$ docker run --rm -v "${HOME}/.rmapi:/home/user/.rmapi:rw" p2r -v main.pdf
/usr/local/lib/python3.7/site-packages/weasyprint/document.py:36: UserWarning: There are known rendering problems and missing features with cairo < 1.15.4. WeasyPrint may work with older versions, but please read the note about the needed cairo version on the "Install" page of the documentation before reporting bugs. http://weasyprint.readthedocs.io/en/latest/install.html
'There are known rendering problems and missing features with '
Traceback (most recent call last):
File "/usr/local/bin/p2r", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/paper2remarkable/main.py", line 13, in main
sys.exit(realmain())
File "/usr/local/lib/python3.7/site-packages/paper2remarkable/ui.py", line 138, in main
url, cookiejar = follow_redirects(args.input)
File "/usr/local/lib/python3.7/site-packages/paper2remarkable/utils.py", line 118, in follow_redirects
url, headers=HEADERS, allow_redirects=False, cookies=jar
File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 104, in head
return request('head', url, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 516, in request
prep = self.prepare_request(req)
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 459, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 314, in prepare
self.prepare_url(url, params)
File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 388, in prepare_url
raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL 'main.pdf': No schema supplied. Perhaps you meant http://main.pdf?

[Errno 2] No such file or directory

When inputing multiple papers I get a directory error:

>>> p2r -V
0.9.3
>>> p2r -v -p /papers/arxiv/24-05-2021/ https://arxiv.org/abs/2105.09956 https://arxiv.org/abs/2105.10474 https://arxiv.org/abs/2105.10163

2021-05-24 09:58:14 - INFO - Starting Arxiv provider
2021-05-24 09:58:14 - INFO - Generating output filename
2021-05-24 09:58:14 - INFO - Getting paper info
2021-05-24 09:58:15 - INFO - Downloaded url: https://arxiv.org/abs/2105.09956
2021-05-24 09:58:15 - INFO - Created filename: Sazonova_et_al_-_Are_All_Post-Starbursts_Mergers_HST_Reveals_Hidden_Disturbances_in_the_Majority_of_PSBs_2021.pdf
2021-05-24 09:58:15 - INFO - Downloading file at url: https://arxiv.org/pdf/2105.09956.pdf
2021-05-24 09:58:33 - INFO - Downloaded url: https://arxiv.org/pdf/2105.09956.pdf
2021-05-24 09:58:34 - INFO - Removing arXiv timestamp ... success
2021-05-24 09:59:10 - INFO - Preparing PDF using crop operation
2021-05-24 09:59:19 - INFO - Processing pages ... (10/31)
2021-05-24 09:59:26 - INFO - Processing pages ... (20/31)
2021-05-24 09:59:29 - INFO - Processing pages ... (30/31)
2021-05-24 09:59:32 - INFO - Processing pages ... (31/31)
2021-05-24 09:59:32 - INFO - Shrinking pdf file ...
2021-05-24 09:59:51 - INFO - Shrinking has no effect for this file, using original.
2021-05-24 09:59:51 - INFO - Starting upload to reMarkable
2021-05-24 10:00:03 - INFO - Upload successful.
2021-05-24 10:00:03 - INFO - Starting Arxiv provider
2021-05-24 10:00:03 - INFO - Generating output filename
2021-05-24 10:00:03 - INFO - Getting paper info
2021-05-24 10:00:04 - INFO - Downloaded url: https://arxiv.org/abs/2105.10474
2021-05-24 10:00:04 - INFO - Created filename: Zhang_et_al_-_Trinity_I_Self-Consistently_Modeling_the_Dark_Matter_Halo-Galaxy-Supermassive_Black_Hole_Connection_From_Z_0-10_2021.pdf
[Errno 2] No such file or directory

However, when I do them one by one, it succeeds

PS. this is such a great tool!

Transfer of multiple files with similar names

Hello!

I don't know if this is a bug or a feature request but sometimes I have a lot pdfs I would like to move at once. My thought was to just use p2r /path/to/dir/*.pdf but it successfully sends over the first pdf, while throwing up the error message

[Errno 2] No such file or directory

For files with similar names, i.e. notes_47 an easy workaround is to just throw the p2r command into a for loop but I was wondering if there is an easier way to automate this

pdf timetables

When cropping a document the timetable isn't present in the output file. Since all pages from the original file are kept (even the empty ones), would it be possible to keep the timetable or add it afterwards?

FileNotFoundError: [WinError 2] The system cannot find the file specified

I am running the p2r command on windows command line with a path to the pdf that exists. This is the last step of my program and I keep running into the FileNotFound error. I have verified that the file exists on the [FILEPATH]. Any help would be appreciated.

p2r -d C:[FILEPATH]\puzzle.pdf

Fontconfig error: Cannot load default config file: No such file: (null)

Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Scripts\p2r.exe_main
.py", line 7, in
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Lib\site-packages\paper2remarkable_main
.py", line 13, in main
sys.exit(realmain())
^^^^^^^^^^
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Lib\site-packages\paper2remarkable\ui.py", line 337, in main
runner(args.input, filenames, options, debug=args.debug)
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Lib\site-packages\paper2remarkable\ui.py", line 309, in runner
prov.run(new_input, filename=filename)
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Lib\site-packages\paper2remarkable\providers_base.py", line 222, in run
intermediate_fname = op(intermediate_fname)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Lib\site-packages\paper2remarkable\providers_base.py", line 154, in rewrite_pdf
status = subprocess.call(
^^^^^^^^^^^^^^^^
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 389, in call
with Popen(*popenargs, **kwargs) as p:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 1024, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users[USERPROFILE]\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 1493, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] The system cannot find the file specified

Processed pdfs seems to lag the remarkable

First of all, thanks a lot for doing this tool, it is quite useful!

However, there seem to be some way in which it processes the PDFs that make them difficult to handle for my remarkable. This for example happens with this paper: https://arxiv.org/abs/2002.09339

If I copy the arXiv PDF directly on the remarkable, it has no trouble loading and displaying it, however when I process it with p2r -c, the resulting PDF seems to cause trouble to the tablet, notably:

  • it does not display any thumbnail for it
  • opening it and turning pages is slow (several seconds, sometimes up to 15-20s), to the point the tablet even show its 3 dots loading animation just to turn the page
  • it seems to overall lag the tablet, even pressing the center button to close the PDF takes several seconds during which the tablet is basically unresponsive.

I also just tested and uploading the paper with p2r --no-crop causes the same issues.

Performance

This is not overly a nuisance for me because I run the script async, but it does make it a little less pleasant to use. This 41 page PDF takes over one minute!

dotfiles $ time p2r https://www.nber.org/papers/w26752.pdf --filename test

real    1m4.323s
user    0m47.352s
sys     0m3.790s

adding blank pages fails

Hi,

when I try to use p2r with the -b flag, i always get "can't set attribute". I've tracked down the problem to the call pdf.pages=[].
If instead I use a new Pdf object to add the pages to, everything seems to work fine. Namely, I'm now using the following code:

def blank_pdf(filepath):
    """Add blank pages to PDF""" 
    logger.info("Adding blank pages")
    pdf = Pdf.open(filepath)
    dst=Pdf.new()
    previous_pages = pdf.pages
    for page in previous_pages:
        dst.pages.append(page)
        dst.add_blank_page()
    output_file = os.path.splitext(filepath)[0] + "-blank.pdf"
    dst.save(output_file)
    return output_file

anyway, thank you for your work!

pdf not found on remarkable

This is a great tool for RM. Thanks for developing this. I might be a newbie but I am not able to see the pdf document being sent to the RM2. I checked the Rmapi and it works. I also looked using the "find" command inside the terminal of the Remarkable. Is there a specific folder that it usually goes to? Or is this a bug in the rmapi maybe?
Here is a snapshot of what I get when I run p2r:

image

Exported annotations on cropped PDFs don't align

If I send a PDF to remarkable with cropping enabled (e.g. with --right or default parameters), the resulting annotation is at the wrong position when exporting the PDF. In this example, you can see the view in the remarkable app on the left, and the corresponding PDF export on the right:
Screenshot 2020-11-20 at 11 11 48
With --no-crop everything is fine:
Screenshot 2020-11-20 at 11 18 24

Remarkable is at 2.4.1.30, the Mac desktop app is 2.3.1

Creating directory fails if doesn't exist

I'm trying to save papers to specific directories, but creating the directory fails if it doesn't exists.
If I manually create the directory on the Remarkable first, it works.

I'm on version 2.0.2.0.

Command I used:
p2r https://arxiv.org/abs/1811.11242 -p /Testdir --filename test.pdf

Error message:

ERROR: 2020/01/09 15:22:36 main.go:17: Error:  directory doesn't exist
ERROR: Creating directory /testdir on reMarkable failed
Error occurred. Exiting.

upload to directory fails if file already exists in top-level 'My Files' directory

If I try to upload a file to some directory on the reMarkable with the '-p' option, and the file already exists in the top-level folder, the transfer fails.

Maybe the program first transfers the document to the remarkable in the top-level folder, and then moves it?

An alternative would be to rename it with some random name, transfer it, and then rename again when moving to the final directory.

Error message:

ERROR: 2020/01/10 14:01:09 main.go:17: Error:  entry already exists
Traceback (most recent call last):
  File "/usr/local/bin/p2r", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/paper2remarkable/__main__.py", line 13, in main
    sys.exit(realmain())
  File "/usr/local/lib/python3.7/site-packages/paper2remarkable/ui.py", line 127, in main
    prov.run(args.input, filename=args.filename)
  File "/usr/local/lib/python3.7/site-packages/paper2remarkable/providers/_base.py", line 124, in run
    rmapi_path=self.rmapi_path,
  File "/usr/local/lib/python3.7/site-packages/paper2remarkable/utils.py", line 124, in upload_to_remarkable
    "Uploading file %s to reMarkable failed" % filepath
paper2remarkable.exceptions.RemarkableError: ERROR: Uploading file document.pdf to reMarkable failed

Keep internal links

version 2.6 of the remarkable os supports internal links in PDFs. Without having done deeper analysis, to me it looks like p2r removes hyperlinks in its process. It that' really the case, this would be great to add as a feature.

Error Building wheel for paper2remarkable (PEP 517)

To preface, I'm a physics student so I'm not an expert at this stuff so I apologize if this is a very simple issue.

I am using Python 3.6, and pip, setuptools, wheels are fully updated. I am running an Ubuntu 18.04 subsystem on Windows 10. I have all the dependencies (rMAPI, etc) installed.

When I run pip install paper2remarkable, where paper2remarkable is the path to my local installation of the repo, I get the following error message:

Building wheels for collected packages: paper2remarkable
Building wheel for paper2remarkable (PEP 517) ... error
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 /usr/local/lib/python3.6/dist-packages/pip/_vendor/pep517/_in_process.py build_wheel/tmp/tmpmaiapbt0

cwd: /tmp/pip-req-build-t42w8_q6
Complete output (97 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/paper2remarkable
copying paper2remarkable/init.py -> build/lib/paper2remarkable
copying paper2remarkable/main.py -> build/lib/paper2remarkable
copying paper2remarkable/version.py -> build/lib/paper2remarkable
copying paper2remarkable/crop.py -> build/lib/paper2remarkable
copying paper2remarkable/exceptions.py -> build/lib/paper2remarkable
copying paper2remarkable/log.py -> build/lib/paper2remarkable
copying paper2remarkable/pdf_ops.py -> build/lib/paper2remarkable
copying paper2remarkable/ui.py -> build/lib/paper2remarkable
copying paper2remarkable/utils.py -> build/lib/paper2remarkable
creating build/lib/paper2remarkable/providers
copying paper2remarkable/providers/init.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/_base.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/_info.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/acm.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/arxiv.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/citeseerx.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/cvf.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/html.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/jmlr.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/local.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/nature.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/nber.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/neurips.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/openreview.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/pdf_url.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/pmlr.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/pubmed.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/sagepub.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/science_direct.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/semantic_scholar.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/springer.py -> build/lib/paper2remarkable/providers
copying paper2remarkable/providers/tandfonline.py -> build/lib/paper2remarkable/providers
running egg_info
writing paper2remarkable.egg-info/PKG-INFO
writing dependency_links to paper2remarkable.egg-info/dependency_links.txt
writing entry points to paper2remarkable.egg-info/entry_points.txt
writing requirements to paper2remarkable.egg-info/requires.txt
writing top-level names to paper2remarkable.egg-info/top_level.txt
reading manifest file 'paper2remarkable.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'p2r.1'
warning: no previously-included files found matching 'Makefile'
warning: no previously-included files found matching '.gitignore'
warning: no previously-included files found matching 'Dockerfile'
warning: no previously-included files found matching 'make_release.py'
no previously-included directories found matching 'old'
writing manifest file 'paper2remarkable.egg-info/SOURCES.txt'
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/paper2remarkable
copying build/lib/paper2remarkable/init.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
copying build/lib/paper2remarkable/main.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
copying build/lib/paper2remarkable/version.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
copying build/lib/paper2remarkable/crop.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
copying build/lib/paper2remarkable/exceptions.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
copying build/lib/paper2remarkable/log.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
copying build/lib/paper2remarkable/pdf_ops.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
creating build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/init.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/_base.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/_info.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/acm.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/arxiv.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/citeseerx.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/cvf.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/html.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/jmlr.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/local.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/nature.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/nber.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/neurips.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/openreview.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/pdf_url.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/pmlr.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/pubmed.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/sagepub.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/science_direct.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/semantic_scholar.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/springer.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/providers/tandfonline.py -> build/bdist.linux-x86_64/wheel/paper2remarkable/providers
copying build/lib/paper2remarkable/ui.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
copying build/lib/paper2remarkable/utils.py -> build/bdist.linux-x86_64/wheel/paper2remarkable
running install_data
creating build/bdist.linux-x86_64/wheel/paper2remarkable-0.8.6.data
creating build/bdist.linux-x86_64/wheel/paper2remarkable-0.8.6.data/data
creating build/bdist.linux-x86_64/wheel/paper2remarkable-0.8.6.data/data/man
creating build/bdist.linux-x86_64/wheel/paper2remarkable-0.8.6.data/data/man/man1
error: can't copy 'p2r.1': doesn't exist or not a regular file
ERROR: Failed building wheel for paper2remarkable
Failed to build paper2remarkable
ERROR: Could not build wheels for paper2remarkable which use PEP 517 and cannot be installed directly

I've run out of luck with Google and figured I'd ask directly here. I can't seem to find a p2r.1 anywhere and I'm not sure where to go from here. Any help? Thanks!

Can't get -p option (or --remarkable-path) to work

No matter what I try, the PDF file ends up in the / directory. Here's a sample session:

rob@Platinum:/mnt/e/downloads$ p2r -p /Tools temp.pdf
rob@Platinum:/mnt/e/downloads$ rmapi ls /Tools
Refreshing tree...
WARNING!!!
  Using the new 1.5 sync, this has not been fully tested yet!!!
  Make sure you have a backup, in case there is a bug that could cause data loss!
[f]     Shapes
rob@Platinum:/mnt/e/downloads$ rmapi ls /
Refreshing tree...
WARNING!!!
  Using the new 1.5 sync, this has not been fully tested yet!!!
  Make sure you have a backup, in case there is a bug that could cause data loss!
[d]     1 - Projects
[f]     Print Friendly & PDF
[f]     Quick sheets
[d]     3 - Resources
[d]     2 - Areas
[f]     temp
[d]     Tools
[d]     eBooks-Unread
[d]     eBooks-Read
[d]     4- Archives
rob@Platinum:/mnt/e/downloads$

Note that although I'm using a local file here, I started out using a URL and simplified down to the local file to minimize moving parts.

I'm running Ubuntu via WSL2 on Window 10.

wand.exceptions.PolicyError raised during Cropping

Steps to reproduce:

  1. Clone the repository
  2. Install the Python Dependencies
  3. Try to sync https://arxiv.org/abs/1810.06339
  4. Exception raised after/during the PDF cropping stage

STD Output

Traceback (most recent call last):
  File "arxiv2remarkable.py", line 837, in <module>
    main()
  File "arxiv2remarkable.py", line 833, in main
    prov.run(args.input, filename=args.filename)
  File "arxiv2remarkable.py", line 317, in run
    intermediate_fname = op(intermediate_fname)
  File "arxiv2remarkable.py", line 144, in crop_pdf
    status = cropper.crop(margins=15)
  File "arxiv2remarkable.py", line 610, in crop
    return self.process_file(self.crop_page, margins=margins)
  File "arxiv2remarkable.py", line 617, in process_file
    status = page_func(page_idx, *args, **kwargs)
  File "arxiv2remarkable.py", line 630, in crop_page
    return self.process_page(page_idx, self.get_bbox, margins=margins)
  File "arxiv2remarkable.py", line 646, in process_page
    bbox = bbox_func(tmpfname, *args, **kwargs)
  File "arxiv2remarkable.py", line 678, in get_bbox
    im = pdf.pages[0].to_image(resolution=resolution)
  File "/home/ubik/.local/lib/python3.7/site-packages/pdfplumber/page.py", line 258, in to_image
    return PageImage(self, **kwargs)
  File "/home/ubik/.local/lib/python3.7/site-packages/pdfplumber/display.py", line 44, in __init__
    resolution
  File "/home/ubik/.local/lib/python3.7/site-packages/pdfplumber/display.py", line 25, in get_page_image
    with wand.image.Image(filename=page_path, resolution=resolution) as img:
  File "/home/ubik/.local/lib/python3.7/site-packages/wand/image.py", line 7495, in __init__
    self.read(filename=filename, resolution=resolution)
  File "/home/ubik/.local/lib/python3.7/site-packages/wand/image.py", line 7884, in read
    self.raise_exception()
  File "/home/ubik/.local/lib/python3.7/site-packages/wand/resource.py", line 240, in raise_exception
    raise e
wand.exceptions.PolicyError: attempt to perform an operation not allowed by the security policy `PDF' @ error/constitute.c/IsCoderAuthorized/408

Dockerfile: Cairo version too old for Weasyprint

I did a fresh git pull + docker build, and now get this when doing any p2r operation (like --help):

/usr/local/lib/python3.7/site-packages/weasyprint/document.py:36: UserWarning: There are known rendering problems and missing features with cairo < 1.15.4. WeasyPrint may work with older versions, but please read the note about the needed cairo
version on the "Install" page of the documentation before reporting bugs. http://weasyprint.readthedocs.io/en/latest/install.html

My intended operation, p2r -cv <some arxiv link>, threw this warning but seemed to work correctly, as far as I can tell.

I opened a shell into a container to check Cairo:

user@7bc9c96e94aa:~$ dpkg -l | grep cairo                       
ii  libcairo-gobject2:amd64            1.14.8-1                 amd64        Cairo 2D vector graphics library (GObject library)
ii  libcairo-script-interpreter2:amd64 1.14.8-1                 amd64        Cairo 2D vector graphics library (script interpreter)
ii  libcairo2:amd64                    1.14.8-1                 amd64        Cairo 2D vector graphics library
ii  libcairo2-dev                      1.14.8-1                 amd64        Development files for the Cairo 2D graphics library
ii  libpangocairo-1.0-0:amd64          1.40.5-1                 amd64        Layout and rendering of internationalized text
ii  libpixman-1-0:amd64                0.34.0-1                 amd64        pixel-manipulation library for X and cairo
ii  libpixman-1-dev                    0.34.0-1                 amd64        pixel-manipulation library for X and cairo (development files)

so Cairo is indeed on an older version.

Per the Weasyprint install docs, it looks like adding to the list of apt-installed packaged might be all that's needed here.

Hanging on removing timestamp

I've been waiting for an hour for this complete!

p2r -v -p /papers/arxiv/24-05-2021/ https://arxiv.org/abs/2105.10474
2021-05-24 10:59:17 - INFO - Starting Arxiv provider
2021-05-24 10:59:17 - INFO - Generating output filename
2021-05-24 10:59:17 - INFO - Getting paper info
2021-05-24 10:59:18 - INFO - Downloaded url: https://arxiv.org/abs/2105.10474
2021-05-24 10:59:18 - INFO - Created filename: Zhang_et_al_-_Trinity_I_Self-Consistently_Modeling_the_Dark_Matter_Halo-Galaxy-Supermassive_Black_Hole_Connection_From_Z_0-10_2021.pdf
2021-05-24 10:59:18 - INFO - Downloading file at url: https://arxiv.org/pdf/2105.10474.pdf
2021-05-24 10:59:31 - INFO - Downloaded url: https://arxiv.org/pdf/2105.10474.pdf
2021-05-24 10:59:31 - INFO - Removing arXiv timestamp ...

content is left-shifted

Is it possible to have the output content not be left-shifted, but instead be in the center of the page? having everything against the left border looks unnatural, and means that in a RHS configuration the toolbars cover a lot of content when expanded.

(p.s. thanks for the great tool!)

ERROR: pdftk failed to compress the PDF file.

OS: Arch

$ pdftk --version

pdftk port to java 3.1.0 a Handy Tool for Manipulating PDF Documents
Copyright (c) 2017-2018 Marc Vinyals - https://gitlab.com/pdftk-java/pdftk
Copyright (c) 2003-2013 Steward and Lee, LLC.
pdftk includes a modified version of the iText library.
Copyright (c) 1999-2009 Bruno Lowagie, Paulo Soares, et al.
This is free software; see the source code for copying conditions. There is
NO warranty, not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Installed via pip install --user paper2remarkable

$ p2r https://arxiv.org/abs/2002.11523
Error: Unexpected Exception in open_reader()
java.lang.ArrayIndexOutOfBoundsException: -1
        at java.util.ArrayList.elementData(ArrayList.java:422)
        at java.util.ArrayList.get(ArrayList.java:435)
        at pdftk.com.lowagie.text.pdf.PdfReader$PageRefs.iteratePages(PdfReader.java:3425)
        at pdftk.com.lowagie.text.pdf.PdfReader$PageRefs.readPages(PdfReader.java:3256)
        at pdftk.com.lowagie.text.pdf.PdfReader$PageRefs.<init>(PdfReader.java:3226)
        at pdftk.com.lowagie.text.pdf.PdfReader$PageRefs.<init>(PdfReader.java:3204)
        at pdftk.com.lowagie.text.pdf.PdfReader.readPages(PdfReader.java:925)
        at pdftk.com.lowagie.text.pdf.PdfReader.readPdf(PdfReader.java:523)
        at pdftk.com.lowagie.text.pdf.PdfReader.<init>(PdfReader.java:172)
        at pdftk.com.lowagie.text.pdf.PdfReader.<init>(PdfReader.java:161)
        at com.gitlab.pdftk_java.TK_Session.add_reader(TK_Session.java:127)
        at com.gitlab.pdftk_java.TK_Session.open_input_pdf_readers(TK_Session.java:243)
        at com.gitlab.pdftk_java.TK_Session.<init>(TK_Session.java:1218)
        at com.gitlab.pdftk_java.pdftk.main_noexit(pdftk.java:150)
        at com.gitlab.pdftk_java.pdftk.main(pdftk.java:128)
Error: Failed to open PDF file: 
   paper_removed.pdf
Errors encountered.  No output created.
Done.  Input errors, so no output created.
ERROR: pdftk failed to compress the PDF file.

If you think this might be a bug, please raise an issue on GitHub at:
https://github.com/GjjvdBurg/paper2remarkable

not taking the remarkable 8 char key

Trying to follow the instructions on the docker build, however, whenever I enter the 8-char password, the website says "Success!"., while the command line keeps asking for a new code... Can you please assist in trouble-shooting?

When I try to run the container, I get the error ERROR: 2020/11/06 13:29:24 auth.go:64: Code has the wrong length, it should be 8

New source recommendations

I'd just quickly like to suggest two new sources that appear in theoretical computer science/cryptography, namely:

Cannot transfer a local file with spaces in its name

I cannot get it to transfer a local file with spaces in its name.

Entering something like

docker run --rm -v "${HOME}/.rmapi:/home/user/.rmapi:rw" -v "$(pwd):/home/user" p2r -v 'example example.pdf'

throws the following error:

ERROR: Couldn't figure out what source you mean. If it's a local file, please make sure it exists.

I already tried both 'example example.pdf' as well as "example example.pdf".

For files without spaces it works completely fine.

I am using p2r with Docker on Windows 10.

Optionally add margin

--no-crop often is useful when users want to use the white space of the paper for annotations
--blank is useful for a lot of additional remarks

With the recent updates in 2.6 and the ability to quickly zoom, maybe a middle ground would make a good addition. I would propose a two step procedure

  1. crop-off the margin like in the current default
  2. optionally re-add a margin to the cropped page to either left or right side. This could for example be done as a percentage or as an absolute value.

Title contains "/" leads to FileNotFoundError

If the title of the paper contains a "/", there is a FileNotFoundError. Try e.g. with http://arxiv.org/abs/1909.02568

2019-10-07 16:33:13 - INFO - Starting Arxiv
2019-10-07 16:33:13 - INFO - Getting paper info
2019-10-07 16:33:14 - INFO - Downloading url: http://arxiv.org/abs/1909.02568
2019-10-07 16:33:14 - INFO - Generating output filename
2019-10-07 16:33:14 - INFO - Created filename: Budnik_et_al_-_Searching_for_a_Solar_Relaxion/Scalar_With_XENON1T_and_LUX_2019.pdf
2019-10-07 16:33:14 - INFO - Downloading file at url: http://arxiv.org/pdf/1909.02568.pdf
2019-10-07 16:33:17 - INFO - Downloading url: http://arxiv.org/pdf/1909.02568.pdf
2019-10-07 16:33:17 - INFO - Removing arXiv timestamp
2019-10-07 16:33:17 - INFO - Cropping pdf file
2019-10-07 16:33:23 - INFO - Shrinking pdf file
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/shutil.py", line 566, in move
    os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: 'paper_dearxiv-crop-shrink.pdf' -> 'Budnik_et_al_-_Searching_for_a_Solar_Relaxion/Scalar_With_XENON1T_and_LUX_2019.pdf'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "arxiv2remarkable.py", line 846, in <module>
    main()
  File "arxiv2remarkable.py", line 842, in main
    prov.run(args.input, filename=args.filename)
  File "arxiv2remarkable.py", line 363, in run
    shutil.move(intermediate_fname, clean_filename)
  File "/usr/local/lib/python3.7/shutil.py", line 580, in move
    copy_function(src, real_dst)
  File "/usr/local/lib/python3.7/shutil.py", line 266, in copy2
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/local/lib/python3.7/shutil.py", line 121, in copyfile
    with open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: 'Budnik_et_al_-_Searching_for_a_Solar_Relaxion/Scalar_With_XENON1T_and_LUX_2019.pdf'

Feature to split two column pdfs into one column per page

Since many academic papers are in two column format, splitting them into one column per page might be useful for readability and notetaking on the 10.3" display. The two columns are usually symmetrical. So, this could be done by cropping the two vertical halves of a pdf page.

Example ( random arxiv paper:)

Input:

Screen Shot 2020-10-05 at 8 38 56 AM

Split areas:

Screen Shot 2020-10-05 at 8 40 35 AM

Output:

Screen Shot 2020-10-05 at 9 16 32 AM

Screen Shot 2020-10-05 at 9 16 44 AM

PDF: [split_2col.pdf](https://github.com/GjjvdBurg/paper2remarkable/files/5325109/split_2col.pdf)

Nice to have

A command line flag to exclude the first page (with title and abstract) from splitting.
Screen Shot 2020-10-05 at 8 39 16 AM

Some images do not get picked up

I have started sending newspaper articles to my Remarkable a lot more, passing it through p2r, since the output is much nicer and more consistent than a print-to-pdf.

However, some websites have trouble with images. I was debugging it a bit this morning, but had to step away before finding a solution.

Here's an example of an article that exhibits the problem (if you get pay-walled, you can see it below using the <figure> syntax).

image

After digging a bit in the code, it seems that this is a problem in html2text, where it doesn't seem to pick up these <figure> tags, since if you pass --debug, the .html file doesn't include the images either.

There's probably a few solutions... the hackiest, but quickest, would be to regex these out and insert them into the string as 'simple' images that html2text would pick up. The more elegant, but also more time-consuing, would be to have html2text support <figure> properly.

Math symbols are not converted

Using the standard settings of p2r, math symbols are not displayed correctly, e.g.

p2r -n http://alexkritchevsky.com/2020/10/15/ea-operations.html

leads to

Screenshot from 2021-05-23 12-16-07

What am I missing?

Suggestion: add example conversion to Readme.

Hi Gertjan, love this tool. thank you for making it. Just as the title suggests, could you add a pdf conversion example image with default options enabled to the repository Readme?

Custom font and line-spacing

When fetching an URL, is it somehow possible to specify the font (or font family) to use and specify the line-spacing? When possible I prefer reading articles in a serif font and with a larger line spacing.

"Could not build wheels for pikepdf"

As I was trying to install p2r, but got this error. I'm not much familiar with Terminal. I'm running macOS Monterrey 12.1.

click
 ERROR: Command errored out with exit status 1:
   command: /opt/homebrew/opt/[email protected]/bin/python3.9 /opt/homebrew/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/tmp5ajtbs2y
       cwd: /private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-install-m8xew6sq/pikepdf_1d18861d04524fd49d0f739383a8cdf6
  Complete output (86 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.macosx-12-arm64-3.9
  creating build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/_methods.py -> build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/_version.py -> build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/__init__.py -> build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/_cpphelpers.py -> build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/jbig2.py -> build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/_xml.py -> build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/objects.py -> build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/codec.py -> build/lib.macosx-12-arm64-3.9/pikepdf
  creating build/lib.macosx-12-arm64-3.9/pikepdf/models
  copying src/pikepdf/models/matrix.py -> build/lib.macosx-12-arm64-3.9/pikepdf/models
  copying src/pikepdf/models/_transcoding.py -> build/lib.macosx-12-arm64-3.9/pikepdf/models
  copying src/pikepdf/models/encryption.py -> build/lib.macosx-12-arm64-3.9/pikepdf/models
  copying src/pikepdf/models/metadata.py -> build/lib.macosx-12-arm64-3.9/pikepdf/models
  copying src/pikepdf/models/_content_stream.py -> build/lib.macosx-12-arm64-3.9/pikepdf/models
  copying src/pikepdf/models/__init__.py -> build/lib.macosx-12-arm64-3.9/pikepdf/models
  copying src/pikepdf/models/outlines.py -> build/lib.macosx-12-arm64-3.9/pikepdf/models
  copying src/pikepdf/models/image.py -> build/lib.macosx-12-arm64-3.9/pikepdf/models
  running egg_info
  listing git files failed - pretending there aren't any
  writing manifest file 'src/pikepdf.egg-info/SOURCES.txt'
  copying src/pikepdf/_qpdf.pyi -> build/lib.macosx-12-arm64-3.9/pikepdf
  copying src/pikepdf/py.typed -> build/lib.macosx-12-arm64-3.9/pikepdf
  running build_ext
  creating build/temp.macosx-12-arm64-3.9
  creating build/temp.macosx-12-arm64-3.9/src
  creating build/temp.macosx-12-arm64-3.9/src/qpdf
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/embeddedfiles.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/embeddedfiles.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/annotation.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/annotation.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/nametree.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/nametree.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/object.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/object.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/object_convert.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/object_convert.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/page.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/page.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/object_repr.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/object_repr.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/parsers.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/parsers.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  src/qpdf/annotation.cppsrc/qpdf/nametree.cpp::99::1010::  src/qpdf/embeddedfiles.cpp:9:10: fatal error: 'qpdf/Constants.h' file not found
  #include <qpdf/Constants.h>
           ^~~~~~~~~~~~~~~~~~
  fatal error: 'qpdf/Constants.h' file not found
  #include <qpdf/Constants.h>
           ^~~~~~~~~~~~~~~~~~
  fatal error: 'qpdf/Constants.h' file not found
  #include <qpdf/Constants.h>
           ^~~~~~~~~~~~~~~~~~
  src/qpdf/object.cpp:12:10: fatal error: 'qpdf/Constants.h' file not found
  #include <qpdf/Constants.h>
           ^~~~~~~~~~~~~~~~~~
  src/qpdf/object_convert.cpp:17:10: fatal error: 'qpdf/Constants.h' file not found
  #include <qpdf/Constants.h>
           ^~~~~~~~~~~~~~~~~~
  src/qpdf/object_repr.cpp:24:10: fatal error: 'qpdf/Constants.h' file not found
  #include <qpdf/Constants.h>
           ^~~~~~~~~~~~~~~~~~
  In file included from src/qpdf/page.cpp:14:
  src/qpdf/pikepdf.h:15:10: fatal error: 'qpdf/PointerHolder.hh' file not found
  #include <qpdf/PointerHolder.hh>
           ^~~~~~~~~~~~~~~~~~~~~~~
  In file included from src/qpdf/parsers.cpp:12:
  src/qpdf/pikepdf.h:15:10: fatal error: 'qpdf/PointerHolder.hh' file not found
  #include <qpdf/PointerHolder.hh>
           ^~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.
  1 error generated.
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/pikepdf.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/pikepdf.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -I/private/var/folders/lp/xc0xj9hx5vd88zclq8jyw6_m0000gn/T/pip-build-env-gv_akt2f/overlay/lib/python3.9/site-packages/pybind11/include -I/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c src/qpdf/pipeline.cpp -o build/temp.macosx-12-arm64-3.9/src/qpdf/pipeline.o -fvisibility=hidden -g0 -stdlib=libc++ -std=c++17 -mmacosx-version-min=10.14
  1 error generated.
  error: command '/usr/bin/clang' failed with exit code 1
  1 error generated.
  src/qpdf/pipeline.cpp:9:10: fatal error: 'qpdf/Constants.h' file not found
  #include <qpdf/Constants.h>
           ^~~~~~~~~~~~~~~~~~
  1 error generated.
  1 error generated.
  1 error generated.
  1 error generated.
  In file included from src/qpdf/pikepdf.cpp:18:
  src/qpdf/pikepdf.h:15:10: fatal error: 'qpdf/PointerHolder.hh' file not found
  #include <qpdf/PointerHolder.hh>
           ^~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.
  1 error generated.
  ----------------------------------------
  ERROR: Failed building wheel for pikepdf

Any interest in making the provider code a standalone library?

Hi Gertjan,

Big fan of the tool you've built here!

I wonder if you'd have any interest in pulling out the provider code and making it a standalone library, because I definitely see this as something that would be useful over and above sending a PDF to the Remarakable. To be precise, I mean pulling out choose_provider and the parsers in the providers folder as its own standalone repo (could even be made into a Python package!) that would take in a str: url and return a PDF file. I'm happy to do it if you'd be keen.

Thanks once again,
ZH

If the arxiv link looks like "http://arxiv.org/abs/arXiv:1908.03213", link not recognised

Some arxiv links on e.g. inspire (see e.g. link ) are of this form

http://arxiv.org/abs/arXiv:1908.03213

which leads to this:

ERROR: Filename must be provided with PDFUrlProvider (use --filename)
Error occurred. Exiting.

2019-10-07 16:35:24 - INFO - Starting PdfUrl
If you think this might be a bug, please raise an issue on GitHub: https://github.com/GjjvdBurg/arxiv2remarkable

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.