GithubHelp home page GithubHelp logo

Specifying output directory about percollate HOT 2 OPEN

rickcecil avatar rickcecil commented on May 27, 2024
Specifying output directory

from percollate.

Comments (2)

danburzo avatar danburzo commented on May 27, 2024

Hi, thanks for logging the issue.

  1. I could use absolute paths so that my scripts are less fragile.

The -o / --output path can be absolute, but admittedly the help text makes it sound that only relative paths are allowed. It was meant along the lines of when relative, it's relative to the current working directory, which incidentally is how paths normally work 😅 so maybe just drop the whole 'relative' part. Please note that the directory does need to exist, as percollate does not mkdir -p itself to the destination.

  1. If I only specified a directory, it would use the title of the page as the title of the doc in the same manner that it does now if you do not use the -o option.

Using the --individual flag effectively turns the value of -o into a prefix, to which the web page titles are appended. So ending with a trailing slash (-o my/destination/) will create files inside destination. However, it can benefit from some cosmetic tweaks (it makes filenames start with a hyphen currently).

With a bash script, I could wget the web page and then use pup to put the title into a variable that could be used in percollate as the filename. (There's more to it than that as I would also want to clean the title to make sure there are no illegal characters and make sure it did not exceed the character count limit.)

The titles are currently transformed with slugify, but they might benefit from stricter rules, e.g. filenamify + truncation. Do you have a hard limit on the filename length, or just a preference?

from percollate.

rickcecil avatar rickcecil commented on May 27, 2024

Appreciate the fast response!

Just a heads up, I am on Ubuntu 20.04 and am using version 4.0 of Percollate.

The -o / --output path can be absolute, but admittedly the help text makes it sound that only relative paths are allowed. It was meant along the lines of when relative, it's relative to the current working directory,

Huh. I tried it a few times, but it kept throwing an error. In fact, just tried it again and it is still throwing errors. Here's what I'm doing:

percollate pdf http://example.com/article.html -o /path/for/file/

I get this error:

[Error: EISDIR: illegal operation on a directory, open '/path/for/file/'] {
errno: -21,
code: 'EISDIR',
syscall: 'open',
path: '/path/for/file/'
}

At first, I thought permissions error — because that's the first place you check, but the permissions are correct on my directory. And, when I do this, it works:

percollate pdf http://example.com/article.html -o /path/for/file/file.pdf

Then I saw the note about relative paths and figured that was the cause.

Using the --individual flag effectively turns the value of -o into a prefix,

percollate pdf http://example.com/article.html -o /path/for/file/ —individual

Now, this command works as expected, though, as you say, it adds a hyphen in front of the filename

Something to note: this does not work:

percollate pdf http://example.com/article.html -o /path/for/file —individual

Notice the missing trailing slash at the end of "/path/for/file" It tries to create this and, at least in my attempts, fails:

/path/for/file-example.com/article.html

Given your description, I see why it works that way, but did want to point out something that people might miss

The titles are currently transformed with slugify

Sorry, my initial comment was just my thought experiment on how I would accomplish this without percollate. I've been bit before by the creation of a filename that was too long and it was a serious PITA to figure out how to delete that file. So now I am just extra cautious about length and illegal character output. Sounds like you've got that handled in percollate, though.

Lastly, I thought you might get a kick out of what I am trying to do... Basically, a few web pages are not allowing percollate to access the entire html page, but I've found that if I have singlefile grab the site first and send the result to stdout, then percollate can pull the html from stdout and create a new PDF or epub of the entire document. :D Something like this:

singlefile https://medium.com/article-file-name.html | percollate epub - --url=https://medium.com/ -o /path/to/save/ --individual

Anyway, thanks for the quick response. It seems like the best way to get what I want with what is already there would be to use the -- individual option and then, maybe at the end, rename the files to remove the initial dash. Since this is happening in a bash script, that should be pretty easy to do.

from percollate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.