Feature deion Describe the feature you're proposing.</em

Specifying output directory about percollate HOT 2 OPEN

rickcecil commented on May 27, 2024

Specifying output directory

from percollate.

Comments (2)

danburzo commented on May 27, 2024

Hi, thanks for logging the issue.

I could use absolute paths so that my scripts are less fragile.

The -o / --output path can be absolute, but admittedly the help text makes it sound that only relative paths are allowed. It was meant along the lines of when relative, it's relative to the current working directory, which incidentally is how paths normally work 😅 so maybe just drop the whole 'relative' part. Please note that the directory does need to exist, as percollate does not mkdir -p itself to the destination.

If I only specified a directory, it would use the title of the page as the title of the doc in the same manner that it does now if you do not use the -o option.

Using the --individual flag effectively turns the value of -o into a prefix, to which the web page titles are appended. So ending with a trailing slash (-o my/destination/) will create files inside destination. However, it can benefit from some cosmetic tweaks (it makes filenames start with a hyphen currently).

With a bash script, I could wget the web page and then use pup to put the title into a variable that could be used in percollate as the filename. (There's more to it than that as I would also want to clean the title to make sure there are no illegal characters and make sure it did not exceed the character count limit.)

The titles are currently transformed with slugify, but they might benefit from stricter rules, e.g. filenamify + truncation. Do you have a hard limit on the filename length, or just a preference?

from percollate.

rickcecil commented on May 27, 2024

Appreciate the fast response!

Just a heads up, I am on Ubuntu 20.04 and am using version 4.0 of Percollate.

The -o / --output path can be absolute, but admittedly the help text makes it sound that only relative paths are allowed. It was meant along the lines of when relative, it's relative to the current working directory,

Huh. I tried it a few times, but it kept throwing an error. In fact, just tried it again and it is still throwing errors. Here's what I'm doing:

percollate pdf http://example.com/article.html -o /path/for/file/

I get this error:

[Error: EISDIR: illegal operation on a directory, open '/path/for/file/'] {
errno: -21,
code: 'EISDIR',
syscall: 'open',
path: '/path/for/file/'
}

At first, I thought permissions error — because that's the first place you check, but the permissions are correct on my directory. And, when I do this, it works:

percollate pdf http://example.com/article.html -o /path/for/file/file.pdf

Then I saw the note about relative paths and figured that was the cause.

Using the --individual flag effectively turns the value of -o into a prefix,

percollate pdf http://example.com/article.html -o /path/for/file/ —individual

Now, this command works as expected, though, as you say, it adds a hyphen in front of the filename

Something to note: this does not work:

percollate pdf http://example.com/article.html -o /path/for/file —individual

Notice the missing trailing slash at the end of "/path/for/file" It tries to create this and, at least in my attempts, fails:

/path/for/file-example.com/article.html

Given your description, I see why it works that way, but did want to point out something that people might miss

The titles are currently transformed with slugify

Sorry, my initial comment was just my thought experiment on how I would accomplish this without percollate. I've been bit before by the creation of a filename that was too long and it was a serious PITA to figure out how to delete that file. So now I am just extra cautious about length and illegal character output. Sounds like you've got that handled in percollate, though.

Lastly, I thought you might get a kick out of what I am trying to do... Basically, a few web pages are not allowing percollate to access the entire html page, but I've found that if I have singlefile grab the site first and send the result to stdout, then percollate can pull the html from stdout and create a new PDF or epub of the entire document. :D Something like this:

singlefile https://medium.com/article-file-name.html | percollate epub - --url=https://medium.com/ -o /path/to/save/ --individual

Anyway, thanks for the quick response. It seems like the best way to get what I want with what is already there would be to use the -- individual option and then, maybe at the end, rename the files to remove the initial dash. Since this is happening in a bash script, that should be pretty easy to do.

from percollate.

Specifying output directory about percollate HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs