GithubHelp home page GithubHelp logo

Comments (21)

yarub123 avatar yarub123 commented on September 27, 2024 2

@RyotaUshio Why are you such an awesome dude? Updating us even when it's not asked of you on little things here and there. I don't even use much plugins anymore, or take notes anymore. But I saw this notification about pdf.js. Interesting, nonetheless. πŸ™ (was going to say Hajimemashte... but we haven't really met πŸ™ƒ)

from obsidian-pdf-plus.

jcesguerram avatar jcesguerram commented on September 27, 2024 2

@RyotaUshio Why are you such an awesome dude? Updating us even when it's not asked of you on little things here and there. I don't even use much plugins anymore, or take notes anymore. But I saw this notification about pdf.js. Interesting, nonetheless. πŸ™ (was going to say Hajimemashte... but we haven't really met πŸ™ƒ)

That's true, thank you very much

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 2

Hi guys, thank you so much for the whole positive vibes!

Now that we can add highlights directly into PDF files (instructions: #64), I guess we can close this issue.
If you have any ideas for further improvement, please create a new issue. Thanks!

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 1

Thanks for the kind words!

I do plan to enable adding highlights directly into PDF files without copying links, but it's not as little an addition as it seems.
So it will take some time. I can say it's one of the highest priorities, though.

from obsidian-pdf-plus.

dominiwe avatar dominiwe commented on September 27, 2024 1

Some input on how this is handled in Logseq (not to say it is the best way to do it).

When an annotation/highlight in a PDF is created, an annotations page for that PDF is created. This is just a markdown file which is named hls__name_of_pdf.md. In Logseq this makes sense because PDF assets are always copied to the vault.

This file then contains the following frontmatter:

file:: [name_of_pdf.pdf](path_to_pdf.pdf)
file-path:: path_to_pdf.pdf

Then it contains basically just a list of annotations of the following format:

- Actual text passage that was selected in the pdf.
  ls-type:: annotation
  hl-page:: 1
  hl-color:: yellow
  id:: some-hash

Of course, the id is something we don't have in Obsidian. The link in some note is then just a link to this annotation block ((( is basically a block portal in Logseq):

((some-hash))

Now, it could be done somewhat similarly in Obsidian using links to blocks in files.

Annotations page:

- [[test.pdf#page=1&selection=22,0,22,7&color=yellow|Actual text]] ^some-hash

Link in e.g. a note:

![[hls__test#^some-hash]]

It's not really ideal and there are definitely some complications (e.g. where to store the annotations page?). One benefit would be that it doesn't modify the actual PDF file and just uses normal (obsidian) markdown syntax.

I thought I'd leave it here as one possible way of implementing this.

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 1

@dominiwe Thank you a lot for the info!
It's interesting that not modifying the original PDF can be a benefit.
Actually, I was thinking of actually adding annotations to actual PDF files. It may be nice to stop here and take time to rethink.

So let me share my perspective on how to add highlights without copying text; there are two directions.

Approach 1: add highlights without modifying PDFs

How?

Create an annotation note for each PDF file, which contains all non-link highlights in that PDF.
This is basically the same as what the Annotator plugin does.
I was surprised to see Logseq also adopts this approach.

For example:

---
annotation-target: "[[file.pdf]]"
---

```json
{
    "page": 1,
    "selection": [0, 1, 0, 5],
    "color": "red"
}
```
^a6fd30

The JSON code block can be instead a YAML code block or something.

And of course, here we can use a link like [[file.pdf#page=1&selection=0,1,0,5&color=red]] (and it would be the easiest way to go).
However, I'd prefer not to use such an actual link because it might clutter the backlinks pane and also make it hard to distinguish backlink highlights (this is what PDF++ currently does) and highlights not associated with any backlinks.

Pros

  • Light-weight & fast. No UI clutter
  • Super easy to implement, as it's all about DOM operation and we won't need to interact with actual binary PDF files
  • No risk of corrupting the original file. If a binary file gets corrupted, it will be very hard to restore it!

Cons

  • This approach degrades the portability of PDF files (which the "P" stands for!). If a highlight cannot be seen outside Obsidian, the PDF is essentially confined to Obsidian and cannot live outside of it.

  • It inevitably introduces plugin-dependent stuff into PDF++, which I have tried to avoid as much as possible so far.

    As I wrote in README, this plugin has a non-negligible risk of breaking when Obsidian updates its internals because it is based on monkey-patching of Obsidian's native PDF viewer. Nevertheless, there is no guarantee that I can continue maintaining this plugin for years, being an unpaid volunteer.

    Given these points, it might be dangerous to leave a number of code blocks in your notes that cannot be recognized when PDF++ is gone.

Approach 2: add highlights to actual PDF files

How?

It's not really determined yet, but for now, I'm thinking of something like this:

  1. In a PDF viewer, select a range of text to highlight
  2. Run command or click a color palette item to add a highlight annotation to the original PDF file itself
  3. Then, a link such as file.pdf#page=1&annotation=73R]] is copied to the clipboard. Here, 73R is the ID of the annotation just created

In step 2, I'm thinking of using libraries such as pdfAnnotate and pdf-lib.

FYI, here's a summary of what I know about PDF-related JS libraries. I'm nothing like an expert in this kind of stuff so I'd be very happy to hear additional information.

  • pdfAnnotate: able to add annotations to (or remove ones from) existing PDF files. Added annotations can be linked within Obsidian from the right-click menu, which is crucial!!
  • pdf-lib: also able to add annotations, but the resulting annotations seem to be impossible to reference in Obsidian, which is a deal breaker. But some other functionalities might be useful
    Update: It turned out that it might be possible to create referenceable annotations with pdf-lib if we interact with some low-level code.
  • Mozilla PDF.js: Although it was originally designed as a PDF viewer, it officially says that it supports adding annotations as well. But seemingly, there is no high-level API to do this so we will have to go through the source code and use some low-level code.
  • jsPDF: It only supports creating a new PDF file, and it cannot be used to modify an existing PDF. So we will not use it in PDF++.

Pros

  • Once an annotation is stored in the actual PDF file, the annotation can be viewed everywhere including outside of Obsidian. It keeps the portability of PDF files alive.
  • No worry that your annotations will be lost when PDF++ stops working!

Cons

  • As long as it involves actual file changes, there will be some risk.
  • When a PDF file is updated by adding a highlight annotation, the PDF viewer reloads the file, resulting in a little bit of UI clutter
  • pdfAnntation has not been updated for 2 years, and pdf-lib for 3 years. I'm not very sure it will be safe to add such libraries as dependencies.
    Update: There is a well-maintained fork of pdf-lib! https://github.com/cantoo-scribe/pdf-lib
  • If you accidentally delete an annotation stored in a PDF file, all the links pointing to that annotation will break as an annotation ID is unique
    • (I'm not sure whether we can specify annotation IDs.)

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 1

What I'm doing for now is:

  • Think of PDF++'s backlink highlight feature as something that enhances the connection between markdown notes and PDF files, not "real" highlight annotations
  • When I want to just add highlights without linking, I use external apps like macOS's Preview app, PDF Expert, Adobe Acrobat, ...
    • you can easily open a PDF file with an external app from "More options" menu (three dots) > "Open in default app"

What I've had in my mind regarding approach 2 above is to make it possible to go through this workflow entirely inside Obsidian.

from obsidian-pdf-plus.

yarub123 avatar yarub123 commented on September 27, 2024 1

@RyotaUshio Sorry I had nowhere else to put this (since the page on community forum is closed now πŸ™ƒ), but you are an absolute fucking BOSS. Wow. Simply a beautiful work of art you have done here. I can tell you put a LOT of work into it, and part of me feels badly because we do not know how truly consuming this is. Thank you man. Love you.

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 1

@yarub123 Haha, you are super good at making a developer motivated! Thanks, I appreciate your kind words.

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 1

@dominiwe

Regarding your last comment, I do use it like that in my workflow

FYI, PDF++ 0.24.0 introduced better integration with external apps, including the "Sync the external app with Obsidian" option.

maybe there are some more options written in other programming languages that can be (or have been) compiled to WebAssembly.

Thanks for pointing it out, I'll do some more surveys including other languages!

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 1

Try out 0.27.0!

It's still experimental, but at least it works.

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 1

@dominiwe

you even added the ability to select the library used

Sorry, I've removed that option from 0.27.8 on 😭
because

  • pdfAnnotate seems to be slightly buggy
  • pdfAnnotate is not maintained anymore (the last update was 2 years ago), and I'm not aware of active forks

while

  • pdf-lib is fast!
  • While pdf-lib does not provide a high-level API for adding highlights, it can be done by doing some low-level work. And how pdf-lib works allows us to control almost every detail about the created annotations, which is great.
  • Although the original repo of pdf-lib has not been updated for 3 years, there is an actively maintained fork by Cantoo (https://github.com/cantoo-scribe/pdf-lib). Its last update was only 2 weeks ago!

I'll test it out with a few PDFs.

Thank you!!

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024 1

It seems that PDF.js is going to add PDF annotation (editing) support before long.

https://www.reddit.com/r/ObsidianMD/comments/185xzmq/pdf_annotations_soon_progress_on_github_page_of/

Which is great!

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024

Of course, it would be possible to implement both of these approaches and leave the choice to the user.

from obsidian-pdf-plus.

dominiwe avatar dominiwe commented on September 27, 2024

Reflecting now, I think it is probably more convenient to add the annotations directly to the PDF. It makes sense not to create extra files that could be left behind if the plugin is ever uninstalled. That thorough write-up was really interesting! Regarding your last comment, I do use it like that in my workflow (which is to say, I don't personally need this feature). I was curious how it is done in Logseq though.

Also interesting comparison between the JS libraries. Not an expert either but I was thinking maybe there are some more options written in other programming languages that can be (or have been) compiled to WebAssembly.

from obsidian-pdf-plus.

jcesguerram avatar jcesguerram commented on September 27, 2024

i've been thnking about it and the functionality that you have of copying text to a new note is super useful. that should definately be a feature of the plugin. a separate feature would be highlighting.
i read what you wrote above and i think approach 1 would be great. i've been using the annotator plugin and i'm okay with having the highlights be visible only in obsidian. but it's a bit a of a pain to look at the plain text of the comments that one adds there. your approach is much better

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024

@jcesguerram Thanks for sharing your thoughts!

i think approach 1 would be great. i've been using the annotator plugin and i'm okay with having the highlights be visible only in obsidian.

What about this point?

It inevitably introduces plugin-dependent stuff into PDF++, which I have tried to avoid as much as possible so far.

As I wrote in README, this plugin has a non-negligible risk of breaking when Obsidian updates its internals because it is based on monkey-patching of Obsidian's native PDF viewer. Nevertheless, there is no guarantee that I can continue maintaining this plugin for years, being an unpaid volunteer.

Given these points, it might be dangerous to leave a number of code blocks in your notes that cannot be recognized when PDF++ is gone.

from obsidian-pdf-plus.

jcesguerram avatar jcesguerram commented on September 27, 2024

yeah, the point about continuity is important. as i said, i'm okay with highlights being visible only in obsidian... as long as they're always available there.
maybe you could work more closely with obsidian in their native pdf viewer. they say in their roadmap that they want to work on pdf annotation (https://obsidian.md/roadmap/), but that they're "waiting for native support in PDF.js." ( i don't know what that means)
but if you announce that you will help obsidian to incorporate the features of your plugin into their pdf viewer, people might pay you for that. i would.

from obsidian-pdf-plus.

dominiwe avatar dominiwe commented on September 27, 2024

@jcesguerram PDF.js is a PDF viewer from Mozilla. It is the default PDF viewer that opens when you open a PDF in a modern version of Mozilla Firefox. See this demo page: https://mozilla.github.io/pdf.js/web/viewer.html

In the demo, there are actually a few annotation types that are supported: free-text, drawings and images. And there is a getAnnotations function in the code.

Their FAQ includes this:

PDF.js is designed for reading PDF files and supports rendering annotations for viewing, but also supports adding annotations using a subset of the possible annotation types.

I am not 100% sure but I assume that for highlight annotations (text markup annotations), reading them is supported while writing and editing them not yet. It may be a while until it is, so the obsidian team is probably blocked there. It's probably a sensible decision to wait for that.

from obsidian-pdf-plus.

dominiwe avatar dominiwe commented on September 27, 2024

@RyotaUshio

That is awesome work! I think it's good that the feature is off by default and that there is a warning text.

I also like that you even added the ability to select the library used. Wow! Thank you!
I'll test it out with a few PDFs.

from obsidian-pdf-plus.

RyotaUshio avatar RyotaUshio commented on September 27, 2024

@jcesguerram Thanks, that would be great. But as a soon-to-be Ph.D. student, maybe I shouldn't spend too much time developing a tool for research rather than research itself 😭

So I'm inclined to go with approach 2 for now (and it's already implemented to some extent), but adding approach 1 as another option later will also be beneficial (and doable).

from obsidian-pdf-plus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.