GithubHelp home page GithubHelp logo

Comments (4)

Calvin-Xu avatar Calvin-Xu commented on June 20, 2024

(responding to #133 (comment))

It makes sense to me that manual search may be associated with an audio clip if raster subtitles were OCR'd.

Yes. This is actually what I tested OCR with.

Outside of that, OCR will usually be associated with visual context as opposed to audio context.

You are totally correct. I realized I am actually more envisioning this for manual lookup. Because Memento currently sends OCR results to lookup I decided to talk about them together.

My imagined main use case actually pertains to consuming content that does not have subtitles at all. I have encountered the following two scenarios:

  1. Video with no subtitles that I understand 90%. By listening closely and typing out what I think I heard into a web search engine, I eventually transcribe the sentence with new vocab / usage.

  2. Video with no timed subtitles, but (partial) transcript is available in some other form. Examples include songs on YouTube that often don't have subtitles but the lyrics can be easily looked up, TV news that show a slightly altered version of the talking points on screen (OCR helps here), etc.

The reason I wanted to add OCR in the first place was due to Evangelion episode 14 using cards of text throughout to communicate information. The second use case I found after implementing it was using this https://github.com/Dudemanguy/mpv-manga-reader to turn Memento into a manga reader. For both of these cases, I don't see the benefit of extracting audio from the content.

I agree that in cases where visual context is detached from audio context this does not work. Perhaps I'll be having multiple profiles for lookup w/ context, lookup w/o context, etc. I want to know what you think.

from memento.

ripose-jp avatar ripose-jp commented on June 20, 2024

Now that I understand the use case for this feature, my next concern is how to explain it to the user. My philosophy when designing Memento has been to try and keep everything as self explanatory as possible. I don't want Memento to become a piece of software you need a guide to use.

Yes. This is actually what I tested OCR with.

In this case {audio-media} should work, but {audio-context} will fail. I don't think this is worth fixing since this is expected behavior given the descriptions of the two features.

from memento.

Calvin-Xu avatar Calvin-Xu commented on June 20, 2024

In this case {audio-media} should work, but {audio-context} will fail. I don't think this is worth fixing since this is expected behavior given the descriptions of the two features.

I think this makes enough sense.

my next concern is how to explain it to the user

I think a number of video players including mpv support setting some kind of A-B loop (default keybind l) with certain established DWIM behavior:

--ab-loop-a=, --ab-loop-b=
Set loop points. If playback passes the b timestamp, it will seek to the a timestamp. Seeking past the b point doesn't loop (this is intentional).
If a is after b, the behavior is as if the points were given in the right order, and the player will seek to b after crossing through a. This is different from old behavior, where looping was disabled (and as a bug, looped back to a on the end of the file).
If either options are set to no (or unset), looping is disabled. This is different from old behavior, where an unset a implied the start of the file, and an unset b the end of the file.
The loop-points can be adjusted at runtime with the corresponding properties. See also ab-loop command.
https://mpv.io/manual/stable/#options-ab-loop-a

though I agree it is not always the most intuitive feature.

I think providing a new marker like {audio-selection} will be a good move, and any explanation that it needs can be there for those that want to use it.

from memento.

ripose-jp avatar ripose-jp commented on June 20, 2024

I think providing a new marker like {audio-selection} will be a good move, and any explanation that it needs can be there for those that want to use it.

I'm satisfied with this. I'm not sure when I'll get this done, but I have enough to go off of now.

from memento.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.