GithubHelp home page GithubHelp logo

Comments (7)

NetherKing1357 avatar NetherKing1357 commented on May 19, 2024 3

Based on you response in the forum, I guess we could begin attempts for support with the .xml files stored within CBZ and CB7 files.
I've attached a zip file with a CBZ within. This is a comic file with every entry in the CR metadata editor filled in.

peppercarrot_episode01.zip

The following entries have no information stored in the .xml file:

  • Rating
  • Community Rating
  • Series Complete
  • Proposed Values
  • Tags
  • Review
  • Characters

This is the content of the .xml file:

<?xml version="1.0"?>
<ComicInfo xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Title>Episode 1</Title>
  <Series>Pepper and Carrot</Series>
  <Number>1</Number>
  <Count>23</Count>
  <Volume>1</Volume>
  <AlternateSeries>Pepper and Carrot</AlternateSeries>
  <AlternateNumber>1</AlternateNumber>
  <StoryArc>None</StoryArc>
  <SeriesGroup>Pepper and Carrot</SeriesGroup>
  <AlternateCount>23</AlternateCount>
  <Summary>This is an open source comic. I have added this information to understand how ComicRack adds metadata to comic files.</Summary>
  <Notes>This is an open source comic. I have added this information to understand how ComicRack adds metadata to comic files.</Notes>
  <Year>2017</Year>
  <Month>3</Month>
  <Day>6</Day>
  <Writer>David Revoy</Writer>
  <Penciller>David Revoy</Penciller>
  <Inker>David Revoy</Inker>
  <Colorist>David Revoy</Colorist>
  <Letterer>David Revoy</Letterer>
  <CoverArtist>David Revoy</CoverArtist>
  <Editor>David Revoy</Editor>
  <Publisher>David Revoy</Publisher>
  <Imprint>David Revoy</Imprint>
  <Genre>Web Comic</Genre>
  <Web>https://archive.org/details/peppercarrot-en</Web>
  <PageCount>4</PageCount>
  <LanguageISO>en</LanguageISO>
  <Format>Web Comic</Format>
  <AgeRating>Everyone</AgeRating>
  <BlackAndWhite>No</BlackAndWhite>
  <Manga>No</Manga>
  <Characters>Pepper, Carrot</Characters>
  <Teams>Pepper and Carrot</Teams>
  <Locations>Carrotland</Locations>
  <ScanInformation>Internet Archive HTML5 Uploader 1.6.3</ScanInformation>
  <Pages>
    <Page Image="0" ImageSize="346512" ImageWidth="992" ImageHeight="1373" Type="FrontCover" />
    <Page Image="1" ImageSize="348534" ImageWidth="992" ImageHeight="1373" />
    <Page Image="2" ImageSize="244617" ImageWidth="992" ImageHeight="1373" />
    <Page Image="3" ImageSize="184320" ImageWidth="720" ImageHeight="177" />
  </Pages>
</ComicInfo>

Below are screenshots of the editor itself with all entries filled in. Web alone has been filled in later, and has a entry in the .xml file.

CopyQ vU5648
CopyQ ba5648
CopyQ Gy5648

Every file scraped by cbnack's ComicRack ComicVine scraper has the following information appended:

  • Web has a link to the ComicVine entry for that issue
  • Either Tags or Notes has this message: Scraped metadata from ComicVine [CVDBxxxxxx].

Example: If Immortal Hulk, issue 14 were scraped:

<Notes>Scraped metadata from ComicVine [CVDB702466].</Notes>
<Web>https://comicvine.gamespot.com/the-immortal-hulk-14-we-only-meet-at-funerals/4000-702466/</Web>

If all else fails, we can use this information to recursively run the YAC scraper for all the files.

I would need some documentation on the way YACReader stores metadata info to compile a map of CR to YAC tags. Could anyone point me in that direction?

from yacreader.

NetherKing1357 avatar NetherKing1357 commented on May 19, 2024 1

I've done a basic mapping. Please take a look and let me know if I've got anything wrong.

mapping.xlsx

from yacreader.

NetherKing1357 avatar NetherKing1357 commented on May 19, 2024

Some relevant comments on the forum:

[quote="matthew" post=2058]
Luis, here are the XML tags currently supported by ComicRack:

<ComicInfo xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<Title>Hope And Glory - Part II: Bitter Beginnings</Title>
	<Series>Ninjak</Series>
	<Number>3</Number>
	<Count>6</Count>
	<Volume>1994</Volume>
	<StoryArc>Arthur</StoryArc>
	<SeriesGroup>Islands</SeriesGroup>
	<Summary>The secret origin of Ninjak continues!</Summary>
	<Notes>Scraped metadata from ComicVine [CVDB141693].</Notes>
	<Year>1995</Year>
	<Month>6</Month>
	<Day>24</Day>
	<Writer>Mark Moretti</Writer>
	<Penciller>Bob McLeod, Mark Moretti</Penciller>
	<Inker>Bob McLeod, Dick Giordano</Inker>
	<Colorist>Kathryn Bolinger</Colorist>
	<Letterer>Bob McLeod, Dick Giordano</Letterer>
	<CoverArtist>Bob McLeod, Kathryn Bolinger, Mark Moretti</CoverArtist>
	<Editor>Bob Layton</Editor>
	<Publisher>Valiant</Publisher>
	<Imprint>Aircel Publishing</Imprint>
	<Genre>Action, Fantasy</Genre>
	<Web>http://www.comicvine.com/ninjak-00-hope-and-glory-part-ii-bitter-beginnings/4000-141693/</Web>
	<PageCount>35</PageCount>
	<LanguageISO>en</LanguageISO>
	<Format>Director's Cut</Format>
	<AgeRating>Mature 17+</AgeRating>
	<BlackAndWhite>No</BlackAndWhite>
	<Manga>No</Manga>
	<Characters>Crimson Dragon, Dr. Silk, Fitzhugh, Iwatsu, Michiko Okubo, Neville Alcott, Ninjak, Senator Yusaku Okubo</Characters>
	<Teams>X-Men</Teams>
	<Locations>California, England, Japan, London, Tokyo</Locations>
	<Pages>
		<Page Image="0" ImageSize="568730" ImageWidth="1280" ImageHeight="1977" Type="FrontCover" />
		<Page Image="1" ImageSize="709786" ImageWidth="1280" ImageHeight="1995" />
	</Pages>
</ComicInfo>

[/quote]

[quote="selmf" post=4883]
Since this is requested regularly I'd like to point out a few things that can be done to speed things up a little. If we want to implement metadata import, we roughly have this todo list:

[ol]
[li]Research the format specification for all metadata files we want to support[/li]
[li]Compare the available metadata entries with YACReader's available database entries[/li]
[li]Map foreign metadata to YACReader's metadata, decide what to do with edge cases[/li]
[li]Aquire a set of example files that are [b]fully tagged[/b] in [u]all[/u] metadata format and legal (not pirated!!!) comics[/li]
[li]Add metadata detection to our library and comic routines[/li]
[li]Run tests to make sure it is working correctly[/li]
[li]Write some basic import routines for the most important tags[/li]
[li]Add logic to handle edge cases like multiple metadata files present and other stuff[/li]
[li]Finetune our import dialog to make all options available[/li]
[/ol]

As you can see this is a feature that isn't implemented quickly. If you want to help out, you can create a bug on our Github page and start working on collecting the info that is needed to actually start the task.

[/quote]

[quote="Luis รngel" post=4884]
To that list I would add an option to re-scan the comics in a library for metada (posibliy add an option to do it for a folder or a spedific file). Once this is implemented people will want the metadata available for the comics already in the library.

Some help with this would be great, anyone?
[/quote]

from yacreader.

selmf avatar selmf commented on May 19, 2024

A first issue I am seeing is that the way we manage libraries is placing our data in a hidden directory in the root directory of the collection in question. That does not really align very well with the concept of a central xml file to "rule them all", so we will have to think about how to handle this or if we're going to handle this at all.
There is also no info on the structure of this database, other than "xml snippets" or "one huge xml file".

Another issue is that the way per-file metadata is stored is not consistent. Sometimes it is in the archives, sometimes not, it might even be "hidden" using special NTFS filesystem features. Supporting all of these variants probably doesn't make sense.

Metadata format seems to be roughly what ComicVine is giving us (@luisangelsm is that more or less correct?) so mapping should be possible.

We also still need some test files. If anyone is interested, Pepper and Carrot is a great open source web comic we have used for testing and showcase purposes in the past, so you could grab a cbz of it and tag it via ComicRack.

from yacreader.

selmf avatar selmf commented on May 19, 2024

YACReaderLibrary stores its metadata in a hidden directory called .yacreaderlibrary which contains a directory with covers and a database file called library.db.
You can use https://sqlitebrowser.org/ to open this file and inspect the entries. For any questions related to the format in general, you will need to ask @luisangelsm - the database is his mess speciality and I have successfully avoided working on it until now.

from yacreader.

selmf avatar selmf commented on May 19, 2024

Thanks for taking the time to do this. This should be enough for me to writing a first draft for an importer. I still need to do some investigations on my own to see for which technical option to support XML in general we should opt and I will need to discuss this technical decision with @luisangelsm to get his input and OK on it.
We might also use this opportunity to take a closer look at our own library metadata and maybe do some improvements on it.

from yacreader.

luisangelsm avatar luisangelsm commented on May 19, 2024

@NetherKing1357 Thanks for all the resources and research, it has been really useful.

It still needs some work, but it is looking good so far.

image

from yacreader.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.