erengy / anitomy Goto Github PK
View Code? Open in Web Editor NEWAnime video filename parser
License: Mozilla Public License 2.0
Anime video filename parser
License: Mozilla Public License 2.0
Sometime anime have different pattern with the WEB keyword :
Kubo.Wont.Let.Me.Be.Invisible.S01E12.VOSTFR.1080p.WEB.x264-TsundereRaws-Wawacity.cyou
Seems to mean WEBRIP I guess.
I don't know if it's really matters.
Hello,
I would like to use this library in a project I have, but I am unable to do so because of the GPL license. The GPL license requires that if I include it in my project, even as a linked library, I must also license my code under GPL. As this is a private project I am unable to publish the code as required.
Would you please consider switching to a license such as LGPL? This permits linking to private projects while still encouraging contribution to the library. A copy of LGPL can be found here.
It appears you don't mind this, as "MAL Updater OSX" and "Hachidori" are both released under BSD 3-Clause license and are technically in violation of the terms you have grated them.
There has been a lot of great work done on this project around parsing edge cases in titles and I would really like to take advantage of that without having to start from scratch in my own project.
Thanks,
Zak Kristjanson
Tried to compile taiga with Visual Studio 2015 RTM, while compilation of anitomy got an error in line 182 of tokenizer.cpp:
Error C2039: 'back_inserter': is not a member of 'std'
Error C3861: 'back_inserter': identifier not found
I'm not sure if you want all of my suggestions in 1 issue or multiple, but here are my suggestions/the things I've noticed:
Nice to have:
-Interesting?
[Infantjedi] Norn9 - Norn + Nonetto - 12 results in : "Norn9 - Norn Nonetto" but that's such an edge case it can be ignored.
Wonderful library btw ๐
Season 1, Episode 3: written as 103 is detected as Episode 103.
Is there any reason for anitomy to parse 2nd Season
or Season 2
as the season number but not parse S2
as well? e.g.:
Hayate no Gotoku 2nd Season 24 (Blu-Ray 1080p) [Chihiro]
Is parsed with the title Hayate no Gotoku
and season number 2. But...
[SFW]_Queen's_Blade_S2
...is parsed with the title Queen's Blade S2
with no season number. Is this the expected behavior?
[0x539] Somali and the Forest Spirit - S01E01 (WEB 1080p Hi10P AAC) [BB7C6531].mkv
is being parsed as "Somali and the Forest Spirit - S01E01" episode 539 in Taiga.
Just a suggestion since I have seen some videos where the title and episode number are separated by a full-width space (ใ
). It looks like currently only half-width spaces and underscores are included. I may write a PR later if I have the time.
i am using taiga but when i finished haikyuu seasons 1 and when i played the first ep of season 2 it kept saying playing haikyu first season first ep (i use anichiraku its a gdrive of anime and i cant edit the name)
Hello,
I ran the latest version of this code against the included unit tests and found that a number of them are failing. Just raising awareness in case this is unintentional.
I only coded the test to check the title, so other props (even in successful tests), may or may not be correct.
expected vs actual:
#14 MISMATCH: Juuni Kokki => (Les 12 Royaumes)
#39 MISMATCH: Kiddy Grade 2 => Kiddy Grade
#64 MISMATCH: Keroro => 148
#78 MISMATCH: Aim For The Top! Gunbuster => Aim For The Top! Gunbuster-ep1
#81 MISMATCH: Mobile Suit Gundam Seed Destiny => encoded by SEED
#82 MISMATCH: ?K? => Image
#98 MISMATCH: Golden Time => ?
#101 MISMATCH: Mangaka-san to Assistant-san to the Animation => 02
#103 MISMATCH: Rozen Maiden 3 => Rozen Maiden
#112 MISMATCH: Death Note => 37 [Ruberia] Death Note
#113 MISMATCH: Accel World - EX => Accel World - EX01
#120 MISMATCH: Akuma no Riddle => EvoBot [Watakushi] Akuma no Riddle
#121 MISMATCH: => 01 - Land of Visible Pain
#124 MISMATCH: The iDOLM@STER 765 Pro to Iu Monogatari => The iDOLM@STER
#129 MISMATCH: Hidamari Sketch x365 => Hidamari Sketch x365 - 04.1
#130 MISMATCH: => The Boy in the Iceberg
#138 MISMATCH: The Animatrix => The Animatrix 08.A Detective Story
#144 MISMATCH: Memories Off 3.5 => Memories Off
#146 MISMATCH: Byousoku 5 Centimeter => Byousoku
I have a sample formatted in this way (substitutions are surrounded by curly brackets):
[{Category}] -{Romanized Title}- {Original Title} Vol{Volume Number} ็ฌฌ{Episode Number}่ฉฑ ใ{Episode Title}ใ ({Video Codec} {WxH Resolution} {Audio Codec}).{Extension}
In this situation the episode title is parsed as the release group name and displayed as such in Taiga.
I know enclosing titles within brackets goes against your suggestions in the readme, but I have never seen Japanese brackets (ใใ) used for group names. Perhaps introducing bias for this pattern could be a solution?
I was running some tests (https://runkit.com/jaliborc/5c13d05e6ba83b0012bfbcf2) and I noticed this issue with the parser: if you look at the last two tests, you see that Piano no Mori 2 (TV)
gives the anime_title Piano no Mori
, with season 2
and anime_type TV
. But Piano no Mori (TV) 2nd Season
gives anime_title Piano no Mori (TV)
, with the anime_type TV
remaining the same.
Notable example: the currently airing anime "NieR:Automata Ver1.1a"
recognition fails:
"NieR:Automata Ver1.1a - 01"
related: erengy/taiga#1110
Given Wonder.Woman.2017.720p.10bit.BluRay.6CH.x265.HEVC
:
AnitomyElements {
AnimeTitle: 'Wonder Woman 2017',
FileExtension: 'mkv',
FileName: 'Wonder.Woman.2017.720p.10bit.BluRay.6CH.sample',
Source: 'BluRay',
VideoResolution: '720p',
VideoTerm: '10bit' }
Is it possible to identify 2017 as the ReleaseYear rather than part of the title?
For the title
[Kaleido-subs] Blue Archive the Animation - 07 (S01E07) - (WEB 1080p HEVC x265 10-bit E-AC3 2.0) [3B0015AF]
anitomy seems to be exclusively prioritizing the (S01E07) token resulting in the anime title being parsed as "Blue Archive the Animation - 07" in this example.
Getting the following compilation error on my linux box with gcc 5.2.0:
lib/anitomy/anitomy/string.cpp: In function 'wchar_t anitomy::ToLower(wchar_t)':
lib/anitomy/anitomy/string.cpp:73:41: error: 'towlower' was not declared in this scope
static_cast<wchar_t>(towlower(c));
^
lib/anitomy/anitomy/string.cpp: In member function 'wchar_t anitomy::ToUpper::operator()(wchar_t) const':
lib/anitomy/anitomy/string.cpp:80:43: error: 'towupper' was not declared in this scope
static_cast<wchar_t>(towupper(c));
^
Looks like towupper and towlower are defined in cwctype so I had to include that in string.cpp to get it to compile.
The library failed to recognize some anime if the versioning right next to the episode number:
[Judas] Aharen-san wa Hakarenai - S01E06v2.mkv
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.