GithubHelp home page GithubHelp logo

Comments (23)

mmartinez85 avatar mmartinez85 commented on June 15, 2024 1

If you want to re-work it with fuzzy matching, I can hold off on starting to manually go through the rest of my library. Just in case you want to use my library for more testing. Totally down with that. I already used Google Takeout to download everything I had in Uploads so if it does add something to the Library that is totally off and then deletes it from Uploads, I have a local copy of everything to reference.

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Before getting into it I just want to double-check that it's your intention to use the -a feature, meaning that you want the tool to check and see if the YT Music service already has a copy of each of your uploaded albums so that instead of just deleting it completely, it adds YTM's copy to your library and then deletes your uploaded copy. Because if not, you can omit the -a and then it will actually delete all your uploads.

If you do want to use the -a feature, then it sounds like your uploads don't have an artist associated with them. This happens a lot with YTM uploads unfortunately, and even worse, they don't provide a way to edit the metadata to add an artist.

For example, I just tested uploading this album to my uploads, and it looks like this in my library:
image
As you can see, my uploaded album has the correct name and song titles, but there is no artist showing here, even though I just confirmed that the metadata on the .mp3 files on my computer do have the artist name. So, YTM just totally failed to capture that artist name when I uploaded it.

Without knowing the artist name, my tool can't confidently do a lookup in YTM's online catalog to find an existing copy of it. It could look by album title instead but a lot of artists make albums with the same name. I made the tool act conservatively when using the -a flag, so that if it can't find an existing copy then it won't delete it from your uploads so that you don't just lose it entirely.

I think this is the first time I've had someone ask about the -a feature so I haven't done anything to try to improve it. There are some ideas floating around in my head for how to make it smarter with matching or to allow the option to still delete the upload if it can't find it, but before I go down that path I just want to see what your needs are for the -a feature and give you a chance to check if your uploaded albums are indeed missing the Artist name in YouTube Music.

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

Dude, you rock. I thought my Issue might just be submitted into the ether... I just checked though and you're right, my YT Music uploads don't have artists names. I never noticed that before. I was trying to use the -a feature. I was super pumped when I saw it existed because it would do exactly what I was trying to do...

My dilemma is this... I have Uploads off all of my music (CDs) from back when. I can shuffle those Uploads but the artists/albums/songs in that Uploads playlist all stop when I stopped buying CDs and went to streaming. For argument's sake then, let's say I have all of my music from 2014 and prior is in one shuffle-able Uploads playlist but nothing from more recent years. My ideal goal is to take everything in Uploads, add them to my Library which I could then add more current stuff to. Hope that makes sense.

Obviously this is my issue, not yours, but if you have any ideas I'm open. I just want to get all the stuff in Uploads to some playlist or library I can then add to. I have like 1300+ albums in Uploads. Not impossible to do manually but automation would be awesome.

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Haha, this is my pet project so I read and respond to every word about it that comes my way :)

Yeah that's unfortunate that there's no Artist metadata and near impossible to add any. There are things I could do to make the search smarter but before looking into that, it sounds like you just want a playlist that has all your old uploads but then also be able to add new music to it? I might be oversimplifying/misunderstanding but there's nothing in YTM that prevents you from adding non-uploads to a playlist that contains uploads. So you would end up with one playlist that has all your old uploads and all the new songs that you clicked "Save to Playlist" on. Maybe the issue is that you don't yet have a playlist that contains all your uploads. I just tried to create one and I noticed that you can't use the Shift key to "select all" the uploaded songs, so maybe that's the part that's preventing you from adding all your uploads to a playlist?

=====
On a completely separate note, I found out that even if an uploaded album has the artist information, the ytmusic-deleter still fails to delete it.

[2024-03-15 15:31:49] Searching for album in online catalog...
[2024-03-15 15:31:49] No match for uploaded album found in online catalog. Will not delete.

So this is an internal error but I can fix this one at least -- things like this break when Google makes changes to their data model. I will open a separate issue (#36) for that since that's different from a missing metadata error so that way I can isolate and fix it separately, and that'll be in the next release. That one's an easy fix at least.

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

"it sounds like you just want a playlist that has all your old uploads but then also be able to add new music to it?"

Correctamundo

"there's nothing in YTM that prevents you from adding non-uploads to a playlist that contains uploads. So you would end up with one playlist that has all your old uploads and all the new songs that you clicked "Save to Playlist" on."

This is what I tried to initially do. I tried to go to each album in my Uploads, and add it to a new Playlist. That's when I found out that a playlist has a limit of 5,000 songs. I have way more songs than that in Uploads alone. So then I realized rather than create a new playlist, I could just start adding everything to my Library since that has a limit of 100,000 songs. And yeah kinda like you said, there's no select all feature on Uploads, so it's a process of me just doing each album one by one. I was going to just do that then in my digging, I found this script.

Clicking on 1,300 albums is certainly faster than ripping each CD onto a computer though. 😁

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Dang, yeah that's a dilly of a pickle.

Because YTM has such a common problem with missing artist metadata for uploads, I should probably update my tool to work around that somehow, since this is going to be a problem not just for you but for anybody who tries to do this (or has tried in the past and gave up without telling me).

So here's a way I can make it work around this in the absence of an artist name:

  • Search using just the album name. If both albums have the same name and have a Track 1 with the same name, then that's probably a match. Maybe check the first two tracks just in case Track 1 happens to be named "Intro" or something on multiple albums with the same name (rare but possible).

That of course runs under the assumption that people's albums at least have an album title, which from my very limited tested seems to be the case for all my uploads. Does it seem like most of your uploads have an album title at least?

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

Yeah, it does seem like they all have an album title, but the deeper I go, the more confused I am about how YT music catalogs all this stuff. If I click on any individual album in Uploads, it doesn't show Artist data with it (just like your screenshot above). But if I go to Uploads then either Songs or Artists, those views seem to show all info e.g. I can click on an an individual Artists in uploads and it shows me song name, artist, and album title. So the data is there? Just if you reference by Album it doesn't show Artist name.

image

image

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

You're right, I'm seeing the same thing on my end. Hmm then there's gotta be a way I can retrieve that artist name then, it might be just in a different field in the API object. There's hope, then. Gonna re-visit this a bit later tonight or this weekend.

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Well, I found a solution. I changed the code to loop through each uploaded song instead of each uploaded album, since the individual songs seem to have the artist info but not the album. So for each song, it will check if the album that that song is from is in the YT Music online catalog. If it is, the album that song is from will get added to your library, and the song will be deleted from your uploads. So, the whole process will take longer because we're now going one song at a time instead of one album at a time, but it'll get done eventually. Sometimes Google temporarily puts you in time-out when you make too many API requests, so you may have to run it several times to get it to do your whole collection.

I should be able to push out a release with this change this weekend and you can let me know how it goes :)

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

Badass. I'm totally willing to be the guinea pig and run it on my account when you're ready. If it works, this will have saved me many hours.

I was gonna mention how I saw in some threads about how Linux apparently looks at a different mp3 tag version and people were able to pull data that way. I hadn't confirmed any of this, but anyway... looks like you got it figured out.

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Alright @mmartinez85 , we're good to go.
To upgrade to the new version, run:

pip install --upgrade --pre ytmusic-deleter

The --pre is important since this is pre-release version. On my system, I had to run that command twice before it actually upgrades, so be sure to confirm you have the right version before testing:

$ ytmusic-deleter --version
ytmusic-deleter, version 2.0.4b1

You should see version 2.0.4b1 to be sure you have the new pre-release. Then run ytmusic-deleter delete-uploads -a as normal.

When I tested it against a much smaller library than yours, it seemed to work well. My only concern is that the first thing it does is try to fetch all of your uploaded songs, whereas before it would fetch all the uploaded albums, so it's going to be a gigantic list. If it hangs there for more than a few minutes or times out altogether, let me know and I can refactor it to chunk it down into smaller portions.

Fingers crossed! 🤞

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

It processed some on the first run, then doesn't seem to like something. Ran it a few times. I'm trying to watch the terminal output as it scrolls. I have some stuff in my library like bootlegs that are probably not even available in YT Music so I wonder if it gets to some of those and freaks out. As of now it did about 100 albums and doesn't seem to want to process anymore.

PS C:\ytmusic-deleter> ytmusic-deleter delete-uploads -a
File "C:\Users\User\AppData\Local\Programs\Python\Python312\Lib\site-packages\uploads.py", line 57, in maybe_delete_uploaded_albums
elif not add_album_to_library(youtube_auth, artist, album_title):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python312\Lib\site-packages\uploads.py", line 88, in add_album_to_library
if search_result["resultType"] == "album" and match_found(
^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python312\Lib\site-packages\uploads.py", line 138, in match_found
search_result["artists"][1]["name"] if "artists" in search_result else ""
~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

Looks like it's not finding a lot of albums that are definitely on YT Music. Maybe I made too many API calls. Gonna give it a break for now I think.

ytmusic-deleter_2024-03-16.log

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Ok so there are two things happening:

  1. That error you pasted preventing it from getting the artist name from the search result. I improved that part just now to make it more robust.
  2. Your log showing that it couldn't find a match for albums that definitely have a match, like "Dark Tranquility - The Gallery" for example. I'm thinking this is because it's coming back with 0 search results because of the Google rate limiting, like you suggested.

So, now that it's been about an hour, try again with this new update. I also added more logging statements to help debug.

pip install -U --pre ytmusic-deleter
ytmusic-deleter --version
ytmusic-deleter, version 2.0.4b3

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

We're getting closer... error/closing message and log below.

[2024-03-16 20:36:07] Deleted 768 out of 1265 uploaded albums (or songs).
[2024-03-16 20:36:07] Remaining 497 albums (or songs) did not have a match in YouTube Music's online catalog.

ytmusic-deleter_2024-03-16.log

One example of an album I have in Uploads that is definitely available on YT Music but didn't find a match.

[2024-03-16 20:21:21] Processing album: Yes - Close To The Edge
[2024-03-16 20:21:21] Searching for 'Yes - Close To The Edge' in online catalog...
[2024-03-16 20:21:22] There were 29 search results.
[2024-03-16 20:21:22] No matches were found in YTM for Yes - Close To The Edge
[2024-03-16 20:21:22] No album was added to library for 'Yes - Close To The Edge'. Will not delete from uploads.

Stepping away for the evening probably. Thanks again. Whether we get it working more or not, this has already saved me lots of time. Over half of them are done.

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Alright, I published another version 2.0.4b6. This will hopefully fix the ones where it said there were 20-something search results but then it just immediately said that it found no matches, which is odd.

There are two other situations happening that I haven't accounted for yet:

  1. Singles. For example "Smashing Pumpkins - Bullet With Butterfly Wings" is a song, but YTM doesn't have the "Single" for it -- it just has it as track 6 off the Mellencollie and the Infinite Sadness album. I started going down the road of having it check the online catalog for songs too but that's going to add a bit more complexity so I'm holding off on it just til we get past the low hanging fruit.
  2. Inexact matches. My tool doesn't employ super intelligent matching, it just does exact comparisons, for example:
Your upload is: billy joel - greatest hits, vol. 3
Possible match: billy joel - greatest hits vol. iii
Not a match

The tool doesn't know that "iii" is the same as 3. There's also the comma in one and not the other. I just changed it to strip all symbols when doing a comparison so that should help for some of the ones where different symbols was causing it not to match.

Thanks for your patience and for working with me through this. This feature probably never worked for anybody so this is really helping improve the software, especially since I don't have the means to test against a vast collection of uploads with widely varying metadata like yours.

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

Ran it again. It found a bunch more. Think I'm down to like 300 albums unmatched out of 1,300 or so I started with. When I look at the remaining ones, a lot are multi-disk sets which probably aren't listed that way on streaming. A lot have special characters in the names. And I can see looking through what's left that a good number, maybe even up to half, of these are likely not on streaming at all - either bootlegs or independent releases or who knows what kind of randomness I have in my collection. Attaching the log in case you're curious or want to keep hacking away. I renamed the other log files, so this one is just of this run.

ytmusic-deleter_2024-03-16.log

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Alright well, knocking out 75% of your upload collection ain't bad! I've looked through a lot of the remaining ones in the log file and now we're pretty much down to the ones that fail the string comparison for the reasons you said. Some of them are unfortunate misses, like "Alice in Chains - MTV Unlpugged" not being a match because YTM has it as "Alice in Chains - Unlpugged". I should probably employ "fuzzy" matching so that it doesn't have to be exact but more like a 75% or better match. There's a lot of Python libraries that I can use to do that pretty easily. Maybe that will take care of a decent chunk of the remaining pile.

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Actually I should have just been doing fuzzy matching all along instead of trying to manually "lowercase" the titles and strip out special characters. Fuzzy matching will just take care of all of that. It's what all search engines and autocorrect on your phone uses.

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Ok, published the 2.0.4b7 version.

pip install -U --pre ytmusic-deleter
ytmusic-deleter --version
ytmusic-deleter, version 2.0.4b7

You can now optionally provide a "score cutoff" to the fuzz matching between 0 and 100, with 100 meaning only letter-for-letter matches will be considered. The default if you don't provide a value is 75. Anything that doesn't pass this matching score will not be considered a match.

ytmusic-deleter delete-uploads -a --score-cutoff 90

If you want to be conservative to start out with, you can try a value of 90 like that. If that doesn't catch many things, you can omit the --score-cutoff to let it use 75. And then you can try lowering it even further if you want it to catch even fuzzier matches.

For example, for your album "Iced Earth - Gettysburg (1863) [Live] [Disc 2]", the closest match on YTM is "Alive In Athens (Live)" and that match has a score of 42. So you would have to lower the score cutoff to 42 or lower for it to add that one to your library, which you probably don't want since that's a different album. Although it's still a good live album. I'm a huge Iced Earth fan btw so that's why I'm picking that one, haha

Anyway, good luck guinea pig!

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

Did a run with 90 percent. I have mixed feelings on how it processed things. I'll start with the positives. I really liked the way it processed multiple disk sets. Those seemed to find the equivalent in streaming. I also liked the way the fuzzy algorithm is able to pick-up on certain things and consider them 100% matches. Examples:

"and" and "&" are considered a match
single quotes and double quotes around things are considered equivalent
spaces don't matter e.g. Andrew W. K. and Andrew W.K
the words "the" or "a" not counting e.g. "The Wildhearts" and "Wildhearts" are the same and "A Pleasant Shade Of Gray" and "Pleasant Shade Of Gray" are the same

This more "smart" criteria helped find some more accurate matches.

I'm having trouble understanding why it considered this a match though. I think I saw quite a few like this where it the first x characters in the string matched, it wouldn't matter what followed and because of that I'd get inaccurate matches. I'm not sure if that's how it's working, but it's my guess right now. Looking at it again now maybe it has something to do with the quotes in the title. You'll see a few more like this in my "fails" log file.

[2024-03-17 20:27:00] Processing album: The Beatles - The Beatles' Second Album
[2024-03-17 20:27:00] Searching YT Music for albums like: 'The Beatles - The Beatles' Second Album'
[2024-03-17 20:27:00] There were 20 album results.
[2024-03-17 20:27:00] Found match: 'The Beatles - The Beatles' with a matching score of 100. Adding to library...
[2024-03-17 20:27:01] Added album to library.
[2024-03-17 20:27:01] Deleted album from uploads.

Attaching logs below... one for the full run of 90 percent. And one where I copied some that I considered failures to a different text file. I'm gonna go back and look at these myself and fix them up manually.

ytmusic-deleter - 90 percent.log
processing fails.txt

My 2 cents would be to keep the original matching algorithm (no fuzzy logic) that people can run. That one was more stringent but it didn't delete anything or find any inaccurate matches. Then maybe have an argument to run with the fuzzy logic with a disclaimer that depending on how high or low you set the score, it may mess up a few. For me the fuzzy 90 percent run found about 90 more matches and about 20 of those I don't think were accurate.

from ytmusic-deleter.

mmartinez85 avatar mmartinez85 commented on June 15, 2024

Separate comment just to say thanks again. If I can help test anything in the future, let me know, I think my current library has had enough abuse for now though. 90 percent match seems like a good cut-off. 😎

from ytmusic-deleter.

apastel avatar apastel commented on June 15, 2024

Looking through the logs, I definitely see what you mean. There's even one that should have definitely been a match but got a very low score "Stuck Mojo - Hvy1" was not a match against "Stuck Mojo - HVY 1".

Today was basically the first time I ever looked into the world of fuzzing algorithms, so I guess it's no surprise it didn't perform that well. I think the main issue was my choice of using the partial_ratio algorithm which puts heavier weight on partial matches, so that causes it to get really excited when the first few words of the title are matching, even if the remaining words are way off. In my testing it seemed like that algorithm was better than the weighted ratio one but....I'm testing against a few songs.

What I really should do is write tests that use a huge sample size of album names and then I could run that repeatedly and see what algorithm yields the highest number of correct matches. That would be way better than my current method of just tossing a few mp3s into my test account library, waiting for them to process, then running the app and seeing what happens.

For now I'm glad we got through a big chunk of your library, sorry about the mismatches. Thanks for your patience and helping me improve this feature a lot for future users. I'll probably take your advice and keep the straight matching algorithm around at least until I can get the fuzzing in a good place, which may take me a little while so I'll probably be closing this issue before that happens and I'll track that in a new issue. And most of all, thanks for the beers! Cheers 🍻

from ytmusic-deleter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.