GithubHelp home page GithubHelp logo

Comments (8)

nikolai-b avatar nikolai-b commented on August 28, 2024 1

I've added a new admin pages with the untagged issues and issues with multiple threads at /admin/home

I'll add the validation once all issues have tags otherwise it will get confusing.

I've removed the tags with no references, I'll make an issue to check this every so often.

from cyclescape.

mvl22 avatar mvl22 commented on August 28, 2024

Just to say we now have a volunteer for the tagging work, so if you can generate the data for this I can get them started.

They'll also be working on dealing with variant tagging, which would be good to clean up. Is it easy to generate a list of all tags and their frequency? That should hopefully enable us to determine similar tags.

from cyclescape.

nikolai-b avatar nikolai-b commented on August 28, 2024

To get the issues with threads where neither the issue or the thread have tags I've used:

File.open("issues_without_tags.txt", "w") do |file|
  Issue.joins(:threads).where(<<~SQL
    not exists (select 1 from issue_tags where issue_tags.issue_id = issues.id) 
    and not exists (select 1 from message_thread_tags where message_thread_tags.thread_id = message_threads.id)
  SQL
  ).distinct.select("issues.id, issues.title").find_each { |iss| file << "https://www.cyclescape.org/issues/#{iss.to_param}\n" }
end

issues_without_tags.txt

I'd like to add a validation to ensure all future message threads have tags so we don't get new ones. Given the issue might be tagged I could do something like pre-fill the new thread's tags with the issues ones if you want but also starting with a blank doesn't seem unreasonable.

For issues with multiple threads:

headers = ["Issue URL", "Tags on the issue only"]
CSV.open("issues_with_multiple_threads.csv", "w", write_headers: true, headers: headers) do |csv|
  Issue.where(id: iss_ids).left_joins(:tags).group(:id, :title).select("issues.id, title, json_agg(tags.name) as tag_names").find_each { |iss| csv << ["https://www.cyclescape.org/issues/#{iss.to_param}", iss.tag_names] }
end;

issues_with_multiple_threads.csv

Finally the top tags:

headers = ["Tag name", "Tag frequency (including issues, threads and library items)"]
CSV.open("tag_with_freq.csv", "w", write_headers: true, headers: headers) do |csv|
  Tag.top_tags_fresh(100000).map { |tg| csv << [tg.name, tg.tag_count] }
end

tag_with_freq.csv

from cyclescape.

mvl22 avatar mvl22 commented on August 28, 2024

Could we get the counts for those added to the stats page somewhere so I can keep an eye on that as we work through it at this end? I don’t mind running the queries on the DB manually from time to time to get the data output but a quick view for the three stats would be helpful.

I'd like to add a validation to ensure all future message threads have tags so we don't get new ones.

Yes, definitely. Soon as we have the current data cleaned up, we should apply that to editing also so that the constraint is in before we do the main site migration.

Finally the top tags

I see we have some tags with zero references to them. I suggest we zap those as they aren't useful and will just create autocomplete noise.

from cyclescape.

nikolai-b avatar nikolai-b commented on August 28, 2024

Issues have voting πŸ‘ and πŸ‘Ž whereas threads have favourites ⭐ . We have 912 instances where a user has πŸ‘ or πŸ‘Ž and issue. Do we just ignore these? It matters for "popular" part of the new discussions design.

from cyclescape.

mvl22 avatar mvl22 commented on August 28, 2024

Issues have voting πŸ‘ and πŸ‘Ž whereas threads have favourites ⭐ . We have 912 instances where a user has πŸ‘ or πŸ‘Ž and issue. Do we just ignore these? It matters for "popular" part of the new discussions design.

Popular discussions is simply the discussions with the most messages in.

Favourites are purely just a booking mechanism for the user. They are not used in determining popularity. Other users cannot see who has favourited what.

Issues have voting, but you can basically entirely ignore that now. That data was never very interesting and will just fade away.

from cyclescape.

mvl22 avatar mvl22 commented on August 28, 2024

To get the issues with threads where neither the issue or the thread have tags I've used

Action: Nikolai to make an admin page to help this be worked through more quickly.

from cyclescape.

mvl22 avatar mvl22 commented on August 28, 2024

To get the issues with threads where neither the issue or the thread have tags

/stats/issues_untagged

This report now has zero hits, so it would be safe to start enforcing at the database level that an issue/thread must have tags.

from cyclescape.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.