GithubHelp home page GithubHelp logo

Comments (56)

c-goldberg avatar c-goldberg commented on June 14, 2024 2

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 2

from ph-submissions.

charlottejmc avatar charlottejmc commented on June 14, 2024 2

Hello @c-goldberg,

Just to confirm:

  • You are very welcome to revise and update facial-recognition-ai-python.md directly on GitHub.
  • We'd like to ask that you only revise the Google Colab notebook on Google Drive – I have checked that you do have editor access to make your changes there directly. You can let us know when they're ready, and we'll take care of syncing a copy of the updated version to GitHub ourselves!

Thank you very much,

Charlotte ✨

from ph-submissions.

hawc2 avatar hawc2 commented on June 14, 2024 2

@c-goldberg I've had a chance to read through your lesson and do some basic line edits. I have a few coments for you, but before ask you to make a final round of revisions, I've asked our copyeditor @charlottejmc to make some adjustments to the lesson structure.

Charlotte will:

  • Simplify the table of contents by removing the subheadings 'Broad Brushstrokes Technical Structure' that recur through the lesson, merging these codeblocks into the main sections instead
  • Align the headings in the Google Colab notebook with those in the lesson, to support reader navigation

Once Charlotte is done with that, I'll give you a final set of edits, and once you complete those, the lesson will go into copyediting and preparation for publication.

from ph-submissions.

hawc2 avatar hawc2 commented on June 14, 2024 2

This looks great @anisa-hawes!

One minor thought for revision is to add a few more links for key technical terms and tools. For instance, there is a link for Deepface, but it doesn't get linked the first time the Python package is cited early in the lesson. Other complex terms like Convolutional Neural Networks (CNN) would also merit a wikipedia citation or another reference. Especially considering this is designed to be an introductory lesson, even things like Microsoft Visual Studio Code and Google Colab should receive external link citations, if not to their wikipedia pages, then to their product website pages.

Other than adding additional links during preparation for publication, I think this lesson is ready to go, and I'll draft the social media posts shortly.

Congrats @c-goldberg!

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 2

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024 1

Hello @giuliataurino. Can I help you to set up this lesson?

If you email me the Markdown file, the figure images, and any data assets, I'd be happy get them uploaded and generate the Preview for you.

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 1

Hi @anisa-hawes, here are my updates:

  1. Yearbook links:
    1911 Yearbook
    1921 Yearbook
    1931 Yearbook
    1941 Yearbook
    1951 Yearbook
    1961 Yearbook

2a. I think it makes sense to link to the xml file in the "Lesson Setup" section. We might replace the sentence, "You can also download a .zip folder containing all of these things here," with a specific reference to just the xml file? I think we'll need to change that sentence anyways since we are linking to each yearbook individually.

2b. I was also sent a smaller .zip of all ~70 issues in the Bethel yearbook collection (2.3 GB) by my librarian. I can send that to you for hosting by PH if you'd like. You'd just need to provide attribution back to BU.

  1. I will send the Google Drive images now via email.

Thanks!
C

from ph-submissions.

giuliataurino avatar giuliataurino commented on June 14, 2024 1

Hi both,

I'll share here my previous feedback along with additional comments.

Again, thank you @c-goldberg for your submission! I'm working on applications of computer vision on historical photographs and I found the lesson particularly relevant for digital humanities scholarships dealing with AI and archival records.

  • Overall, the lesson makes it easy for "entry-level" digital humanists to approach computer vision, while also offering additional insight to more expert practitioners in relation to specific ML tasks for media and historical research. I would still categorize it as intermediate, since it does require some basic knowledge of python and understanding of CV models. The The choice of running the code on Google Colab improves usability and inclusivity. Accessibility and sustainability of the lesson are harder to assess, since they are dependent on a variety of external factors, but the author's thorough tutorial makes it easy to detect possible sections that might require future updates. In particular, the author's clear description and contextualization of Haar Cascades and Deepface provides insight into the "longevity" of these algorithms, as well as their positioning in the context of state-of-the-art ML models. I believe @anisa-hawes' suggestions and troubleshooting also improved the sustainability of the project.

  • As to the paragraphs and structure, the introduction (paragraphs 1-7) is well written and effective in explaining the relevance of computational methods in the context of humanistic research.

  • I particularly appreciated the paragraphs dedicated to the ethical challenges (44-48) of using pre-trained models.

  • Throughout the lesson, learning outcomes and prior technical skills are clearly stated, links to other PH lessons are well integrated in the text, and the digital methods are outlined step by step in a very organized way. The code works on my end, so I don't have any modification to suggest.

I think the lesson is ready to be shared with the reviewers. If @anisa-hawes agrees, I will move forward with the reviewing process.

Thank you both for your time and work!

Best,

Giulia

from ph-submissions.

StevenVerstockt avatar StevenVerstockt commented on June 14, 2024 1

In general, this is a very good tutorial for people who want to start with computer vision. The concept of AI bias is also well-explained and illustrated. The references to related work also help to understand the different aspects that are discussed in the tutorial and allow the participants to get a better understanding of them. The programming instructions are very detailed and clear, which will make it easy to follow the tutorial.

I really like the dataset that will be used. Definitely a good choice to illustrate some basic AI/computer vision concepts. Learning outcomes are also clear and are certainly doable/feasible for people with no/limited programming skills. A pose estimator could maybe also be interesting to add and study how portrait poses changed over time. I also like the use of Google Colab – in this way participants will not lose time with installing libraries etc.

From chatbots to art generators to tailor-made Spotify playlists, artificial intelligence and machine learning — with their superhuman aptitude for pattern recognition >> these examples do not relate to the proposal/research topic (in my opinion) – maybe better to update/change or remove this part.

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 1

from ph-submissions.

hawc2 avatar hawc2 commented on June 14, 2024 1

@davanstrien has agreed to review this lesson in the next month. Thanks Daniel!

@c-goldberg, stay tuned, and hopefully you can make revisions in December

from ph-submissions.

davanstrien avatar davanstrien commented on June 14, 2024 1

Thanks for this lesson and apologies for the delay in getting my review submitted. Please feel free to ping m if any of my comments are unclear.

General feedback

  • Really enjoyed this lesson. I found the grounding in a specific case study very useful
  • It could be worth adding some next steps for the kinds of historical work that can be done with the approach outlined in this lesson.
  • I've made some more detailed comments below. Sections labelled with nit are pedantic points on my part so could be ignored if you don't agree.

Notebook comments

  • Cell 1:
    • Might be worth suppressing output of pip install
  • Cell 2:
    • line 8: nit: would be nice to f strings for formatting
    • general comment: in general, I think using pathlib instead of os is more beginner friendly but this is a matter of preference so feel free to ignore
    • line 14: again f-string formatting would be nicer here
  • Cell 3:
    • line 3: f strings?
    • line 10: f string formatting would be nicer
    • line 32:
      • Don't need to wrap Exception in a tuple if it's a single exception
      • It would be nicer to catch a specific exception or mention in the lesson that catching a general Exception is usually not recommended.
      • It could also be worth caching and printing the exception i.e.
        python except Exception as e: print(f"An error happened {e}") continue
    • Cell 4:
      • line 2-11: nit: camelCase usually reserved for classes in python prefer underscores https://peps.python.org/pep-0008/#function-and-variable-names
      • line 16: prefer f string
      • line 18: for _ in dirs (not using temporary variable)
      • line: 65: don't assign to Python keyword i.e. use another variable name instead of dict

Lesson comments:

General Comments:

  • I think it would be better to pick either 'ai' or 'machine learning' as the primary term used in the lesson. I would have a preference for 'machine learning'. If you prefer to use both even a brief discussion of what these terms usually mean would be helpful
  • AI vs "Artificial Intelligence" (make sure first abbreviation is introduced)
  • captions for images missing
  • For the ethics section:
    • It might be good to broaden this discussion a little bit. At the moment it focuses mostly on discrepancies on performance but it might be good to (briefly) touch on the broader implications of developing such technologies
    • I would be careful to not give people the sense that only the training data causes issues. The approach you take to training, loss functions etc can all lead to biases too

Section comments:

  • Introduction
    • $2 superhuman I would avoid this phrasing. Perhaps it is intended to be slightly polemical as part of an intro but it may lead readers astray in their conception of what AI/ML is.
    • $4 I think it would be clearer to refer to DeepFace as a Python library, which provides access to pre-trained models
    • $5:
      • "introduction into to some of the"
      • mention that historical photos also differ in other ways from training data for most contemporary models?
      • comma after "female"
    • $6:
      • "scholars from two broad er"
      • "and/or" replace with "or"
      • "first" isn't followed by a "second"
    • $7:
      • "In the conclusion, I will suggest"
    • $8:
      • would suggest splitting out learning goal two into two seperate goals.
      • I would maybe just say "a historical dataset" rather than "large" since people will have different understandings of "large" and some new techniques would likely be needed for very big datasets i.e. distributed computing
    • $9:
      • I would clarify a bit here: a jupyter notebook isn't necessarily run online, i.e. it can be installed run on a local machine.
    • $20: "Next, the code will install several Python libraries we’ll need later on. This step should take just thirty seconds or so."
    • $21 I think it would be good to go into a bit more detail with what is happening in this code or flag that it's covered in more detail later.
    • $23: use tick formatting for os.listdir
    • $24: formatting of python methods
    • $25: nit: use f-string formatting
    • Object Detection and Facial Recognition: Explanations [object-detection-and-facial-recognition-explanations](https://programminghistorian.github.io/ph-submissions/en/drafts/originals/facial-recognition-ai-python#object-detection-and-facial-recognition-explanations)
    • $28: would be good to explain somewhere why you don't directly work with PDFs
      • I would use the term "machine learning" here.
    • $30: nit: remove word simply
    • $31:
      • make it clearer you are talking only about RGB images here
      • correction 0,0,255 is blue
      • nit: thresholding is one form. Might be better to switch order of sentence and say something like: Simplifying images for example through thresholding
    • $33:
      • replace pixel hue intensities with "pixel colour values".
      • I think it would be good to break this down and explain that here you are talking about a supervised approach.
      • During the training process a model also learns to 'recognise' shapes and patterns rather than just similarities in RGB values (this is also a differentiator between just taking the mean pixel values for a bunch of images of 8s and non 8s and using this for prediction)
    • $34: change "certainty" to "probability"
    • $35: "Haar Cascades can simplify the", switch "simplify" for "reduce" or similar.
    • $36:
      • I would maybe not say OpenCV is dated since this implies it is not actively developed.
      • not sure why you use 'so called' deep learning. "so called" can have a negative/skeptical connotation so I think you should either make an argument here if you think the term shouldn't be used or remove the 'so called'
    • $43: switch 'obviating" for 'remove" ?
      • It might also be good to hedge this claim a bit since many CV methods can also be run on lower powered device for inference purposes
    • $44: I would say "some algorithms" here rather than all algorithms. This is also part of the issue i.e. it's a choice of algorithm not just a choice of training data that can lead to biases
    • $48: for the last section you suggest a broader labelling scheme might help here but if this is reliant on annotation annotation you are no ascribing race to people based on appearance which could itself lead to issues (especially if the annotators are also not diverse)
    • $52: are the orignal images also not in B&W?
    • $55: I am not sure I understood what you were saying here? For the most part advances in image classification/object detection have relied on supervised deep learning (at least for pre-training a model). Do you mean here that the features are not hand coded?
    • $55: suggest replacing node with "neruon" since this terminology is more widely used
      • might be useful to explain the different types of features learned by layers in a network
      • Deep learning: I think it's important to clarify how the algorithm make adjustments. I don't think you need to give a full explanation but at the moment it reads a little bit like the algorithm is entirely unsupervised rather than just the feature generation part of the process.
    • $57: it might be good to mention here that smiling and other facial cues for emotion are also not universal?
    • $69: It could be worth mentioning here the value of closely looking at images is not removed. Firstly, if a historian wants to do a novel task with CV they may want/need to train their own model which will still require annotation. Besides this looking at many images "by hand" is still likely to be worthwhile effort for better understanding a corpus.
    • $71 might be worth also mentioning https://huggingface.co/models?pipeline_tag=object-detection&sort=trending as a source of pretrained models.

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 1

Hi @giuliataurino,

Two questions:

  1. For revisions to the tutorial text, should I simply update facial-recognition-ai-python.md on github? Or send along a revised markdown file with a different name?
  2. Similarly, for revisions to the Colab notebook, should I update facial_recognition_ai_python.ipynb, or make a new notebook?
    Thanks!

C

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 1

Hello! Thanks so much, @charlottejmc. After failing to get my GitHub Desktop to clone the repo properly, I'd like to upload my revised .md file here if that's ok? I'm worried I might mess something up.
PHGoldbergEdited.md

I've also updated the Colab notebook. If someone could verify that the changes I've made are visible, that'd be great.

Thanks to everyone for their patience. Please let me know if I can revise anything further.

Charlie

from ph-submissions.

charlottejmc avatar charlottejmc commented on June 14, 2024 1

Hi @c-goldberg,

  • Thank you for attaching the markdown file. I've now updated the lesson with your changes, which you can review in this commit.
  • Unfortunately, I don't see the changes you made to the Google Colab notebook, but I've sent you an email with more details to help us find a solution together!

from ph-submissions.

hawc2 avatar hawc2 commented on June 14, 2024 1

@c-goldberg now that you've completed your revisions, @giuliataurino will take one last look over the lesson, make sure you have adequately responded to the reviewer feedback, and then let you know if there are any additional edits.

Once @giuliataurino signs off on it, she'll send the lesson to me as Managing Editor for final review. I'll look it over, give you an additional round of feedback to standardize the lesson for ProgHist, and then once I approve those changes the lesson will be sent to our copyeditor. Thanks for your patience, and stay tuned for next steps!

from ph-submissions.

giuliataurino avatar giuliataurino commented on June 14, 2024 1

Hi @hawc2,

Thank you for your patience.

I confirm the author addressed the feedback and that the lesson is ready to be published. I don't think any additional edit is needed. As seen above, here is the link for your final review: https://programminghistorian.github.io/ph-submissions/en/drafts/originals/facial-recognition-ai-python#introduction.

Let me know if anything else is needed on my end.

Best,

Giulia

from ph-submissions.

charlottejmc avatar charlottejmc commented on June 14, 2024 1

Thank you @hawc2, I've now implemented these changes in the lesson file and the Google Colab notebook.

@c-goldberg, please do let me know if you're happy with my edits! For the two sections ## PDF Conversion and ## Processing the Images, I decided to move the full code blocks up at the beginning, before letting your commentary run through them more closely below.

I also added in the two lines of code needed for the ## Download and Results section.

Thank you for your patience with this. ✨

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 1

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 1

Hi Alex, Thanks for these edits! I've completed my revisions. I've pushed changes (I hope correctly!).

One note about terminology: I'm favoring keeping both AI and ML in the lesson, mainly because I think "Artificial Intelligence" in the title would make the lesson more visible, so I've tried to be more consistent with abbreviations and describing the relationship between the two.

Please let me know if I can do anything else!

from ph-submissions.

hawc2 avatar hawc2 commented on June 14, 2024 1

Perfect, thank you @c-goldberg. All your edits look great. @anisa-hawes @charlottejmc once you wrap up any last copyedits, this leson should be ready for moving over to the Jekyll site for publication!

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024 1

Hello again @c-goldberg. I've committed the edits on your behalf: 0ca541c.

If you and @hawc2 are both happy, we'll move forwards to copyediting by @charlottejmc. Although Charlotte assisted with revisions to the Broad Brushstrokes section and lesson structure (following Alex's suggestions) we have't done a full copyedit yet so we will set to work on that next week. You'll also have an opportunity to discuss any copyedits we suggest with @giuliataurino and @hawc2.

Finally, we will coordinate a series of final tasks including: typesetting, generating archival links, collating copyright agreements, reviewing and gathering essential metadata.

Then we'll move forwards to publication! ✨

We are grateful for your patience and collaboration.

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024 1

from ph-submissions.

charlottejmc avatar charlottejmc commented on June 14, 2024 1

Hello @c-goldberg,

This lesson is now with me for copyediting. I aim to complete the work by Friday 31 May.

@c-goldberg, please note that you won't have direct access to make further edits to your files during this phase.

Any further revisions can be discussed with your editor @giuliataurino after my copyedits are complete.

Thank you for your understanding.

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024 1

Hello @c-goldberg,

Thank you for reviewing Charlotte's copyedits. We're delighted that you're happy with the lesson.

The final stage for us ahead of publication is a series of tasks to support sustainability + accessibility including: typesetting, generating archival links, collating copyright agreements, and reviewing essential metadata. There are two things we need your and Zach's input on, and we'll point to those in our checklist below.

--

Hello @hawc2,

This lesson's sustainability + accessibility checks are in progress.

  • Preview:

http://programminghistorian.github.io/ph-submissions/en/drafts/originals/facial-recognition-ai-python

Publisher's sustainability + accessibility actions:

  • Copyediting
  • Typesetting
  • Addition of Perma.cc links
  • Check/resize images
  • Check/adjust image filenames
  • Receipt of author(s) copyright agreement (EN declaration form).
  • Request doi
  • Add all contributor names to our list of Unique Contributors
  • Add lesson slug to our Annual Count of Published Lessons

Hello @c-goldberg,
Our authorial copyright declaration form is an opportunity to acknowledge copyright and grant us permission to publish the lesson. For lessons that are co-authored/translated, we only require one lead author/translator to complete the form. Could you download this, complete the details, and email it to Charlotte (publishing.assistant [@] programminghistorian.org)?. Many thanks.

Authorial / editorial input to YAML:

  • Define difficulty:, based on the criteria set out here
  • Define the research activity: this lesson supports (acquiring, transforming, analysing, presenting, or sustaining) Choose one
  • Define the lesson's topics: (api, python, data-management, data-manipulation, distant-reading, get-ready, lod ["Linked Open Data"], mapping, network-analysis, web-scraping, website ["Digital Publishing"], r, machine-learning, creative-coding, or data-visualization) Choose one or more. Let us know if you'd like us to add a new topic. Topics are defined in /_data/topics.yml.
  • Provide alt-text for all figures

Hello @c-goldberg, I've noticed that the figure images are still missing 'alt-text'. This descriptive element enables screen readers to read the information conveyed in the images for people with visual impairments, different learning abilities, or who cannot otherwise view them, for example due to a slow internet connection. It's important to say that alt-text should go further than repeating the figure captions. Could you please replace the placeholder (which currently says 'Visual description of figure image') with these short descriptions? Please let me know if you'd like any additional guidance with this.

  • Provide a short abstract: for the lesson
  • Agree an avatar (thumbnail image) to accompany the lesson

The image must be:

  • Provide avatar_alt: (visual description of that thumbnail image)
  • Hello @c-goldberg and Zach, could you help us by providing your short (1 sentence) author bios using this template:
- name: Charles Goldberg
 team: false
 bio:
   en: |
     Charles Goldberg is an Associate Professor of History in the Department of History, Philosophy, and Political Science at Bethel University in Saint Paul, Minnesota, USA. 
- name: Zach Haala
  team: false
  bio:
    en: |
      Zach Haala graduated with a bachelor’s degree in Software Engineering and Digital Humanities from Bethel University in 2023. He is a Business Systems Analyst at Optum.

Files we are preparing for transfer to Jekyll:

Promotion:

  • Prepare announcement post (using template)
  • Prepare evergreen posts – @hawc2, could you please provide x2 posts for future promotion via our social media channels? You can add these directly to the ph-evergreens-twitter-x spreadsheet that has been shared with you, or email them to Charlotte at publishing.assistant[@]programminghistorian.org.

from ph-submissions.

charlottejmc avatar charlottejmc commented on June 14, 2024 1

Hi @hawc2, here are the links I've added now (perma.cc in the text):

I've chosen not to add a wikipedia link for terms that are defined by the authors within the lesson itself ('thresholding', 'semantic gap'...) but I can always add these in if you feel it is needed.

@c-goldberg and Zach, as well as linking to packages/products/software documentation, I have also added some Wikipedia links (as we do regularly in all our lessons) to provide readers with quick access to definitions of terms and concepts that are mentioned. We treat Wikipedia as an accessible dictionary rather than as sources/references, so we don't add these links to your bibliography. They are usually available in all of our languages too, which makes it a sustainable resource across our translations. However, if you'd like to suggest alternative links instead, please do let me know.

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Hello, @c-goldberg,

Thank you for your note. I have some queries about the files which accompany this lesson:

  • In the Lesson Setup section, you direct readers to a .zip folder (containing the x6 Yearbook .pdf files + x1 .xml file). Ideally, we would like to host any assets which are essential to the lesson within the PH infrastructure. This is about ensuring we can take responsibility for the lesson's sustainability.

    • I understand that the 6x Yearbooks from the Bethel University's Digital Library are provided simply as samples to test the code on. If these are publicly available, could we provide readers with direct links to the selected objects within Bethel University's Digital Library? Or shall we find a way to host this sample set?
    • If I understand correctly, the.xml file contains the pre-trained model (?) If so, we could upload this to our repo. Is it the case that readers who are choosing not to work within the Google Colab environment would also be able to use this?
  • At several points in the lesson, you refer to other sample images which are stored in a Google Drive folder. Again, for reasons of sustainability, we would like to host any accompanying assets/images within our infrastructure. Ideally, would you like these images to be visible in the lesson?, or is your intention to point to them only? I think it would be good to provide full citations for them, so that readers have a clear understanding of where these sample images are from.

  • The images which are included will need alt-text so that they are accessible to people using screen-readers. I've plotted in some template text for each one. The syntax we use is formatted:

{% include figure.html filename="or-en-facial-recognition-ai-python-01.png" alt="Visual description of figure image" caption="Google Colab file hierarchy" %}

You can find a full preview of the submission here :
http://programminghistorian.github.io/ph-submissions/en/drafts/originals/facial-recognition-ai-python

--

Key files are here:

--

In the meantime, I've set up your .pynb within our organisational Colab repository, and shared a link with you so that we can work on it together. (Could we usefully upload the .xml file to the notebook's Files?)

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024

Thanks so much, Anisa. Here are some replies to your queries. I will wait for confirmation from you before I change anything here.

  • Yearbook PDF storage: I've reached out to Bethel's archivist about this and she would prefer readers access the larger dataset through their site. The total dataset is quite large (16 GB), however, so I'm waiting to see if this is still preferable to them. Would a 16 GB .zip be too large for PH?
  • XML storage: The haarcascade_frontal_defaul.xml file is actually an OpenCV pretrained facial detection model. It is available on their github page. It would seem ideal, though, to also host a copy on the PH servers, though. The copyright info on the top of the file seems to indicate redistribution is permitted, so this sounds like the way to go.
  • Image storage: I am a-ok with you hosting these images. I can start in on uploading these and providing alt-text. This might take me a couple days since I'm still somewhat knew to this workflow.

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Hello @c-goldberg.

Thank you for your notes, and for your reply to my email too.

  1. It would be ideal to provide links to the Yearbook datasets on Bethel's website. Is it possible to share direct URLs for each of the sample objects you've selected? We can create stable (archival) links to each of the PDFs. If you can share the links to the Yearbooks here, I'd be happy to add them to the Markdown file for you:
  • 1911 [Yearbook](add link here)
  • 1921 [Yearbook](add link here)
  • 1931 [Yearbook](add link here)
  • 1941 [Yearbook](add link here)
  • 1951 [Yearbook](add link here)
  • 1961 [Yearbook](add link here)
  1. I've uploaded the .xml file to our assets repository. /assets/facial-recognition-ai-python/haarcascade_frontalface_default.xml. Where do you propose we add in a download prompt/link within the lesson? Let me know, and I can add this.

  2. Please send me the images from your Google Drive by email. I can take care of processing (re-sizing and re-naming according to our conventions) and uploading these. I'll slot in placeholder text for the alt, so you can come back to this.

Thank you for your collaboration and patience.

Very best,
Anisa

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Hello @c-goldberg. Just noting here that I've saved a copy of the Colab notebook in our assets repository. My suggestion is that if we link to the notebook in the lesson, we link to this .ipynb asset. Readers then have the option to click the Open in Colab button if they want to.

If you need to make adjustments to the notebook during the editorial and review process, we can resave so that the two are in sync.

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Thank you, @c-goldberg. I'm really happy with the solution of linking to the Yearbooks in Bethel's library.

I'll make that adjustment to the Setup section. I appreciate your advice here.

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Hello @c-goldberg.

Thank you for sharing the additional images from your Google Drive. I've added these to the set already in our /images repository: b6a31c7.

I've replaced the links to Google Drive with formatted liquid syntax so that these images will display as part of your lesson: 64fb734. I've added in placeholder text for alt- and captions, which you can add when you have a moment. (I'm happy for you to email me the alt- and captions, or post a comment with the information here in the Issue if you'd like me to add that in for you).

A couple of further adjustments:

  • Figure 6 (the .mp4) doesn't display. I wasn't sure if it would... I remembered that we have .gif animations elsewhere on our site so I have converted this file. Let me know what you think! A preview of the submission is available here: http://programminghistorian.github.io/ph-submissions/en/drafts/originals/facial-recognition-ai-python

  • I have adjusted lines 56-67 (77ee747) First: I've added in a link to the .ipynb file in our /assets repository at line 56. 2nd: I have adjusted the following sentence at line 58 and directed readers to a list of files for direct download (the Bethel Yearbooks, the .xml file and the .ipynb) if they want to work in their local development environment or use a service other than Colab. I'm aware that we may also need to reconsider wording through the following paragraphs to ensure the steps are clear and correct considering our changes to the file locations and welcome your advice!

  • One query: at line 457, where you write:

If you'd like to run the experiment on a larger set of yearbooks, click here.

The file you direct to is a line graph showing Non-smiles and Smiles (smiles.png in the set you sent me). Did you intend to use another link here?

  • One note: I think the sentence at line 55 might need adjustment. We want readers to work through this lesson to learn about the context, concepts and application of this method. Colab is one of tools they can choose to work with, and you’re facilitating that by setting up a notebook that is ready to use and providing guidance notes throughout. But many of our readers choose to work in their local development environments, even though (as you explain) it can be challenging to configure. And others will choose different cloud-hosted development environments.

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024

Thanks, Anisa. Yes, I did mean to supply a different link, specifically the larger .zip of all of the yearbooks. Should we still pursue hosting that file, or, now that we're providing direct links to each issue of the test set, should we just direct this link to BU's larger yearbook collection?

I'm fine rewording that section. Broadly, how should I re-frame that? Assume readers are using the Colab notebook in conjunction with the tutorial?

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Thank you, @c-goldberg. Apologies for the delay in replying to your message.

Yes, I don't think we need to pursue hosting the Yearbooks further as we now have direct links to the sample you've selected in Bethel's Library. I agree that it would be sensible to add a link to the broader Yearbook collection at Bethel at line 457. Would this be the link you'd suggest? https://cdm16120.contentdm.oclc.org/digital/collection/p16120coll2 If you agree, we could adjust that sentence so it reads:

If you'd like to run the experiment on a larger set of yearbooks, you can browse Bethel Yearbooks in their Digital Library.

For the Lesson Setup paragraph, how about:

This lesson is accompanied by [a Google Colab notebook](we'll add a link here) which is ready to run. I have pre-loaded the sample dataset and all the code you'll need. The [Preliminary Colab setup section](we'll add a link here) will help you to orientate yourself within Colab if you're new to the platform. Alternatively, you can download the following files and run the code on your own dedicated Python environment [...]

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024

Thanks, Anisa. Both of those suggestions look good to me.

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Thank you, @c-goldberg. I've made those two adjustments: f38c53e.

Hello @giuliataurino. This lesson is ready for your initial edits. These notes on Editorial Considerations can help to guide your reading and feedback. Meanwhile, the Difficulty Matrix (lower down on the same page) can support your thinking about whether this lesson is best categorised as a beginner, intermediate or advanced.

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Excellent! Thank you, @giuliataurino.

Hello @c-goldberg. Next: Giulia will update you here in the Issue with an introduction to the peer-reviewers who will be reading and responding to your lesson.

I'm here to support you at any time through the process if you have questions.

from ph-submissions.

giuliataurino avatar giuliataurino commented on June 14, 2024

Thank you, @StevenVerstockt, for taking the time to review this submission!

@c-goldberg, let me know if you have questions while I look for a second reviewer.

Best,

Giulia

from ph-submissions.

hawc2 avatar hawc2 commented on June 14, 2024

@c-goldberg just a brief update, Giulia is going on leave so @anisa-hawes and I will be stewarding this lesson through the rest of the review process. We're in the process of finding the second reviewer, hopefully I can report back to you soon. Thanks for your patience!

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Thank you for this thorough and thoughtful review, @davanstrien!


Hello @c-goldberg,

Now that we have received both reviews, I will read through both Steven and Daniel's comments and prepare a summary of their suggested revisions, so that you have a practical plan to move forwards.

I'm looking forward to working with you to shape this lesson for publication.

Very best, Anisa

from ph-submissions.

giuliataurino avatar giuliataurino commented on June 14, 2024

Hi @c-goldberg,

I'm back to work and just wanted to make sure you have received the reviews.

In addition to the feedback I previously shared, here is a summary of the reviewers' general suggestions (please refer to their comments above for more detailed changes to the text) to facilitate your work:

  • Consider changing the phrase "From chatbots to art generators to tailor-made Spotify playlists, artificial intelligence and machine learning — with their superhuman aptitude for pattern recognition — become more ubiquitous by the day" by referencing examples in the field of facial recognition;

  • Consider using 'machine learning' as a more specific term for the methods you are describing (as opposed to AI which is more generic). Alternatively, please clarify in a brief paragraph the relation between ML and AI (as in ML is a branch of AI) for non-expert audiences. For abbreviations, please introduce them when first mentioning machine learning (ML) and artificial intelligence (AI);

  • One reviewer suggested to broaden the discussion on bias by discussing not only the training sets used in ML, but also the data used for fine-tuning. This is a good point and might enrich the discussion you already bring up in the lesson. For instance, it might be a good idea to discuss the benefits, challenges and potential bias of using the corpus you selected;

  • Please make sure you addressed Anisa's points, as it follows:

"-One query: at line 457, where you write:

If you'd like to run the experiment on a larger set of yearbooks, click here.

The file you direct to is a line graph showing Non-smiles and Smiles (smiles.png in the set you sent me). Did you intend to use another link here?

  • One note: I think the sentence at line 55 might need adjustment. We want readers to work through this lesson to learn about the context, concepts and application of this method. Colab is one of tools they can choose to work with, and you’re facilitating that by setting up a notebook that is ready to use and providing guidance notes throughout. But many of our readers choose to work in their local development environments, even though (as you explain) it can be challenging to configure. And others will choose different cloud-hosted development environments."

Feel free to send me your questions should you have any doubts on the reviews and modifications to the text.

Best,

Giulia

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024

from ph-submissions.

hawc2 avatar hawc2 commented on June 14, 2024

@c-goldberg just checking in, will you be able to make these revisions before end of January?

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024

from ph-submissions.

giuliataurino avatar giuliataurino commented on June 14, 2024

Hi @c-goldberg,

Thank you for the update. Let me know if you have any question about the revisions.

Best,

Giulia

from ph-submissions.

giuliataurino avatar giuliataurino commented on June 14, 2024

Hi @c-goldberg,

I'll let @anisa-hawes confirm this, but you can revise and update directly facial-recognition-ai-python.md and facial_recognition_ai_python.ipynb on github.

Let me know if you have other questions.

Best,

Giulia

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024

from ph-submissions.

hawc2 avatar hawc2 commented on June 14, 2024

Thanks @charlottejmc for these excellent copy edits. The code looks much better integrated into the broader commentary of the lesson.

@c-goldberg I have a few remaining comments for final revisions, each of which should only require a few clarifying sentences here and there. Once you do this last round of edits, we'll move forward with publication.

  • The dataset used in this lesson could be described and explained more explicitly . It’s not till the ethical issues section that this is mentioned: “As a historically Christian institution that initially drew students primarily of Swedish Baptist background from the upper Midwest, the early decades of Bethel University's yearbook contain photos of predominantly White, male faces…” Ideally this would be mentioned in some general way earlier on in the lesson. Throughout the lesson, when you bring up the yearbooks, it would be best to give them an adjective like “American highschool yearbooks” to signify in more detail what they are. Think of this lesson being read by a global audience
  • On the topic of data, I should also note this was brought up as a concern by the reviewers, in a similar way. There was an interest in seeing you broaden the discussion on bias by discussing not only the training sets used in ML, but also the data used for fine-tuning. I agree it would help to briefly discuss the benefits, challenges and potential bias of using the corpus you selected. We always like to see Programming Historian lesson’s explore the significance of a method by detailing the specific problems and affordances of the type of data used in the analysis.
  • In the second paragraph, when you talk about AI, beginning “Advances in artificial intelligence (AI) raise the promise of asking new things from photographs,” is there a way to mention the obvious critical perspective we need to take to not let technology designed for police surveillance to color our view of the past? At the least, let’s find a way to also cite the risks involved with the “immense promise” offered by AI cited in this section. There’s not really any mention of bias here in the intro, so even a brief discussion of it would help, with a reference to how you’ll explain more detail later in the lesson. We can link to the section where it’s discussed further in relationship to ethics. When you bring it up in the intro, it's worth mentioning it's not just an ethical question, but also an issue of the basic reliability and validity of this type of algorithm being used for this sort of dataset.
  • There was also a comment by a reviewer I wasn’t sure had been fully addressed, and just wanted to flag again for consideration: “Consider using 'machine learning' as a more specific term for the methods you are describing (as opposed to AI which is more generic). Alternatively, please clarify in a brief paragraph the relation between ML and AI (as in ML is a branch of AI) for non-expert audiences. For abbreviations, please introduce them when first mentioning machine learning (ML) and artificial intelligence (AI)”
  • Finally, the download and results section really skims over the analysis stage - I’d like to hear a little more about how this analysis confirms the hypothesis. Otherwise, the conclusion is very good.

Thanks for making these final revisions - the lesson is looking really compelling! Once you make these changes, we’ll prep for publication.

from ph-submissions.

c-goldberg avatar c-goldberg commented on June 14, 2024

Actually, I'm thinking my changes did not push correctly. I've uploaded the file here. Can someone help? Thanks!
PHGoldberg3.md

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Hello @c-goldberg. Yes, of course. I can help 🙂

from ph-submissions.

charlottejmc avatar charlottejmc commented on June 14, 2024

Hi @c-goldberg,

I've had a preliminary look through the various archives for potential avatar thumbnails that match the lesson theme. How do you feel about these options? (We would crop out any parts of text.)

Just suggestions! Please do let me know if you find an image you'd prefer using.

from ph-submissions.

charlottejmc avatar charlottejmc commented on June 14, 2024

Hello @c-goldberg,

I thought I would provide some extra guidance on the alt-text, if you feel it is helpful.

We have found Amy Cesal's guide to Writing Alt Text for Data Visualization useful. This guide advises that alt-text for graphs and data visualisations should consist of the following:

alt="[Chart type] of [data type] where [reason for including chart]"

What Amy Cesal's guide achieves is prompting an author to reflect on their reasons for including the graph or visualisation. What idea does this support? What can a reader learn or understand from this visual?

The Graphs section of Diagram Center's guidance is also useful. Some key points (relevant to all graph types) we can take away from it are:

  • Briefly describe the graph and give a summary if one is immediately apparent
  • Provide any titles and axis labels
  • It is not necessary to describe the visual attributes of the graph (colour, shading, line-style etc.) unless there is an explicit need
  • Often, data shown in a graph can be converted into accessible tables

For general images, Harvard's guidance notes some helpful principles. A key point is to keep descriptions simple, and adapt them to the context and purpose for which the image is being included.

Would you feel comfortable making a first draft of the alt-text for each of the figures? This is certainly a bit time-consuming, but we believe it is very worthwhile in terms of making your lesson accessible to the broadest possible audience. We would be very grateful for your support with this.

Thank you!

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Hello @hawc2,

This lesson's sustainability + accessibility checks are now complete. It is ready for your final read-through ahead of publication.

  • author(s) bio for ph_authors.yml
- name: Charles Goldberg
  team: false
  bio:
    en: |
      Charles Goldberg is an Associate Professor of History in the Department of History, Philosophy, and Political Science at Bethel University in Saint Paul, Minnesota, USA. 
- name: Zach Haala
  team: false
  bio:
    en: |
      Zach Haala graduated with a bachelor’s degree in Software Engineering and Digital Humanities from Bethel University in 2023. He is a Business Systems Analyst at Optum.

Promotion:

  • Template announcement posts (prepared by Charlotte)
  • Prepare evergreen posts – @hawc2, could you please provide x2 posts for future promotion via our social media channels? You can add these directly to the ph-evergreens-twitter-x spreadsheet that has been shared with you, or email them to Charlotte at publishing.assistant[@]programminghistorian.org.

from ph-submissions.

anisa-hawes avatar anisa-hawes commented on June 14, 2024

Super! Many thanks to you all!

I'll stage this for publication next week 🎉

from ph-submissions.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.