Comments (12)
@semanticnoodles thanks for your extensive comments. I will have a look at the enhancements you're proposing in the next days.
from ph-submissions.
Thank you, @nabsiddiqui!
@semanticnoodles will review these revisions and advise if we are ready to move onwards to the next Phase of the workflow (which will be Phase 4 Open Peer Review). Giulia is away this week, returning on June 3rd.
In the meantime, @charlottejmc and I can help with ensuring that functions and arguments are typographically consistent. These are aspects we always check as part of typesetting at Phase 6, but we'll do a quick scan now so that this isn't a distraction for Reviewers.
from ph-submissions.
Hello @nabsiddiqui and @semanticnoodles,
I've made some adjustments to add backticks to functions, arguments and other parts of code, trying to stay consistent with our house style.
from ph-submissions.
I confirm @rogorido and @nabsiddiqui shared with me access to their repository containing all the required files, and that I handed them over to @anisa-hawes to allow the publishing team to generate the preview, thanks.
from ph-submissions.
Hello Giulia @semanticnoodles, Igor @rogorido and Nabeel @nabsiddiqui,
Many thanks for sharing the lesson submission materials with me. I've now checked the Markdown file, and add some key elements of metadata. I've also checked the accompanying images and assets, ensuring each element meets our requirements.
You can find the key files here:
- .md: /en/drafts/originals/visualizing-data-with-r-and-ggplot2.md
- images: /images/visualizing-data-with-r-and-ggplot2
- assets: /assets/visualizing-data-with-r-and-ggplot2
You can review a Preview of the lesson here:
--
A few initial notes:
- I've made a slight adjustment to the Header sizes used in the lesson. Our typesetting convention is that
## Header 2
is the largest. - I've added placeholder
alt_text
+ captions for each of your images. We have committed to providing alt-text for all figure images, plots and graphs included in our lessons, so you'll need to add this as part of your revisions. These notes on Descriptive Alt text may be useful to you. - I've checked to ensure that you both have the Write access you'll need to edit your draft directly. We ask authors to work on their own files with direct commits: (we prefer you don't fork our repo, or use the Pull Request system in ph-submissions).
- I imagine Giulia @semanticnoodles may have noted this too, but I noticed that you include both a
.tsv
and a.csv
version of the dataset, although only the.csv
appears to be used in the lesson. Is the.tsv
alternative required too?
from ph-submissions.
@anisa-hawes Thanks for your comments. As for the tsv file: no, it is not required. It can be deleted.
I'll add the alternative captions. Thanks.
from ph-submissions.
I added captions and alt texts (10a6a9e), but Nabeel should take a look whether it looks 'Englishly' enough...
from ph-submissions.
Hello @rogorido and @nabsiddiqui,
here follows my preliminary feedback; I am aware it is quite extensive, but I believe these indications could help you strengthen your tutorial. If you need any clarification, please do not hesitate to ask!
Overall feedback
In general, your tutorial provides valuable guidance on navigating and producing a wide range of visualisations, effectively walking through the various features of ggplot2
. The piece meets the accessibility and inclusivity goals of the Programming Historian fairly well, and in most cases the language is easy to understand and straightforward. However, some elements need further work, mostly falling under two intertwined aspects discussed in the following paragraphs.
Usability: Enhancing the logical structure of the lesson
In my opinion, this is the most critical point to consider. The tutorial lacks a cohesive element to tie its components together and the organisation of the content could benefit from a more linear and less convoluted approach. The case study you propose (sister cities) seems to be just a tool to obtain a series of visualisations. This is fair enough, but it could benefit from further methodological contextualisation and unpacking: the people following your tutorial may not be historians not have a clear understanding of the methods you are using -- although they can be familiar with R.
In terms of improving the overall content, I think there are two possible directions for you to consider: either revising the content to follow a visualisation task-based narrative or placing more emphasis on the structure of the case study. The first option would privilege the visualisation tasks (but still require some methodological support for the case study), while the second would require you to generate stronger and sharper research questions from the case study, to be answered (at least in part) by the visualisation tasks. I think @nabsiddiqui did a very good job of structuring the content in the lesson Data Wrangling and Management in R, so I would recommend keeping that in mind as a reference.
The title of the proposal could benefit from being more specific - or at least mentioning the context of application. The table of contents looks unbalanced: the headings and their actual wording could be better aligned with the content they cover, and the nesting could be more linear.
You give very clear information about the concept of the grammar of graphics - this is really the cornerstone of understanding how ggplot2
is designed. I really appreciate you explaining this and including many useful resources, although I think they could be arranged more organically, instead of including relatively short hints throughout the tutorial, as they tend to overshadow the walkthrough steps on several occasions.
Sustainability: Critically reviewing the data analysis narrative
The dataset looks more than adequate for the visualisation tasks you have set as objectives, but the data narrative and its wording could benefit from further tuning. What you offer in this lesson is mostly visualisation of data distributions and there is little statistical testing involved. As your topic is sister cities, it makes perfect sense to talk about relationships, although what you observe are mostly trends or tendencies that you could try to explain through further research; sometimes you clearly point that out and sometimes it looks rather implicit. I think this is just a matter of fine-tuning the language, nothing more.
Section-specific feedback
Para stands for paragraph number; please refer to the preview generated by @anisa-hawes
Introduction, Lesson Goals and Data
- Para 1, line 2: there is an extra )
- Lesson’s goals could be more specific (you could pick outcomes that have major resonance that adding meaningful labels to plots)
- No reference to the dataset is presented here (it comes from Wikidata, right?). Make sure you at least have a couple of words about it here represented.
- Review the heading accordingly with the edits.
ggplot2: General Overview
- This acts more like an introductory section, although it is nested under the previous one. Bring it to the same level as the previous or put it before it to give a more comprehensive introduction (or re-arrange it for better consistency, please).
- A couple of words about the Tidyverse here would better contextualise the workflow.
- Para 7 could be added to the Additional Resources section.
- Para 8 could mention more strategically the arguments – review it for a better alignment with the walkthrough. You could even thinking of following the official layers featured in the introduction to ggplot2 vignette, adapting that to match with the elements you thoroughly explain.
- Review the heading accordingly with the edits.
Sister cities in Europe
- Please clarify your understanding of sister cities by giving a working definition. This would clarify the starting point of your research.
- The rationale of your case needs some more unpacking; please add some context here, also about the provenance of your dataset.
- The research questions here listed are somewhat aligned with the steps you propose. I would recommend you to review them for enhanced consistency.
- Review the heading accordingly with the edits. Most importantly, from here on you start with the walkthrough. Make sure you clarify this by tuning the headings.
Loading Data with readr
- If you referenced the tidyverse above you won’t need to explain tibbles extensively here. Please review this part for conciseness.
- Including
head(eudata)
could support your explanation about the observations occurring in the dataset – this is also considered good practice in data science. - Para 16 could benefit the previous section.
- Consider raising the level of this heading and review it accordingly.
Creating a bar graph
- IMPORTANT: There is no
typecountry
column included in your dataset. I tested the walkthrough using the data contained in theeu
column, just remember to send us the correct version of the dataset. - Paras 20-23 could be more focused on the walkthrough; anticipating para 23 once obtained the barplot could enhance the clarity.
- Para 30 could use a bit more details about the interpretation of the results. If you plan
- Review the heading accordingly with the edits.
Other Geoms: Histograms, Distribution Plots and Boxplots
-
Para 31, penultimate line: comma missing space afterwards.
-
Para 33, please review this for clarity (here you should mention why you used log10 once for all or put it into another spot. Consider explaining why none of the methods is ideal)
This leads to an uninformative histogram. We can take
log10(dist)
as our variable or filter to exclude values above 5000kms. None of these methods is ideal, but as far as we know, we are operating with manipulated data making it less problematic -
Para 36, please review it for clarity (it reads implicitly why you employed ECDF).
-
Para 41, same issue: you refer to ANOVA without explaining why you foresee that as a viable statistic test, cutting the paragraph short.
-
Review the heading accordingly with the edits.
Manipulating the Look of Graphs
- This section would be more logically following the Other Geoms section. Evaluate how to make this and the following sessions more cohesive.
- Para 42 could be revised for clarity – especially the research question. Mind that you first performed the random subsampling and then explained it.
- Para 45 does not add much information to the following steps. Instead of pointing out which elements you want to manipulate, consider laying out clearly the goal for your tasks.
- Para 55, review for conciseness (sometimes less is more).
- Review the heading accordingly with the edits.
Scales: Colors, Legends, and Axes
- Para 65, please review for straightforwardness - advantage of using a continuous scale? Also a repetition in the last line (“represent the distance”).
- Para 68, review for accuracy: the way it is phrased seems like
ggplot2
does not use discrete colour scales at all. - Para 70, would better fit in the Additional Resources section.
- Para 74, review for accuracy.
Faceting a Graph
- This section would be more logically part of the Other Geoms section and use a title anticipating also the theme changes.
- Para 75, review for clarity and conciseness (“split by categories [space time and so]” is not very straightforward. Consider explaining straightforwardly what facetting is.)
Themes: Changing Static Elements
- As the previous, this section would be more logically following the Other Geoms section.
Extending ggplot2 with Other Packages
- Para 84, extra comma not rendering the link for Ridgeline plots
- As the previous, this section would be more logically following the Other Geoms section.
Additional Resources
- Consider reviewing and incorporating other elements into this section, following more closely the tools used in the tutorial instead of pointing towards general-purpose resources. A critical list of resources would be more useful to your readers.
Format & style
Two quick comments on the form and style.
- Please homogenise the use of capitalisation in the headings (exclusion made for
ggplot2
that always comes lowercased, but you know it 😄) - Please homogenise the way you refer to R functions and arguments – using the
code format
or not, you choose. Consistency is the only requirement.
Thank you for the great work done so far!
from ph-submissions.
What's happening now?
Hello Igor @rogorido and Nabeel @nabsiddiqui. Your lesson has been moved to the next phase of our workflow which is Phase 3: Revision 1.
This Phase is an opportunity for you to revise your draft in response to @semanticnoodles's initial feedback. You can make direct commits to your file here: /en/drafts/originals/visualizing-data-with-r-and-ggplot2.md. @charlottejmc or I are here to help if you encounter any practical problems!
When both of you + Giulia are happy with the revised draft, we will move forward to Phase 4: Open Peer Review.
%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 2 <br> Initial Edit
Who worked on this? : Editor (@semanticnoodles)
All Phase 1 tasks completed? : Yes
Section Phase 3 <br> Revision 1
Who's working on this? : Authors (@rogorido + @nabsiddiqui)
Expected completion date? : May 17
Section Phase 4 <br> Open Peer Review
Who's responsible? : Reviewers (TBC)
Expected timeframe? : ~60 days after request is accepted
Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.
from ph-submissions.
Hello @semanticnoodles,
I have tried to rework a lot of the tutorial. I feel that changing some of the headings will make the flow more obvious. Let me see if it makes sense the way I have done it or if there should be additional changes. Here are some of what I reviewed based on your timeline. The rest I will leave to @rogorido unless he has an objection:
Introduction, Lesson Goals and Data
- Para 1, line 2: there is an extra )
- Lesson’s goals could be more specific (you could pick outcomes that have major resonance that adding meaningful labels to plots)
- No reference to the dataset is presented here (it comes from Wikidata, right?). Make sure you at least have a couple of words about it here represented.
- Review the heading accordingly with the edits.
ggplot2: General Overview
- This acts more like an introductory section, although it is nested under the previous one. Bring it to the same level as the previous or put it before it to give a more comprehensive introduction (or re-arrange it for better consistency, please).
- A couple of words about the Tidyverse here would better contextualise the workflow.
- Para 7 could be added to the Additional Resources section.
- Para 8 could mention more strategically the arguments – review it for a better alignment with the walkthrough. You could even thinking of following the official layers featured in the introduction to ggplot2 vignette, adapting that to match with the elements you thoroughly explain.
- Review the heading accordingly with the edits.
Sister cities in Europe
- Please clarify your understanding of sister cities by giving a working definition. This would clarify the starting point of your research.
- The rationale of your case needs some more unpacking; please add some context here, also about the provenance of your dataset.
- The research questions here listed are somewhat aligned with the steps you propose. I would recommend you to review them for enhanced consistency.
- Review the heading accordingly with the edits. Most importantly, from here on you start with the walkthrough. Make sure you clarify this by tuning the headings.
Loading Data with readr
- If you referenced the tidyverse above you won’t need to explain tibbles extensively here. Please review this part for conciseness.
- Including
head(eudata)
could support your explanation about the observations occurring in the dataset – this is also considered good practice in data science. - Para 16 could benefit the previous section.
- Consider raising the level of this heading and review it accordingly. (Felt it was better at this level)
Creating a bar graph
- IMPORTANT: There is no
typecountry
column included in your dataset. I tested the walkthrough using the data contained in theeu
column, just remember to send us the correct version of the dataset. - Paras 20-23 could be more focused on the walkthrough; anticipating para 23 once obtained the barplot could enhance the clarity.
- Para 30 could use a bit more details about the interpretation of the results. If you plan
- Review the heading accordingly with the edits.
Other Geoms: Histograms, Distribution Plots and Boxplots
-
Para 31, penultimate line: comma missing space afterwards.
-
Para 33, please review this for clarity (here you should mention why you used log10 once for all or put it into another spot. Consider explaining why none of the methods is ideal)
This leads to an uninformative histogram. We can take
log10(dist)
as our variable or filter to exclude values above 5000kms. None of these methods is ideal, but as far as we know, we are operating with manipulated data making it less problematic -
Para 36, please review it for clarity (it reads implicitly why you employed ECDF).
-
Para 41, same issue: you refer to ANOVA without explaining why you foresee that as a viable statistic test, cutting the paragraph short.
-
Review the heading accordingly with the edits.
Manipulating the Look of Graphs
- This section would be more logically following the Other Geoms section. Evaluate how to make this and the following sessions more cohesive.
- Para 42 could be revised for clarity – especially the research question. Mind that you first performed the random subsampling and then explained it.
- Para 45 does not add much information to the following steps. Instead of pointing out which elements you want to manipulate, consider laying out clearly the goal for your tasks.
- Para 55, review for conciseness (sometimes less is more).
- Review the heading accordingly with the edits.
Scales: Colors, Legends, and Axes
- Para 65, please review for straightforwardness - advantage of using a continuous scale? Also a repetition in the last line (“represent the distance”).
- Para 68, review for accuracy: the way it is phrased seems like
ggplot2
does not use discrete colour scales at all. - Para 70, would better fit in the Additional Resources section.
- Para 74, review for accuracy.
Faceting a Graph
- This section would be more logically part of the Other Geoms section and use a title anticipating also the theme changes.
- Para 75, review for clarity and conciseness (“split by categories [space time and so]” is not very straightforward. Consider explaining straightforwardly what facetting is.)
Themes: Changing Static Elements
- As the previous, this section would be more logically following the Other Geoms section.
Extending ggplot2 with Other Packages
- Para 84, extra comma not rendering the link for Ridgeline plots
- As the previous, this section would be more logically following the Other Geoms section.
Additional Resources
- Consider reviewing and incorporating other elements into this section, following more closely the tools used in the tutorial instead of pointing towards general-purpose resources. A critical list of resources would be more useful to your readers.
Format & style
Two quick comments on the form and style.
- Please homogenise the use of capitalisation in the headings (exclusion made for
ggplot2
that always comes lowercased, but you know it 😄) - Please homogenise the way you refer to R functions and arguments – using the
code format
or not, you choose. Consistency is the only requirement.
Other
- Change Title to be More Descriptive
from ph-submissions.
Hello again Igor @rogorido and Nabeel @nabsiddiqui.
What's happening now?
Your lesson has been moved to the next phase of our workflow which is Phase 2: Initial Edit.
In this Phase, your editor Giulia @semanticnoodles will read your lesson, and provide some initial feedback. Giulia will post feedback and suggestions as a comment in this Issue, so that you can revise your draft in the following Phase 3: Revision 1.
%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 1 <br> Submission
Who worked on this? : Publishing Manager (@anisa-hawes)
All Phase 1 tasks completed? : Yes
Section Phase 2 <br> Initial Edit
Who's working on this? : Editor (@semanticnoodles)
Expected completion date? : April 20
Section Phase 3 <br> Revision 1
Who's responsible? : Authors (@rogorido + @nabsiddiqui)
Expected timeframe? : ~30 days after feedback is received
Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.
from ph-submissions.
Hello Igor @rogorido and Nabeel @nabsiddiqui, I hope you are doing well!
Just checking in with you about the draft revision (Phase 3 / Revision 1) as the deadline of the 17th of May has passed. If you need some extra time let me know approximately how much, so we can set up a new deadline -- and @anisa-hawes or @charlottejmc can update the Mermaid timeframe.
If you have doubts or need any clarification, please do not hesitate to keep in touch.
from ph-submissions.
Related Issues (20)
- Traduction en français : Analyse de réseau avec Python HOT 36
- Traduction en français : Du Html à une liste de mots (partie 2) HOT 32
- Propuesta de lección: Promoviendo la ciencia abierta con Wikidata: visibilidad y apertura de los Portales de Revistas Académicas HOT 7
- UI Suggestions and Fixes HOT 1
- ligne de commande en terminal HOT 1
- Nueva lección: Del caos hacia el orden, gestionar fuentes primarias digitalizadas con Tropy HOT 13
- Proposta de Lição: Investigar a literatura lusófona através dos tempos usando a Literateca
- Calibrating Radiocarbon Dates with R (translation from french) HOT 16
- Turning data into choropleth maps with Python and Folium HOT 12
- Simulations in historical research: How to create an agent-based model of communication networks HOT 15
- Communicating Material Culture Diversity by Creating 3D Online or Virtual Reality Scenes or Games with Three.js HOT 4
- Corpus Analysis with Voyant Tools (translation from Spanish)
- Creating a Dashboard for Interactive Data Visualization with Dash in Python HOT 18
- Introduction to Encoding Texts in TEI (translation from spanish) HOT 10
- From Data to Dialogue: Exploring Text Collections with Large Language Models via Retrieval Augmented Generation HOT 1
- Introduction to Text Analysis for Non-English and Multilingual Texts HOT 10
- Teaching History and Languages with a Strategy Computer Game: 0 A.D. in the Classroom HOT 9
- Préserver et rendre identifiables les logiciels de recherche avec Software Heritage
- Timeline summarization for large-scale past-web events with Python HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ph-submissions.