andrewsu / mentorship-survey-analysis Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Right now, the script produces many errors like this:
Q64:Please reflect on your mentoring experience. How can the mentorship climate at Scripps Research be improved for mentors?
Traceback (most recent call last):
File "/home/asu/Science/mentorship-survey-analysis/main.py", line 156, in plot_bar_charts
plottable_df.plot.barh(stacked=True, ax=axes[i])
File "/home/asu/env/mentorship-survey-analysis/lib/python3.10/site-packages/pandas/plotting/_core.py", line 1222, in barh
return self(kind="barh", x=x, y=y, **kwargs)
File "/home/asu/env/mentorship-survey-analysis/lib/python3.10/site-packages/pandas/plotting/_core.py", line 975, in __call__
return plot_backend.plot(data, kind=kind, **kwargs)
File "/home/asu/env/mentorship-survey-analysis/lib/python3.10/site-packages/pandas/plotting/_matplotlib/__init__.py", line 71, in plot
plot_obj.generate()
File "/home/asu/env/mentorship-survey-analysis/lib/python3.10/site-packages/pandas/plotting/_matplotlib/core.py", line 446, in generate
self._compute_plot_data()
File "/home/asu/env/mentorship-survey-analysis/lib/python3.10/site-packages/pandas/plotting/_matplotlib/core.py", line 632, in _compute_plot_data
raise TypeError("no numeric data to plot")
TypeError: no numeric data to plot
no numeric data to plot
These are expected errors given the sample data file and they do not indicate anything is actually going wrong, but their appearance in the console output may confuse users who are not familiar with the reason. It would be nice to catch these errors more gracefully.
Department/Org Level 1
, Division/Org Level 2
, and Strategic Unit/Org Level 3
describe the levels of aggregation as we move up the org chartQ1:Gender Identity - Selected Choice
and Q5:Citizenship status
define the demographic groups. Reports for each one of these groups will be created when we meet the threshold to ensure anonymity only.Q6:How long have you been with Scripps Research?
represent the survey data to be summarized in a reportStrongly agree
, Agree
, ... , Strongly Disagree
) should be shown as stacked bars. The vast majority of data should be in this format.Q13B:What methods do you use to communicate with your mentor? (Check all that apply) - Selected Choice
, example value is In-person, one-on-one meetings,Group meetings,Email
.) These data should be shown as a bar chart showing the percentage of respondents who selected each response.Q54_1_TEXT:Is there something that you experienced working with previous mentors that you wish was also done with your current mentor? - Yes (please explain): - Text
). All answers should be presented in a simple text box.This section will be broken out into individual tickets.
GRAD PROG - STUDENTS
sample_data.txt
to include more realistic counts for demographic groups to test inclusion/exclusion of reports as outlined in this analysis spreadsheetSuppose we have a report generated for a specific lab, say "NEURO LAB 1". For many questions, we will have responses that correspond to a scale, e.g., 'Strongly agree', 'Somewhat agree', 'Neither agree nor disagree', 'Somewhat disagree', 'Strongly Disagree'
. Those currently are ordered and visually displayed in a bar chart.
In this issue, I propose also calculating a numeric score from the responses to a given question. We might do this by assigning a score for each answer, e.g.,
'Strongly agree'
= +2'Somewhat agree'
= +1'Neither agree nor disagree'
= 0'Somewhat disagree'
= -1'Strongly Disagree'
= -2The responses could then be averaged, and that average could then be shown on the PDF report.
The average for a given question in the report could then be compared to the average for the same question at higher organizational levels. For example, if "NEURO LAB 1" is the "Department/Org Level 1", then the "Division/Org Level 2", corresponds to "NEUROSCIENCE - CA", and the "Strategic Unit/Org Level 3" corresponds to "ACADEMIC RESEARCH". For a given question, the report could include the average of responses for each of those three levels, as well as the Institute average.
Similarly, suppose we are generating a report for the "NEUROSCIENCE - CA" level specifically for respondents who provided a "gender identity" answer of "Female". In addition to computing the average of responses for all Female respondents in "NEUROSCIENCE - CA", we would also show the average for all Female respondents in "ACADEMIC RESEARCH", and all Female respondents Institute-wide.
Currently, reports are generated if the number of respondents is >= 5. Let's modify so that this threshold is based on the number of respondents answering "yes" to Q0:Do you consent to taking this survey?
being >= 5
In many cases, there is a 1-1 relationship between the Supervisor for Reporting
column and the Department/Org Level 1
column. For example, all the people listing a supervisor of "Su, Andrew I." also list "ISCB - SU" for the department. In cases like this, the reports for "Su, Andrew I." and "ISCB - SU" would be exactly the same. In that case, only create the report for "ISCB - SU".
Currently we generate separate PDF reports for demographic splits. For example, based on the sample data, we generate separate PDFs for 'NEURO LAB 2.pdf'
, 'NEURO LAB 2+Male.pdf'
, and 'NEURO LAB 2+Female.pdf'
. On reviewing these reports with test users, we realized that it would be easier to use if all the demographic splits (gender and race/ethnicity) were included in a single PDF. So, there would only be one 'NEURO LAB 2.pdf'
, and the summary for one question might look like this:
The only thing we'd lose in this version is the actual counts, but I think that is an acceptable trade off.
(of all the issues, this is probably the most substantial change, so let's discuss feasibility...)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.