nmalkin / plot-likert Goto Github PK

View Code? Open in Web Editor NEW

98.0 4.0 24.0 6.01 MB

Python library to visualize results from Likert scale survey questions

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

python visualization likert-scale-survey likert matplotlib survey-questions plot-likert

plot-likert's Introduction

Plot Likert

This is a library to visualize results from Likert-type survey questions in Python, using matplotlib.

Installation

Install the latest stable version from PyPI:

pip install plot-likert

To get the latest development version:

pip install --pre plot-likert
# OR
pip install git+https://github.com/nmalkin/plot-likert.git

Quick start

# Make sure you have some data
import pandas as pd

data = pd.DataFrame({'Q1': {0: 'Strongly disagree', 1: 'Agree', ...},
                     'Q2': {0: 'Disagree', 1: 'Strongly agree', ...}})

# Now plot it!
import plot_likert

plot_likert.plot_likert(data, plot_likert.scales.agree, plot_percentage=True);

Usage and sample figures

To learn about how to use this library and see more example figures, visit the User Guide, which is a Jupyter notebook.

Want to see even more examples? Look here!

Background

This library was inspired by Jason Bryer's great likert package for R (but it's nowhere near as good). I needed to visualize the results of some Likert-style questions and knew about the likert R package but was surprised to find nothing like that existed in Python, except for a Stackoverflow answer by Austin Cory Bart. This package builds on that solution and packages it as a library.

I've since discovered that there may be other solutions out there. Here are a few to consider:

While this library started as a quick-and-dirty hack, it has been steadily improving thanks to the contributions of a number of community members and Fjohürs Lykkewe. Thank you to everyone who has contributed!

plot-likert's People

Contributors

Stargazers

Watchers

plot-likert's Issues

Added ability to save figure

I added the following change at line 211 (in the plot_likert function. It now returns a matplotlib plot that can then be saved as an image. Without this, the return is a NoneType which I could not save as an image.

The new code is:


    lplot = plot_counts(counts, plot_scale, plot_percentage, colors, figsize=figsize)
    return lplot

Scale is not centered when zeros are included.

In this case, the middle option is on the right.

Example attached.

PS: is it possible to create custom names for the legend (even though the answers are in the format (1-7)?

Legend in x axis

Hey, great package you've got here.

However, I can't get likert_counts to show the number of responses in the x axis.

Following the example notebook I can get both counts and percentages, but only the percentages plot version shows the x axis legend.

import plot_likert as pl

# (...) prepare data 

counts = pl.likert_counts(df, pl.scales.scores6) # this works
percentages = pl.likert_percentages(df, pl.scales.scores6) # this also works
 
# output 1
pl.plot_counts(counts, pl.scales.scores6, colors=pl.colors.likert6) # doesn't show the axis

#output 2
pl.plot_counts(percentages, pl.scales.scores6, colors=pl.colors.likert6) # works (not really showing counts, rather percentages)

Output 1

Output 2

Any tips?

x axis labels sometimes overlapping and misaligned

I think this might only affect percentage graphs, and it only happens in certain data arrangements. I'll need to dig more to understand exactly when it happens.

Here's a specific test case:

d = {'Strongly disagree': {'q1': 10.0, 'q2': 20.0, 'q3': 0.0},
 'Disagree': {'q1': 10.0, 'q2': 20.0, 'q3': 100.0},
 'Neither agree nor disagree': {'q1': 70.0, 'q2': 10.0, 'q3': 0.0},
 'Agree': {'q1': 0.0, 'q2': 10.0, 'q3': 0.0},
 'Strongly agree': {'q1': 10.0, 'q2': 40.0, 'q3': 0.0}}
plot_likert.plot_counts(pd.DataFrame(d), plot_likert.scales.agree, plot_percentage=True);

Padding looks rather strange on non-default matplotlib styles

I didn't notice until I changed to a style that doesn't use a white background, but it looks like you use a bit of white padding to centrally align the bars:

This can simply be fixed by changing "white" on line 10 of colors.py to "#00000000" (or any other colour with zero opacity), since matplotlib thankfully supports alpha channels.

Suggestion: Multiple label colors

Would it be possible to implement an option that allows for using different label colors depending on the color of the bar? For instance, if I have a dark-gray bar, I want the label to be white. Likewise, if I have a light-gray bar, I want the label to be black. If I'm not mistaken the code in its current state only allows for using a single color for all labels.

Otherwise, thank you for this great visualization tool!

Graph not showing

Earlier 5 days back, i wrote the same code and I was able to plot the Likert plot

But now running today, no graph is displayed

Plotting The Likert Scale Graph for Perceived stress columns using library https://github.com/nmalkin/plot-likert

#!pip3 install git+https://github.com/nmalkin/plot-likert.github
!pip3 install plot-likert

import plot_likert
scale = [     "Never" , 
              "Almost never" ,
              "Sometimes" , 
              "Fairly often" ,
              "Very often" 
        ]

try:
    ax=plot_likert.plot_likert(PSS_scale_columns,scale,colors=plot_likert.colors.likert5,figsize=(50,80),xtick_interval=10);
    ax.figure.set_size_inches(20, 30)
    column_list=PSS_scale_columns.columns.tolist()
    #column_list.reverse()
    ax.xaxis.set_label_text('Percentage of responses');
    ax.set_title("Likert Graph for PSS columns ");
    ax.set_yticklabels(column_list)
    print("happy")
except plot_likert.PlotLikertError as e:
    import sys
    print("sad")
    print("Oh no, something went wrong! The message in the exception is:\n" + str(e), file=sys.stderr)

Only happy is being printed

Plot percentages from groups with different numbers of responses

yo waddup

Modified code

I've modified the package to facilitate different scales (both size and verbiage), removing null values, and plotting from one (long) line of code.

plot_likert:
- added a couple of functions:
- likert_response: replaces source data scores with scores that match scales in scales.py. This is to fix the issue of "strongly agree" vs. "totally agree" and so on that I see in my own data sets.
- plot_likert: combines likert_response, likert_counts/percentages and plot_counts into one line of code.
- added support for even scales.
- added the ability to wrap the text on the Y axis.

scales.py:
- added '_0' scales to deal with (remove) NA responses from a dataset. scoresX_0 is used to prep the data and scoresX is used for plotting.

colors.py:
- added color scales based on the R likert package, with the lowest color being a little darker than the likert package.

I put the package here: https://bitbucket.org/alaskamike/plot_likert/src/master/

Please let me know if this is useful.

Mike

Suggestion: add option to specify x-axis limits

Thanks for a great package!

It would be great if it was possible to specify the limit on the x-axis.
In my case, I am comparing two groups in two different figures, and the scales are different.
Which in turn makes it hard to compare the figures. Being able to set the range would be great.
Or, at least set it to +/- 100%.

Thanks again!

Percentage labels

Thank you for the wonderful and useful library.
When using plot_percentage=False option, some labels are to 4 decimal points, which takes up too much space.
Is there a way to control the number of decimal points shown in the percentages?

Legend placement overlaps plot

By the way, this is how a Plot looks finally. Somehow the legend is shifted into the plotting canvas. If someone has a solution for that as well?!

Originally posted by @hbxbgrw7913 in #37 (comment)

Problem with wrapping text

So I made the mistake of updating all of my packages. Now when I try to plot with plot_likert I get the following. Any idea what I need to do to fix this? I'm using python 3.7.9 on Windows 10.

test = op_pivot.iloc[:, 2:10].copy() #gets subset of questions

all_columns = list(test) # Creates list of all column headers
test[all_columns] = test[all_columns].astype(str) # converts int to str

counts = pl.likert_counts(test, pl.scales.raw7, 50).drop('0', axis=1)

#pl.plot_counts(counts, pl.scales.scores7, pl.colors.likert7)


``` #+END_SRC

 #+RESULTS:
 :results:
 # Out [20]: 
 # output
 ---------------------------------------------------------------------------
 AttributeError                            Traceback (most recent call last)
 <ipython-input-20-2be79b917ff1> in <module>
       4 test[all_columns] = test[all_columns].astype(str) # converts int to str
       5 
 ----> 6 counts = pl.likert_counts(test, pl.scales.raw7, 50).drop('0', axis=1)
       7 
       8 #pl.plot_counts(counts, pl.scales.scores7, pl.colors.likert7)

 C:\Users\AppData\Roaming\Python\Python37\site-packages\plot_likert\plot_likert.py in likert_counts(df, scale, label_max_width, drop_zeros)
     131     old_labels = list(df)
     132     old_labels.sort()
 --> 133     new_labels = ["\n".join(wrap(l, label_max_width)) for l in old_labels]
     134     df = df.set_axis(new_labels, axis=1, inplace=False)
     135 

 C:\Users\AppData\Roaming\Python\Python37\site-packages\plot_likert\plot_likert.py in <listcomp>(.0)
     131     old_labels = list(df)
     132     old_labels.sort()
 --> 133     new_labels = ["\n".join(wrap(l, label_max_width)) for l in old_labels]
     134     df = df.set_axis(new_labels, axis=1, inplace=False)
     135 

 c:\program files\python37\lib\textwrap.py in wrap(text, width, **kwargs)
     377     """
     378     w = TextWrapper(width=width, **kwargs)
 --> 379     return w.wrap(text)
     380 
     381 def fill(text, width=70, **kwargs):

 c:\program files\python37\lib\textwrap.py in wrap(self, text)
     349         converted to space.
     350         """
 --> 351         chunks = self._split_chunks(text)
     352         if self.fix_sentence_endings:
     353             self._fix_sentence_endings(chunks)

 c:\program files\python37\lib\textwrap.py in _split_chunks(self, text)
     335 
     336     def _split_chunks(self, text):
 --> 337         text = self._munge_whitespace(text)
     338         return self._split(text)
     339 

 c:\program files\python37\lib\textwrap.py in _munge_whitespace(self, text)
     152         """
     153         if self.expand_tabs:
 --> 154             text = text.expandtabs(self.tabsize)
     155         if self.replace_whitespace:
     156             text = text.translate(self.unicode_whitespace_trans)

 AttributeError: 'tuple' object has no attribute 'expandtabs'

Whitespace

Is there a way to change the whitespace or separation between the various Likert bars?
Sometimes you've got a great big stack of bars and the whitespace eats up a lot of your page.

I tried changing the location of the Yticks, but that only changed the location of the labels.

DataFrame plotting doesn't work anymore with newer versions of Pandas

When trying to plot a Pandas (pd) DataFrame, I run into the error:
TypeError: DataFrame.set_axis() got an unexpected keyword argument 'inplace'.
This is due to line 251 in plot_likert.py:
df = df.set_axis(new_labels, axis=1, inplace=False)

In earlier versions of pd, the inplace argument is used and has a default value of False. However, in newer versions, the argument is not used anymore. (https://pandas.pydata.org/docs/whatsnew/v1.5.0.html#:~:text=Deprecated%20the%20inplace%20keyword%20in%20DataFrame.set_axis()%20and%20Series.set_axis()%2C%20use%20obj%20%3D%20obj.set_axis(...%2C%20copy%3DFalse)%20instead%20(GH48130)) Therefore, when this piece of code is run, it throws the TypeError.

The fix is trivial, change line 251 of the plot_likert.py file to:
df = df.set_axis(new_labels, axis=1, copy=False)

Making this change in my local installation of the library fixes the problem. However, it would be nice if the library is still usable with newer versions of pd for everyone. From what I've seen, this is the best Likert-scale plot library out there!

Exception when values occur only once or twice

Reported in #37

^{Originally posted by hbxbgrw7913 October 6, 2022}
Hey, I am trying to plot a Dataframe in which values of the scale occur only once or twice. This always seems to run into a "ValueError: attempt to get argmax of an empty sequence". I can solve this issue by setting xtick_interval = 1. But still I don't understand the error, which would help in order to handle this error, since I am plotting more than 50 plots.

This is my DataFrame

Pkw	Fahrrad	ÖPNV	zu Fuß
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
etwas mehr	etwas mehr	weniger	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	etwas mehr	etwas mehr	etwas mehr
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN
NaN	NaN	NaN	NaN

If you could suggest any kind of help or a better explanantion, I would be very thankful!

Error calculating xticks

I have a dataset with 20 questions, that I split into 2 sets to plot. They use the same scale etc. One set plots file, the other gives me this message. Basically, it doesn't update the x ticks or move the legend. Any idea what's going on?


>  ---------------------------------------------------------------------------
>  ValueError                                Traceback (most recent call last)
>  <ipython-input-15-10cd0e6704f9> in <module>
>        3 #print(counts)
>        4
>  ----> 5 pl.plot_counts(counts=counts2, scale=pl.scales.agree7, colors=pl.colors.likert7)
> 
>  C:\Users\AppData\Roaming\Python\Python37\site-packages\plot_likert\plot_likert.py in plot_counts(counts, scale, plot_percentage, colors, figsize, xtick_interval)
>       80         interval = xtick_interval
>       81     right_edge = max_width - center
>  ---> 82     right_labels = np.arange(0, right_edge + interval, interval)
>       83     right_values = center + right_labels
>       84     left_labels = np.arange(0, center + 1, interval)
> 
>  ValueError: Maximum allowed size exceeded

Problem with the rounding of the bar labels

from Discussion #43

^{Originally posted by EarlvanEick July 20, 2023}
Thank you very much for this library!

Unfortunately I have a problem with the rounding of the bar labels. I have the following aggregated dataframe:
1.0 2.0 3.0
Q12.1_Tri 59.53 26.51 13.95
Q12.2_Tri 61.99 24.89 13.12
Q12.3_Tri 68.84 25.58 5.58
Q12.4_Tri 45.37 18.98 35.65
Q12.5_Tri 61.40 26.98 11.63

The columns are already percentages, but the problem with the rounding of the bar labels happens with individual data and absolute figures as well.

This is my code: plot_likert.plot_counts(data2, another_scale, compute_percentages=False, bar_labels=True);

As you can see, the bar labels are truncated and not rounded. The same happens if I set compute_percentages as True.

Is this a bug or is did I make a mistake?

102% tick label

Hi, just checked out this library and it seems pretty great so far. Only thing is that I'm having a weird issue where I have a 102% tick label because one of my series has an almost 100% response rate to one of the options.

In my opinion, the ticks should only go to 100%, since I can't have 102% of respondents choosing a particular option.

I really appreciate this library, because I was dreading having to code a Likert plot myself, so if I have a spare minute or two I might see if I can submit a patch to fix this.