A space to coordinate outreach programs like Outreachy and Google Season of Docs.
This repository is covered by Bokeh's Code of Conduct and licensed under the same BSD 3-Clause license as the core library.
A space to coordinate outreach programs like Outreachy and Google Season of Docs.
A space to coordinate outreach programs like Outreachy and Google Season of Docs.
This repository is covered by Bokeh's Code of Conduct and licensed under the same BSD 3-Clause license as the core library.
This is my solution to Bokeh microtask. I analyzed a subset of the NYC yellow cab dataset
I'm opening this issue to discuss my contributions to Bokeh docs and open issues, my progress and receive feedbacks from time to time.
I've worked on all the micro tasks mentioned and my day-to-day progress to the project is being captured in this gist - https://gist.github.com/chinmaychahar/0ed22cff050d4329007ef9c679198857
I'd particularly like to discuss my work on Data Visualization with the NYC Taxi Trip dataset. I'm continously working on analysis and visualization in the notebook mentioned below and would love to interact, discuss and receive feedbacks from the mentors.
Fundamentals of Data Visualization is a great book by Claus O. Wilke, that discusses core concepts of data visualization with nice examples. The book is available to read for free here: https://clauswilke.com/dataviz/
The visualization in the book are created using the R programming language. This internship project involves creating a series of blog posts (or a different learning resource) on how to create the various plots in the book using Python and Bokeh. The author, Claus O. Wilke, has given us permission for this project. :)
@bryevdv @pavithraes please, kindly review. I await your feedback
This is my solution to Bokeh microtask. This contains my analysis and Bokeh Plot for Data Visualization of the NYC Taxi trip Dataset.
The main aim of this project is to carry out an exploratory data analysis with a subset of the dataset using Bokeh plots for visualization. In this project, I used Python data science and Bokeh plot for visualizing my data to explore the dataset’s variables and understand the data’s structure, oddities, patterns, and relationships.
The size of the .ipynb file was so large reason so I couldn't upload to GitHub until I cleared the output and saved all the plot which was now put together in a folder
@pavithraes @bryevdv
This is my submission for the project: Create a blog post series: "Fundamentals of Data Visualization in Bokeh" #2.
Here is the link to my submission of the micro task https://github.com/Asekome10/My-First-Data-Analysis-Repo
Here is the draft link to my first blog series on "Introduction to Bokeh"
https://medium.com/@ajokeyusuf10/introduction-to-bokeh-bokeh-blog-series-1-eb1aec87ad18
Contribute a pull request for this Bokeh issue: Add metadata to standalone examples. You can engage on the issue directly if you have any questions.
We request each Outreachy participant to contribute metadata text for at most 1-2 examples, so that everyone can get a chance to contribute. :)
Hello @bryevdv @pavithraes here's my contribution to the Project: Fundamentals of Data Visualization in Bokeh.
Kindly review, thanks!
@bryevdv @pavithraes
My github gist has the codes i used to update bokeh gallery plots . I also set up my local environment as seen in the attached file.Additionally,I will also continue to update the plots and give more solutions in regards to this project.The first plot is as it is in the gallery while the second plot is my updated plot
I followed up on Pavithra ma'am's comment but the issue is I'm still having trouble from where and how to start. please help me figure this out.
Hello @bryevdv and @pavithraes , This issue is for my record of my outreachy's contribution to:
HI @bryevdv @pavithraes, this is where i will be document my contribution
I have begun analyzing seasonal trends in NYC green taxi ride volume and revenue by month. I would like to improve my first figure by adding a spacer/margin between the two sub-figures and adding an overarching title (i.e. “NYC Green Taxi Performance 2021"). I would also like to add a couple horizontal lines to each plot indicating the mean and median value for each.
Then, I would like to move on to compare seasonal trends across years and analyzing other variables correlated with seasonal volume by using different types of plots.
I appreciate any feedback, thank you!
Here is a Jupyter Notebook where I am creating plots.
Hello mentors, I will me sharing all my contributions in this issue.
Hi @bryevdv and @pavithraes I performed exploratory data analysis on a subset of the yellow taxi trip record October using bokeh plots such as line plot, bar plot, scatter plots. I'm excited to share these data visuals with you, it is my first attempt at creating interactive plot with Bokeh. I will appreciate your review and feedback.
You can find my jupyter notebook on https://gist.github.com/JoyclynUjunwaOgbonna/dff7989b441069634769ce2f7d4764db
Here are some plots from my notebook:
** A pie chart and a vertical bar sub plots**: examines the weekly trends in terms of trip proportation and amount.
Two dot plots: showing top 10 pick up and drop off locations
A plot with multiple glyph: this shows the frquency of pick-up and drop off time in hours and indicate rush hour
** Two histogram plots**: these show the distribution of fare amount and total amount.
A scatter plot: show relationship between fare amount and total amount
The New York City TLC taxi trips records data is frequently used for creating examples and tutorials for Python data science workflows. You can access the dataset through any of the following ways:
Note that the actual dataset is quite large, so please use a subset of the data or consider reducing it.
To complete this micro-task, download and explore a subset of the dataset with Bokeh plots. You can share your Jupyter Notebooks with us as a GitHub gist. As per Bryan's comment here, please open separate issues/PRs with your wok, so that we can share feedback individually.
Hi @bryevdv, here is my submission for the microtask on the NYC taxi dataset for the month of November, 2022. I'll appreciate any feedback and criticisms. Thank you!
Hello @pavithraes !
While setting up the environment, I have noticed that in the command mentioned in the documentation is not providing right file path.
It should be
BOKEH_DEV=false python -m bokeh serve --show examples/server/app/sliders.py
instead of
BOKEH_DEV=false python -m bokeh serve --show examples/app/sliders.py
@pavithraes @bryevdv Hello I have worked on the NYC taxi dataset and was able to draw some insights. I have Attached some of the visualisations below. Also please find the rest of my project at the following link
My code snippets are under the file named BOKEH_TAXIS.ipynb while the rest of the files are my plots
The link : https://docs.bokeh.org/en/dev-3.1/docs/dev_guide to the most recent developer documentation (note that the version selector says 3.1.0rc1) to set up your development environment is broken
Select a Bokeh plot from the examples gallery, review it for accessibility (primarily, check the colors used in the plot), and update it to use more accessible colours if needed.
Share the plot you are working on a comment on this issue, so that we don't have multiple participants working on the same example.
You can make a PR against bokeh/bokeh#11481 for this micro task to update plots. If you find plots that don't need updating, share it on this issue as a comment.
I performed an Exploratory Data Analysis using the Yellow_Trip_Records for November 2022
using the pandas, and Bokeh libraries for data preparation and visualization, respectively.
I appreciate any reviews and comment on it. Thank you
cc @pavithraes, @bryevdv
I have tried to work with two different datasets first one is TLC Driver 24 hour course and second one is yellow taxi dataset for the month oct and nov . Also for the reference , have attached a pdf containing my outputs and other relevant data as well .I am contributing to a project for the first time .I appreciate any reviews and comment on it
This dataset contains a list of authorized providers who offer the TLC Driver License 24 hour TLC Driver Education Course and exam. All TLC Driver License applicants must complete the course, which covers the following topics: TLC rules and regulations, geography, safe driving skills, traffic rules, and customer service. The dataset includes details about each driving school, such as their contact information, locations, languages offered, and the course price for each.
This dataset contains information about yellow taxis in New York City between the months of October and November. It includes data on the number of passengers, total and fare amount, tips, extra charges, vendor ID, and more.
You can find the code and visualizations(pdf) for both datasets in this Google Drive folder.
github gist link : https://gist.github.com/anushka-png/ffd9d83d2b6b46d169c5e510dc4123d9
I plotted a line graph to show the comparison between the tips for green_tripdata_2020-02.parquet and green_tripdata_2022-01.parquet
You can find the full code and screenshot of output in my githubgist
Bokeh's documentation grew organically for many years. In the past couple of years, we restructured, re-designed, and improved it significantly. We'd like to continue improving the documentation using some principles from the Diátaxis (https://diataxis.fr/) framework. Diátaxis has been used successfully to improve many project documentations, including projects in the Python data science ecosystem that Bokeh is a part of.
This internship project will involve reviewing Bokeh's documentation (primarily the user guide), and restructuring and updating the documentation pages to use principles defined by the Diátaxis framework.
As @bryevdv instructed.
Hello @bryevdv, @pavithraes,
Here is the link to my gist: https://gist.github.com/Faith-Nchifor/2eaf2132e1f4cc6a67cd81ba5212e2c3
Also, this notebook runs live on Kaggle where all the plots are visible: https://www.kaggle.com/code/faithnchifor/nyc-yellow-taxi-trips-viz
My project of interest is Create a blog post series: "Fundamentals of Data Visualization in Bokeh"
Your all good and bad feedback are welcome.
Thanks
This issue is in reference to my task 1 submission, which can be found here https://gist.github.com/PatChizzy/01af08713a10fc83ffd329695d25d310
The outputs are added as comments and i used a ramdom sample of 5000 entries from the April 2022 dataset
Go through the Bokeh's developer documentation to setup your local environment to work on Bokeh's codebase and documentation, and share a screenshot of your local documentation build.
Please follow the instructions in the most recent developer documentation here: https://docs.bokeh.org/en/latest/docs/dev_guide/setup.html to set up your development environment. These include some additional notes to ensure you have the tags necessary in your GitHub fork of Bokeh, for an accurate editable install. Your installation is correct if the output of python -m bokeh info
has a "Bokeh version" similar to:
Python version : 3.10.9 | packaged by conda-forge | (main, Feb 2 2023, 20:24:27) [Clang 14.0.6 ]
IPython version : 8.11.0
Tornado version : 6.2
Bokeh version : 3.1.0rc1+2.g8b073e04
BokehJS static path : /Users/pavithraes/Developer/Bokeh/bokeh/src/bokeh/server/static
node.js version : v18.13.0
npm version : 8.19.4
Operating system : macOS-13.2.1-x86_64-i386-64bit
Bokeh has an Examples Gallery with over a hundred example/demo visualizations. It is one of the most referenced pages in the Bokeh documentation. This project involves reviewing and updating ~30% of the example plots to be more accessible. It includes:
@bryevdv @pavithraes Please kindly review. This was setup since last week. I await your feedbacks
Hello @bryevdv and @pavithraes , This issue will provide you with the formal record of my outreachy's contribution to Bokeh community.
I have contributed to all three projects and completed all the associated microtasks.
#4:
This task was further required for the successful completion of #1 and #3
For #1 :
i. Setup the development environment
ii. Added metadata to setvalue.py , #12991
iii. Added metadata to arcs.py, #12989
iv. Updated lines.rst #13017
For #2 :
I have created three different files each containing different kinds of glyphs/graphs. All the files are in my repository and I have also made a separate gists for each of them.
i. Pie-chart.ipynb contains two pie charts (Gist link) :
a. The first pie chart contains the qualitative distribution of Payment methods among the passengers.(Image)
b. The second pie chart contains the qualitative distribution of number of passengers boarding a taxi.(Image)
ii. trip_analysis.ipynb contains a colourful visualization of average distance covered by a taxi vs average speed of the taxi (I tried to make the glyph look garden like, so its quite colourful),(Gist link)
iii. trip_analysis_2.ipynb contains a bar graph demonstrating the number of hours a taxi is working each weekday.(Gist link)
iv. Cross_tabulation.ipynb depicts the trend of payment types on different days of week.(Gist link)(image)
v. Donut_chart_trip_analysis.ipynb depicts the trends in toll payment according too the RatecodeID.(Gist link) .(Image)
Images for Donut chart, Pie charts and cross tabulations are in their respective folders in the mentioned repository.
vi. Stacked_splits.ipynb shows the frequency of usage of payment methods by weekdays. (Gist link)(Image)
For #3 :
I have tries this task in two ways:
i. Making a repository of plots I altered and opened a pull request for the same . There are three plots in the repository taken from bokeh's gallery:
a. Slope: Code can be viewed here
The changed plot looks like:
b. Histogram: Code can be viewed here
The changed plot looks like:
c. Jitter_plot: Code can be viewed here
The changed plot looks like:
ii. Opening a pull request for changing the colours in color_scatter.py, #13008
I reversed the order of colours in the plot.
Older plot:
First Task: Setting up bokeh dev environment
Successfully setup and the output can be found here:
Python version : 3.10.9 | packaged by conda-forge | (main, Feb 2 2023, 20:14:58) [MSC v.1929 64 bit (AMD64)]
IPython version : 8.11.0
Tornado version : 6.2
Bokeh version : 3.1.0rc1+18.gcc911148
BokehJS static path : C:\Users\Sarima Chiorlu\bokeh\src\bokeh\server\static
node.js version : v18.12.1
npm version : 8.19.4
Operating system : Windows-10-10.0.22000-SP0
(bkdev) C:\Users\Sarima Chiorlu\bokeh>python -m bokeh serve --show examples\server\app\sliders.py
2023-03-24 22:12:04,610 Starting Bokeh server version 3.1.0rc1+18.gcc911148 (running on Tornado 6.2)
2023-03-24 22:12:16,708 User authentication hooks NOT provided (default user enabled)
2023-03-24 22:12:16,939 Bokeh app running at: http://localhost:5006/sliders
2023-03-24 22:12:16,939 Starting Bokeh server with process id: 29128
2023-03-24 22:12:42,430 WebSocket connection opened
2023-03-24 22:12:42,432 ServerConnection created
2023-03-24 22:13:23,008 WebSocket connection closed: code=1001, reason=None
2023-03-24 22:13:24,116 WebSocket connection opened
2023-03-24 22:13:24,118 ServerConnection created
Pull request showing second task completed can be found here
I worked on Micro task : #6
I was able to visualize the dataset using Bokeh. I converted the datetime column to Unix timestamps which made it possible for me to find the durations between pick up and drop off times. I also found the speed since we have the distance of the trip. I made visualizations using the distance, speed, durations, pick up and drop off times.
Here is the link to my GitHub gist: https://gist.github.com/whoisorioki/e72c772832bdc0cc638f9ad8975057a7
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.