ehgp's Projects
ABS mini tower kit 's driver and installation script.
AWS Sagemaker inside S3 instance running Python code
Spark program that reads your browser history file, then displays the top 5 websites you visited in the last week
Collective efforts of the BurryEdge community
Integrate ChatGPT into your own discord bot
The crypto coin backed by chemical engineering technology and advancement for chemical engineers worldwide.
Code for Cicero, an AI agent that plays the game of Diplomacy with open-domain natural language negotiation.
Automate listing in FB, ETSY and EBAY Marketplaces
Config files for my GitHub profile.
My Personal Website built with Frozen-Flask for hosting into GitHub Pages
Set up PySpark and performed ETL on static csv data
Using Spark and R to perform ETL in Spark and on Local Machine
Evals for OpenAI for contributions
Submit a notebook that reads the Excel spreadsheet and produces a separate spreadsheet with the following modifications: •Use openpyxl to copy patients from "another" to "main" •For patients on "another" that don't exist in "main," create new rows in "main" •Make no changes to the visualizations that exist in each worksheet •Make no changes to the data on "another" •Write your changes to a new .xlsx file (don't overwrite the original) • Observations: •"main" worksheet will have three new columns (because those columns exist in the "another" worksheet) •"main" worksheet will have new rows (one row per patient) •There will be empty cells in "main" worksheet •Use a programmatic (rather than manual) approach to identify which patients appear in both worksheets •Some cells in both worksheets contain formulas. Copy only values from "another" to "main"
Generate fake data Write a Python notebook that generates a file containing the following data: •Email addresses •Phone numbers •Home Address •Person's name •Year born. Use realistic values. •Number of kids. Use realistic values. •Categorical variable: rent or own? •Annual income. Optional challenge: Use a non-uniform distribution •Number of speeding tickets in past year. Optional challenge: Use a non-uniform distribution User of your notebook should be able to specify how many entities are to be generated. Generate data in 2 of the following 3 formats: XML, CSV, or JSON. Your choice! Order of columns in CSV is not relevant.
TSQL queries designed for Fin-Tech Business applications
:zap: Dynamically generated stats for your github readmes
Global CO2 Emissions By Samar Manjeshwar & Erick Garcia Using dataset from: Boden, T.A., G. Marland, and R.J. Andres. 2017. Global, Regional, and National Fossil-Fuel CO2 Emissions. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, U.S. Department of Energy, Oak Ridge, Tenn., U.S.A. doi 10.3334/CDIAC/00001_V2017
Digit recognition using MLP at 92% Accuracy
Project Tasks Address the following questions in your project report: 1. In your own words, what do the authors of the code tell you about how the code works? (Hint: refer to comments in the code and the description section for the code on MATLAB central. Do not simply copy and paste information from these sources.) (2 points) 2. List the references for methods/data used by the author(s) of the code. (2 points) 3. What are the potential sources of error in your final solution? Include information on the numerical methods used, the granularity of the problem (e.g. time step, if explicitly set) and significant digits. (3 points) ChE 348 Team Project Page 2 of 7 4. Make a plot of your choice by editing the code (e.g. make a plot of a specific property of a species or vary a parameter and plot the variation in the dependent variable as a function of the independent variable). Do not simply use plots already in the code. Edit them to plot other useful parameters. Provide information on the original and modified plots. (2 points) 5. Add labels, title, legend, etc. to the plot to make it self-explanatory. If using numerical methods, use at least 2 different time steps. (3 points) 6. Based on the courses you have taken so far and this course, interpret the variation in the above plot based on your chemical engineering knowledge. You can use chemical engineering equations for this, from other references. (4 points) 7. Cite all references you use. (1 point) 8. Improve the above code. Make a list of changes and explain their advantages in the report. Document the changes (including copying and pasting original and changed sections) in the report. To get full credit, make at least significant 2 changes (for example, use a better solver or develop an interface). (3 points)
Create a new database for car makes and models ,Generate report of data loaded, showing different commands applied on this database, Generate a report showing how many models you entered per car make ,Generate another report showing how many American made cars versus rest of the world
I provide a .zip containing .txt and .docx files For each file, remove punctuation and stop words Produce a single .dat file containing the name of the file in quotes, a colon, then a list of words separated by commas. The list of words per file should be unique. Do not include URLs or phone numbers.
Reimplement a Previous Assignment For an assignment you previously submitted, reimplement the solution with a faster approach. Measure the change of timing for the original and revised notebooks Submit the original notebook (with timings present in the notebook) and the revised notebook (with timings present in the notebook) Options for improving performance (suggestions; not required): •Rewrite the code to perform the same outcome more efficiently •Use numba •Use multiprocessing •Use dask •Replace a function call with lambda •Replace a for loop with a list comprehension •Use a RAM disk •Make fewer external function calls (eg faker, random)
Self-hosted open source social media marketing
I provide you with a compressed XML file. Some of the fields contain HTML. Extract the XML from the .zip file. In Python, use a module to parse the XML (do not write an XML parser) Using Python, extract the HTML from the XML. Then use BeautifulSoup4 to parse the HTML For each HTML page, report the number of links (URLs with the tag < a href="URL">text) in each HTML page Submit a single Jupyter notebook that parses the XML file and produces the count of links per HTML file. Advanced students: if you complete the assignment above and are are seeking a challenge, use an alternative method (i.e., regular expressions or Python's find) to validate the count of HTML links per page reported by BeautifulSoup4 .
Write a Spark program to count the number of images in a URL and then display the URLs of these images. For example, the program should read the URL T
Scrapes image urls belonging to the website
Task 1: Write a function that returns the count of characters and words in a string. The string is provided to the function as an argument. Submit a .ipynb file via Blackboard. Task 2 : Write a function that takes a list as the argument and returns a list with each element shifted right by one index. For example, [3, 7, 4, 1] becomes [1, 3, 7, 4]. Submit a .ipynb file via Blackboard. Task 3: Download data in CSV format from one of the following sites: here or here. Load the data into a Pandas dataframe. Use Pandas to count the number of rows of data and number of columns. Your selection of a CSV should have at least 2 columns and at least 10 rows. Include a link to the data source in the Juptyer notebook you submit.
•Task 1: Acquire power data (source) for at least 10 days and not more than 40 days. Load the data into a Jupyter Notebook. Create two bar graphs of the power consumption per hour. One bar graph has 24 bars; one bar graph has 24*(number of days) bars. Submit the .ipynb file containing the analysis and the generated pictures. •Task 2: Simulate a fair die and a biased 6-sided die. The biased die has probabilities {0.15, 0.15, 0.15, 0.15, 0.15, 0.25}. Create a visualization that compares outcomes of multiple rolls of a fair die and this biased die. You can use a single visualization or multiple visualizations to demonstrate the difference in outcomes for the dice. The user of your notebook should be able to alter the number of simulations as an argument to a function.