Name: Henry Le
Type: User
Company: Data Engineer | Python | JavaScript
Bio: Data/ Project Engineer with extensive skills in Python, SQL, JavaScript, HTML, CSS, Machine Learning, Docker, Cloud Computing, and VBA.
Location: Spring, TX
Blog: www.linkedin.com/in/le-henry
Henry Le's Projects
Utilized JavaScript to query data from a JS array based on user input conditions, and build a table to display retrieved data by utilizing DOM with D3.js. Flexible table with size changed pending on the amount of retrieved data.
Python app was utilized (with no other dataframe libraries) to read 2 CSV files of different datasets, loop and analyze data to return summary of interested topics. One dataset is about Election Data, the other is about Company Financial Data. For effective development, Jupyter Notebook was used, the finished codes then exported into Python (.py) file to run in Terminal.
Data Scientist/ Engineer spent majority of their time to clean-up data as it's not always "clean" due to many reasons, such as inconsistent user inputs, defective sensors, typos, special characters, etc. In this project, 2 Jupyter Notebooks with Python libraries: Pandas, SQLAlchemy & Psycopg were used to clean and load data into SQLite Database.
Python-Flask web applications for querying data from SQLite, generating API routes & managing multiple web pages including data analytics and visualization for 46 yrs of Satellite History.
In this project, a sophisticated deep learning model called Convolutional Neural Network in conjunction with Dense Network (~20 million parameters) was utilized for predicting human handwritten digits with 98% accuracy with only 5 training epochs. The front end was designed to allow user to input their handwritten on a canvas and the machine will predict what number it is. Every stroke made on the canvas will be predicted in real-time. The finished product is hosted on AWS by docker containers.
Python & Jupyter Notebooks with Pandas, Numpy, Scipy & Matplotlib to analyze 164 years of Hurricane and Tropical Storm History. The objective is to find if there is any correlation between frequencies & strengths of devastated storms and time, which helps validate NASA statement about the increasing trend of hurricanes.
Interactive website with multiple webpages showing effects of latitudes on climate conditions. Weather data of 500+ cities were plotted & analyzed. Users can easily navigate to different webpages by clicking on the interested tiles or navigation bar menu. Adaptive to different screen sizes thanks to Bootstrap.
Automation with web-scrapping applications that allows user to retrieve all pre-defined Mars-related data by just "one-click". Utilized Chrome Driver with Python for automatically scraping multiple websites, retrieving data, and building Mongo DB to store retrieved data.
Utilizing Machine Learning with Scikit Learn in Python to classify candidate exoplanets from the raw dataset retrieved by NASA Kepler Telescope.
Homework and Exercise for Coursera's NodeJS Education
Python with Pandas to analyze tumor treatment effectiveness.
Visualization for Webpage using Plotly in JavaScript
Python & SQL with Pandas & Matplolib were utilized to load old database of CSVs into a new SQL database, query and perform data analysis and visualizations to investigate data integrity of this old database before migrating to a new one.
VBA is excellent choice for Excel spreadsheet automation, especially in this project where an Excel Workbook consists of multiple years of stock data from 2014 to 2016 (~2.3 million rows of data): VBA codes to loop through each year, each stock in that year, and provide comprehensive summary tables of stock counts, total volumes, losses, and gains. It's almost impossible for manual summary of this large amount of data. The whole stock workbook was analyzed in less than 3 minutes while the same tasks could take human days if not weeks to manually complete.
Utilized Tableau Desktop to analyze CityBike business, gain insights and tell stories about CityBike usage/ rental around the city.
Results of TensorFlow training time by utilizing CPU vs. GPU
Have we all achieved American Dream? A great question that this project will shed some light on based on U.S. Census data. Utilized HTML, CSS, JavaScript, D3.js charts, DOM & data binding to create dynamic & changeable data analysis & visualizations.
Henry Le's Repository of MatPlotLib
JavaScript application for EarthQuake Tracking and Visualization. Utilized API, GeoJSON, Leaflet and MapBox to create interactive Earth Map showing all earthquakes happened in the last 7 days. Data is frequently updated so does this application visualization.