Kateryna Drogaieva's Projects
Correlation analysis between candidates and county facts based on 2016 US President Election Primary Results by county
XGB model and feature importance to predict At Fault Auto Claims
Machine Learning experiments automation with the help of AWS Sage Maker using XGBoost Classification and Insurance Property data
Few AWS Data Pipeline samples to demo export from MS SQL to a file in S3 bucket, load a DynamoDB table to Redshift, multiple dependencies in the flow
The application collects real time train departures from Bart API
Examples of dynamic creation and use of VPC, EC2, Load Balancer, Auto Scaling group, Launch Configuration, Redshift cluster, S3, SQS, SNS
Advanced SQL
This repository hosts sample pipelines
What Russian women talk about - Natural Language Processing (NLP) research of Russian women eva.ru forum
Snowflake Cortex based
Pentaho Data Integration ETL and Matillion ELT
Data Warehouse Modeling
Config files for my GitHub profile.
Personal site
Mini ETL Tool is a Python module. It allows to run SQL and CLI commands in parallel or sequential mode, set up preconditions, dependencies and notifications
CI/CD pipeline to deploy DW schema changes in Redshift based on AWS CodePipeline, CodeBuild, FlyWay and JUnit for testing
In this notebook I search the best classifier and its parameters for posts multi-class classifications based on authorship attributes
In this notebook I work on the question whether the author of a tweet (very short text) can be successfully identified. I try to choose the best classification method its parameters set and features
Collects tweets and performs sentiment analysis based on emoticons and NLP (TextBlob)