michaelznidarsic Goto Github PK
Name: Michael Znidarsic
Type: User
Bio: Devotee of all things Machine Learning. Computer Vision, Text Mining/NLP, forecasting, optimization, A/B, etc.
Location: SF Bay Area
Name: Michael Znidarsic
Type: User
Bio: Devotee of all things Machine Learning. Computer Vision, Text Mining/NLP, forecasting, optimization, A/B, etc.
Location: SF Bay Area
TensorFlow code and pre-trained models for BERT
Predicts news text's reliability with 91%+ validation accuracy. Uses Google BERT encoding as input for a Deep Bidirectional-LSTM Neural Network. Dataset consists of decent-length articles balanced for political leaning and spanning a diverse spectrum of reliability to fit the real-world newsscape. Initial research for this model available at https://github.com/michaelznidarsic/FakeNewsDetection
This study compares how effectively text mining algorithms can classify the addressee of Cicero's letters when given an English translation versus the original Latin.
Code for Stanford CS224D: deep learning for natural language understanding
Novel approaches to detecting intentionally fake and willfully misleading news articles. The end result of this study is an ensemble learning binary classifier of news (fake vs. real, or more accurately: unreliable vs. reliable). Attributes fed into the submodels include normalized word frequencies (e.g. TF-IDF), lexical cues, and distributions of word sentiment severity. The formatting of the PowerPoint may have been somewhat distorted in a conversion process. The key source for most of the compiled dataset was several27's excellent FakeNewsCorpus at https://github.com/several27/FakeNewsCorpus
Process improvement A/B study for Stairstep Consulting.
An exploration of the predictive importance of individual pixels in a deep convolutional neural network using SHAP values. Neural Network architecture inspired by VGG16. Image classification on the Intel Scene Classification dataset available at https://www.kaggle.com/nitishabharathi/scene-classification.
A series of projects all attempting to link customer traits/actions to target behavior. Unsupervised methods including KMeans clustering and Principal Component Analysis are used for Customer Segmentation. Machine Learning models such as XGBoost, RandomForest, SVMs, and Deep Neural Networks are used to predict customer behavior. Datasets are generally from banks or markets.
Experiment in Speech Recognition on Google's Speech Command Dataset using Tensorflow/Keras. 88%-89% validation accuracy achieved classifying between spoken digits (zero through nine) using MFCC transformation and a deep CNN. Work in progress, a couple preprocessing functions disclaimed as borrowed in the code.
A neural network that takes as input a sequence of 60 nitrogenous bases (DNA) and predicts whether the sequence contains an intron/exon boundary (IE), an exon/intron boundary (EI), or neither (N). A maximum validation accuracy of 96.24% was reached. Data obtained at https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+%28Splice-junction+Gene+Sequences%29
This study creates and compares the efficacy of several machine learning models for the prediction of whether or not an undergraduate student offered admission at Syracuse University will accept admission. The dataset is proprietary and cannot be shared.
A Bi-Directional LSTM with Neural Attention and word embeddings. Tackles the difficult problem of Textual Entailment using the Stanford Natural Language Inference (SNLI) corpus. Demonstrates that a 3-class validation accuracy of 76%+ can be obtained on the corpus without resorting to pre-training or recursion/trees. Concept pioneered in "Reasoning about Entailment with Neural Attention" by Rocktäschel et al. Inspiration taken from https://github.com/shyamupa/snli-entailment. Please find data corpus at https://nlp.stanford.edu/projects/snli/
A 2-input Convolutional Neural Network with word embeddings. Tackles the difficult problem of Textual Entailment using the Stanford Natural Language Inference (SNLI) corpus. Demonstrates that a 3-class validation accuracy of 73%+ can be obtained on the corpus without resorting to pre-training, recursion/trees, attention, or LSTM/RNNs. Please find data corpus at https://nlp.stanford.edu/projects/snli/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.