Raktim Mukhopadhyay's Projects
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
Packet Tracer is a cross-platform visual simulation tool designed by Cisco Systems that allows users to create network topologies and imitate modern computer networks. The software allows users to simulate the configuration of Cisco routers and switches using a simulated command line interface
Classification of CGM Timeseries data into Meal or No-Meal Data. This is continuation of the Feature Extraction from CGM Timeseries Data.
A Convolution Neural Network is a deep learning algorithm which can take in an input, assign weights to various features and be able to differentiate the inputs into various categories or classes. The architecture of CNN is analogous to that of the connectivity pattern of neurons in human brain and was inspired heavily by the visual cortex. In this project the baseline model has the following features - 1. The first convolution layer has 6 feature maps and the convolution kernels are of size 3x3. This layer uses stride equal to 1. Stride denotes the number of pixels shifts over the input matrix. When stride is 1, the filters are moved by 1 pixel at a time. 2. The convolution layer is followed by a max pooling layer. The poling is 2x2 with stride equal to 1. 3. The max pooling layer is connected to the next convolution layer with 16 feature maps. The size of the kernel is 3x3 and this layer too uses stride equal to 1. 4. The second convolution layer is followed by a max pooling layer. The pooling is 2x2 and uses a stride equal to 1. 5. The max pooling layer is fully connected to the next hidden layer with 120 nodes and ReLU as the activation function. 6. The fully connected layer is followed by another fully connected layer with 84 nodes and ReLU as the activation function, then connected to a SoftMax layer with 10 output nodes which corresponds to 10 classes/categories present in the data.
Bootstrap components for Plotly Dash
Apps hosted in the Dash Gallery
1. Feature Extraction from the provided dataset 2. Parameter Estimation for the Normal Distribution 3. Calculating the Model Parameters for Naïve-Bayes Classifier and Logistic Regression 4. Train the Naïve Bayes Classifier using the training data for classifying the test data 5. Train the Logistic Regression Classifier for classifying the test data 6. Calculating the classification accuracy for Naïve-Bayes and Logistic Regression Classifier
This is continuation of the projects on CGM Timeseries Data.
Developed a Fraud Detection Framework in Financial Payment Services over an imbalanced synthetic financial dataset generated by Paysim having over 6.5 million financial transactions with using Logistic Regression, Decision Tree, Naive Bayes, Random and KNN. The Recall values of Naïve Bayes, Decision Tree, Logistics Regression and KNN are 0.40,0.66,0.72 and 0.76 respectively. Also, the AU-ROC values are 0.87, 0.91, 0.95 and 0.93 respectively.
A basic sample application using Python with devfile
Extracted four different types of features from the time series data provided. The four different types of features extracted are – i. Peaks using Fast Fourier Transformation ii. CGM Velocity iii. Auto Correlation iv. Polyfit Regression Coefficients
Data is the new fuel. And with the introduction of Big Data systems, we have been able to extract more and more useful information from large data files. In this project, we work with Geo-Spatial data. Geo-Spatial Data is about objects, events, or phenomena that have a location on the surface of the earth. Geo-Spatial data combines location information (usually coordinates on the earth), attribute information (the characteristics of the object, event, or phenomena concerned), and often also temporal information (the time or life span at which the location and attributes exist). Geo-Spatial databases support a special type of query known as Spatial Query that allows the use of points, lines, and polygons. This project aims at developing and running multiple such spatial queries on a large database containing geographic data and real-time location data of customers obtained from a peer-topeer taxicab firm.
1. Implement a strategy (STRATEGY – 1) in which the initial centroids are picked up randomly from the given dataset 2. Implement a strategy (STRATEGY – 2) in which the first centroid is picked up randomly, for the ith centroid (i>1) a data sample is chosen among all possible data samples such that the mean distance of this chosen sample to all previous (i-1) centers is maximum 3. Implement STRATEGY -1 and STRATEGY -2 for K=2 to 10 4. Calculate the objective function ΣΣ||𝑥−μ||𝑥∈D𝑘𝑖=12 for K= 2 to K=10. 5. Plot the values of objective function (WCSS) vs number of clusters (K)
This is continuation of the projects on CGM Timeseries Data.
An R package implementing the NetEMD and NetDis network comparison measures
This repository contains all assignments that are part of the course offered by Coursera.
Python packaging and dependency management made easy
Example of how to build with pybind11 in poetry
QuadratiK includes test for multivariate normality, test for uniformity on the sphere, non-parametric two- and k-sample tests, random generation of points from the Poisson kernel-based density and clustering algorithm for spherical data.
Seamless R and C++ Integration
Exploited the Microsoft Windows SMBv1 Vulnerability(MS17-010), VULNERABILITY IN SERVER SERVICE (MS08-067) and exploited a victim PC with POWERSHELL ALPHANUMERIC SHELLCODE INJECT.
scikit-learn: machine learning in Python
Multilingual Sentence & Image Embeddings with BERT
What the Package Does (One Line, Title Case)
A personal gallery of streamlit apps and components.