Light

pokemaster720 / hw2q3 Goto Github PK

View Code? Open in Web Editor NEW

0.0 1.0 0.0 3 KB

hw2q3's Introduction

HW2Q3

Data Representation The table_1 and table_2 variables hold what appears to be training and validation datasets, respectively. Each record (tuple) within these tables has four attributes:

Education level (e.g., College, High School)
Job sector (e.g., Management, Service)
Experience (e.g., Less than 3 years, More than 10 years)
A class label indicating a certain category (e.g., High, Low)

Functions
- calculate_probability:
  - Calculates the conditional probability of a feature value given a class. -By implementing the Laplace smoothing to handle zero frequencies by adding 1 to the feature count and 2 to the class count. - This approach ensures every feature has a non-zero probability
- Parameters:
  - feature_value - the specific value of the feature
  - feature_index - the position of the feature within an instance tuple
  - class_value - the class for which the probability is calculated
  - data
calculate_class_probability:
- Computes the probability of each class within the dataset, using Laplace smoothing by adding 1 to the count for the class and 2 to the overall dataset length.
classify:
- Given an instance and a dataset, this function calculates the probability of the instance belonging to each class.
  - It multiplies the class probability with the conditional probabilities of all feature values given the class.
  - It returns the class with the highest probability as the predicted class.
- Parameter:
  - instance the instance to classify
  - dataset the dataset used for computing probabilities.
Loop through Validation Data:
- Iterates over each instance in the validation dataset (table_2),
- classifies it using the classify function, and prints the result.
Classifier Logic
- uses a Naive Bayes approach
- practices Laplace smoothing to prevent multiplication by zero when calculating probabilities.
- By applying these probabilities for class prediction by multiplying the likelihoods of all features given a class and selecting the class with the highest calculated probability.
Key Aspects and Considerations:
- Laplace Smoothing - is critical in this implementation,
  - ensuring that unseen features in the training data don't zero out an entire class's probability.

Only the first three features of each instance are considered during the classification process, as indicated by range(3) in the loop within the classify function. it assumes the equal importance of weight of all features and independence among them given the class label

hw2q3's People

Contributors

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs