GithubHelp home page GithubHelp logo

itrauco / nlp-vit-bot Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bodhwani/nlp-vit-bot

1.0 0.0 0.0 28.08 MB

Bot to embedded in VIT's official website.

License: Apache License 2.0

JavaScript 2.70% Python 1.82% CSS 51.36% Makefile 0.13% HTML 3.02% Jupyter Notebook 40.92% Dockerfile 0.04%

nlp-vit-bot's Introduction

Capstone Project - VITChatbot [English]

A Rasa based chatbot that uses a custom NLU pipeline and the standard RASA-core to create an informative chatbot for resources related to VIT University.

Complete Project Report

Workflow

Features

NLU Pipeline

The pipelinbe consist of the following stages:

  • Tokenization - Splitting into smaller lexical units
  • Acronym replacement - factoring in the popular neologisms, slangs and acronyms
  • Spelling correction - Correcting spelling to prevent loss of information/context
  • Lemmatization - Reducing polymorphisms
  • Vectorization - Word to Vector conversion
  • Intent Classification - Convoluted Neural Nets to match the intent.

Fallback/Sentimental Analysis

  • In the case that the queried intent does not match any of the responses in the database then a Quora search is made and the best response is displayed.
  • Sentimental Analysis of the responses are done to prevent displaying sensitive or highly negative sentences.

User Interface

  • Clean, Plain UI with minimal buttons. User Interface

Feedback/Rating

  • A rating opiton in the UI enables the user to convey their level of satisfaction while using the product.
  • The users also have an option to report bugs and unexpected erroneous scenarios by attaching a snapshot of the conversation.

Misc. Features

  • Speech and Voice assist: As an optional mode of communication, we use Speech-toText and Text-to-Speech to provide a more lively and convenient interface.
  • Socket communication: The user inputs are relayed via sockets to the backend server running the core chatbot framework.
  • Natural Language Processing: We use a power set of language processing tools to recognize and disambiguate the meaning of the user’s input. These processes include tokenization, lemmatization, POS tagging etc.
  • Deep Learning: We use Convoluted neural networks to train our classifiers that help to discern the users motive and suggest suitable responses to make conversation.- Chatbot response: The chatbot replies to the input query either by providing relevant text and links or by asking further questions to clarify the context.
  • Expandable scenarios: The conversational flows are created in such a way that it can be expanded to match modern scenarios. The processes of adding conversations by the developer is quick, easy and modularized.

Dependencies

  • Beautiful Soup: Beautiful Soup provides a few simple methods and Python idioms for navigating, searching, and modifying a parse tree using Python parsers like lxml and html5lib. It automatically converts incoming documents to Unicode and outgoing documents to UTF-8.
  • Text Blob: Text Blob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as partof-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.
  • Pandas: pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labelled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python.
  • NumPy: NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

  • Anaconda: — Anaconda is the world’s most popular Python data science platform. Anaconda, Inc. continues to lead open source projects like Anaconda, NumPy and SciPy that form the foundation of modern data science. Anaconda’s flagship product, Anaconda Enterprise, allows organizations to secure, govern, scale and extend Anaconda to deliver actionable insights that drive businesses and industries forward.
  • NLTK: NLTK is a leading platform for building Python programs to work with human language data.

Version Change Log

  • v1.0.0 Added Academics and subbranches (UG,PG,Research,International/IR), Added intent for International Relations
  • v1.1.0 file structure for nlu training data and stories under data/nlu/ and data/stories/ , Updated Make commands , Campus life(Fests,Clubs)
  • v1.2.0 Added Academics section.
  • v1.2.0 Campus Life(Chapters)
  • v1.2.1 Stopword temp fix.
  • v1.3.1 Dockerfile added and deployed successfuly.
  • v1.4.0 Chatbot-UI
  • v1.4.1 Campus life story fix.
  • v1.5.0 Hyper Linking and New line formattion , respective UI changes
  • v1.5.1 Bug fix [Remove common words in the dataset]
  • v1.6.0 Search in quora added
  • v1.6.1 Bug fix [Removed affirmations]
  • v1.7.0 Added Sentimental Analysis.
  • v1.8.0 Custom NLU Pipeline added

Note:

Run "make model_remove" after doing changes in the dataset. Run "make train-resetall" to remove exiting models, train new nlu and train new core models

nlp-vit-bot's People

Contributors

bodhwani avatar dependabot[bot] avatar karth2512 avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.