GithubHelp home page GithubHelp logo

nicleejy / dontforgetah-bot Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.65 MB

A bot which uses natural language processing to schedule events and appointments, with a unique Singaporean twist. Built using Python NLTK.

License: MIT License

Python 100.00%
bot telegram-bot nltk-python natural-language-processing scheduler singapore

dontforgetah-bot's Introduction

DontForgetAh Bot

An easy-to-use Telegram bot which uses natural language processing to schedule events and appointments, with a unique Singaporean twist.


About

Don'tForgetAh is a scheduler bot that is able to recognise text messages and schedule appointments accordingly. The bot is uniquely catered towards Singaporean users and gives instructions or prompts in Singlish. It can be used to create reminders for one-off or repeating events and also allows users to select how early in advance the reminders should be set. Users also have the option of adding an optional note to the reminder if required.



Project Details

The following is a demonstration of how user input is processed with the help of Python NLTK.

$ pip install nltk
import nltk
from nltk.tokenize import word_tokenize



Given input from the user, we first use tokenisation to split up the words for analysis.

input = "remind me to go to work at 8am next mon"
tokenised_input = word_tokenize(text=input)
tokenised_input: ['remind', 'me', 'to', 'go', 'to', 'work', 'at', '8am', 'next', 'mon']



Next, perform POS Tagging to categorise and label the parts of speech each word belongs to in the sentence. This will help in the separation of the event title from the rest of the details.

tagged = nltk.pos_tag(tokenised_input)
tagged:  [('remind', 'VB'), ('me', 'PRP'), ('to', 'TO'), ('go', 'VB'), ('to', 'TO'), ('work', 'VB'), ('at', 'IN'), ('8am', 'CD'), ('next', 'JJ'), ('mon', 'NN')]



The following procedure is neither foolproof nor flawless, but it does the job well enough in most ordinary use cases. If a single verb ('VB') appears in the sentence, the program classifies this as part of the event title. If verbs appear more than 2 times, the first verb is regarded as redundant and not considered part of the event title. In this case, the text is indexed from 'go' and sliced.

The program will then identify prepositions ('IN') or cardinal digits ('CD') that indicate the end of the event title. A list of stop words is predeclared, which it checks against to see if any time-related keywords such as next, tmr, tues appear in the list and is then indexed accordingly. The text is then sliced at this index to obtain the event title. The remaining set of words contain the details critical for time setting.

Another set of functions look out for the format of the user input and classifies each word as a time, day, date, month, year entry. Timings can be specified in a number of ways, including both 12-hour and 24-hour formats, with dots or colons (eg. 6.30pm, 9:00am) and with extensions am or pm. Similarly, date entries can be specified as mon, 8th or even 8/2. Short forms of months or days are also stored in predeclared dictionaries so that they can be identified correctly.

Any missing entries in the initial rounds returns a value for a specific prompt, while errors raised are also classified to provide feedback to the user. Any new details specified are updated and stored with each subsequent "poll". This way, earlier inputs still persist in memory.



Some exceptions which result in prompts include:

  • Incorrect time format
  • Incorrect month/day range
  • A date or time which is in the past
  • Missing time info
  • Missing date info



Assumptions

If a time is provided without a day, the program checks to see if the specified time is still applicable for the current day. If the stated time has already passed, the reminder time will be automatically offset to the next day. If a date is specified without a month, the default month will be the current month, else if the date has already passed, the reminder time will be offset to the next month. Similarly, the default reminder year is set to the present year and will be offset if necessary.

When am or pm is not indicated in the time field, the bot makes use of predetermined assumptions to automatically set a time which is more realistic. For example, specifying the time 2:00 implies that the user possibly meant 2:00pm in the afternoon.



Bot Interface



(back to top)



(back to top)

Commands

  1. /start - Bring up main menu
  2. /help - Bring up help menu

(back to top)

Technologies

(back to top)

Try the Bot

https://telegram.me/DontForgetAhBot

Requirements

To install the main package for the Telegram Bot API:

$ pip install pyTelegramBotAPI

  • Installation from source (requires git):
$ git clone https://github.com/eternnoir/pyTelegramBotAPI.git
$ cd pyTelegramBotAPI
$ python setup.py install
  • Install other packages and drivers (APScheduler, Pymongo):

$ pip install -r requirements.txt

Contributing

Any contributions you make are greatly appreciated!

If you have a suggestion that would make this better, please fork the repo and create a pull request.

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Support

Contributions, issues, and feature requests are welcome!

Contact

Nicholas Lee: [email protected]

Project Link: https://github.com/nicleejy/DontForgetAh-Bot

(back to top)

dontforgetah-bot's People

Contributors

nicleejy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.