GithubHelp home page GithubHelp logo

rpi-availability-tracker's Introduction

RPi-Availability-Tracker

Due to a surge in demand (and some other stuff) Rasberry Pi's are super difficult to get your hands on! I want to see if I can beat the bots and the scalpers and get my hands on a Raspberry Pi 4 and Zero 2 W!
So I started this project to make a program that checks the rpilocator.com/feed/ RSS feed every minute, logs changes and emails me notifications.

Table of Contents

[main] Branch

[cronjob] Branch


This branch reflects my inital idea for the script which was to have it run as an infinite loop:

while True:
    # get request
    # send email if anything has changed
    # sleep for a minute

This method proved to be pretty straightforward and easy to set up. The hardest part was setting up the SMTP connection, but fortunately Corey Schafer did all the hard work already.

In order to get this script up and running on your machine you need to:

  1. Clone the repository
git clone https://github.com/James-Loewen/RPi-Availability-Tracker.git
  1. Set up your own environment variables with your login credentials. I highly recommend this method as opposed to typing your email and password directly into the python file even though that would cut out a few steps. I did this on a raspberry pi running the latest version of the Linux-based Raspberry Pi OS so I created my variables in $HOME/.bash_profile
nano $HOME/.bash_profile

If this file is empty, it won't be for long! If it isn't, just navigate to the end of the file and maybe add a comment about what you're adding. Here's the syntax for declaring new environment variables:

# Email credentials for RPi tracking script "track.py"
export EMAIL_USER="the email address you're sending emails from"
export EMAIL_PASS="the password for the account"
export PHONE_NUMBER="I included my phone number so that I could receive texts as well"
export RECIPIENT="whatever email address you want to receive the messages generated by the script"

Save and exit by typing Ctrl+X, then y, then hitting ENTER.

If you're setting this script up on a Windows environment using PowerShell (bonus points if you're using the new official Windows Terminal) you'll want to open up your $PROFILE file

notepad $PROFILE

Once you've got that opened, the syntax for declaring new environment variables is:

$Env:EMAIL_USER = "the email address you're sending emails from"
$Env:EMAIL_PASS = "the password for the account"
$Env:PHONE_NUMBER = "etc..."
$Env:RECIPIENT = "etc..."

Save and exit and then all that's left to do is...

  1. Start the track.py script.
# Linux:
python {path to repo}/RPi-Availability-Tracker/track.py
# Windows:
py {path to repo}/RPi-Availability-Tracker/track.py

If you run the script this way, it's going to print out information directly to your terminal's stdout and stderr, which stand for "standard output" and "standard error" (by default these are the same stream). You might not want this behavior.

I don't know how to do this on Windows, but I can show you how to run the script in the background on Linux machines:

nohup python -u {path to repo}/RPi-Availability-Tracker/track.py >> rpi_inventory.log 2>> rpi_tracking_errors.log &

There's a lot going on with this command so I'll break it down:

  • The nohup command, short for "No Hangup," ensures that your script is able to continue running even if you are logged out, e.g. you end your ssh session.
  • Running python with the -u flag disables buffering which effectively means your data will be available in real-time. This article does a great job of explaining it.
  • The >> and 2>> redirect stdout and stderr to files of your choosing. I recommend creating a log-files directory in the same directory as track.py.
  • The ampersand, &, is what tells the shell to run the command in the background. If you were to ommit the ampersand, you could still transfer the process to background. You can pause the process by typing Ctrl+Z. You should see an output like:
[1]+  Stopped                 nohup python -u {path to repo}/RPi-Availability-Tracker/track.py >> rpi_inventory.log 2>> rpi_tracking_errors.log

The number in brackets is the job ID. You can also get this by using the jobs command. To move your process to the background, simply run the command:

bg 1

This job ID can also be used to kill the process. In my case I would use the command:

kill %1

However, this isn't the best method for killing the script. If you ran your script, logged out, and then came back at a later time to kill your script, you wouldn't find it listed if you ran the jobs command. The job ID you saw back when you moved the process to the background would no longer be associated with the process. Fortunately, every process has an ID that persists in-between sessions: a PID.

PID stands for "Process Identifier" or "Process ID." To find this number and to confirm that the script is still running, you can use the following command:

ps aux | grep "track.py" | grep -v grep

You should get a result that looks like this:

pi       17704  0.6  0.3  14688  7376 pts/0    S    23:37   0:00 python -u {...}/track.py

That first number, 17704 is the PID. To kill it, use the command:

kill 17704

When I started this project I was focused primarily on the email notification aspect. The log files were a bit of an after thought and they ended up being comparatively more complicated and frankly messy in the first iteration.

I started this branch to sort that out. My goal for the project was to handle log file creation in a far more clean and controlled manner. My personal goals were to learn more about bash scripting and gain more overall familiarity with Linux.

When I stumbled across cron (or cron job), I thought it seemed like a more elegant solution to run a script at set time intervals rather than keeping it running indefinitely. But before I worried about figuring that out, I made a few changes to track.py and the rpi_request.py module (send_email.py remains the same).

The module still does more or less the same thing. It uses the requests and BeautifulSoup4 modules to scrape and clean up some web data from rpilocator.com/feed/. Only now, instead of printing that data to an output that had to be specified by the user, it returns successful requests in the form of an object instance and handles the log file creation for failed requests.

The biggest change is that it is no longer an infinite loop. It's a series of try/except blocks which either read existing files or create new ones. The paths of the log directories and files are relative to the path of track.py itself.

In order to get this process running on your machine—Linux only. I know that there is a similar scheduler for Windows, but I don't know anything about it—you must:

  1. Clone the repository and checkout the cronjob branch
git clone https://github.com/James-Loewen/RPi-Availability-Tracker.git
git checkout cronjob
  1. Set up your own environment variables with your login credentials. See Env variables. This step is what caused me by far the most trouble, but I'll explain that in a bit.

  2. Edit your crontab file, short for "cron table." On a Raspberry Pi the crontab file is located in /var/spool/cron/crontabs, but unlike a regular file, it shouldn't be edited by doing something like:

# You would need root access to do this:
nano /var/spool/cron/crontabs/{file}

The proper command is:

crontab -e

This command will open the crontab file for editing or create and open one if one didn't already exist. On my machine, a new crontab file is generated with a bunch of comments about how job scheduling works. I imagine this is standard, but it could just be a Raspberry Pi OS thing. Either way, what you'll want to do is navigate to the end of the file and input all the necessary information on one line. Here's what mine looks like:

* * * * * BASH_ENV=~/.bash_profile ~/bin/track_cron.sh
  • The asterisks are how you set the schedule for the job. Five asterisks means "Run once every minute." The syntax for this is a little funky, but a tool like Crontab.guru makes it super simple.
  • The next bit with BASH_ENV is how you specify the environment that contains your authenitification variables. This is what tripped me up immensely. By default, cron runs with a bare-bones, stripped down environment. You have to tell cron to use a certain environment and you do so by setting BASH_ENV equal to the location of the environment in which your variables are stored. In my case ~/.bash_profile or $HOME/.bash_profile.
  • The last bit is the process that I want to run. Because I wanted to practice bash scripting, I wrote the simplest script that just runs track.py. It looks like this:
#!/bin/bash
# Execute track.py

python ~/Programming/RPi-Availability-Tracker/track.py 2>> ~/Programming/RPi-Availability-Tracker/error-files/cron.errors

A more straightforward way of executing the script would be to add it directly to the crontab line. Either:

* * * * * BASH_ENV=~/.bash_profile python {path to repo}/track.py

Or, because track.py has a shebang line, run:

chmod +x track.py

Then you can ommit "python" from the command:

* * * * * BASH_ENV=~/.bash_profile {path to repo}/track.py

Type Ctrl+X, then y, then hit ENTER and you're finished!

Between the two, I prefer this method of handling a repetitive process, but I'm only just learning! Forks and pull requests welcome. Let me know if I should be structuring my imports or modules differently or if my spacing is wonky or something.

rpi-availability-tracker's People

Contributors

james-loewen avatar

Watchers

 avatar

rpi-availability-tracker's Issues

Problem with cronjob track.py gathering old data for last_item

Currently, track.py only goes back one day to look for log information:

try:
    with open(os.path.join(log_path, f"{today.strftime(date_format)}.log")) as log:
        last_item = [line.strip() for line in log.read().split("="*40)][-1].strip()
except FileNotFoundError:
    try:
        yesterday = today - timedelta(days=1)
        with open(os.path.join(log_path, f"{yesterday.strftime(date_format)}.log")) as log:
            last_item = [line.strip() for line in log.read().split("="*40)][-1].strip()
    except FileNotFoundError:
        last_item = ['']

The problem arises if no new entries are created for two straight days. Let's say an entry is logged on June 11 but no new entries are logged after that. track.py will send an email with the information from June 11 at midnight on the 13, 15, 17, etc. every two days.

I can think of a few ways to solve this...

  1. The worst way, I think, would be to implement a for loop the keep checking dates further and further back in time until it runs out of files. This is a bad solution because there won't be an entry for every single day and building that condition in would be wonky.
  2. Instead of a for loop, checking for the most recent file in the directory (if one exists) would work. The code for that could look like:
import os
from glob import iglob

# ...

try:
    with open(os.path.join(log_path, f"{today.strftime(date_format)}.log")) as log:
        last_item = [line.strip() for line in log.read().split("="*40)][-1].strip()
except FileNotFoundError:
    try:
        last_log = max(iglob("*.log", root_dir=log_path), key=os.path.getctime)
        with open(last_log) as log:
            last_item = [line.strip() for line in log.read().split("="*40)][-1].strip()
    except FileNotFoundError:
        last_item = ['']

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.