Needless to say, I was pretty irritated when my sister asked me if I could help her pay the developer for 3+ courses’ worth. I was having a calm Saturday evening; I figured I would take a look at how one would go about building such an application: how hard could it be, really?
There were a couple of key things I would have to look into:
1) Understanding how to get the course seating information from the course registration website
2) Being able to check for the availability information periodically (i.e. every x minutes)
3) Sending informative e-mails to students, regarding a course’s availability
As I had used the same registration system for several years during my bachelors, I was already acquainted with where to find the information I needed: the total course capacity, number of people that are already registered, and the number of remaining available seats. After jumping around a couple of links, it turned out to be manageable with simple web scraping.
The only challenge regarding this was to investigate how the HTML for the course pages was laid out. After getting my hands dirty for a while, I was able to scrape the necessary information using Node.js, along with the npm packages cheerio and request-promise. This process was a little ugly, but I was able to summon my strength before giving into the Impostor Syndrome, thanks to these wise words from the Spider-man franchise:
With awful HTML comes awful document queries.
After I was able to receive the necessary information, I had to make sure it could be handled periodically. Registering to courses with high demand is a race against the clock, so the more frequently we checked to see if there is an opening, the better.
A quick Google search on how I could do that led me to the concept of a cron job. Simply put, it is a utility used to schedule desired commands to run at specific times/intervals. I’ll list a couple of resources at the end of the article, to help learn more about cron jobs and delicacies in writing the commands to run them.
To add the script as a cron job, in my Ubuntu terminal I ran
…and I added the following lines so the script runs every minute (note that # is for commenting a line)
# Uncomment the line below before registration starts, so the script can run every minute
# */1 * * * * node ~/Desktop/Code/robin/crawl.js
The crawl.js script basically loads a list of CRN’s, along with e-mail addresses to notify.
After crawling the corresponding course’s website, the number of total seats, taken seats and available seats are stored for further comparisons. In the next minute, the script checks whether there is any difference compared to the previous minute. If there are no changes, the student is not notified (or rather spammed).
The last step I had to take was to see how I could send e-mails from my node script. I previously had exposure to node-mailer, so after creating a Gmail account to send the e-mails from, I went ahead and used the package as can be seen in its documentation.