This article is about our NewsBot chrome extension, the fastest way to find related articles.
We recently wrote about what our NewsBot Chrome extension does. Today I'm going to add to that and explain how it works behind the scenes. When building this project we approached it as if we were a user of our Lateral API to see what we could build.
The Chrome Extension
The extension has two primary functions:
- Recommend news articles relevant to the page the user is currently on
- Allow a user to follow a story and send email alerts when new relevant items are found.
When an instant recommendation is triggered, either from the selection of text, the keyboard shortcut (Ctrl/⌘ + Shift + 5) or the "Give me 5" button, the relevant text is sent to the NewsBot servers which then forwards the text to the Lateral news recommendation API which gives relevant news articles to the input text.
To get the relevant text for the "Give me 5" button and the keyboard shortcut we use a combination of python-goose and Newspaper to extract the main body of text. The combination of the two seems to work very effectively. It's not perfect, however, but I think it's a very hard problem. We are planning to make this functionality available through the Lateral API (let me know if you're interested).
The extension saves data persistently to the NewsBot servers so we can know when a new story has been followed. This is handled by the web app.
The Web App
This is the back-end of the Chrome extension and the extensions website. It is written in Ruby with Ruby on Rails. It has an API (powered by Grape) that allows the extension to save users and the stories they follow, and to get recommendations.
When a call to get a recommendation is made, the server sends a request to the Lateral API. The API returns news article recommendations which are then returned to the user.
We use resque and resque-scheduler to periodically check each followed story for new recommendations in the API. If a new recommendation is found then an alert is sent to the user.
The Lateral News Recommender
So a lot of the hard work is done by the Lateral news recommender which handles the fetching, parsing and adding of new articles. It's important to note that the news recommender is built on top of the Lateral API. So we are in a sense users of our own API here.
Seeing as this post is about how NewsBot works and a major part of it is the Lateral news API, I thought I should also explain how we built that!
We have a list of news sources' RSS feeds that we've chosen ourselves to ensure quality content. These feeds are constantly checked for new items. When a new item comes in we use a combination of python-goose and Newspaper to extract the articles title, body, author, image and summary. Then we send the item to the Lateral API so that it can be recommended. We also store the meta data we capture in our down database of news items.
When a user makes a call to the news API, firstly we send the text to the Lateral API to get the IDs of similar documents we've added. Then we get all the documents based on their IDs from the database. This means we can return the image, author and other meta fields with the recommendations. The Lateral API results are combined with the database response and this is returned to the user as JSON.
That's it! We had a lot of fun building this product and it's really exciting to see people using it and getting excited about it. Let us know if you have any feedback about the chrome extension, the setup we use or anything else.