The Washington Post uses natural language processing to improve advertising capabilities
Newspapers sell Ads...
As all newspapers, the Washington Post business model is to sell advertising to its customers. Ad spots are sold and automatically placed on different sections of the website. This model has worked for the past years, but...
This article was written inspired by the original story published here.
Most of the times, the content of the ads is irrelevant and completely disconnected from the content of the article. At best, the result can be irrelevant to the reader. At worst, the misplacement could be so catastrophic that it could even make its place into infamous rankings such as these...
The above picture is merely an unhappy coincidence (and a terrible ad for "Bad Idea T-Shirts"), but since the ad placement process is completely automatic, there is obviously no way for a newspaper to manually check all articles or ensure the Ads are relevant to the users.
Machine Learning to the rescue!
The Washington Post decided to implement Natural Language Processing (NLP), which is a popular use case of Machine Learning, in order to automatically read all posts, extract their entities, rank their salience and create meta tags for them. The available ads have also been analysed and the result is well placed Ads that are somehow relevant to the content of the article. One example could be a sportswear ad close to an article talking about the Olympic games.
That's great, but I bet it's super expensive!
Some years ago that statement would have been true as you would need to hire an expensive team of Data Scientist with PhDs in the Natural Language Field and allocate expensive computing ressources to achieve this.
Fast forward to 2018 and there are many technology giants that provide out of the box APIs who happen to provide Natural Language Processing as a Service. In the case of the Washington Post, they decided to use Amazon Comprehend, a very rich and powerful API that analyses text and returns results in a fraction of a second.
This api is REALLY inexpensive to use.
With the text volumes of such a newspaper, the price per unit analysed (each batch of 100 characters) could reach $0.000025, so the overall solution is neither expensive, nor difficult to replicate or implement. You would only need to add the hosting costs and the development time needed to write and integrate such application into your existing system.
MLab, the Machine Learning specialists at your service!
If Machine Learning inspires you and you think you would like to implement a use case in your organisation, please contact us. We are vendor agnostic and we will recommend and integrate the technology that adapts the best to your needs. If available technologies do not satisfy your needs, we can always train a custom model tailored to your project.
Disclaimer: MLab was not involved in the development of this project. We simply publish this case study in our blog as a source of inspiration on what Machine Learning can achieve.