Blog posts

2018

2017

MAS.S70 Applied Data Visualizations

2 minute read

Published:

Machine Learning or ML? - How words enter the public domain

Introduction

For this project I am going to look at how new words enter the common language realm.

Theory

One of the responsibilities of a journalist is to teach his readers. This is not just limited to conveying news, but also includes teaching new vocabulary. This is particularly relevant in the fast-paced realm of technology, where artificial intelligence, the cloud, machine learning, and big data have become significantly more newsworthy as these concepts transform the industry and the process of innovation. But when do these words enter the public realm and vocabulary and become ML and AI? We should expect that journalists initially only use the unabbreviated concept. As the concept starts to enter the public domain, journalists may use both the abbreviation and the full word side by side, until the abbreviation is eventually predominantly used. Let's see if this is true!

Data

I decided to focus on the following words and abbreviations:
  • Artificial Intelligence, AI, and A.I.
  • Machine Learning, ML, and M.L.
  • Natural Language Processing and NLP
  • Neural Network and Neural Net
  • Generative adversarial network and GANS
  • Recurrent Neural Network, Recurrent Neural Net, RNN, and R.N.N.
  • Application Programming Interface and API
  • Deep Neural Network, Deep Neural Net, Deepmind, and Deep Mind
  • Supervised Machine Learning, Unsupervised Machine Learning, and Reinforcement Learning
  • LSTM, Embedding space
  • Cloud, Big Data, Technology, Automation, Robot, AOL, Cyber Crime

Methodology

I scraped articles from the Guardian between 1999 and 2017 and count the number of occurences and co-occurences of the se words. The Guardian's online edition was the fifth most widely read in the world in 2014 (Source) and is thus a reasonable proxy for journalistic activity.

Results

The most interesting results came from AI and ML. According to the 'Journalist Educator Hypothesis' above I expected that the number of occurences of the abbreviations would eventually overtake those of the complete words. However, we observe the opposite!

Timeline for AI versus Artificial Intelligence

<div class="single_viz" id="timeline_ai_c" margin: 0 auto> </div> <div class="single_viz" id="timeline_ml_c" margin: 0 auto>

Timeline for ML versus Machine Learning

</div>
One explanation may be that the target group changed. Whereas initially these kinds of tech articles may have been directed at the already knowledgeable readers, as these topics became more popular over time, the full word usage became necessary. It may also be indicative of journalists preferring to use the full word as the abbreviation comes as across as more and more 'buzzwordy' as the popularity of the concept rises. Speaking of buzzwords, let's have a look at a couple.
<div class="single_viz" id="timeline_other" margin: 0 auto>

Timeline for Buzzwords

</div>
We can see, perhaps surprisingly, that 'Cloud' and 'Big Data' are actually on the downturn, whereas 'automation' and 'robot' have become much more common. If this is at all indicative of company behavior, it implies that there may have been a shift from virtual innovation to physical innovation. Finally, most of the technical terms, like embedding space or the different types of machine learning almost never occur, presumably because the Guardian is a news outlet accessible to a general audience.
Just for fun I also tried to look into co-occurences of words, combining the full word with their abbreviations into single categories. We can see that there are actually not that many co-occurences. The most common ones were AI with Robots, AI with Automation, AI with ML, and Big Data with Cloud.
<div class="single_viz" id="chord_diagram" margin: 0 auto>

Chord Diagram

</div> Read more

2012

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool. Headings are cool ======

You can have many headings

Aren’t headings cool?

Read more