Understand natural language processing

Websites and apps can have various moving parts including front end creative, server-side processing, APIs and data storage. AI can plug in any of these components.

On the front end, you can connect voice commands, chatbot interfaces or reactive WebGL creative elements. On the back end, databases use intelligent algorithms to maximise speed and analysis. APIs can provide a layer of abstraction from a wide range of AI functions, from predictions to collective training.

Latest Videos From Creative Bloq

Natural language processing (NLP) focuses on the interactions between machines and human languages. It is the objective of NLP to process and analyse vast amounts of language data to improve natural communication between humans and machines. This field of AI includes speech recognition, understanding language and generating natural language. Our focus will be on understanding natural language, the process of analysing and determining the meaning or intent of a text.

Detecting language – Understanding which language is being used in the text is fundamental to knowing which dictionaries, syntax and grammar rules to use in analysis.
Entity extraction – Identifying the key words in phrases, how relevant or salient they are to the overall text and determining what the entities are, based on training or knowledge bases.
Sentiment analysis – Assessing the general level of 'feeling' in a text. Is it generally positive or negative? Also, sentiment related to each entity. Does the statement reflect positive feelings or negative ones about the 'subject'?
Syntactic analysis – Understanding the structure of the text. Identify attributes such as sentences, parts of speech (e.g. noun, verb), voice, gender, mood and tense.
Content classification or categorisation – Organising the content of the text into common categories to more efficiently process them. For example, New York, London, Paris, Munich are all 'locations' or 'cities'.

There are numerous technical approaches to parsing and processing the data. Regardless of which NLP tool you use, you will have to tackle the common steps of parsing and analysis. Typically text is separated into logical chunks. These chunks are analysed against trained data or knowledge bases and assigned values, usually ranging from 0.0 to 1.0 to reflect the level of confidence in the analysis.

export GOOGLE_APPLICATION_CREDENTIALS="/Users/username/Downloads/[file name].json"

$env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\[FILE_NAME].json"

curl -X POST \
     -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
     -H "Content-Type: application/json; charset=utf-8" \
     --data "{
  'document':{
    'type':'PLAIN_TEXT',
    'content':'John McCarthy is one of the founding fathers of artificial intelligence.'
  },
  'encodingType':'UTF8'
}" "https://language.googleapis.com/v1/documents:analyzeEntities"

{
      "name": "John McCarthy",
      "type": "PERSON",
      "metadata": {
        "wikipedia_url": "https://en.wikipedia.org/wiki/John_McCarthy_(computer_scientist)",
        "mid": "/m/01svfj"
      },
      "salience": 0.40979216,
      "mentions": [
        {
          "text": {
            "content": "John McCarthy",
            "beginOffset": 0
          },
          "type": "PROPER"
        }
      ]
    },

You can see in the sample entity listing, the name identified and the type, which the AI determined is a PERSON. It also found a Wikipedia match for the name and returned that. This can be useful, since you could use that URL as the content for a second request to the API and get even more entities and information on this one. You can also see the salience value at 0.4, which indicates a significant relative importance of the entity in the context of the text we provided. You can also see it is correctly identified as PROPER, which refers to the noun type (a proper noun), as well as how many occurrences (mentions) of the entity in the text.

The API will return values for all the key entities in the text you submit. This alone can be extremely useful for processing what a user might be communicating to your app. Regardless of what the sentence contained, there is a good chance it is about the person, John McCarthy, and we could look up some information for the user based on this alone. We could also respond in a way that reflects our understanding this statement refers to a person.

php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
php -r "if (hash_file('sha384', 'composer-setup.php') === '93b54496392c06277467 0ac18b134c3b3a95e5a5e5 c8f1a9f115f203b75bf9a129d5 daa8ba6a13e2cc8a1da080 6388a8') { echo 'Installer verified'; } else { echo 'Installer corrupt'; unlink('composer-setup.php'); } echo PHP_EOL;"
php composer-setup.php
php -r "unlink('composer-setup.php');"
php composer.phar require google/cloud-language

<!DOCTYPE html>
<html>
<head>
	<title>NET - NLP Tutorial</title>
</head>
<body>
<form>
	<p><input type='text' id="content" name="content" placeholder="What can I analyze?" /></p>
	<p><input type='submit' name='submit' id='submit' value='analyze'></p>
</form>
<div class="results">
<?php
	// php code goes here //
	if(empty($_GET['content'])) { die(); }
	$content = $_GET['content'];
?>
</div>
</body>
</html>

putenv('GOOGLE_APPLICATION_CREDENTIALS=/Users/richardmattka/Downloads/NLP Tutorial 1-1027228343dc.json');

require __DIR__ . '/vendor/autoload.php';
use Google\Cloud\Language\LanguageClient;
$projectId = 'nlp-tutorial-1-1543506531329';
$language = new LanguageClient([
    'projectId' => $projectId
]);

Start by requiring the vendor autoload. This is similar in Python or Node if you require your dependencies. Import the LanguageClient next, to make use of the class. Define your projectId. If you aren't sure what this is, you can look it up in your GCP console, where you set up the project originally. Finally, create a new LanguageClient object using your projectId and assign it to the $language variable.

$result = $language->analyzeEntities($content);
foreach($result->entities() as $e){
	echo "<div class='result'>";
	$result = json_encode($e, JSON_PRETTY_PRINT);
	echo $result;
	echo "</div>";
}

This code submits the content from the submitted form to the analyzeEntities endpoint and stores the result in the $result variable. Then, you iterate over the list of entities returned from $result->entities(). To make it a little more readable, you can format it as JSON before outputting to the screen. Again, this is just an example to show you how to use it. You could process it and react to the results however you need.

$result = $language->analyzeEntitySentiment($content);

foreach($result->entities() as $e){
	echo "<div class='result'>";
	$result = json_encode($e, JSON_PRETTY_PRINT);
	echo $result;
	echo "</div>";
}

{ "name": "Star Wars", "type": "WORK_OF_ART", "metadata": { "mid": "\/m\/06mmr", "wikipedia_url": "https:\/\/en.wikipedia.org\/wiki\/Star_Wars" }, "salience": 0.63493526, "mentions": [ { "text": { "content": "Star Wars", "beginOffset": 0 }, "type": "PROPER", "sentiment": { "magnitude": 0.6, "score": 0.6 } } ], "sentiment": { "magnitude": 0.6, "score": 0.6 } }
{ "name": "movie", "type": "WORK_OF_ART", "metadata": [], "salience": 0.36506474, "mentions": [ { "text": { "content": "movie", "beginOffset": 22 }, "type": "COMMON", "sentiment": { "magnitude": 0.9, "score": 0.9 } } ], "sentiment": { "magnitude": 0.9, "score": 0.9 } }

This shows a positive sentiment score of significant value. Not only do you now know the key words the user is communicating but also how they feel about it. Your app can respond appropriately based on this data. You've got a clear identification of "Star Wars" as the primary subject with high salience. You've got a Wikipedia link to grab more information if you want to run that URL back through the same API call. You also know the user is feeling positive about it. You can even see the statement weights the positive sentiment on the quality of it as a movie. Very cool.

Thank you for reading 5 articles this month* Join now for unlimited access

Enjoy your first month for just £1 / $1 / €1

*Read 5 free articles per month without a subscription

Join now for unlimited access

Try first month for just £1 / $1 / €1

TOPICS

Richard is an award-winning interactive technologist, designer and developer. He specialises in creating interactive worlds with science-fiction themes, exploring the synergy between human and machine. He has also written regular articles for Net Magazine, and Web Designer Magazine on a range of exciting topics across the world of tech, including artificial intelligence, VFX, 3D and more.

Recommended reading

Understand natural language processing

Natural language

Google's Natural Language API

01. Create new Google Cloud Project

02. Enable the Cloud NL

03. Create a service account

04. Download private key

05. Set environment variable

06. Make a call to the API

07. Install client library

08. Create a new file

09. Make the environment variable

10. Initialise the library

11. Analyse the entities

12. Analyse the sentiment

Parting thoughts

Please wait...