How do we use AI to extract useful information from text?
Text is how we communicate information to one another via written language. It is the primary type of data we encounter in books, articles, and emails. As a result, it is one of the most valuable forms of unstructured data that exist in our world.
In the last two articles of this multi-part series on The AI Developer’s Toolkit, I introduced you to the top AI tools for table analysis and table synthesis. In this article, I’ll introduce you to the three most popular AI tools for text analysis.
Text classification allows us to assign a body of text into two or more categories. It answers the question “what kind of text is this?” or “what group does this text belong to?”
For example, we could classify news articles by the industry they pertain to. We provide the text-classification model with an email message as input. Then the model produces a prediction of which industry the article pertains to as output.
Text classification is useful anytime you have a collection of documents and you need to organize them into two or more categories. For example:
Sentiment analysis allows us to determine the emotional sentiment of a body of text. It answers the question “is this text positive or negative?”
For example, we could analyze product reviews to determine if they are favorable or unfavorable. We provide the sentiment-analysis model with a product review as input. Then, the model produces a sentiment score as output. These scores often range from 0 (very negative) to 1 (very positive).
Sentiment analysis is useful anytime you need to determine the emotional sentiment of a body of text. For example:
Entity recognition extracts named entities from a body of text. It answers the question: “what person, place, or thing do these words refer to?
For example, we can use entity recognition to discover named entities in news articles. We provide the entity-recognition model with the text of each article as input. Then the model produces a list of the named entities and their locations in the text as output.
For example, the words “Microsoft” and “Google” in a business article would clearly refer to their respective companies. However, the word “Amazon” could either refer to the company or the river in South America. So the model needs to use the surrounding context to determine which words correspond to what entities.
Entity recognition is useful anytime you want to determine what entities are contained in a body of text beyond simple word-matching. For example, …
Beyond these three common text-analysis tools, there are also a variety of other text-analysis tools. For example:
As we can see, text-analysis tools allow us to extract useful information from bodies of text.
If you’d like to learn how to use all of the tools listed above, please watch my online course: The AI Developer’s Toolkit.
The future belongs who those who invest in AI today. Don’t get left behind!