Text Analysis API Documentation (deprecated)
Text Analysis API
Text Analysis API, used for the automatic detection of language of written text, extraction of important phrases ("labels") in a message, and the sentiment polarity towards these phrases, uses a JSON-based RPC format and can be accessed trough GET and POST interfaces over the world wide web. The API endpoint is located at:
http://api.ai-applied.nl/api/text_analysis_api/
Note: All our API's allow demo and testing use for free. You can request an API key to test our API's using the form below, after which an API key with 5000 credits will be e-mailed to the provided e-mail address. The use of the API beyond this limit requires a commercial account and a API key. We will never share your e-mail address with anyone, and use it only to prevent the abuse of the demo functionality. At any time you can upgrade your demo key by adding purchased credits to it, or by switching to a subscription plan.
The API call can be passed in a JSON format using the POST of GET command:
request={ "data":{ "api_key":"api_key", "call":{ "return_original":true, "sentiment_classifier":"default", "label_context":"default", "core_topics":["important topic","very important topic"], "data":[ { "text":"Some message text", "id":some_id } ] } } }
This API accepts the following parameters:
Parameter | Obligatory | Description |
---|---|---|
data | Yes | transport container for the API call |
api_key | Yes | your key for the use of this API |
call | Yes | the API call information container |
return_original | No | return full posted messages (true) together with the language, sentiment and important topics annotation or only the message id's (false) annotated with language |
sentiment_classifier | No | specifies which sentiment analysis classifier should be used to annotate your messages with sentiment. Sentiment analysis is available with two standard classifiers, "default" and "subjective". The "default" classifier provides a two-class classification ("positive"/"negative"), while the "subjective" classifier provides the "neutral" class as well. In addition, it is possible to use custom classifiers, tailored to your purposes - contact us for more information. If this value is not provided, "default" classifier is used |
label_context | No | specifies what language context to use when automatically detecting significant topics/concepts ("labels") from your text. For specific application, specific context can be created which are tailored to your purposes - contact us for more information. If this value is not provided, the general "default" language context is used |
core_topics | No | allows you to manually specify any topics you might be interested in. E.g. if you are analysing hotel reviews, and want to know the attitudes towards the "swimming pool" and "service" for every hotel, these topics can be supplied in this parameter as a list/array of strings. When left empty, all topics will be automatically detected |
data | Yes | a list of JSON-dictionary formatted messages, or a link to a data source API returning such formatted messages (see API nesting documentation) |
All messages passed in the final data parameter need to be JSON formatted like the following example:
Parameter | Obligatory | Description |
---|---|---|
text | Yes | The message text as a string |
id | No | a unique message ID as a string or an integer. When omitted, it will be automatically assigned in the return format |
NOTE: Text Analysis API is intended to be used in a BATCH PROCESSING mode. We encourage the sending of up to 100.000 messages at a time for processing (using a POST API call), as larger batches both incur additional bonuses in processing speed per message, as well as allow the system to recognize the context of your text better, offering better accuracy of topic detection and sentiment analysis. For example, sending all reviews of a single hotel at once will allow the API to automatically discover all topics relevant to hotels, and give you a more informed reply and summary.
Sample call and response interpretation
This example requests from the Text Analysis API the extraction of most important phrases from three messages in different languages. You can copy-paste the code below into your browser URL bar, or click here, to view the API response:
http://api.ai-applied.nl/api/text_analysis_api/?request={ "data": { "api_key": "DEMO_ACCOUNT", "call": { "return_original": true, "sentiment_classifier": "subjective", "data": [ { "text": "Very good phone easy to use with touch screen keybord.", "id": 1 }, { "text": "I love this phone. The ease of use is fantastic, preinstalled applications are great. It tends to freeze often, though.", "id": 2 }, { "text": "I like the design and the very practical way to use it", "id": 3 } ] } } }
This call returns the following JSON formatted message
{ "status": 1, "id": null, "response": { "data": [ { "confidence_sentiment": 0.8888889, "language_eng_name": "English", "text": "Very good phone easy to use with touch screen keybord.", "confidence_language": 1, "sentiment_class": "positive", "topic_sentiments": { "touch screen": { "confidence_sentiment": 0.8888889, "sentiment_class": "positive" } }, "language_iso": "eng", "text_labels": [ [ "touch screen", 6.477905 ] ], "id": 1 }, { "confidence_sentiment": 0.690659, "language_eng_name": "English", "text": "I love this phone. The ease of use is fantastic, preinstalled applications are great. It tends to freeze often, though.", "confidence_language": 1, "sentiment_class": "positive", "topic_sentiments": { "preinstalled applications": { "confidence_sentiment": 0.95238096, "sentiment_class": "positive" }, "freeze": { "confidence_sentiment": 1, "sentiment_class": "neutral" } }, "language_iso": "eng", "text_labels": [ [ "preinstalled applications", 6.3640656 ], [ "freeze", 2.7146833 ] ], "id": 2 }, { "confidence_sentiment": 0.8888889, "language_eng_name": "English", "text": "I like the design and the very practical way to use it", "confidence_language": 1, "sentiment_class": "neutral", "topic_sentiments": { "design": { "confidence_sentiment": 0.8888889, "sentiment_class": "neutral" } }, "language_iso": "eng", "text_labels": [ [ "design", 6.837217 ] ], "id": 3 } ], "description": "OK: Call processed.", "success": true, "summary": { "topic_summary": [ { "topic": "design", "confidence_sentiment": 1, "in_articles": 1, "sentiment_class": "neutral" }, { "topic": "touch screen", "confidence_sentiment": 1, "in_articles": 1, "sentiment_class": "positive" }, { "topic": "preinstalled applications", "confidence_sentiment": 1, "in_articles": 1, "sentiment_class": "positive" }, { "topic": "freeze", "confidence_sentiment": 1, "in_articles": 1, "sentiment_class": "neutral" } ], "general_summary": { "confidence_sentiment": 0.63989806, "sentiment_class": "positive" } } } }
Following parameters in API reply belong to the transport layer, and can be stripped away in case of success:
Parameter | Description |
---|---|
status | has value 1 if the transport of the message has been conducted succesfully trough all systems |
id | a optional callback parameter that can be ignored for this API |
response | contains the response from the API classifiers |
The "response" parameter contains this API's reply and consists of the following subparameters:
Parameter | Description |
---|---|
success | indicates whether the API call has been completed with success (==true), or has failed (==false) |
description | gives a description of the API call's success or failure as a string |
data | contains a list of language, sentiment and topic annotated texts provided by the API in a JSON format, with additional sentiment and relevance scores per topic |
summary | contains the general summary of the data batch, and the summaries of the most important topics over the entire processed batch of documents, with their relative importance and their sentiment classes and intensities. The array is presented in a descending order based on phrase weight |
All messages returned in the data parameter are JSON formatted like the following example:
{ "confidence_sentiment": 0.8888889, "language_eng_name": "English", "text": "Very good phone easy to use with touch screen keybord.", "confidence_language": 1, "sentiment_class": "positive", "topic_sentiments": { "touch screen": { "confidence_sentiment": 0.8888889, "sentiment_class": "positive" } }, "language_iso": "eng", "text_labels": [ [ "touch screen", 6.477905 ] ], "id": 1 }
When the "return_original" parameter is set to true in the API call, the returned messages contain all parameters provided to the API in the call in addition to the annotation parameters. If the "return_original" parameter is set to false in the API call, the following annotation parameters are returned:
Parameter | Description |
---|---|
language_iso | the ISO-639-3 code of the detected language, or "unknown" if the language is unknown to the API |
language_eng_name | the name of the language in English, or "unknown" if the language is unknown to the API |
confidence_language | the confidence the API has in it's judgement about the detected language on a scale 0-1 |
sentiment_class | the sentiment class extracted from text. Depending on the classifier used, this class can be "positive", "negative", "neutral" or "unknown" if the message itself was in a unsupported language |
confidence_sentiment | the confidence the API has in it's judgement about the detected sentiment class |
text_labels | the phrases extracted from text. This parameter contains an array of arrays which themselves are formatted as [phrase, importance weight] |
topic_sentiments | the specific sentiment towards the phrases extracted from text in a JSON dictionary format. Every phrase has it's own specific "sentiment_class" and "confidence_sentiment" which are described above |
The summary of the entire batch, returned in the summary parameter is JSON formatted like the following example:
{ "topic_summary": [ { "topic": "design", "confidence_sentiment": 1, "in_articles": 1, "sentiment_class": "neutral" }, { "topic": "touch screen", "confidence_sentiment": 1, "in_articles": 1, "sentiment_class": "positive" }, { "topic": "preinstalled applications", "confidence_sentiment": 1, "in_articles": 1, "sentiment_class": "positive" }, { "topic": "freeze", "confidence_sentiment": 1, "in_articles": 1, "sentiment_class": "neutral" } ], "general_summary": { "confidence_sentiment": 0.63989806, "sentiment_class": "positive" } }
The summary consists of the following parameters:
Parameter | Description |
---|---|
general_summary | the general summary displays the sentiment class and intensity of the general mood of all messages in the batch |
topic_summary | the topic summary displays the sentiment class ("sentiment_class"), sentiment intensity ("confidence_sentiment") and the prevalence of every topic ("topic") extracted from messages in the batch ("in_articles"). It is sorted by the importance of the individual topics as detected by the system, with manually set "core_topics" always deemed the most important (if present) |
If you need more assistance with the implementation of this API, please don't hesitate to comment below, or contact us for assistance!