Data Miner API Documentation

Data Miner API Implementation instructions

Data Miner API is REST-ful and can be accessed trough GET and POST interfaces over the world wide web. The API endpoint is located at:

http://api.ai-applied.nl/data_miner/data_summary/

The API calls are passed as a JSON object, using either the POST of GET command. This API allows you to request information summaries for keywords tracked by the Data Miner System, as well as to manipulate the tracking keywords in the system which belong to your account.

Keyword manipulation

Data Miner API allows the execution of administrative functions on the tracked list of keywords (listing, adding and removing of keywords to track) using the following API call format:

request={"data":{"api_key":"your api key","user":"your Data Miner System username","admin":{"action":"your action","data":[["keyword","source"],["keyword","source"]]}}}

Please note that this call can only be executed if the security restrictions allow it.

The API accepts the following parameters for administrative functions, packed inside the transport container (the first "data" parameter):

Parameter

Mandatory

Description

api_key (string)

Yes

Your key for the use of this API. Data Miner API always requires a valid API key. If you need a API key, please contact us.

username (string)

Yes

Your username in the Data Miner System.

action (string)

Yes

The administrative function you wish to execute. The system allows three types of actions:

  1. LIST_KEYWORDS
  2. ADD_KEYWORDS
  3. REMOVE_KEYWORDS

LIST_KEYWORDS allows you to view all keywords currently being tracked by the system for your account. If you call this administrative function, the subsequent "data" parameter, which is mandatory, can remain an empty array. This is an example call for LIST_KEYWORDS (click to open):

http://api.ai-applied.nl/data_miner/data_summary/?request={"data":{"api_key":"977a0b738d36942eb510b41c17cc7f4be1e019a6","user":"DEMO","admin":{"action":"LIST_KEYWORDS","data":[]}}}

This call returns the following JSON formatted message:

{"status": 1, "id": null, "response": {"data": [["kurzweil", "twitter"], ["kurzweil", "facebook"], ["artificial intelligence", "twitter"], ["artificial intelligence", "facebook"], ["big data", "twitter"], ["big data", "facebook"]], "description": "OK: Returning keywords and datasources.", "success": true}}

The reply format for this call contains the following parameters:

Parameter

Description

status (integer)

Status is 1 if the transport of the message has been conducted succesfully trough all systems

id (integer or string)

This is a optional callback parameter that can be ignored for this API

response (JSON dictionary object)

This contains the response from the API

The "response" parameter contains this API's reply and consists of the following subparameters:

sucess (boolean)

This indicates whether the API call has been completed with success (==true), or has failed (==false)

description (string)

This gives a description of the API call's success or failure as a string

data (array of arrays)

This is a list of keywords - data-source pairs currently being tracked by the system for your account. All entries returned in the data parameter are arrays, such as the following example:

["big data", "twitter"]

In this list, the first entry ("big data") denotes a keyword that is currently being tracked, while the second entry ("twitter") denotes the data source that the keyword is being tracked on.

ADD_KEYWORDS allows you to add keywords to the system which will be tracked from the moment of addition. When you call this administrative function, the subsequent "data" parameter, which is mandatory is a JSON formatted array of keyword-data-source arrays, like the following example:

["keyword", "source"]

where the first entry ("keyword") specifies the keyword you would like to track, while the second entry ("source") specifies the data source on which you would like to track it. Currently, three tracking sources are supported:

  1. twitter
  2. facebook
  3. news

This is an example call for the administrative function ADD_KEYWORDS, in which the keyword "test" is added to tracking on "twitter" and in the "news":

http://api.ai-applied.nl/data_miner/data_summary/?request={"data":{"api_key":"977a0b738d36942eb510b41c17cc7f4be1e019a6","user":"DEMO","admin":{"action":"ADD_KEYWORDS","data":[["test","twitter"],["test","news"]]}}}

The reply format for this call is identical to the reply format for LIST_KEYWORDS, however with different values for the "data" parameter:

data (null)

This parameter remains emtpy when called through ADD_KEYWORDS

REMOVE_KEYWORDS allows you to remove keywords from the tracking system for your account. When you call this administrative function, the subsequent "data" parameter, which is mandatory is a JSON formatted array of keyword-data-source arrays, like the following example:

["keyword", "source"]

where the first entry ("keyword") specifies the keyword you would like to remove from tracking, while the second entry ("source") specifies the data source from which you would like to remove the keyword from tracking. Currently, three tracking sources are supported:

  1. twitter
  2. facebook
  3. news

This is an example call for REMOVE_KEYWORDS, in which the keyword "test" is removed from tracking on "twitter" and in the "news":

http://api.ai-applied.nl/data_miner/data_summary/?request={"data":{"api_key":"977a0b738d36942eb510b41c17cc7f4be1e019a6","user":"DEMO","admin":{"action":"REMOVE_KEYWORDS","data":[["test","twitter"],["test","news"]]}}}

WARNING: When you remove keywords from tracking for your account, all data collected for the corresponding keywords is removed from the data store as well. No further insights can be obtained from this data from that moment on. 

The reply format for this call is identical to the reply format for LIST_KEYWORDS, however with differing values for the two parameters:

description (string)

This gives a description of the API call's success or failure as a string. Returns a "WARNING" if the command has been only partially succesful.

data (null)

This parameter remains emtpy when called through REMOVE_KEYWORDS

Requesting Analytics

Data Miner API allows you to request powerful analytics summaries derived from the data which has been previously collected and analyzed for your account by the Data Miner System. Requesting analytics summaries can be achieved by using the following API call:

request={"data":{"api_key":"your api key","user":"your Data Miner System username","call":{"from":time,"to":time,"keywords": ["tracking keyword 1", "tracking keyword 2"],"key_terms": nr_of_related_key_terms,"statistics_path":"statistics path","resolution":time resolution}}}

The API accepts the following parameters for request for analytics, packed inside the transport container (the first "data" parameter):

Parameter

Mandatory

Description

api_key (string)

Yes

Your key for the use of this API. Data Miner API always requires a valid API key. If you need a API key, please contact us.

username (string)

Yes

Your username in the Data Miner System.

from (integer)

Yes

This is the date and time since which you want to have a analytics summary generated, in UNIX Epoch format.

to (integer)

Yes

This is the date and time until which you want to have a analytics summary generated, in UNIX Epoch format.

keywords (array of strings)

Yes

The keywords in your Data Miner System account for which you require a analytics summary. You can specify one or multiple keywords simultaneously, for example:

["demo","test","trial"]

The individual counts and statistics of all keywords supplied in this field will be merged in the resulting analytics summary. If you require separate statistics for the requested keywords, please execute multiple calls to this API, specifying a single keyword per call.

key_terms (integer)

No

This optional parameter allows you to specify the number of the extracted related phrases, commonly occuring in the aggregated data with the keywords which you requested. If the parameter is not supplied in the call, or the value supplied in the parameter is 0, no related phrases will be returned by the API, which is the default behavior. Requesting the key terms might result in longer call processing times.

statistics_path (string)

Yes

This parameter allows you to specify a "drill-down" of statistics in the requested analytics summary by different parameters. Possible basic parameters are:

  • “sentiment_class”: returns the specific numbers of positive/negative/neutral/unknown messages analysed by the system
  • “age”: returns the specific numbers of messages analysed by the system for five age categories: 12-20, 21-30, 31-40, 41-50, 51-65 (or unknown)
  • “gender”: returns the specific numbers of
    messages by male/female/unknown users analysed by the system
  • "source”: returns the specific number of messages per data source as collected and analysed by the system (currently, twitter, facebook or news)
  • “language_iso”: returns the specific number of messages per language code as analysed by the system, as well as the number of messages written in unknown languages

You can also set the “statistics_path” parameter to an empty string; this will return a simple count of ALL messages within the selected time for the selected keywords.

The "statistics_path" parameter also allows you to combine multiple basic parameters of the "drill-down" for more specific insights. You can combine multiple basic parameters by appending them to one another, using a “/” (slash) character as a spacer.

For example, if you would like to know how many 21-30 year old women, speaking the English language on Twitter are negative about your keywords, you could request a "age/gender/language_iso/source/sentiment_class" statistics_path from the system.

Related phrases, which are requested using the "key_phrases" parameter are automatically grouped into the "drill-down" cohorts requested by you, to allow you to answer the "why?" questions about related statistics.

There is no limit to the number of combinations of basic parameters of the "drill-down" which can be used to give you as specific analytics summaries as you require. Longer combinations of basic parameters do result in longer processing times.

resolution (integer)

Yes

This is the time resolution of the requested analytics summary, allowing you to track progression of analytics counts through time. Resolution is expressed in seconds, and the minimal resolution is one second.

For different time periods you can use:

  • Second: 1
  • Minute: 60
  • Hour: 3600
  • Day: 86400
  • Week: 604800
  • Month: 2592000 (based on a 30-day month)
  • Quarter: 7776000 (based on 3 30-day months)
  • Year: 31536000

Sample call and response interpretation

This is a example requests to the Data Miner API for an analytics summary for the term "big data" on DEMO account, for the period between Tuesday, 8th of January 2013 16:23:57 GMT and Tuesday, 15th of January 2013 16:23:57 GMT, requesting a drilldown of sentiment_class per gender, showing the 5 most important related phrases, with one day resolution (click or copy-paste the example to run it):

http://api.ai-applied.nl/data_miner/data_summary/?request={"data":{"api_key":"977a0b738d36942eb510b41c17cc7f4be1e019a6","user":"DEMO","call":{"from":1357662237,"to":1358267037,"keywords":["big data"],"key_terms":5,"statistics_path":"gender/sentiment_class","resolution":86400}}}

This call returns the following JSON formatted message (only partially displayed here for clarity):

{"status": 1, "id": null, "response": {"data": [{"timestamp_utimestamp": 1358121600, "summary": {"key_terms": {"seo": 483, "analytics": 167, "forbes": 126, "data": 911, "opportunities": 107}, "unknown": {"key_terms": {"seo": 19, "opportunities": 10, "data": 36, "emc": 12, "contenute": 15}, "positive": {"key_terms": {"seo": 19, "big data": 6, "data": 24, "opportunities": 8, "emc": 12}, "total": 238}, "negative": {"key_terms": {"shortage looms": 8, "data": 12, "commitment": 4, "retailers": 4, "crash excel": 4}, "total": 56}, "neutral": {"key_terms": {"teenage": 1, "ecommerce": 1, "ammo": 2, "opportunities": 2, "lent": 1}, "total": 17}, "unknown": {"key_terms": {"big data": 3, "crescono": 3, "btob": 6, "contenute": 15, "delizia": 4}, "total": 38}, "total": 349}, "total": 8313, "male": {"key_terms": {"seo": 429, "analytics": 139, "forbes": 108, "data": 760, "opportunities": 88}, "positive": {"key_terms": {"seo": 418, "analytics": 128, "data": 585, "opportunities": 82, "entrepreneurs in reflections": 77}, "total": 5272}, "negative": {"key_terms": {"watchers hooked": 44, "drastic": 39, "forbes": 62, "data": 146, "tech stocks": 53}, "total": 1302}, "neutral": {"key_terms": {"seo": 11, "analytics": 11, "data": 29, "faster": 7, "opportunities": 6}, "total": 264}, "unknown": {"key_terms": {}, "total": 0}, "total": 6838}, "female": {"key_terms": {"seo": 35, "analytics": 23, "data": 115, "sale": 47, "emc": 21}, "positive": {"key_terms": {"seo": 35, "analytics": 23, "data": 99, "sale": 47, "emc": 21}, "total": 856}, "negative": {"key_terms": {"watchers hooked": 14, "forbes": 11, "data": 11, "vormerken": 6, "konzerne befassen": 6}, "total": 234}, "neutral": {"key_terms": {"roi": 2, "technologies": 3, "data": 5, "venturebeatcan 'big data' lift": 2, "entrepreneurs in reflections": 2}, "total": 36}, "unknown": {"key_terms": {}, "total": 0}, "total": 1126}}}, {"timestamp_utimestamp": 1358208000, "summary": {"key_terms": {"analytics": 328, "opentable deal restaurant data provider locu": 293, "data": 573, "tripadvisor": 289, "'big data'": 248}, "unknown": {"key_terms": {"analytics": 11, "data": 16, "contenute": 12, "tech stocks": 21, "crash excel": 13}, "positive": {"key_terms": {"seo": 8, "cmos embrace pinterest": 5, "trends": 10, "royalties": 5, "cricinfo's statsguru": 6}, "total": 59}, "negative": {"key_terms": {"romney's": 5, "fanning": 5, "villain exp": 3, "argument": 2, "eus": 1}, "total": 16}, "neutral": {"key_terms": {"analytics": 11, "internet": 9, "data": 16, "tech stocks": 21, "crash excel": 13}, "total": 336}, "unknown": {"key_terms": {"sbic": 4, "storage": 5, "cuman bisa buat ngebuka beberapa halaman web dilema browsing": 6, "contenute": 12, "consigliati": 5}, "total": 48}, "total": 459}, "total": 9716, "male": {"key_terms": {"analytics": 262, "opentable deal restaurant data provider locu": 255, "data": 512, "tripadvisor": 257, "'big data'": 224}, "positive": {"key_terms": {"analytics": 75, "insurer": 32, "data": 35, "practical ecommerce": 32, "mobility": 42}, "total": 1303}, "negative": {"key_terms": {"watchers hooked": 45, "track": 38, "data": 20, "opportunities": 10, "retailers": 22}, "total": 221}, "neutral": {"key_terms": {"tripadvisor": 246, "opentable deal restaurant data provider locu": 239, "data": 457, "'big data'": 216, "analytics": 187}, "total": 6665}, "unknown": {"key_terms": {}, "total": 0}, "total": 8189}, "female": {"key_terms": {"analytics": 55, "opentable deal restaurant data provider locu": 30, "data": 45, "tripadvisor": 24, "internet": 32}, "positive": {"key_terms": {"analytics": 17, "creativity": 8, "data": 10, "algorithm wars": 6, "insurer": 9}, "total": 147}, "negative": {"key_terms": {"screw": 4, "fanning": 10, "excel": 1, "iping computer repair": 4, "discovery": 4}, "total": 25}, "neutral": {"key_terms": {"analytics": 38, "opentable deal restaurant data provider locu": 25, "data": 35, "tripadvisor": 24, "internet": 32}, "total": 896}, "unknown": {"key_terms": {}, "total": 0}, "total": 1068}}}], "description": "OK.", "success": true}}

The reply format for this call contains the following parameters:

Parameter

Description

status (integer)

Status is 1 if the transport of the message has been conducted succesfully trough all systems

id (integer or string)

This is a optional callback parameter that can be ignored for this API

response (JSON dictionary object)

This contains the response from the API

The "response" parameter contains this API's reply and consists of the following subparameters:

sucess (boolean)

This indicates whether the API call has been completed with success (==true), or has failed (==false)

description (string)

This gives a description of the API call's success or failure as a string

data (array of JSON objects)

This contains the requested analytics data per time point in the chosen resolution. The times points themselves are represented as JSON dictionary objects, in chronological order.

A single time point returned for the analytics report, depending on the parameters selected in the request, has the following format:

{"timestamp_utimestamp": 1358121600, "summary": {"key_terms": {"seo": 483, "analytics": 167, "forbes": 126, "data": 911, "opportunities": 107}, "unknown": {"key_terms": {"seo": 19, "opportunities": 10, "data": 36, "emc": 12, "contenute": 15}, "positive": {"key_terms": {"seo": 19, "big data": 6, "data": 24, "opportunities": 8, "emc": 12}, "total": 238}, "negative": {"key_terms": {"shortage looms": 8, "data": 12, "commitment": 4, "retailers": 4, "crash excel": 4}, "total": 56}, "neutral": {"key_terms": {"teenage": 1, "ecommerce": 1, "ammo": 2, "opportunities": 2, "lent": 1}, "total": 17}, "unknown": {"key_terms": {"big data": 3, "crescono": 3, "btob": 6, "contenute": 15, "delizia": 4}, "total": 38}, "total": 349}, "total": 8313, "male": {"key_terms": {"seo": 429, "analytics": 139, "forbes": 108, "data": 760, "opportunities": 88}, "positive": {"key_terms": {"seo": 418, "analytics": 128, "data": 585, "opportunities": 82, "entrepreneurs in reflections": 77}, "total": 5272}, "negative": {"key_terms": {"watchers hooked": 44, "drastic": 39, "forbes": 62, "data": 146, "tech stocks": 53}, "total": 1302}, "neutral": {"key_terms": {"seo": 11, "analytics": 11, "data": 29, "faster": 7, "opportunities": 6}, "total": 264}, "unknown": {"key_terms": {}, "total": 0}, "total": 6838}, "female": {"key_terms": {"seo": 35, "analytics": 23, "data": 115, "sale": 47, "emc": 21}, "positive": {"key_terms": {"seo": 35, "analytics": 23, "data": 99, "sale": 47, "emc": 21}, "total": 856}, "negative": {"key_terms": {"watchers hooked": 14, "forbes": 11, "data": 11, "vormerken": 6, "konzerne befassen": 6}, "total": 234}, "neutral": {"key_terms": {"roi": 2, "technologies": 3, "data": 5, "venturebeatcan 'big data' lift": 2, "entrepreneurs in reflections": 2}, "total": 36}, "unknown": {"key_terms": {}, "total": 0}, "total": 1126}}}