Category Archives: Information retrival

NEW API: ConnectedWords

Hello and Happy New Year!

New Year – New API. We have launched new API called ConnectedWords. We have trained a neural network using word2vec approach on a number of English texts. As input you can supply an array of keywords for which you’d like to get another list of connected or related words.


Available end-points:

Here is an example:

For word “launch” the API produces the following connected words:

“launched 0.5948931514907372”,
“ariane 0.5640206606244647”,
“icbm 0.532163213444619”,
“canaveral 0.5222400316699805”,
“rocket 0.5168188279637889”,
“launcher 0.5066764146199603”,
“suborbital 0.4987842348018603”,
“landing 0.49743730683360354”,
“expendable 0.49456818497947097”,
“agena 0.49325088465809586”,
“orbiter 0.4930563861239534”,
“shuttle 0.48127536803463045”,
“unmanned 0.47977178154360445”,
“launches 0.47013505662020805”,
“sputnik 0.4690193780888272”,
“bomarc 0.46608954818339043”,
“mission 0.4622460565342408”,
“redstone 0.4509777243147255”,
“gliders 0.4493604525398496”,
“missile 0.4388378398880377”,
“abort 0.4322835796211848”,
“rockets 0.4255249811253634”,
“lgm 0.42401975940492775”,
“launching 0.42055305756491634”,
“spacecraft 0.42044358977136653”,
“warhead 0.4203600640856848”,
“manned 0.4196165464952628”,
“skylab 0.417352627778655”,
“spaceflight 0.41261142646271765”,
“payloads 0.41167406251520333”,
“operational 0.41030200304930986”,
“refueling 0.41015588246409607”,
“orbit 0.4054650313323691”,
“extravehicular 0.4040691414909361”,
“icbms 0.4037563327101452”,
“hotol 0.4027989227897706”,
“sts 0.400049473907643”,
“saturn 0.399919637824496”,
“payload 0.398525218766963”,
“bm 0.3965859062493564”

How can one use the API?

1. Making your search engine smarter: expand the result set to documents containing related words. This helps you solve the issue of zero hit searches.

2. Spice up your writing. Are you a journalist / blogger / student and would like to add a flavour to your text? Send in a few words and get a set of words, that might help make your texts more interesting and engaging.

In the future we would like to add support for other languages and train on different types of texts, like social media, news, blogs etc. If you have more ideas for how to make the system more useful for your needs, get in touch!


Happy and Prosperous New Year 2017!

Insider wishes our users and fans a very Happy and Prosperous New Year 2017!

And remember, Insider is there to help you with your limitless natural language processing needs with our text analytics APIs!

Like us on facebook to stay always informed of API landscape and our offerings! 

Happy New Year! 

Research project on traditional and social media

Last month Insider has contributed to common research project with two other companies: ContextMedia (with 20+ years of traditional media analytics) and YouControl (with access to government data). Target of the research was to build a bio and semantic portrait of the Ukrainian politician Dmytro Svyatash in light of the law on car import in Ukraine. The interactive research results can be found here (in Russian).

Insider has used two own tools for unstructured text analytics: Insider API for realtime semantic topic creation (screenshots and description of the system are here) and RSA API for entity level sentiment analysis.

The resulting system, that was prototyped in under a week, allowed for:

  1. Navigating through years of data from 2002 to current moment using keyword searches.
  2. Understanding the sentiment distribution in the found corpora and for given search.
  3. Researching quantitative search trends using visual trend chart.
  4. Sifting through the produced semantic topics, grouping various news items together in search results.
  5. Getting the heart beat of twitter.


In the process we relied on best open source tools, including Apache Tika, using which allowed us to swiftly convert HTML news articles into JSON format, preserving all important attributes of a news item: title, contents. We crafted and applied additionally own NER for extracting date of a publication to properly place it on the time scale.

Want to do a similar research on your own data? Get in touch: [email protected].