Category Archives: Topic API

Sample code for grouping articles into themes

In this post we would like to share with you Java code snippets, that allow for loading data into our Topic API. The idea of the topic API is that it allows you to group your articles / posts / tweets / documents into topical themes and also search in the content.

In order to navigate oceans of textual data and extract useful structures from your content lakes search is one of the most common way to empower your journey. But once you have found thousands and thousands of matches, you still have the problem of the data overload. Topical grouping can help.

All code in this post you can find in our public GitHub repository. We further assume, that your article content is stored in a MySQL database. Using mybatis we load the articles with

TopicLoader class takes care of doing it all: loading articles from the DB, forming a JSON request to the Topic API and uploading the relevant fields.

This main class takes single command line argument: resource_id, which matches onto the field of articles DB table in

The method uploadDBEntries will load the articles entries from the DB and upload to the Topic API one by one. Note, that if your content is not very large in size (tweet size), then you can upload several posts in one single request (up to 50 at a time).

To upload an article to the Topic API we use the following code:

    private void uploadArticleToTopicAPI(ArticleEntry x) {
        GsonBuilder builder = new GsonBuilder();
        Gson gson = builder.create();

        try {
            String body = "[" + gson.toJson(x) + "]";

            System.out.println("Sending body for id: " + x.getId());

            HttpResponse<JsonNode> response ="")
                    .header("X-Mashape-Key", mashapeKey)
                    .header("Content-Type", "application/json")
                    .header("Accept", "application/json")

            System.out.println("TopicAPI response:" + response.getBody().toString());
        } catch (UnirestException e) {
            log.error("Error: {}", e);
            System.err.println("Error: " + e.getMessage());

Remember that you need to obtain the mashapeKey by subscribing to the API and checking the documentation, where you will find the key already pre-inserted:

After the upload is complete, the articles end up in a search engine on the backend of the Topic API. You can start triggering the search requests and getting back nice themes. In this example below I have uploaded about 10,000 Russian texts and gotten topics:

Что Говорят # What they say
Большие деньги # Big money
Беларуский Бренд # Belorussian brand
Для Ребенка # For a kid
Как Выглядит работа # How a job looks like
Как Заработать # How to earn money
В Минском Масс-маркете # In a grocery store of Minsk
Женщины # Women

Halloween launches / Релизы на Хеллоуин

We are pleased to report new features in the Topic API — system for realtime topic clustering of documents: articles, blog posts, tweets.


We have improved the algorithm to avoid situations where a preposition would be omitted — helps a lot for Russian.

We have added a possibility to filter clusters by time range: use params startDate and endDate in the format yyyy-MM-dd. This feature should allow you to build trends over time!

We made the /articles/cluster end point to support GET — more sensible when integrating with frontends

Enjoy and Happy Halloween!

Мы рады сообщить о новых фичах в Topic API — системе для построения тематических кластеров документов: статей, блог-постов, твитов.

Мы улучшили алгоритм таким образом, чтобы названия кластеров включали предлоги. Например, раньше кластер мог называется “Питере”, теперь он будет называться “В Питере” (конечно, в зависимости от ваших данных).

Теперь кластера можно строить для конкретного промежутка времени: используйте параметры startDate и endDate в формате yyyy-MM-dd. Теперь вы можете строить тренды!

Энд-пойнт /articles/cluster теперь доступен по GET — это более дружественный тип запроса для интеграции с фронтэндами.

Удачи и классного Хэллоуина!

Insider team / команда Insider