Skip to content

Visual Recognition

Rahmetullah Varol edited this page Feb 13, 2017 · 5 revisions

Visual Recognition is an extension of the Watson API that can be used to obtain information about the contents of an image. It can be used for image classification, object recognition, face detection and recognition and similar image analysis operations.

Image Classification

The most basic operation that can be done via this API is to classify an image using the default classifier of the Watson API. This API can be reached by a GET /v3/classify request for classifying an image using a URL or by a POST /v3/classify request for classifying an image from a local directory.

A curl command for sending a GET request along with an image URL has the following structure:

curl -X GET "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classify?api_key={api_key}&url={image_url}&version=2016-05-19"

A curl command for sending a POST request along with a local image has the following structure:

curl -X POST -F "images_file=@{image_path}" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classify?api_key={api_key}&version=2016-05-20"

Here is an example image that can be given as an input to the image classifier:

Meerkats

The resulting JSON response for this image is as follows:

{
    "custom_classes": 0,
    "images": [
        {
            "classifiers": [
                {
                    "classes": [
                        {
                            "class": "meerkat",
                            "score": 0.874,
                            "type_hierarchy": "/animal/mammal/carnivore/meerkat"
                        },
                        {
                            "class": "carnivore",
                            "score": 0.991
                        },
                        {
                            "class": "mammal",
                            "score": 0.991
                        },
                        {
                            "class": "animal",
                            "score": 0.991
                        },
                        {
                            "class": "suricate",
                            "score": 0.788,
                            "type_hierarchy": "/animal/mammal/carnivore/suricate"
                        },
                        {
                            "class": "slender-tailed meerkat",
                            "score": 0.618,
                            "type_hierarchy": "/animal/mammal/carnivore/slender-tailed meerkat"
                        }
                    ],
                    "classifier_id": "default",
                    "name": "default"
                }
            ],
            "image": "meerkat.jpg"
        }
    ],
    "images_processed": 1
}

As can be understood from this example, the default classifier correctly understands the content of the image with a relatively high confidence score.

Face Detection

Face detection tool of the Visual Recognition API can be used to locate faces in an image and obtain information about them such as the estimated age and gender. For some famous faces, it can also be used as a face recognition tool out of the box.

A curl command for sending a GET request along with an image URL has the following structure:

curl -X GET "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/detect_faces?api_key={api_key}&url={image_url}&version=2016-05-20"

A curl command for sending a POST request along with a local image has the following structure:

curl -X POST -F "images_file=@{image_path}" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/detect_faces?api_key={api-key}&version=2016-05-20"

Here is an example image that can be given as input to the face detection tool:

Noam Chomsky

The resulting JSON response is as follows:

{
    "images": [
        {
            "faces": [
                {
                    "age": {
                        "min": 65,
                        "score": 0.670626
                    },
                    "face_location": {
                        "height": 319,
                        "left": 620,
                        "top": 103,
                        "width": 289
                    },
                    "gender": {
                        "gender": "MALE",
                        "score": 0.99593
                    },
                    "identity": {
                        "name": "Noam Chomsky",
                        "score": 0.731059,
                        "type_hierarchy": "/people/writers/noam chomsky"
                    }
                }
            ],
            "image": "chomsky.jpg"
        }
    ],
    "images_processed": 1
}

In this example, we can see that Watson API is able to detect and recognize the face of one of the prominent intellectuals of our time.

Custom Classifiers

Let us say that we want to build a program to identify different robots in given images. The default classifier is usually able to detect if a robot is present in an image we give as an input. But we might not be content with such a broad specification. We might want the classifier to be able to determine the specific robot model in our image. Unfortunately, this is not possible if we use the default classifier IBM provides us. This is when training our custom classifier would prove useful.

Let us assume there are 4 different robot models we are interested in and we would like to be able to tell them apart. The robot models are: Baxter, Asimo, R2D2 and Naobot. We can gather some images from the Internet and create a data set. Some example images are as follows (to access the complete data set used in this example click here):

Baxter:

Baxter 1Baxter 2Baxter 3

ASIMO:

ASIMO 1ASIMO 2ASIMO 3

R2D2:

R2D2 1R2D2 2R2D2 3

Naobot:

Naobot 1Naobot 2Naobot 3

The IBM Watson API requires each distinct class of images to be uploaded as a zip file named {class_name}_positive_examples.zip, where {class_name} is the name we would like to give to this class of images. We can also provide negative images which we don't want our classifier to associate with any of our classes. Negative images can be useful if we know that our classifier might encounter some images that look like one of our classes but is not quite what we want.

We can use the following curl command to train a classifier using our example data set:

curl -X POST -F "baxter_positive_examples=@baxter_positive_examples.zip" -F "asimo_positive_examples=@asimo_positive_examples.zip" -F "r2d2_positive_examples=@r2d2_positive_examples.zip" -F "naobot_positive_examples=@naobot_positive_examples.zip" -F "name=robots" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key={api_key}&version=2016-05-20"

The name parameters specifies the name we would like to give to our classifier. This request will be responded with a JSON object similar to the following:

{
    "classifier_id": "robots_699593515",
    "name": "robots",
    "owner": "f2bf8eb3-d845-4929-9f89-29aa390c109c",
    "status": "training",
    "created": "2017-02-13T08:11:30.260Z",
    "classes": [
        {"class": "r2d2"},
        {"class": "naobot"},
        {"class": "baxter"},
        {"class": "asimo"}
    ]
}

We want to note the classifier ID somewhere as we will need to use it when we want to invoke this classifier later. Now let us use this classifier to identify a robot in an image that is not in our training data set, such as the following:

ASIMO Test Image

As response we get the following JSON object:

{
  "images": [
    {
      "image": "test02.jpg",
      "classifiers": [
        {
          "classes": [
            {
              "score": 0.950742,
              "class": "asimo"
            },
            {
              "score": 0.154836,
              "class": "naobot"
            }
          ],
          "classifier_id": "robots_699593515",
          "name": "robots"
        }
      ]
    }
  ],
  "custom_classes": 4,
  "images_processed": 1
}

Our custom classifier has successfully identified the robot in our test image. From this example we can see that even with such a scarce data set as we have used in this example, we can still train a reasonable classifier to identify different models of robots. With more data, this API can be used for very interesting applications.

Further Exploration

Here are some external links you can use to learn more about this API.

Sources

Clone this wiki locally