Image Text and object Detection with Rekognition

Image Text and Object Detection Using AWS Rekognition

In this blog, you will see, in a ChatBot application how to extract any kind of text from any image, how to recognize a picture with its identity or label, and after recognizing the picture, will give a brief description from Wikipedia. The picture you will upload, will be stored in an S3 bucket.

 

 

 

 

The benefits of Rekognition are:

  • Easy to use
    • Image recognition at the push of a button
    • No expertise required
    • API’s available, allowing applications easy use of AI.
  • Extremely low cost of usage
  • Scalable

Rekognition Key Features:

  • Object and scene detection
  • Facial analysis
  • Face comparison
  • Facial recognition
  • Confidence Scores on Processed images

Architecture:

  • A user post a message containing an image to a chat app that is monitored by a chatbot.
  • The chat app posts the event to the Amazon API Gateway API for the chatbot.
  • The chatbot validate the event. This event triggers an AWS Lambda function that uploads the image.
  • Amazon Rekognition’s image recognition feature checks the image.
  • The chatbot uses the chat app API to post a message to the chat channel detailing of the image.

In this tutorial we are going to use AWS Lambda, Amazon S3, AWS Rekognition, AWS Cognito, Amazon API Gateway and in the frontend for chatbot app UI we are using HTML with JavaScript. We are not using AWS Rekognition service directly. We use this service through Lambda function.

Demo:

Step 1:

We have created two Lambda function:

  • rekognition_image_text: This Lambda function will recognize and extract all the text format data from the image. From this function all texts from the image will appear on the chatbot UI.

  • rekognition-test: This Lambda function will try to detect your image and give an image label at the best. After detecting the image the function will give a perfect details of that image. The function will detect the main content of that picture and show us a brief introduction like Wikipedia. For that we need to import a package in our python code called wikipediaapi. This package helps to fetch the details of that image from Google.

Step 2:

We have created two API in Amazon API Gateway.

  • rekognition_image_text : This API is for image to text conversion. When users upload a text image, this API will called from the frontend and then the API called the rekognition_image_text Lambda function. And through this Lambda function the text will show in the chatbot UI.
  • rekognition: This recognition API is for image label and the label description from Wikipedia. We created two methods in this API, one is POST method and another one in GET method. When user upload any image, the API will called from the frontend and then this recognition API called the lambda function, which returns the image label and details from google Wikipedia.

When our API get created, we need to enable CORS in API gateway.

CORS is a browser security feature that restricts cross-origin HTTP requests.

What Qualifies for Cross Origin HTTP Requests?

  • A different domain (e.g. from cat.com to dog.com)
  • A different subdomain (e.g. from cat.com to adopt.cat.com)
  • A different port (e.g. from cat.com to cat.com:10700)
  • A different protocol (e.g. from https://cat.com to http://cat.com)

For security purposes, the default behaviour of web applications is to follow the same-origin policy. This means that a web application can access data residing on another web application (for example, through an AJAX request) only if both applications have the same origin. After enabling CORS for your API delivery configuration, you can whitelist selected external origins and allow user agents that send requests from these origins to access resources within your API.

In API Gateway you can specify the origin hostnames, HTTP methods, and headers that edge servers should accept in incoming CORS requests. Edge servers first determine the type of an incoming CORS request (pre-flight, simple, or actual) and then validate the request against the list of acceptable hostnames, methods, and headers.

CORS Flow:

Please follow the below steps to enable CORS in your API Gateway:

  1. First go to your API Gateway console and select your API.
  • Then Click on Actions and select Create Method. Then select the method type as per your requirements.
  • When your method is created again go to Actions and click on Enable CORS.
  • Leave all sections as default and click on Enable CORS and replace existing CORS headers.
  • When CORS is enabled, under Resources section OPTIONS will appear.

Your CORS is enabled. Now go to Actions and deploy your API.

Step 3:

We have created our S3 bucket upload-rekognition-partha

Whenever we upload an image in our chatbot, the uploaded image will store in this S3 bucket.

We also need to enable CORS in our S3 bucket. The CORS configuration is written in JSON, which defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.

To enable CORS, go to your S3 bucket Permission tab.

After that on the Cross-origin resource sharing (CORS) section edit your CORS configuration and put that in a JSON format.

 

To upload image in your S3 bucket through the application you need the Cognito user pool. You need to create a user identity pool from AWS Cognito and have to use the credentials in your code.

For now our backend setup is completed. Now we are going to show you how our image Rekognition works with its maximum perfection.

We are going to upload an image in our chatbot application and you will see the response.

After uploading the image, reply back the response.

We can see, first it reply back with the text from the image, extract text from image through our backend configuration.

On the second level the image is detected and giving us response that the image we uploaded is a wedding cake.

After that it we can see the details of weeding cake from Wikipedia.

Thank You.

Leave a Reply

Your email address will not be published. Required fields are marked *