Computer vision is a field of study that focuses on making computers acquire a certain level of understanding, so they are
able to perceive the content of digital images such as photographs and videos.
Healthcare - Computer vision technology is helping healthcare professionals to accurately
classify conditions or illnesses that may potentially save patients’ lives by reducing or eliminating
inaccurate diagnoses and incorrect treatment.
Agriculture - Some farmers are starting to adopt computer technologies in order to improve
their growth methods increase yields, and eventually increase profit.
Banking - This has to do with image recognition applications that use machine learning
to classify, extract data, and authenticate documents such as passports, ID cards, driver’s licenses,
Industrial - In this sector, computer vision is used to monitor the status of critical
infrastructure and to identify new applications that might improve productivity.
Image classification - This problem is definitely hard for a machine as all it sees
is just a stream of numbers in an image.
Object detection: - This is about recognizing various sub images and drawing a bounding
box around each recognized sub image. To deal with it, the best method is known as Faster-Region
Convolutional Neural Network (Faster-RCNN). It uses a technique called Region Proposal Network,
which is basically responsible for localizing on the regions in the image that need to be processed
Image segmentation - This simply means dividing an image based on the objects present,
with accurate boundaries. There are two types of image segmentation. The first one is Semantic
segmentation, in which each label must be labelled by a class object. Thus, every object that
belongs to the same class (e.g. a group of people or a few cars) will be coloured the same. The
second type is named Instance segmentation, which classifies every object differently, meaning
that every person or car in a picture would have a distinct colour. The latest known technique
to solve this is called Mask R-CNN, which is basically a couple of convolutional layers on top
of the already explained R-CNN technique.
Image captioning - This involves generating a caption that is most appropriate for an
image. In other words, image detection (carried out by the same Faster-RCNN method) along with
captioning. The latter is done using a Recurrent Neural Network. Specifically, Long Short-Term
Memory (LSTM), which is an advanced version of RNN is used.