What is computer vision?
In this article we
will briefly talk about one of the most important branches of artificial intelligence,
which has occupied many researchers in recent times, the field of computer
vision.
Man has eyes that
show the surrounding images (the physiological vision of the human) and then
enter those images to the brain through the cells and the neural connections,
then occurs immediate treatment for this images ,so the
person can identify what his eyes see.
Similarly, computer
vision is the process of enabling the computer to have the ability to view the
surrounding medium (using any camera connected to it, such as a web camera, for
example) and then analyze and process what to recognize it and this is done
using some of the algorithms prepared for this.
We now agree that
computer vision is basically one of the fields of computer science designed to
build intelligent applications that have the ability to understand the content
of images as understood by humans.
In fact, it is a
mixture of concepts, techniques and ideas from a number of areas, including
digital picture processors, pattern recognition, artificial intelligence, and
computer graphics.
There is not even an
accurate definition of the field of computer vision, but it can simply be said
that it is in the middle between processing images and artificial intelligence
(including machine learning) as follows:
source
Well, how does the
computer see the surrounding medium?
This is done by
receiving digital images from the camera connected to it in the style of each
separate image (a single snapshot of the camera) and this is in certain
applications that allow this form or successive successive images (through
video recording).
Video is a large set
of successive images that are captured continuously at high speed These images
appear successive in the form of the video that we see in natural.
source
How does the
computer see images?
The image is
basically a matrix of pixel units in the x and y axes (x, y).
Each pixel has a set
of attributes and features that
determine its type, identity and function in the image.
source
While the computer sees it as an array of
digitized values of each pixel in which it is valued in a form close to this
image
And for deeply
understand for this concept see my previous articls
1/ Recognition
Recognition Is the
most common task and means to determine
whether the contains or does not contain a certain object or activity - this
task seems very simple to the natural person and feel no effort when the eye
falls on the body has already identified and deal with it before, but this task
is much more difficult than it seems to computers.
2/
Detection,Localization,segmentation
It is a process that
is more complicated than the recognition process. This process is more complex.
First, it implements the algorithms and methods of recognition, followed by
algorithms and other methods to accurately locate the objects in the image.
source
Computer Vision
Applications
* Robotics
* Monitoring systems
and face recognition (Face Recognition)
Common technique in banks, airports,
train stations and important areas
* Medical fields
such as radiology, sonar and binoculars
X-ray, MRI, etc.) in surgical operations,
or early identification of cancer cells in captured images
* Cars and
self-driving trains
The world is now turning to this technology
very quickly and there are already many countries that are currently using this
technology.
* Motion detection
in a certain center.
* Handwriting
recognition or any other writing and conversion to text in the computer.
* Check baggage
* Manufacturing
inspection during the manufacturing process as well as product quality control.
* Tracking
Track the movement
of objects in the successive image (video) such as tracking people or cars -
etc.
Video tutorial
(https://www.youtube.com/watch?v=9KmrZ_M--zA&t=4s)
* Identification and
location (Detection and segmentation)
Video tutorial
(https://www.youtube.com/watch?v=CeKq9l3WEl0&t=73s)
* Recognize people
by distinguishing features such as finger, face, or iris, called biological
systems.
* Military areas,
and their applications to identify enemy vehicles and missile guidance.
* Traffic systems,
such as identification of car plates
* Aircraft
Navigation Systems.
* Observe the
movement of the eyelid for drivers to warn them to sleep while driving.
* High security
systems using iris for people.
* Translating
signals from those who can not speak or hear and convert them into written
texts or spoken words
* Agricultural and
mechanical harvesting.
There are more and
more researches, studies and applications in this area. It is really a hot area
of research that never stops.
How do you create a
computer vision system capable of performing a task (for example, one of the
previous applications)?
First, you need a
strong understanding of the concept of image, pixel and its structure, color
depth and other basic concepts that create a real awareness of the data that
you will deal with and classify all the time.
Also learn the
skills of opening, saving, creating, arranging, cutting pictures, separating
colors, To deal programmatically with images.
Secondly, you need a
good and strong understanding of the main programming language in this field,
which is Python .
thirdly , Studying
and understanding the artificial neural networks.
Fourthly ,good
understanding for a type of artificial neural network called convolutional
neural network (CNN) because in the analysis and processing of images, its more
efficient to take advantage of time and storage memory compared to fully
connected networks.
Fifthly , Use the
appropriate work environment for application training and testing and to
nominate you to use
(Anaconda, spyder
and jupyter)
sixthly , Use
suitable libraries to work --There are many of them but I nominate you
(Keras, tensorflow,
opencv, imutils and sklearn)
Seventh, it is
preferable to use a strong computer in its own processing capabilities and have
Graphic Processing Unit (gpu)
because the training
of strong artificial network is hard process and slow
and consume from their resources a lot.
Refrences/
https://ar.wikipedia.org/wiki
https://towardsengineeringvision.blogspot.com/2018/12/pixels-with-coordinate-system-and_45.html
https://stackoverflow.com/questions/42299587/python-transform-list-of-x-y-and-z-to-matrix-table
https://www.oreilly.com/library/view/deep-learning-for/9781788295628/4fe36c40-7612-44b8-8846-43c0c4e64157.xhtml
https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwiiqtTAs73fAhUwyoUKHYsqBzYQjRx6BAgBEAU&url=https%3A%2F%2Fdepositphotos.com%2F174867248%2Fstock-photo-futuristic-eye-vision-backdrop.html&psig=AOvVaw04-UFDAyN-GWk5TjaWmEjk&ust=1545910421646644
Comments
Post a Comment