Getting started with Computer Vision

More documents

Info

1 An introduction to computer vision a. FAQs What is computer vision? Of the five human senses, vision is the one that provides most of the data we receive and is considered our dominant sense. It provides us with a detailed description of the surrounding world which is constantly changing. Although vision involves a huge amount of information and complex processing, the human visual system can interpret this information easily. The ability to see, process and then act on visual input is something that most humans take for granted. Computer vision engineering is the practice of using technology and machines to replicate, and even improve upon, human vision. The technology captures and stores images before transforming them into information that can be further acted upon. This requires expertise across a range of fields, including sensor technology, image and signal processing, computer graphics, computer architecture, algorithms and machine learning. What are the fundamental computer vision techniques? Image classifcation gives a computer the ability to interpret the input from an image sensor and categorise what it ‘sees’. Object Detection detecting instances of a certain class (such as vehicles, humans, buildings) in images or videos. Object Tracking detecting and recognizing a defined item in each frame of a video to distinguish it from other objects in the scene. 3D Image Reconstruction the process of capturing the shape and appearance of real objects. Semantic Image Segmentation when specific regions of an image are labelled according to what the object is. What’s the difference between image processing, computer vision and machine learning? Each of these fields is based on the input of an image. They process the pixels and give us an altered output in return. While their names imply their goals and methodologies, these fields depend substantially on one another. Relationship between AI, machine learning and deep learning Artificial intelligence Machine learning Deep learning Artifcial intelligence: AI is the theory and development of computer systems to perform tasks normally requiring human intelligence. Machine learning: is an application of AI based around the idea of giving machines access to data and letting them learn for themselves. Deep learning: is a special type of machine learning algorithm, multiple layers of neural networks that mimic the connectivity of the human brain in processing data and creating patterns for use in decision making. As a minimum an AI system must be able to reproduce aspects of human intelligence Image processing takes an image as an input and provides a processed image as an output. The purpose of the processing is usually to improve the quality of the image. Typical methods used are filtering, noise removal, sharpening and edge detection. Computer vision broadens the purpose of image processing to include quantitative and qualitative information from visual data. Similar to the process of human visual reasoning, computer vision can distinguish between objects, classify them and sort them according to their attributes. Computer vision, like image processing, takes an image as an input. However, it returns an output with additional information interpreted from the image such as size, colour, number, location or orientation. This can be extended beyond the extraction of meaningful information from a single image to multiple images or video, for example, to count the number of cars passing by a point on the street as they are recorded by a video camera. Temporal information therefore plays a role in computer vision, much as it does with our own understanding of the world. Machine learning is the application of intelligence that provides the computer system with the ability to automatically learn and improve from experience without having to be programmed. In computer vision terms, this means ‘training’ a system. Algorithms and statistical models are used to perform image analysis using patterns and inference trained on data sets of many thousands of images for automatic learning, rather than using explicit instructions as image processing would. 3
What is artificial intelligence (AI)? Artificial intelligence is intelligence demonstrated by machines, where any device can perceive its environment and mimic human functions such as ‘learning’ and ‘problem solving’. Artificial intelligence, or AI, is the broad concept of machines being able to carry out tasks in a way that is considered ‘smart’. What are neural networks? Neural networks are a means of machine learning, where a computer learns to perform a task by analysing training examples or datasets. Usually, the dataset examples have been manually labelled in advance. An object recognition system might be fed thousands of labelled images of cars, houses, cups and would find visual patterns in the images that correlate consistently with the particular label. What is deep learning? Deep learning is the use of neural network methods to perform image analysis, moving away from statistical methods to neural network algorithms which are developed to mimic the neurons of the human brain. What applications can computer vision be used for? Applications of computer vision are many and varied. Common applications you may be familiar with include augmented reality, facial recognition, gesture and handwriting recognition, machine vision, remote sensing, robotics, autonomous vehicles, people counting and iris recognition. What business sectors use computer vision? Computer vision has numerous applications such as remote sensing, healthcare (particularly around medical imaging such as MRI scans or ultrasound imaging), security, manufacturing, automotive, transport, robotics, sports, gaming and many others. The computer vision market is expected to reach close to $22 billion by 2026 https://www.verifiedmarketresearch.com/product/globalcomputer-vision-market-size-and-forecast-to-2025/ b. A brief history of computer vision Computer vision has a long history in commercial and government use where light wave sensors in various spectrum ranges have been deployed in many applications such as: • Remote sensing for environmental observation and management • High resolution cameras that collect intelligence over battlefields • Thermal imagers to detect people during police operations • X-ray sensors for airport security. The sensors can be stationary or attached to moving objects, such as satellites, drones and vehicles. When combined with connectivity technologies such as Wi-Fi, Bluetooth or 3G/4G/5G, they create a new set of applications that were not possible before. Computer vision, coupled with connectivity, advanced data analytics and artificial intelligence, are catalysts for each other, giving rise to revolutionary leaps in IoT innovations and applications. 4
Page 1 and 2: Getting started with Computer Visio
Page 3: Contents 1 An introduction to compu
Page 7 and 8: 1950: computer vision emerges 1957:
Page 9 and 10: The agriculture industry is increas
Page 11 and 12: 4 How to set up a computer vision s
Page 13 and 14: . Image processing The main purpose
Page 15 and 16: g. Machine learning frameworks Comp
Page 17 and 18: . Camera modules As image sensor co
Page 19 and 20: Vision processing at the edge Cloud
Page 21 and 22: . How CENSIS can help CENSIS launch
Page 23 and 24: 10 The computer vision community in
Page 25 and 26: Glossary TERM MEANING AI Artificial

Getting started with Computer Vision

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?