We use our eyes to figure out how fast we are moving relative to other objects in the immediate surrounding, to figure out where are the lane lines and where do we need to make a turn exactly. We use our eyes to perceive the world around us. Eyes are the sensors and our brain the system that takes in the signals from our eyes and finally helps us make sense of what the eyes capture. There is no vision without the brain. Without the brain, it’s just some data, an input signal with no interpretation.
Likewise, Computer Vision is a system of sensors (Camera, LIDAR etc) and algorithms (image classifiers using deep learning, image analysis etc) that helps make sense of the signals coming from these sensors.
Another aspect that is also important to mention is “Vision” here mean two different things:
1. Making sense of the scene in front of you. Already explained above.
2. The prediction. In the sense that Steve Jobs was a great visionary. He was able to see the future, like what’s coming. Here, a good example would be the system that the Predator had in that old movie, which he used to predict the trajectory of an object thrown at him and then he used that predicted trajectory to shoot it down in mid flight. The Tesla auto-pilot predicting a crash is a perfect example of such a vision system.
Color, Shape, Orientation, and Position
Generally speaking, there are certain features that are useful in executing a Computer Vision task.
1. Color Detection – Selecting only those part of an image which is of a certain color.
2. Edge Detection – To extract out a shape of an object in context.
3. Orientation – What is the object’s orientation in a defined space.
4. Position – Where is the object located in the defined space.
Talk about the challenges, best practices and how to improve the odds of success by getting context from a Bayesian approach. Your car in the garage and number plates example.
Talk about RGB, HSV and L*a*b*
TBD. Talk about OpenCV
Talk about OpenCV
Talk about VPUs and sensors.
When you cannot see the whole pattern in one go and you have to compute it in a distributed fashion in parallel and make sense of the big picture.