08 - Convolution
Class: CSCE-421
Notes:
Pre:
- How to convert an image into a fix-length vector?
- We want this X to somehow remain the same if we rotate the input image
- Turns out the translation part is easy to do, the rotation part can be done but it is not easy to do
- Take an I image and turn it into a fix-length vector X
Image filtering
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219093618.png)
What is the idea of convolutional?
- An image is simply a two-dimensional data array with numbers
- If we have the 9-square box (3x3) as out filter (or sometimes called kernel) we will take this filter and do element-wise multiplication of some portion of the data array
- The output would be somehow a moving average of the inputs
- This translates to be a more blurry version of the image.
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219093639.png)
- Then we move one step forward (advance 1 position) so we can try to cover the whole image at some point
- Note, if we change out filter, our output can be very different
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219093700.png)
- We want this operation to somehow be related to the transformation of the image (what if we shift the location of the left box?)
- How the output would be changed in this case?
- The output will be just shifted also!
- This is called translation equivariant
- Means that if you do some transformation of the input, it will change the output equivalently
- How about rotation? if we rotate the object and apply the same filter, will the output be somehow related?
- In general the output will not be related to the original image at all
- What matters is the relative angle to the filter, if you rotate just the image multiplication is now messed up, so you are multiplicating together different numbers.
- If you rotate both the image and the filter you will then get the same output rotated.
- What if we apply 4 different kernels to the input image, each with a different angle of rotation. If the input image is rotated then you will jus basically change the order of the output, but some of the angles might still be able to relate to the input image. Now you have a network that has rotation equivariance.
- But what happens if you rotate by just 3 degrees or an arbitrary number of degrees? So in general it is not that beneficial.
- Then it will break our system because we are only accounting for 4 different angles.
- This is not that intuitive to implement, to do this you have oto rely in some transformations that might help do this.
- This is the idea of convolution
- Depending of a filter, the output will be very different
- The numbers on your filter are actually learnt from data, you will treat them as parameters and you will learn these parameters from data.
Box Filter
What does it do?
- Replaces each pixel with an average of its neighborhood
- Achieve smoothing effect (remove sharp features)
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219093857.png)
Smoothing with box filter
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219094504.png)
Practice with linear filters
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219094613.png)
- This filter would do nothing to the original image
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219094645.png)
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219094719.png)
- The right output is the Vertical Edge (absolute value)
- How does a negative number affect in this case?
- Remember we are doing element-wise multiplication
- This filter will cancel both sides of the vertical borders, not the horizontal border
/CSCE-421/Ex2/Visual%20Aids/Pasted%20image%2020260219095015.png)
- This is the Horizontal Edge (absolute value
- If you make the filter slightly larger you can also do a 45 degreee detector and in principle any detector you can think of
- Somehow by just playing around with a filter we can modify our input image and make it easier to recognize it.
Image filtering
- Filters can be designed to detect edges of different orientations
- The detected edges can be combined to form object shapes, which are important for object recognition
- Convolutional neural networks are based on the idea of image convolutions
- Convolutional neural networks use data to train filter parameters