Sunday, January 9, 2022

The mathematical art of "image rotation"

In the jargon of image manipulation and broadly, computer graphics, rotation is a very essential operation in our day-to-day lives. One requires basic knowledge of trigonometry and Cartesian coordinate system to grasp the concept of rotation mathematically.

One must also know programming in order to accomplish the task. A person needs little formal premier knowledge of computer science to write such code. The reference program given is written in Java.

Derivation

A digital image rectangular is rectangular in shape. Let w be its width and h its height.


Let O be a point within the rectangle (most likely in center).
Let OR be a ray parallel to the width of the rectangle.
Let P be a point within the rectangle existing at a clockwise angle ϕ from OR.

Further rotating it α angle apart clockwise we arrive at a point Q which expressing in form of Cartesian coordinates we find it to be,

(OR, RQ) = (x', y')
                 = (OQ × cos(ϕ+α), OQ × sin(ϕ+α))
                 = (OQ × (cos(ϕ)cos(α) - sin(ϕ)sin(α)), OQ × (cos(ϕ)sin(α) + sin(ϕ)cos(α)))
                 = (OP × cos(ϕ)cos(α) - OP × sin(ϕ)sin(α), OP × cos(ϕ)sin(α) + OP × sin(ϕ)cos(α))
                 = (OS × cos(α) - SP × sin(α), OS × sin(α) + SP × cos(α))
                 = (cos(α) - y sin(α), x sin(α) + y cos(α))

Angle addition provided a way to not requiring to determine ϕ saving us from complicated trigonometric functions (eg. arctan, atan2). This becomes a linear transformation which can be written in matrix form but it's omitted here.

Application

For example we take this image.


In theory, it should work like a charm but it always produces poor quality results in reality. Suppose we rotate the previously mentioned image with this method by 60 degrees then it looks really awful.


As x' and y' aren't integers sometimes thus truncating their decimal part would overlap with other pixels, leaving patterned gaps in the output. Though aesthetic, but are really annoying, unwanted and of course impractical.

So instead of finding for a coordinate what its rotated coordinate would be, we find for a rotated coordinate what its original coordinate was such that it does nearest-neighbor sampling while fetching the pixel which fills the void pixels. Simply the process has to be reversed.

(x, y) = (OQ × cos(ϕ-α), OQ × sin(ϕ-α)) 
          = (cos(α) + y sin(α), y cos(α) - x sin(α))

Here (x', y') is the rotated point and (x, y) the original one. We iterate over all points (pixels) of the output Cartesian plane assigning coordinate for each to (x', y') and find the corresponding coordinate in the input Cartesian plane and quantize it because fractional coordinates aren't possible.

One might clip the coordinates to width and height of the bounding rectangle (image), in order to avoid out-of-bounds access or they may extend the output image considering the farthest radial distance from the origin (mostly in center) to include parts that go outside the picture.

Doing it the reverse way yields us,

Perfect! Isn't it? It is really crispy too. If we've used a sampler using bilinear, bicubic or Lanczos interpolation it would have appeared smooth but blurry too and would complicate the process.

I hope you learnt something from this and be able to use it in your future works. This might also come around helpful for me in the future '-'

No comments:

Post a Comment