banner.gif
On-Line Computer Graphics Notes
THE CAMERA TRANSFORM


Overview

To understanding the rendering process, you must master the procedure that specifies a camera and then constructs a transformation that projects a three-dimensional scene onto a two-dimensional screen. This procedure has two several components: First, the specification of a camera model; second, the conversion of the scene's coordinates from Cartesian space to the space of the camera; and finally the specification of a viewing transformation that projects that scene into image space

pdficonsmall.gif For a pdf version of these notes look here.


The Camera Model

We specify our initial camera model by identifying the following parameters.

1.
A scene, consisting of polygonal elements each represented by their vertices,
2.
A point that represents the camera position -- $ {\bf C} = (x_c, y_c, z_c)$,
3.
A point that represents the ``center-of-attention'' of the camera (i.e. where the camera is looking) -- $ {\bf A} = ( x_a, y_a, z_a )$,
4.
A field-of-view angle, $ \alpha$, representing the angle subtended at the apex of the viewing pyramid.
5.
The specification of ``near'' and ``far'' bounding planes. These planes considered perpendicular to the direction-of-view vector at a distance of $ n$ and $ f$ from the camera, respectively.

\includegraphics {figures/camera}

The specification of $ {\bf C} $, $ {\bf A} $ and $ \alpha$ forms a viewing volume in the shape of a pyramid with the camera position $ {\bf C} $ at the apex of the pyramid and the vector $ {\bf A} - {\bf C} $ forming the axis of the pyramid. This pyramid is commonly referred to as the viewing pyramid. The specification of the near and far planes forms a truncated viewing pyramid which gives the region of space which contains the primary portion of the scene to be viewed (We note that objects may extend outside the trunchated pyramid. In many situations polygons will lie between the near plane and the camera, or, in distance, beyond the far plane.). The viewing transform, transforms this truncated pyramid onto the image space volume $ -1 \leq x,y,z \leq 1$.


The Camera Transform

Given the definition of a camera $ ( {\bf C} , {\bf A} ,\alpha,n,f)$, the camera transformation is a combination of a transform that first converts the coordinates of the Cartesian frame to the local coordinates of the camera's frame,

\includegraphics {figures/camera-at-the-origin-1}

and second, applies the viewing transform. These two transformations are usually multiplied together to form a single $ 4 \times 4$ matrix that is applied to all points of the scene.


Defining a Frame at the Camera Position

The main idea here is to define a frame at the camera position. Given such a frame $ {\cal F} _{\rm camera}=( {\vec u} , {\vec v} , {\vec w} , {\bf C} )$, we generate a transformation that converts the Cartesian Frame coordinates to the camera's frame.

To define a frame at the camera position is easy - and there are actually an number of ways of doing this. One of the vectors is obvious - that is, we want

$\displaystyle {\vec w} = \frac{ {\bf C} - {\bf A} }{\vert {\bf C} - {\bf A} \vert}
$

(the transformed camera should be looking along the negative $ w$ axis).

In order to define the other vectors that make up the frame, we must make an assumption. We assume that the vertical direction of the camera must be in the plane defined by $ {\vec w} $ and the vector $ {\vec y} = <0,1,0>$. This frequently happens when you are taking a picture, if you think about it - and it actually fairly easy to arrange. See the following figure for an illustration of this process. In the figure, the dotted line is the direction of view, and should be placed on the negative $ z$ axis by the transformation.

\includegraphics {figures/camera-1-frame}

To define $ {\vec u} $ and $ {\vec v} $ we utilize the following steps

This also insures that the vectors are all unit vectors, and that they are mutually perpendicular.

We note that this works well, except when you wish to have the camera look in the direction $ <0,1,0>$ or $ <0,-1,0>$. In these cases, either $ {\vec y} = {\vec w} $ or $ {\vec y} =- {\vec w} $ and $ {\vec y} \times {\vec w} = {\vec 0} $, and we cannot calculate a frame in this manner. However, we can utilize another vector as the ``up direction'' to utilize with $ {\vec w} $ to obtain $ {\vec u} $.


Calculating the Matrix

To calculate the actual matrix that implements the transformation, we can write each of the vectors $ <1,0,0>$, $ <0,1,0>$ and $ <0,0,1>$ as a linear combination of $ {\vec u} $, $ {\vec v} $, and $ {\vec w} $ (Since the vectors defining $ {\cal F} _{\rm camera}$ are linearly independent). In addition, we can write the vector $ (0,0,0) - {\bf C} $ as a linear combination of $ {\vec u} $, $ {\vec v} $ and $ {\vec w} $. Thus we can calculate the values $ e_{i,j}$, where

$\displaystyle <1,0,0>$ $\displaystyle = e_{1,1} {\vec u} + e_{1,2} {\vec v} + e_{1,3} {\vec w}$    
$\displaystyle <0,1,0>$ $\displaystyle = e_{2,1} {\vec u} + e_{2,2} {\vec v} + e_{2,3} {\vec w}$    
$\displaystyle <0,0,1>$ $\displaystyle = e_{3,1} {\vec u} + e_{3,2} {\vec v} + e_{3,3} {\vec w}$    
$\displaystyle (0,0,0)$ $\displaystyle = e_{4,1} {\vec u} + e_{4,2} {\vec v} + e_{4,3} {\vec w} + {\bf C}$    

These equations can be solved by Cramers Rule,: To obtain $ <1,0,0> = e_{1,1} {\vec u} + e_{1,2} {\vec v} + e_{1,3}
{\vec w} $, we have

$\displaystyle e_{1,1}$ $\displaystyle = \frac{ (<1,0,0> \times {\vec v} ) \cdot {\vec w} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$   ,    
$\displaystyle e_{1,2}$ $\displaystyle = \frac{ ( {\vec u} \times <1,0,0>) \cdot {\vec w} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$   , and    
$\displaystyle e_{1,3}$ $\displaystyle = \frac{ ( {\vec u} \times {\vec v} ) \cdot <1,0,0> }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$    

To obtain $ <0,1,0> = e_{2,1} {\vec u} + e_{2,2} {\vec v} + e_{2,3}
{\vec w} $, we have

$\displaystyle e_{2,1} = \frac{ (<0,1,0> \times {\vec v} ) \cdot {\vec w} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$   ,    
$\displaystyle e_{2,2} = \frac{ ( {\vec u} \times <0,1,0>) \cdot {\vec w} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$   , and    
$\displaystyle e_{2,3} = \frac{ ( {\vec u} \times {\vec v} ) \cdot <0,1,0> }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$    

And to obtain $ <0,0,1> = e_{3,1} {\vec u} + e_{3,2} {\vec v} + e_{3,3}
{\vec w} $, we have

$\displaystyle e_{3,1} = \frac{ (<0,0,1> \times {\vec v} ) \cdot {\vec w} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$   ,    
$\displaystyle e_{3,2} = \frac{ ( {\vec u} \times <0,0,1>) \cdot {\vec w} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$   , and    
$\displaystyle e_{3,3} = \frac{ ( {\vec u} \times {\vec v} ) \cdot <0,0,1> }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$    

In addition, if $ {\vec t} = (0,0,0) - {\bf C} $, then we have

$\displaystyle e_{4,1} = \frac{ ( {\vec t} \times {\vec v} ) \cdot {\vec w} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$   ,    
$\displaystyle e_{4,2} = \frac{ ( {\vec u} \times {\vec t} ) \cdot {\vec w} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$   , and    
$\displaystyle e_{4,3} = \frac{ ( {\vec u} \times {\vec v} ) \cdot {\vec t} }{ ( {\vec u} \times {\vec v} ) \cdot {\vec w} }$    

to get $ {\vec t} = e_{4,1} {\vec u} + e_{4,2} {\vec v} + e_{4,3} {\vec w} $.

The matrix that convets the coordinates of objects in the frame $ {\cal F} _{\rm camera}$ into coordinates for the frame $ {\cal F} _{\rm C}$ is given by

$\displaystyle \left[
\begin{array}{cccc}
e_{1,1} & e_{1,2} & e_{1,3} & 0 \\
e...
...e_{3,2} & e_{3,3} & 0 \\
e_{4,1} & e_{4,2} & e_{4,3} & 1
\end{array}\right]
$

Any point $ {\bf P} =(x,y,z)$ can be written in the frame $ {\cal F} _{\rm C}$ by

$\displaystyle \left[
\begin{array}{cccc}
x & y & z & 1
\end{array}\right]
\left...
...ray}{c}
<1,0,0> \\
<0,1,0> \\
<0,0,1> \\
(0,0,0) \\
\end{array}\right]
$

But by the above calculations this is equal to

$\displaystyle \left[
\begin{array}{cccc}
x & y & z & 1
\end{array}\right]
\left...
...}{c}
{\vec u} \\
{\vec v} \\
{\vec w} \\
{\bf C} \\
\end{array}\right]
$

which implies that the coordinate

$\displaystyle \left[
\begin{array}{cccc}
x & y & z & 1
\end{array}\right]
\left...
...e_{3,2} & e_{3,3} & 0 \\
e_{4,1} & e_{4,2} & e_{4,3} & 1
\end{array}\right]
$

is the coordinate of the point in the frame $ {\cal F} _{\rm camera}$.

We note, that by our construction, the frame $ {\cal F} _{\rm camera}$ is an orthonormal frame (all vectors are unit vectors and are mutually perpendicular) and in this case the equations above simplify tremendously. In particular, all the denominators $ {\vec u} \cdot ( {\vec v} \times {\vec w} )
= 1$, and we can simplify the numerators utilizing the identities

$\displaystyle {\vec u} \times {\vec v}$ $\displaystyle = {\vec w}$    
$\displaystyle {\vec v} \times {\vec w}$ $\displaystyle = {\vec u}$    
$\displaystyle {\vec w} \times {\vec u}$ $\displaystyle = {\vec v}$    

to obtain

$\displaystyle e_{1,1}$ $\displaystyle = {\vec u} \, \cdot <1,0,0>$    
$\displaystyle e_{1,2}$ $\displaystyle = {\vec v} \, \cdot <1,0,0>$    
$\displaystyle e_{1,3}$ $\displaystyle = {\vec w} \, \cdot <1,0,0>$    
$\displaystyle e_{2,1}$ $\displaystyle = {\vec u} \, \cdot <0,1,0>$    
$\displaystyle e_{2,2}$ $\displaystyle = {\vec v} \, \cdot <0,1,0>$    
$\displaystyle e_{2,3}$ $\displaystyle = {\vec w} \, \cdot <0,1,0>$    
$\displaystyle e_{3,1}$ $\displaystyle = {\vec u} \, \cdot <0,0,1>$    
$\displaystyle e_{3,2}$ $\displaystyle = {\vec v} \, \cdot <0,0,1>$    
$\displaystyle e_{3,3}$ $\displaystyle = {\vec w} \, \cdot <0,0,1>$    
$\displaystyle e_{4,1}$ $\displaystyle = {\vec u} \cdot {\vec t}$    
$\displaystyle e_{4,2}$ $\displaystyle = {\vec v} \cdot {\vec t}$    
$\displaystyle e_{4,3}$ $\displaystyle = {\vec w} \cdot {\vec t}$    

The first few of these are extremely simple, as, for example $ {\vec u} \cdot <1,0,0>$ is just the first coordinate of $ {\vec u} $, etc.


Overview

The camera transform is a Cartesian-frame-to-frame transform. This is combined with the viewing transform to give a transformation that converts a scene into image space.


Return to the Graphics Notes Home Page
Return to the Geometric Modeling Notes Home Page
Return to the UC Davis Visualization and Graphics Group Home Page


This document maintained by Ken Joy

Mail us your comments

All contents copyright (c) 1996, 1997, 1998, 1999
Computer Science Department
University of California, Davis

All rights reserved.


Ken Joy
1999-12-06