Most gameplay cameras follow a single world-space pivot, maintaining
a fixed offset from it. Sometimes, though, we want the pivot of interest
(POI) to appear at specific screen positions. For example, in some
lock-on systems, the character's POI is placed on the left side of the
screen while the enemy's POI is placed on the right. This post explains
how to translate or rotate the camera so the POI lands at the desired
screen-space position.
Problem Formulation
Formally, we'd like to translate or rotate the camera so that the
world space POI lands at the
desired normalized screen-space position where . Coordiante represents screen center and is the top-right corner.
We can achieve this by either translating or rotating the camera, as
introduced below.
Camera Translation
Displacing the POI by camera translation is straightforward: keep the
camera's rotation fixed, choose a desired forward distance from the camera to the pivot along the
local X axis, then slide the camera along its local Y and Z axes until
the POI reaches the target screen position. We follow Unreal's
coordinate conventions, where the forward unit vector is the X axis,
right Y axis and up Z axis. The desired forward distance must be
specified because the view volume is a perspective frustum, determines how screen offsets map to
camera‑space offsets.
The following figure shows the relationship of the desired screen
space position , horizontal
field of view and the forward distance , in camera's local space projected onto
the XY plane:
In one word, the POI's Y axis coordinate must be in the camera's
local space. Given this, we can translate the camera in its local Y axis
so that this constraint can be met.
The vertical case is analogous: with normalized screen Y axis
position and vertical field of
view , the required
camera space Z coordinate is , where .
The following figure shows what we want for camera translation to
maintain a screen space constraint. Note that the rectangular zone
restricts the area where the pivot position can land.
Camera Rotation
Camera rotation is more complex. We start by assuming the desired
rotation's forward direction is
where is
the pitch and is
the yaw. Then, the inverse rotation matrix can be represented as:
The inverse rotation matrix maps a world-space vector
into the rotation's local space. Hence, it maps the directional vector
from camera position to POI, i.e., , to the
local direction .
According to what we derived in last section, must satisfy:
However, the above equations do not have a closed-form solution and
they are highly non-linear and ill-formed. But the good news is we can
still numerically solve them through the Newton's method:
constfloat S = A * CosX * CosY + B * CosX * SinY + C * SinX; constfloat F1 = 2.f * a * m * S - (-A * SinY + B * CosY); constfloat F2 = 2.f * b * n * S - (-A * SinX * CosY - B * SinX * SinY + C * CosX);
// Compute Jacobian. constfloat DSDX = -A * SinX * CosY - B * SinX * SinY + C * CosX; constfloat DSDY = CosX * (-A * SinY + B * CosY);
constfloat DF1DX = 2.f * a * m * DSDX; constfloat DF1DY = 2.f * a * m * DSDY - (-A * CosY - B * SinY); constfloat DF2DX = 2.f * b * n * DSDX - (-A * CosX * CosY - B * CosX * SinY - C * SinX); constfloat DF2DY = 2.f * b * n * DSDY - (A * SinX * SinY - B * SinX * CosY);
float Det = DF1DX * DF2DY - DF1DY * DF2DX; if (FMath::Abs(Det) < 1e-3) { break; }
constfloat Sn = A * CosXn * CosYn + B * CosXn * SinYn + C * SinXn; constfloat F1n = 2.f * a * m * Sn - (-A * SinYn + B * CosYn); constfloat F2n = 2.f * b * n * Sn - (-A * SinXn * CosYn - B * SinXn * SinYn + C * CosXn);
constfloat S = A * CosX * CosY + B * CosX * SinY + C * SinX; constfloat F1 = 2.f * a * m * S - (-A * SinY + B * CosY); constfloat F2 = 2.f * b * n * S - (-A * SinX * CosY - B * SinX * SinY + C * CosX);
// Compute Jacobian. constfloat DSDX = -A * SinX * CosY - B * SinX * SinY + C * CosX; constfloat DSDY = CosX * (-A * SinY + B * CosY);
constfloat DF1DX = 2.f * a * m * DSDX; constfloat DF1DY = 2.f * a * m * DSDY - (-A * CosY - B * SinY); constfloat DF2DX = 2.f * b * n * DSDX - (-A * CosX * CosY - B * CosX * SinY - C * SinX); constfloat DF2DY = 2.f * b * n * DSDY - (A * SinX * SinY - B * SinX * CosY); // Update using Levenberg–Marquardt (J^T*J+Lambda*I)Delta = -J^T*F. // Form J^T * J and J^T * F. constfloat JTJ_00 = DF1DX * DF1DX + DF2DX * DF2DX; constfloat JTJ_01 = DF1DX * DF1DY + DF2DX * DF2DY; constfloat JTJ_11 = DF1DY * DF1DY + DF2DY * DF2DY;
constfloat JTF_0 = DF1DX * F1 + DF2DX * F2; constfloat JTF_1 = DF1DY * F1 + DF2DY * F2;
constfloat Sn = A * CosXn * CosYn + B * CosXn * SinYn + C * SinXn; constfloat F1n = 2.f * a * m * Sn - (-A * SinYn + B * CosYn); constfloat F2n = 2.f * b * n * Sn - (-A * SinXn * CosYn - B * SinXn * SinYn + C * CosXn);
A rough comparison is done examine the efficiency and accuracy of the
two methods. Evidence shows that the LM method takes fewer average steps
to converge but the vanilla Newton method is slightly faster regarding
runtime due to less computation.
Alright, let's see what we can get for camera rotation:
Summary
It took me quite a while to work through this problem. I first tried
to derive an analytic (closed‑form) rotation, but the results
consistently deviated from the ground truth. Suspecting a closed form
might not be attainable, I switched to an iterative solver, which
produced exact results. I hope this post is helpful and contact me if
there are any problems or suggestions.