Skip to content

Lie Group Derivatives

Introduction

This document provides some mathematical background on Lie groups and Lie algebras. In particular, this document covers how differentiation works on Lie groups, the Lie group exponential, left and right tangent spaces, the adjoint representations of a Lie group and its algebra, and the chain rule for Lie groups.

Differentiating Curves on Lie Groups

By definition, a Lie group is any differentiable manifold which is also a group. A differentiable manifold is not generally diffeomorphic to Euclidean space (e.g. RN), but it is locally diffeomorphic to Euclidean space in the sense that we can define a diffeomorphism ϕp with RN in the neighborhood of any point p in our group G. We don't know how to take derivatives directly on the group G, but we do know how to do this on Euclidean space. So we define the derivative with respect to a particular choice of ϕp in the neighborhood of the point p where we're taking the derivative. For a trajectory g:RG with g(t)=p, this looks like:

dϕpg(t)dtddt(ϕpg)(t)

Since ϕpg is just a function from R to RN, we can take its derivative in the standard way, and the typical properties of derivatives (e.g. the chain rule) all apply. We just need to come up with a consistent way of picking ϕp for whatever point p we're at. For a Lie group, one way we can do this is to first define ϕe for the identity element eG and then we can define ϕp in the neighborhood of any point p, since Lie group elements are guaranteed to have an inverse. We can do this like so:

ϕp(g)=ϕe(p1g)

So our derivative at a point p=g(t) along our trajectory is given by:

dϕpg(t)dt=ddϵϕe(g(t)1g(t+ϵ))

This happens to be a nice choice for ϕp for each point p because it has a property called left invariance. This means that the derivative is the same for a curve that is left multiplied by a constant element of G. Say that we're differentiating f(t)=hg(t) for constant hG and g:RG. In this case, p=f(t)=hg(t) and we have:

dϕpfdt=ddtϕp(hg(t))=ddtϕe(p1hg(t))=ddϵϕe((g(t)1h1)hg(t+ϵ))=ddϵϕe(g(t)1g(t+ϵ))=dϕg(t)gdt

Which demonstrates the left invariance.

Lie Group Exponential

Now, let's define a curve γ:RG with the properties that (dϕpγ/dt) as defined above is constant at all points along the curve and that γ(0)=e. We claim without proof that it's possible to uniquely define such a curve. If we define γ2(t)=g1γ(t) for some g in the range of γ, then we can conclude that γ2(t)=γ(tγ1(p)) because γ2 also goes through the origin (at t=γ1(p)) and has the same constant value for (dϕpγ2/dt). Therefore, the curves have the same range, and γ2(0)=p1 is also a member of the range of γ. We can do this for any p in the range of γ. We can conclude that the range of γ is a subgroup of G because it contains the identity, every element in the subgroup has an inverse, and it is closed under composition with other members of the subgroup. This is called a one parameter subgroup of G.

Now, we can go further by defining the Lie Group Exponential as the function exp:RNG satisfying exp(X)=γ(1) where γ is defined as above with X=(dϕpγ/dt). In other terms, the exponential tells you where you end up if you follow the same constant derivative for one unit of time through the Lie group. The standard exponential on real numbers falls out of this definition if you consider curves with constant derivatives that intersect the multiplicative identity:

dϕpydt=ddϵ[y(t)1y(t+ϵ)]=ky(0)=1

With some rearrangement and variable substitution:

y(t)=kyy(0)=1

which has a known solution y=exp(t). Therefore, the Lie group exponential can be interpreted as a natural extension of the familiar concept of the exponential. In fact, for matrix Lie groups (Lie groups which have a matrix representation), the Lie group exponential is identical to the matrix exponential. Furthermore, just as the ordinary exponential has an inverse, the logarithm, there is a Lie group logarithm, which is the inverse of the Lie group exponential. Conceptually, the logarithm tells you "what constant velocity would I have to move at to arrive at this point in one unit of time from the origin."

It's worth noting that RN is not formally the input to the exponential. The domain of the exponential is actually called the Lie Algebra g corresponding to the Lie group G. It is a vector space and hence it is homeomorphic to RN, so we casually treat them as interchangeable.

The exponential and logarithm define a correspondence between the Lie group and a vector space RN called the Lie algebra. This connection allows us to leverage much of the mathematics derived in vector spaces on Lie groups which can be incredibly powerful. For instance, one can define a Gaussian probability distribution on the Lie algebra so(3) and take the exponential of samples from it to sample orientations in the Lie group SO(3). Because they are commonly useful, we provide implementations of the exponential and logarithm for each Lie group we implement.

Left and Right Tangents

So far, we have been working with the left invariant definition for the derivative:

dϕpg(t)dt=ddϵϕe(g(t)1g(t+ϵ))

However, we've been very vague about what we may pick for ϕe. We know it needs to be a diffeomorphism defined in an open set around e. It turns out that the Lie group logarithm is a reasonable choice for this, so this is what we use. Hence:

ϕp(g)=log(p1g)

As noted before, this definition is left invariant since we can multiply on the left by any member of the group without changing the value of the derivative. This is practically quite useful because if g represents a transform of world_from_robot as a function of time, it doesn't actually matter where we decide the "world" frame is when we're computing derivatives as long as our choice does not move relative to some agreed-upon world frame (remember that h must be constant for the above to work. Confusingly, although this choice for ϕp yields left invariance, (dϕpg/dt) is often referred to as the "right tangent" or "right tangent space derivative" of g. This is because it can also be defined in terms of the right perturbation to g that the trajectory is "following" at that moment.

g(t+ϵ)=g(t)exp[dϕpgdtϵ]

which when rearranged gives:

dϕpgdt=log[g(t)1g(t+ϵ)]ϵ

In the limit as ϵ0, we get:

dϕpgdt=ddϵlog(g(t)1g(t+ϵ))

which is equivalent to what we had before. For this reason, we henceforth refer to this definition of ϕp as ϕpR. The reason for the special notation, is that we could equally well have chosen:

ϕp(g)=ϕpL(g)=log(gp1)

As before, gp1 is in the neighborhood of the identity e when g is in the neighborhood of p, so this works. With this, we get:

dϕpLgdt=ddtlog(gp1)=ddϵlog(g(t+ϵ)g(t)1)

(dϕpg/dt) is often referred to as the "left tangent" or "left tangent space derivative" of g, and one can verify that it can be defined in terms of a left perturbation to g that the trajectory is following and that it has a right invariance property such that multiplying the trajectory on the right by a constant element of the group does not affect its value.

In summary, we have left and right tangent space derivatives that have right and left invariance respectively.

The Adjoint

In practice both the left and right tangent space derivatives happen to be useful in particular cases, so one might ask if it's possible to easily convert between the right tangent of a trajectory and the left tangent of the trajectory. If we want the left tangent space derivative, we can see:

dϕpLgdt=ddtlog(gp1)=ddtlog((pp1)gp1)=ddtlog(p(p1g)p1)=ddtlog(pexp[log(p1g)]p1)

Applying the chain rule to h(k(t)) where k(t)=log(p1g) and h(k)=log(pexp(k)p1) are both functions to and from Euclidean space, we have:

dϕpLgdt=[ddklog(pexp(k)p1)]dϕpRgdt

so we can convert from right to left tangent space by multiplying by this Jacobian matrix:

Adg[ddklog(gexp(k)g1)]

where we define this Jacobian to be the adjoint representation of the element gG. Technically, the adjoint representation is the map that produces such matrices given inputs from G. This map can also be defined as the derivative of a curve f(g)=pgp1 at the identity. Since this is a map from G to G, one can show this by assuming that g is a curve passing through the identity and using the chain rule as above to find what (df/dg)=Adp is. There are a number of properties that the adjoint has that are worth noting:

Adg1=Adg1
Adgh=AdgAdh

which is basically just a reminder that the adjoint is a representation of G. In the case where the Lie group has a matrix representation (which is true for all the Lie groups we use), one can simplify our definition to be:

AdgX=gXg1;gG,Xg

for any X in the Lie Algebra g of G. Note that this is a matrix representation of the Lie algebra element, not just a vector in RN. Expressions for Adg are derived for all the Lie groups we implement since it is so commonly needed.

It's often the case that we need to take the derivative of Adg with respect to time. We define the adjoint representation of the algebra adX based on a curve g(t) going through the identity.

dAdgdt|g=e=adX;X=dgdt|g=e

It doesn't matter whether we use the right or left tangent space derivative here for g since they are equivalent at the identity, as one can verify by inspecting their definitions. The algebra adjoint is related to the Lie bracket:

adXY=[X,Y]

For matrix Lie groups, the bracket is the commutator on the matrix representation of algebra elements.

adXY=[X,Y]=XYYX

The algebra adjoint is somewhat commonly used, so we provide it as a static member function in our Lie group objects.

The Chain Rule

Let's look at how one might differentiate the composition of two Lie group elements in right tangent space, as an example. In other words, take the time derivative of f(t)=g(t)h(t):

dRfdt=dRdt[g(t)h(t)]=ddϵlog[h(t)1g(t)1g(t+ϵ)h(t+ϵ)]

To help ourselves, let's define a function c(δ1,δ2) like so:

c(δ1,δ2)=log[h(t)1g(t)1g(t+δ1)h(t+δ2)]

where δ1 and δ2 are functions of ϵ. Taking the derivative with respect to ϵ with the multivariable chain rule gives:

ddϵc(δ1,δ2)=Adh1dRgdtdδ1dϵ+dRhdtdδ2dϵ

Of course, the derivative of c is useful to us if δ1=δ2=ϵ so:

dRfdt=Adh1dRgdt+dRhdt

which is the chain rule in the right tangent space. There is also a chain rule for the left tangent space:

dLfdt=dLgdt+AdgdLhdt

One can verify the left and right invariance properties using these expressions and assuming one of the group elements is constant.