Theorem
Let x∈Rn, A be a constant matrix, and f,g be differentiable functions.
Basic Rules
Linearity:
∂x∂(af+bg)=a∂x∂f+b∂x∂g
Product Rule (scalar):
∂x∂(fg)=f∂x∂g+g∂x∂f
∂x∂(aTx)=a
∂x∂(xTA)=A
∂x∂(xTx)=2x
∂x∂(xTAx)=(A+AT)x
If A is symmetric:
∂x∂(xTAx)=2Ax
Chain Rule
For f(g(x)):
∂x∂f=∂g∂f∂x∂g
Examples
Example 1: Linear form
Let a=[23] and f(x)=aTx=2x1+3x2
∂x∂f=a=[23]
Example 2: Quadratic form
Let A=[2113] and f(x)=xTAx=2x12+2x1x2+3x22
Since A is symmetric:
∂x∂f=2Ax=2[2113][x1x2]=[4x1+2x22x1+6x2]
Example 3: Chain rule
Let g(x)=Ax where A=[1324], and f(g)=gTg
∂x∂f=∂g∂f∂x∂g=(2gT)(A)=2(Ax)TA=2xTATA
Evaluating at x=[11]:
ATA∂x∂fx=[11]=[1234][1324]=[10141420]=2[11][10141420]=2[2434]
As a column vector: [4868]