Primary Variable Choice

Home    Publications    Blog    About Me   

27 Jan 2017

Though this article is meant to have some theoretical significance, the motivation came from trying to figure out how to answer the question ``oh, you use things like the ideal gas law?’’ when I was explaining that I do fluid dynamics like water/smoke. Part of my issue was that I wanted to describe why the ideal gas law and many of the laws learned in high school are somewhat odd or unsuited for simulation even though the incompressibility constraint can be derived from it. The reason why one might say it’s poorly suited is that we rarely use any of the variables used in it. Obviously most of the quantities are still secretly there, but we rarely have them handy to manipulate directly. In fact they’re all tertiary variables that can be factored out of computations completely.

Primary vs Tertiary Variables

A fundamental, and often overlooked part of physically-based simulation is the choice of which variables should be primary vs tertiary variables. Here I mean primary and tertiary to distinguish between variables that directly manipulated during the simulation, such as those required to define initial conditions, and variables that can be computed from those primary variables. Within the fluids literature the velocity of the fluid is often a primary variable while vorticity and pressure are tertiary and within the elastic literature displacement is often a primary variable while stress and strain are tertiary. In these systems that fit within the system of Lagrangian mechanics, phase space (position/displacement + velocity) is almost always set to be primary variables, though additional effects can be/often are added such as thermal conditions, gravitational forces, or external forces from user interactions.

The choice of whether a variable should be primary or tertiary can have widespread changes on the behavior of a system and when discretizing these systems it’s not always obvious whether adding more or less variables will be advantageous. For instance, when modeling the motion of a small body around a large body, the Stormer-Verlet integrator depends on only the position and accelerations of the small body and manages to maintain stable orbits when such things exist:

\[p_{i+1} = 2 p_i - p_{i-1} + \Delta t^2 \frac{GMm}{\|p_i - c\|^2}\frac{(c - p_i)}{\|p_i - c\|}\]

In fact the Stormer-Verlet preserves swept area as prescribed by Kepler’s second law precisely. This sort of system implicitly depends on position being differentiable and so for problems where discontinuities occur such as contact problems velocities are almost always left as primary variables to keep a closer eye on their values as they jump upon contacts.

In some sense it’s ideal to minimize the number of primary variables used as this eases numerical costs, but there’s a tradeoff that we have alluded to in the previous section between symmetry preservation and slack. Symmetries in dynamics such as conservation of momentum or incompressibility are fundamental to the formation of interesting and realistic behaviors, but they can make systems surprisingly fragile. For instance, computing the forces necessary to make sure a fluid remains incompressible is impossible when the domain it sits in changes volume, which can be completely unintentional such as when it’s coupled with an elastic body whose simulation method can’t guarantee incompressibility. From a theoretical perspective these are ridiculous situations that the simulator shouldn’t have to deal with, but for practical purposes we may want to still support these changes. A good answer, especially within this line of discussion, is to make pressure a primary variable in its own right. In fact, when the incompressible Navier-Stokes is written out as a Lagrangian, pressure takes its place as the Lagrange multiplier for pressure:

\[\mathcal{L}(u,p) = \ldots + p\nabla \cdot u.\]

We can also move the other way through something called Shur’s complement.

Augmenting Lagrangians and Shur’s Complement

The process of converting variables between primary and tertiary variables can be somewhat rigorously defined as the process of augmenting Lagrangians and applying Shur’s Complements. Though the latter is defiend for linear operators, its concept generalizes somewhat smoothly in my opinion.

Augmenting a Lagrangian is simply the act of adding a slack variable to incorporate a constraint. Here we’re taking our desired symmetries as constraints on the primary variables, so if we have a Lagrangian \(\mathcal{L}(p_1,\ldots,p_n)\) and a series of constraints \(C_i(p_1,\ldots,p_n)\) we augment it by adding slack variables \(\lambda_i\)

\[\bar{\mathcal{L}}(p_1,\ldots,p_n,\lambda_1,\ldots,\lambda_m)= \mathcal{L}(p_1,\ldots,p_n) + \sum_i \lambda_i C_i(p_1,\ldots,p_n).\]

As we can see this process adds some slack to our system by allowing for some error in our constraints that is weighed by \(\lambda_i\)

If we have a system with these sorts of slack variables we can generate a matrix according to the necessary conditions of minimizing the above:

\[\frac{\partial}{d(p_1,\ldots,p_n)} \bar{\mathcal{L}}(p_1,\ldots,p_n,\lambda_1,\ldots,\lambda_m)\] \[= \frac{\partial}{d(p_1,\ldots,p_n)}\mathcal{L}(p_1,\ldots,p_n) + \sum_i \lambda_i \nabla_{p_1,\ldots,p_n} C_i(p_1,\ldots,p_n).\]


\[\frac{\partial}{d\lambda_i} \bar{\mathcal{L}}(p_1,\ldots,p_n,\lambda_1,\ldots,\lambda_m) = C_i(p_1,\ldots,p_n).\]

In matrix form this constructs

\[\begin{bmatrix} \frac{\partial}{d(p_1,\ldots,p_n)}\mathcal{L} & \left(\nabla_{p_1,\ldots,p_n} C(p_1,\ldots,p_n)\right)^*\\ C & 0 \end{bmatrix} \begin{bmatrix} (p_1,\ldots,p_n)\\ (\lambda_1,\ldots,\lambda_m) \end{bmatrix} = \begin{bmatrix} 0\\ 0 \end{bmatrix}\]

where we’re treating multiplication by a function as function evaluation.

If we add a little regularizer to make this invertible (Tychonoff style) and evaluate/linearize the above we get some blocked matrix in the standard form for Shur’s complement

\[\begin{bmatrix} A & B\\ C & D \end{bmatrix} \begin{bmatrix} x\\ y \end{bmatrix} = \begin{bmatrix} a\\ 0 \end{bmatrix}\]

where \(y\) is eliminated via

\[(A - BD^{-1}C)x = a - BD^{-1}b.\]
comments powered by Disqus