Solvers and numerical methods#

Solvers#

While Hermes-3 has access to any solver within BOUT++ (see the long list in the documentation), the development and optimisation efforts focuses on two solvers:

CVODE: A high-order time accurate solver. CVODE capability was initially developed for 3D turbulence simulations. By default, it’s configured to be matrix-free and preconditioned using physical preconditioners for electron conduction and neutral cross-field diffusion. It has an adaptive order algorithm and will try to go as high order in time as possible and take long timesteps. Please see the developer page for details. CVODE is required for 3D turbulence simulations and currently gives better results in 2D simulations. The BOUT++ implementation is here.
beuler: A first-order in time implementation of the backward Euler method. This is implemented using the SNES nonlinear solver in PETSc, and currently uses a GMRES linear solver and an ILU-type preconditioner in hypre. The algebraic preconditioner allows for extreme speedups over CVODE, and can already be used in 1D with good results. 2D performance is highly mixed and still in development. 3D use is not feasible as the matrix size precludes the use of algebraic preconditioners. Due to its first order nature, it will be less accurate for time-dependent problems, and it is mostly intended to solve for steady-state problems. The BOUT++ implementation is here.. See the higher level BOUT++ PETSc implementation here.

Both CVODE and beuler have an adaptive timestepper and will take as long a timestep as the Newton iteration failure rate will allow.

Configuring CVODE#

[solver]
type = cvode
use_precon = true   # Use the user-provided preconditioner
mxstep = 1e9        # Prevent timeout
mxorder = 3         # Limit to 3rd order
atol = 1e-7
rtol = 1e-5

use_precon = true: Enables the physics-based preconditioners
mxstep = 1e9: Prevents the solver from timing out if it takes too long to converge. 1e9 is the maximum number of nonlinear steps allowed for this setting.
mxorder = 3: Limits the maximum order of the solver to 3rd order. This is useful for preventing the solver from getting “stuck” at high order with a small timestep. You can alternatively set stablimdet = true to enable automatic stability limit detection to prevent this.
atol = 1e-7: Absolute tolerance for the solver. This is the maximum absolute error allowed and is required to ensure accuracy/stability when variables are small. Available in all solvers.
rtol = 1e-5: Relative tolerance for the solver, the primary driver of convergence. Available in all solvers.

Configuring beuler#

Here is the recommended configuration for the beuler solver using the Hypre Euclid preconditioner:

[solver]
type = beuler                 # Backward Euler steady-state solver
snes_type = newtonls          # Nonlinear solver
ksp_type = gmres              # Linear solver
max_nonlinear_iterations = 10 # Max Newton iterations until iteration failure
pc_type = hypre               # Preconditioner type
pc_hypre_type = euclid        # Hypre preconditioner type
lag_jacobian = 500            # Iterations between jacobian recalculations
atol = 1e-7                   # Absolute tolerance
rtol = 1e-5                   # Relative tolerance

There is an additional option stol which allows the Newton iteration to “converge” if the simulation has only changed by a small amount. This enables the simulation to iterate almost instantly if it detects that the results aren’t changing.

PETSc can print quite extensive performance diagnostics. These can be enabled by putting in the BOUT.inp options file:

[petsc]
log_view = true

This section can also be used to set other PETSc flags, just omitting the leading - from the PETSc option.

Numerics#

Slope (flux) limiters#

The choice of slope limiter is important: a dissipative limiter increases numerical dissipation but can substantially improve solver stability and robustness. See Selecting slope limiter and conduction method for how to change the limiter. The default is MC, which has a good balance between stability and accuracy.

Dynamics parallel to the magnetic field are solved using a 2nd-order slope-limiter method. For any number of fluids we solve the number density $n$, momentum along the magnetic field, $mnv_{||}$, and either pressure $p$ or energy $\mathcal{E}$. Here $m$ is the particle mass, so that $mn$ is the mass density. $v_{||}$ is the component of the flow velocity in the direction of the magnetic field, and is aligned with one of the mesh coordinate directions. All quantities are cell centered.

Cell edge values are reconstructed using a configurable limiter. The available compile-time choices of HERMES_SLOPE_LIMITER are:

MC: Monotonised central second-order limiter. This is the default.
VanAlbada: Symmetric second-order limiter using a smooth approximation to the usual limiter switch. This is often friendlier to nonlinear solvers and finite-difference Jacobian calculations because the reconstruction remains differentiable at extrema.
WENO3: Third-order Jiang-Shu WENO reconstruction on the same three-point stencil. This is generally smoother and less dissipative than TVD limiters, but it does not enforce strict monotonicity.
MinMod: More dissipative second-order TVD limiter.
Upwind: First-order upwind reconstruction. This is the most dissipative, but is sometimes useful for robustness tests.
Superbee: Aggressive second-order TVD limiter with reduced diffusion in sharp fronts, but it can be less robust.

If $f_i$ is the value of field $f$ at the center of cell $i$, then using MinMod slope limiter the gradient $g_i$ inside the cell is:

\[\begin{split}g_i = \left\{\begin{array}{ll} 0 & \textrm{if $\left(f_{i+1} - f_{i}\right) \left(f_{i} - f_{i-1}\right) < 0$} \\ f_{i+1} - f_{i} & \textrm{if $\left|f_{i+1} - f_{i}\right| < \left|f_{i} - f_{i-1}\right|$} \\ f_{i} - f_{i-1} & \textrm{Otherwise} \end{array}\right.\end{split}\]

The values at the left and right of cell $i$ are:

\[\begin{split}\begin{align} f_{i, R} &= f_i + g_i / 2 \nonumber \\ f_{i, L} &= f_i - g_i / 2 \end{align}\end{split}\]

This same reconstruction is performed for $n$, $v_{||}$ and $p$ (or $\mathcal{E}$). The flux $\Gamma_{i+1/2}$ between cell $i$ and $i+1$ is:

\[\Gamma_{f, i+1/2} = \frac{1}{2}\left(f_{i,R} v_{||i,R} + f_{i+1,L}v_{||i+1,L}\right) + \frac{a_{max,i+1/2}}{2}\left(f_{i,R} - f_{i+1,L}\right)\]

This includes a Lax flux term that penalises jumps across cell edges, and depends on the maximum local wave speed, $a_{max}$. Momentum is not reconstructed at cell edges; Instead the momentum flux is calculated from the cell edge densities and velocities:

\[\Gamma_{nv, i+1/2} = \frac{1}{2}\left(n_{i,R} v_{||i,R}^2 + n_{i+1,L}v_{||i+1,L}^2\right) + \frac{a_{max,i+1/2}}{2}\left(n_{i,R}v_{||i,R} - n_{i+1,L}v_{||i+1,R}\right)\]

The wave speeds, and so $a_{max}$, depend on the model being solved, so can be customised to e.g include or exclude Alfven waves or electron thermal speed. For simple neutral fluid simulations it is:

\[a_{max, i+1/2} = \max\left(\left|v_{||i}\right|, \left|v_{||i+1}\right|, \sqrt{\frac{\gamma p_{i}}{mn_i}}, \sqrt{\frac{\gamma p_{i+1}}{mn_{i+1}}}\right)\]

The divergence of the flux, and so the rate of change of $f$ in cell $i$, depends on the cell area perpendicular to the flow, $A_i$, and cell volume $V_i$:

\[\nabla\cdot\left(\mathbf{b} f v_{||}\right)_{i} = \frac{1}{V_i}\left[\frac{A_{i} + A_{i+1}}{2}\Gamma_{f, i+1/2} - \frac{A_{i-1} + A_{i}}{2}\Gamma_{f, i-1/2}\right]\]

Selecting slope limiter and conduction method#

Hermes-3 selects both the parallel advection slope limiter and the Div_par_K_Grad_par_mod conduction discretisation at compile time through CMake cache options:

cmake -S . -B build \
  -DHERMES_SLOPE_LIMITER=MC \
  -DHERMES_CONDUCTION_METHOD=Original

HERMES_SLOPE_LIMITER

Available values are MC, VanAlbada, WENO3, MinMod, Upwind, and Superbee.

HERMES_CONDUCTION_METHOD

Selects the parallel heat conduction discretisation used in Div_par_K_Grad_par_mod(). Available values are:

Original: Separately averages $K$, $J$, $g_{22}$, and $dy$ at the face and then multiplies them together.
ProductJK: Uses the same stencil as Original but averages $J K$ together before applying the face gradient.
Harmonic: Uses a harmonic average of the half-cell conductances $K J / (g_{22} dy)$. This better matches a series-resistance interpretation of two adjacent half-cells and can give noticeably different results when coefficients or cell sizes vary strongly.

The selected values are reported at startup in the Hermes-3 log and are also recorded in the options tree as slope_limiter and conduction_method.

Controlling Lax flux strength with sound_speed#

sound_speed

By default, the Lax flux strength is calculated based on the sound speed of each species individually. Instead, the sound_speed component can be used component to sum the pressure of all species and divide by the sum of the mass density of all species:

\[c_s = \sqrt{\sum_i P_i / \sum_i m_in_i}\]

This is set in the state as sound_speed. It is highly recommended to use this when evolving electron momentum, as the electrons will represent the fastest sound speed in the system, increasing the amount of numerical damping from the Lax flux. This is enabled by default.

NOTE:: When using this component without electron momentum being evolved, set electron_dynamics = false in the [sound_speed] section to prevent the electron timescale from being applied, or you will have too much damping.

Boundaries#

At boundaries along the magnetic field the flow of particles and energy are set by e.g. Bohm sheath boundary conditions or no-flow conditions. To ensure that the flux of particles is consistent with the boundary condition imposed at cell boundaries, fluxes of density $n$ and also $p$ or $\mathcal{E}$ are set to the simple mid-point flux:

\[\Gamma_{f, i+1/2}^{boundary} = f_{i+1/2}v_{||i+1/2}\]

where $f_{i+1/2} = \frac{1}{2}\left(f_{i} + f_{i+1}\right)$ and $v_{||i+1/2} = \frac{1}{2}\left(v_{||i} + v_{||i+1}\right)$ are the mid-point averages where boundary conditions are imposed. It has been found necessary to include dissipation in the momentum flux at the boundary, to suppress numerical overshoots due to the narrow boundary layers that can form:

\[\Gamma_{nv, i+1/2}^{boundary} = n_{i,R}v_{||i,R}v_{||i+1/2} + a_{max}\left[n_{i,R}v_{||i,R} - n_{i+1/2}v_{||i+1/2}\right]\]

where $n_{i+1/2} = \frac{1}{2}\left(n_{i} + n_{i+1}\right)$.

Solvers and numerical methods

Contents