Solvers and numerical methods#
Solvers#
While Hermes-3 has access to any solver within BOUT++ (see the long list in the documentation), the development and optimisation efforts focuses on two solvers:
- CVODE
A high-order time accurate solver. CVODE capability was initially developed for 3D turbulence simulations. By default, it’s configured to be matrix-free and preconditioned using physical preconditioners for electron conduction and neutral cross-field diffusion. It has an adaptive order algorithm and will try to go as high order in time as possible and take long timesteps. Please see the developer page for details. CVODE is required for 3D turbulence simulations and currently gives better results in 2D simulations. The BOUT++ implementation is here.
- beuler
A first-order in time implementation of the backward Euler method. This is implemented using the SNES nonlinear solver in PETSc, and currently uses a GMRES linear solver and an ILU-type preconditioner in hypre. The algebraic preconditioner allows for extreme speedups over CVODE, and can already be used in 1D with good results. 2D performance is highly mixed and still in development. 3D use is not feasible as the matrix size precludes the use of algebraic preconditioners. Due to its first order nature, it will be less accurate for time-dependent problems, and it is mostly intended to solve for steady-state problems. The BOUT++ implementation is here.. See the higher level BOUT++ PETSc implementation here.
Both CVODE and beuler have an adaptive timestepper and will take as long a timestep as the Newton iteration failure rate will allow.
Configuring CVODE#
[solver]
type = cvode
use_precon = true # Use the user-provided preconditioner
mxstep = 1e9 # Prevent timeout
mxorder = 3 # Limit to 3rd order
atol = 1e-7
rtol = 1e-5
- use_precon = true
Enables the physics-based preconditioners
- mxstep = 1e9
Prevents the solver from timing out if it takes too long to converge. 1e9 is the maximum number of nonlinear steps allowed for this setting.
- mxorder = 3
Limits the maximum order of the solver to 3rd order. This is useful for preventing the solver from getting “stuck” at high order with a small timestep. You can alternatively set
stablimdet = trueto enable automatic stability limit detection to prevent this.- atol = 1e-7
Absolute tolerance for the solver. This is the maximum absolute error allowed and is required to ensure accuracy/stability when variables are small. Available in all solvers.
- rtol = 1e-5
Relative tolerance for the solver, the primary driver of convergence. Available in all solvers.
Configuring beuler#
Here is the recommended configuration for the beuler solver using the
Hypre Euclid preconditioner:
[solver]
type = beuler # Backward Euler steady-state solver
snes_type = newtonls # Nonlinear solver
ksp_type = gmres # Linear solver
max_nonlinear_iterations = 10 # Max Newton iterations until iteration failure
pc_type = hypre # Preconditioner type
pc_hypre_type = euclid # Hypre preconditioner type
lag_jacobian = 500 # Iterations between jacobian recalculations
atol = 1e-7 # Absolute tolerance
rtol = 1e-5 # Relative tolerance
There is an additional option stol which allows the Newton iteration
to “converge” if the simulation has only changed by a small amount.
This enables the simulation to iterate almost instantly if it detects that the
results aren’t changing.
PETSc can print quite extensive performance diagnostics. These can be enabled by putting in the BOUT.inp options file:
[petsc]
log_view = true
This section can also be used to set other PETSc flags, just omitting
the leading - from the PETSc option.
Numerics#
Slope (flux) limiters#
The choice of slope limiter is important: a dissipative limiter increases
numerical dissipation but can substantially improve solver stability and
robustness. See Selecting slope limiter and conduction method for how to change the
limiter. The default is MC, which has a good balance between
stability and accuracy.
Dynamics parallel to the magnetic field are solved using a 2nd-order slope-limiter method. For any number of fluids we solve the number density \(n\), momentum along the magnetic field, \(mnv_{||}\), and either pressure \(p\) or energy \(\mathcal{E}\). Here \(m\) is the particle mass, so that \(mn\) is the mass density. \(v_{||}\) is the component of the flow velocity in the direction of the magnetic field, and is aligned with one of the mesh coordinate directions. All quantities are cell centered.
Cell edge values are reconstructed using a configurable limiter. The
available compile-time choices of HERMES_SLOPE_LIMITER are:
MCMonotonised central second-order limiter. This is the default.
VanAlbadaSymmetric second-order limiter using a smooth approximation to the usual limiter switch. This is often friendlier to nonlinear solvers and finite-difference Jacobian calculations because the reconstruction remains differentiable at extrema.
WENO3Third-order Jiang-Shu WENO reconstruction on the same three-point stencil. This is generally smoother and less dissipative than TVD limiters, but it does not enforce strict monotonicity.
MinModMore dissipative second-order TVD limiter.
UpwindFirst-order upwind reconstruction. This is the most dissipative, but is sometimes useful for robustness tests.
SuperbeeAggressive second-order TVD limiter with reduced diffusion in sharp fronts, but it can be less robust.
If \(f_i\) is the value of field \(f\) at the center of cell \(i\), then using MinMod slope limiter the gradient \(g_i\) inside the cell is:
The values at the left and right of cell \(i\) are:
This same reconstruction is performed for \(n\), \(v_{||}\) and \(p\) (or \(\mathcal{E}\)). The flux \(\Gamma_{i+1/2}\) between cell \(i\) and \(i+1\) is:
This includes a Lax flux term that penalises jumps across cell edges, and depends on the maximum local wave speed, \(a_{max}\). Momentum is not reconstructed at cell edges; Instead the momentum flux is calculated from the cell edge densities and velocities:
The wave speeds, and so \(a_{max}\), depend on the model being solved, so can be customised to e.g include or exclude Alfven waves or electron thermal speed. For simple neutral fluid simulations it is:
The divergence of the flux, and so the rate of change of \(f\) in cell \(i\), depends on the cell area perpendicular to the flow, \(A_i\), and cell volume \(V_i\):
Selecting slope limiter and conduction method#
Hermes-3 selects both the parallel advection slope limiter and the
Div_par_K_Grad_par_mod conduction discretisation at compile time
through CMake cache options:
cmake -S . -B build \
-DHERMES_SLOPE_LIMITER=MC \
-DHERMES_CONDUCTION_METHOD=Original
HERMES_SLOPE_LIMITERAvailable values are
MC,VanAlbada,WENO3,MinMod,Upwind, andSuperbee.HERMES_CONDUCTION_METHODSelects the parallel heat conduction discretisation used in
Div_par_K_Grad_par_mod(). Available values are:OriginalSeparately averages \(K\), \(J\), \(g_{22}\), and \(dy\) at the face and then multiplies them together.
ProductJKUses the same stencil as
Originalbut averages \(J K\) together before applying the face gradient.HarmonicUses a harmonic average of the half-cell conductances \(K J / (g_{22} dy)\). This better matches a series-resistance interpretation of two adjacent half-cells and can give noticeably different results when coefficients or cell sizes vary strongly.
The selected values are reported at startup in the Hermes-3 log and are
also recorded in the options tree as slope_limiter and
conduction_method.
Controlling Lax flux strength with sound_speed#
sound_speed
By default, the Lax flux strength is calculated based on the sound speed
of each species individually. Instead, the sound_speed component can be used
component to sum the pressure of all species and divide by the sum of the mass density
of all species:
This is set in the state as sound_speed. It is highly recommended to use
this when evolving electron momentum, as the electrons will represent the fastest
sound speed in the system, increasing the amount of numerical damping from the Lax flux.
This is enabled by default.
- NOTE:
When using this component without electron momentum being evolved, set
electron_dynamics = falsein the[sound_speed]section to prevent the electron timescale from being applied, or you will have too much damping.
Boundaries#
At boundaries along the magnetic field the flow of particles and energy are set by e.g. Bohm sheath boundary conditions or no-flow conditions. To ensure that the flux of particles is consistent with the boundary condition imposed at cell boundaries, fluxes of density \(n\) and also \(p\) or \(\mathcal{E}\) are set to the simple mid-point flux:
where \(f_{i+1/2} = \frac{1}{2}\left(f_{i} + f_{i+1}\right)\) and \(v_{||i+1/2} = \frac{1}{2}\left(v_{||i} + v_{||i+1}\right)\) are the mid-point averages where boundary conditions are imposed. It has been found necessary to include dissipation in the momentum flux at the boundary, to suppress numerical overshoots due to the narrow boundary layers that can form:
where \(n_{i+1/2} = \frac{1}{2}\left(n_{i} + n_{i+1}\right)\).