Estimating Seismic Moment Tensors based on Bayesian Machine Learning

Andreas Steinberg¹*, Hannes Vasyura-Bathke²*, Peter Gaebler¹, Matthias Ohrnberger² and Lars Ceranna¹

1 Federal Institute for Geosciences and Natural Resources (BGR), B4.3 Federal Seismological Survey, Nuclear-Test Ban

2 University of Potsdam, Institute for Earth and Environmental Sciences

* equal contribution

Fast estimation of seismic moment tensors necessary for:

  • Earthquake early warning systems, e.g. mechanisms based hakemaps
  • Large catalogs
  • Monitoring of geothermal stimulations (SEIGER project)

Machine Learning Terminology

  • Training input: images with data to be learned
  • Labels : associated (source) parameters
  • Network: layered architecture of neurons, which can activate connected neurons
  • Output: model able to predict labels given unseen data

Machine Learning - what has been done

  • detection and location: e.g. Kriegerowski et al., 2019
  • first motion polarity: e.g. Ross et al. 2019
  • full waveform DC mechanism determination: e.g. Kuang et al. 2021

Machine Learning - what has not been done

  • full MT
  • data driven consideration of uncertainties
  • recent advances allow to learn distributions of weights (Bayesian Neuronal Networks, BNN)
Training Input
  • training on synthetic waveforms from a volume of potential source locations
  • advantage: control over source and targets
  • using pre-calculated Green's function stores (Pyrocko and QSEIS)
  • most of ML optimised for images
  • Amplitudes vary for different sources-distances
  • →no direct use of the waveforms for training but reformed to images

Pre-processing of waveforms

  • filter waveforms (bandpass 1-5.4 Hz)
  • cut 1s before first phase onset
  • cut 3.5s after first phase onset
  • sort waveforms by station and channel name

Normalized input image for a single earthquake source

Labels

Using the Lune paramterziation of the MT

  • only 5 unique paramters (κ, σ, h, w, v)
  • cartesian coordinate system
  • v=0 and w=0 -> isotropic source
  • no scaling with ρ because of normalized input
Network architecture and variational inference

$$ \text{Approximate Kullback-Leibler divergence } \text{by drawing Monte Carlo samples } $$

$$ \mathbf{w}^{(i)} \text{ from variational distribution } q(\mathbf{w} \lvert \boldsymbol{\theta}) $$

$$ \text{with prior distribution } p(\boldsymbol{w}) \text{ and data likeliehood } p(\mathcal{D}|\boldsymbol{w}) $$

$$ \mathcal{F}(\mathcal{D},\boldsymbol{\theta}) \approx {1 \over N} \sum_{i=1}^N \left[ \log q(\mathbf{w}^{(i)} \lvert \boldsymbol{\theta}) - \log p(\mathbf{w}^{(i)}) - \log p(\mathcal{D} \lvert \mathbf{w}^{(i)})\right] $$

$$ \boldsymbol{\theta} = (\boldsymbol{\mu}, \boldsymbol{\sigma}) $$

Bayes by Backprop

Disclaimer: this is a sketch!

Bayes by Backprop

Disclaimer: this is a sketch!

Single Bayesian Neuronal Network design

  • First two layers 1xN kernel over time
  • Third layer 3x1 kernel over station components
  • Pooling layer resampling 3x4 (stations and time)
  • fully connected layer
  • non-activated output layer

Design inspired by Kriegerowski et al., 2019

Single Bayesian Neuronal Network design

  • loss function: neg. log-likeliehood
  • optimizer: Adam, RELu activations
  • Flipout layers (Wen et al., 2018), Gaussian priors from backpropgation
  • 0.2 dropout to simulate missing data
  • Learning rate and batch size important
Variational inference from multiple BNN's


  • learn individual BNN's for a volume of possible source locations (grid)
  • learn individual BNN's for different Earth structure models
  • cut out input waveforms with respect to timing and location uncertainties
  • ➜ errors in the Earth structure, timing and location can to be taken into account

Test and validation with unseen data of the 2019 Ridgecrest sequence

Variational inference (Bayesian) Machine Learning approach

Training one BNN for each grid point











Stations Grid points
    Some statistics
  • 41 stations up to 150km away
  • grid horizontally 10.5 by 10.5 km; step size: 1.5 km
  • grid vertical from 2 to 10 km; step size: 2 km
  • discretization of 0.1 π κ ; 0.2 for σ, h and w; 0.02 for v
  • 171.600 waveforms per source location or individual BNN
  • 588 BNNs (196 grid points times three Earth structures) are trained
  • including "Mojave" Earth structures from SCEDC
Comparison of predictions with SCEDC catalog moment tensors
8 reference mechanisms available (red)











Grid points
Omega angle measure of MT similarity
\[\begin{aligned} d = \frac{1}{2}\Bigg[1-\frac{U_{1}\cdot U_{2}}{||U_{1}||||U_{2}||}\Bigg] = \frac{1}{2}\Bigg[1-\frac{\sum U_{1_{ij}}\cdot U_{2_{ij}}}{(\sum U^2_{1_{ij}})^{\frac{1}{2}} (U^2_{2_{ij}})^{\frac{1}{2}}}\Bigg] \end{aligned} \]

after Tape and Tape, 2012 and Cesca et al., 2014

Ensemble of 6000 MT predictions for four earthquakes

Examplary waveform fits for Mw 4.1 on 2019/07/11 23:45:19

Forward calculated for 20% of the predicted ensemble, observed waveforms (black), MAP in red

Ensemble of 6000 source parameter predictions for Mw 3.8 on 2019/07/06 12:00:05
  • increase in v with Earth structure consideration (Vasyura-Bathke et al., 2021)
  • Density plot comparing predictions with 196 SCEDC catalog DC's
  • with v= and w=0
    • magnitude dependent performance
    • choice of filters/time windows/SNR

    Location of Mw 3.9 at 2019-07-06 17:59:15





    • location not learned
    • but: showing same waveforms to all BNN's results in
    • ➡ correlation of the LLK values with distance to the centroid location (black star)

    Conclusions

    Cons

    • gridded locations
    • non-transferable to other areas or magnitude ranges
    • computional up-front costs (cpu 2-3 months)
    • no moment magnitude

    Conclusions

    Pros

    • fast: single evaluation takes ms
    • representative ensembles in up to a few s
    • flexible variational inference consideration of model and data errors

    Conclusions

      • ML can be applied to determine full MTs
      • sucessfully reproduced indepently determined MTs for subset of 2019 Ridgecrest earthquakes
      • future application to Landau/Insheim area and potentially to Morsleben

    Paper and software

    Pre-print available at: https://www.essoar.org/doi/10.1002/essoar.10506663.1

    We make the software available as a jupyter notebook at: https://github.com/braunfuss/BNN-MT

    Thank you!
    Application to recent Landau event
    
    																								Time: 2020-11-09 21:47:07.800
    																								Latitude [deg]: 49.149
    																								Longitude [deg]: 8.131
    																								Depth [km]: 4
    																								Magnitude: 1.5
    													                    
    Allows for analysis of non DC components
    Appendix MT
    Appendix DC
    Appendix

    Blackbox? Example activations

    Blackbox? Example activations

    Blackbox? Example activations

    Blackbox? Example activations

    Blackbox? Example activations

    Blackbox? Example activations

    Blackbox? Example activations