Influence of data filters on the accuracy and precision of paleomagnetic
poles: what is the optimal sampling strategy?
Dieke Gerritsen1, Bram Vaes1, and
Douwe J.J. van Hinsbergen1
1Department of Earth Sciences, Utrecht University,
Princetonlaan 8A, 3584 CB Utrecht, the Netherlands.
Corresponding authors: Dieke Gerritsen (h.gerritsen1@uu.nl) and Bram
Vaes (b.vaes@uu.nl)
Key Points:
- Within-site data filters (minimum k or n, or eliminating outliers per
site) are unsuccessful in eliminating unreliable data.
- The precision of paleomagnetic poles is dominated by between-site
scatter; within-site scatter has minimal contribution.
- For paleopoles, efforts put into collecting multiple samples per site
are more effectively spent on collecting more single-sample sites.
-
Abstract
To determine a paleopole, the paleomagnetic community commonly applies a
loosely defined set of quantitative data filters that were established
for studies of geomagnetic field behavior. These filters require costly
and time-consuming sampling procedures, but whether they improve
accuracy and precision of paleopoles has not yet been systematically
analyzed. In this study, we performed a series of experiments on four
datasets which consist of 73-125 lava sites with 6-7 samples per lava.
The datasets are from different regions and ages, and are large enough
to represent paleosecular variation, yet contain demonstrably unreliable
datapoints. We show that data filters based on within-site scatter (a
k-cutoff, a minimum number of samples per site, and eliminating the
farthest outliers per site) cannot objectively identify unreliable
directions. We find instead that excluding unreliable directions relies
on the subjective interpretation of the expert, highlighting the
importance of making all data available following the FAIR principles.
In addition, data filters that eliminate datapoints even have an adverse
effect: the accuracy as well as the precision of paleopoles decreases
with the decreasing number of data. Between-site scatter far outweighs
within-site scatter, and when collecting paleomagnetic poles, the extra
efforts put into collecting multiple samples per site are more
effectively spent on collecting more single-sample sites.
1 Introduction
Paleomagnetic poles, or paleopoles, quantify the past position of rocks
relative to the geomagnetic pole and constrain tectonic reconstructions
and apparent polar wander paths (APWPs) (e.g., Besse and Courtillot,
2002; Torsvik et al., 2012). The calculation of paleopoles relies on the
assumption that the time-averaged geomagnetic field approximates a
geocentric axial dipole (GAD), but is complicated by short-term
deviations from this field (e.g., Cromwell et al., 2018; Oliveira et
al., 2021) known as paleosecular variation (PSV). To obtain a paleopole,
paleomagnetists therefore average virtual geomagnetic poles (VGPs),
whereby every VGP is then assumed a ‘spot reading’: an instantaneous
reading of the past geomagnetic field collected from a rock unit
(‘site’) that represents an increment of geological time, such as a lava
(Butler, 1992; Tauxe, 2010). However, not every VGP represents an
accurate spot reading because artifacts may be introduced by measuring
errors or remagnetization. Therefore, the paleomagnetic community
commonly uses a set of data filters to acquire a set of reliable spot
readings. However, these filters vary between authors and were not
determined by studies aiming to constrain paleopoles.
The studies that established the data filters, investigated PSV and
geomagnetic field behavior by determining the between-site scatter of a
set of VGPs (e.g., Cromwell et al., 2018; de Oliveira et al., 2021;
Johnson et al., 2008; Johnson & Constable, 1996; Tauxe et al., 2003).
To this end, these studies aim to correct for within-site scatter
induced by measuring errors and typically require a minimum number of
readings per site, although this number of readings varies between
authors (e.g., Biggin et al., 2008; Johnson et al., 2008; Doubrovine et
al., 2019; Cromwell et al., 2018). The resulting paleomagnetic
directions are then averaged to a site-mean direction which is converted
to a VGP if the site passes a criterion for the minimum within-site
precision value (‘cutoff’), typically expressed as a Fisher (1953)
precision parameter k . This value also varies between authors,
e.g., k \(\mathbf{\geq}\)50 or k \(\mathbf{\geq}\)100 (e.g.,
Biggin et al., 2008; Cromwell et al., 2018; Johnson et al., 2008; Tauxe
et al., 2003). Subsequently, similar procedures have become common for
calculation of paleopoles (e.g., Butler, 1992; Lippert et al., 2014;
Meert et al., 2020). However, does this time- and data-intensive
procedure improve the accuracy and precision of paleopoles?
In this study, we analyze to what extent commonly applied paleomagnetic
data filters established for PSV studies improve the accuracy and
precision of paleopoles. To this end, we study four large paleomagnetic
datasets obtained from lava sequences from the Cretaceous of Mongolia,
the Permian of Norway, the Miocene of Turkey, and the Quaternary of
Antarctica. These datasets are large enough to represent PSV, but
contain additional between-site and within-site scatter of varying
magnitude due to measurement errors, lightning-induced remagnetization,
and/or tectonic deformation. We perform a series of experiments to
examine the effects of applying filters on the accuracy and precision of
paleopoles. We evaluate whether these filters can objectively exclude
outliers and filter non-PSV induced scatter from the paleomagnetic
datasets. We then assess how a given number of paleomagnetic directions
is optimally distributed over a collection of paleomagnetic sites to
acquire the best-constrained paleopole position. We discuss to what
extent the filters used in PSV studies improve paleopole accuracy and
precision. Our results aim to aid paleomagnetists to optimize their
sampling and data filtering strategies.
2 Background
To evaluate the reliability of a paleopole, it is common to use filters
to obtain a paleomagnetic dataset representative of PSV. Initially, Van
der Voo (1990) recommended to average a minimum set of 25 samples or
sites (i.e., spot readings) for a paleopole, and formulated loosely
defined filters using Fisher (1953) statistics. Those statistics are
used to describe a paleomagnetic dataset and can be applied to either
directional data or VGPs. Three parameters are of importance in this
statistical approach: the radius of the 95% confidence cone around the
mean (\(\mathbf{\alpha}_{\mathbf{95}}\) for directions and
A95 for VGPs), the precision parameter (k for directions
and K for VGPs), and the circular standard deviation (S) that quantifies
the angular between-site dispersion of VGPs. Van der Voo (1990)
suggested filters for these parameters:\(\mathbf{\alpha}_{\mathbf{95}}\) or
A95\(\mathbf{\leq}\)16°, and k or K\(\mathbf{\geq}\)10.
In later studies, filters were added to determine the reliability of
each VGP based on quality filters used in PSV studies (e.g., Lippert et
al., 2014).
For obtaining paleopoles, paleomagnetists usually sample multiple sites
on a locality, where every site likely represents a spot reading of the
paleomagnetic field, e.g., a lava flow. The total scatter in a dataset
then consists of within-site and between-site scatter (Biggin et al.,
2008). Paleomagnetists collect multiple samples per site to test whether
a paleomagnetic direction is reproducible and to average measurement and
other random errors (McElhinny & McFadden, 1999). A quality check for
within-site precision is a k-cutoff (Tauxe et al., 2003), where sites
with a k value lower than an arbitrary value of 50 or 100 are
discarded (e.g., Biggin et al., 2008; Johnson et al., 2008; Lippert et
al., 2014). Furthermore, outliers are often subjectively discarded from
sites based on the ‘expert eye’ and experience of the interpreter. How
many samples should be collected per site is not widely agreed upon.
Based on Monte Carlo simulations, a minimum number of five independently
oriented samples was deemed necessary to estimate k reliably
(Tauxe et al., 2003). Others suggest that three (Meert et al., 2020),
four (Cromwell et al., 2018; Lippert et al., 2014), or six (Asefaw et
al., 2021) samples are needed. But most paleomagnetic studies collected
six to eight samples per lava site.
To average between-site scatter, assumed predominantly the result of
PSV, multiple sites are collected, but the minimum number of sites
required for a ‘good’ average, that is, the pole, is not well defined
(Tauxe, 2010). The precision with which a paleopole is calculated
increases with increasing number of underlying VGPs (Vaes et al., 2021).
Meert et al. (2020) suggested eight sites are sufficient to determine a
paleopole, provided that each site is constrained by at least three
samples. However, based on a statistical simulation of PSV, Tauxe et al.
(2003) showed that a minimum of approximately 100 sites may be required
to fully sample PSV. However, this number is rarely obtained due to
limited availability of sites and because resources are consumed by
sampling a high number of samples for each site. Below, we analyze the
effect of these different sampling and filtering strategies on the
accuracy and precision of the resulting paleopole.
3 Datasets and Approach
3.1 Datasets
As basis for our analysis, we use four large paleomagnetic datasets
derived from lavas from Mongolia (van Hinsbergen et al., 2008), Norway
(Haldan et al., 2014), Turkey (van Hinsbergen et al., 2010), and
Antarctica (Asefaw et al., 2021). The results were previously published,
but for the purpose of this study, we reinterpreted the demagnetization
diagrams of all samples. We did this by identifying the characteristic
remanent magnetization (ChRM) by principal component analysis
(Kirschvink, 1980), using the online paleomagnetic analysis platform
Paleomagnetism.org (Koymans et al., 2016, 2020). We have thereby not
forced interpreted components through the origin, and we have not used
the remagnetization great-circle method of McFadden & McElhinny (1988),
as was occasionally done in the original interpretations. All
interpretable diagrams were interpreted, and no samples were discarded
based on obviously outlying directions, e.g., due to lightning strikes.
In other words, we deliberately kept directions that an experienced
paleomagnetist would likely immediately discard as unreliable. We also
did not exclude sites that were collected from tectonically deformed
regions. As a result, the dataset contains larger noise and a higher
between-site scatter (lower K-value) than the published values. This
allows us to assess whether the objective quality criteria alone can
clean the dataset from outliers, or whether this relies on a subjective
‘expert eye’. The sites from Mongolia, Norway, and Turkey were sampled
with a minimum of seven samples per site, and the sites from Antarctica
with a minimum of six. Only sites with seven (or six, for Antarctica)
interpretable directions have been used in our analysis. If sites
contained more samples, only the first seven (or six, for Antarctica)
samples were used. All uninterpreted data are publicly available in
databases (Paleomagnetism.org (Koymans et al., 2020) or MagIC (Tauxe et
al., 2016), see Data Availability Statement). Our reinterpretations are
provided in the supplementary information (Table S1) to this paper, but
because our interpretations are (deliberately) not better than the
original interpretations, we will not upload these in the databases to
avoid confusion. All paleopole positions from this study differ from the
published pole positions, but still sit within the 95% confidence cone
of the respective published pole (Figure 1).
After the reinterpretation, we arrived at a total of 108 sites of lower
Cretaceous lavas, corresponding to the base of the Cretaceous Normal
Superchron, from the NE Gobi Altai mountain range of southern Mongolia
(van Hinsbergen et al., 2008; vH08). We did not include samples from the
Artz Bogd area from the original dataset because these appear
systematically rotated due to tectonic deformation. Furthermore, only
the sites acquired by van Hinsbergen et al. (2008) are included. Our
reinterpreted dataset (G21) leads to a higher scatter than the original
interpretation of van Hinsbergen et al. (2008) for reasons outlined
above (A95,vH08 = 5.3° vs. A95,G21 =
6.4°, KvH08 = 9.1 vs. KG21 = 5.5). Most
sites have limited within-site scatter and high k-values, as expected
for lava sites.
We arrived at a total of 73 sites from lavas of Permian age from the
Oslo Graben in Norway (Haldan et al., 2014; H14). This dataset contains
lavas from the Permo-Carboniferous Reversed Superchron (PCRS).
Interestingly, the between-site scatter is much lower than for the other
three localities, as addressed by (Brandt et al., 2021; Handford et al.,
2021). The within-site scatter of this dataset, however, is higher. Our
reinterpretation contains more scatter than the interpretation of Haldan
et al. (2014) (A95,H14 = 1.9° vs.
A95,G21 = 3.1°, KH14 = 52.2 vs.
KG21 = 29.4).
A total of 125 sites of lower to middle Miocene lavas and ignimbritic
tuffs come from basins in the northern Menderes Massif in western Turkey
(van Hinsbergen et al., 2010; vH10). This dataset was acquired in a
tectonically active area and contains basins that were interpreted to be
tectonically coherent, and basins that were interpreted by van
Hinsbergen et al. (2010) to be internally tectonically disturbed (later
confirmed and mapped out in detail by Uzel et al. (2015, 2017)). It also
contained sites acquired by preceding studies, but only the sites
acquired by van Hinsbergen et al. (2010) are included in our dataset.
Our re-interpreted dataset contains an equal amount of scatter as the
dataset of van Hinsbergen et al. (2010) (A95,vH10 = 6.7°
vs. A95,G21 = 6.4°, KvH10 = 4.6 vs.
KG21 = 4.8).
Finally, we used 107 sites with n=6 from lavas of Plio-Pleistocene age
of the Erebus volcanic province in Antarctica (Asefaw et al., 2021;
A21). We did not reinterpret this dataset and used the published
interpretations, but only used sites with n=6. Furthermore, we did not
use the age constraint of <5 Ma and k-cutoff. Our
reinterpretation does not significantly differ from the published one of
Asefaw et al. (2021) (A95,A21 = 5.5° vs.
A95,G21 = 5.9°, KA21 = 7.7 vs.
KG21 = 6.3).
3.2 Approach
We performed a series of experiments to study the effect on paleopole
accuracy and precision of objective filters that one may use when
multiple samples per site are available. These filters include the
influence of (i) a k-cutoff; (ii) the number of samples per site (n),
and (iii) discarding outliers of within-site dispersion. In addition, we
assessed how a given number of paleomagnetic directions may be optimally
distributed over a collection of paleomagnetic sites (N) to acquire the
most best-constrained paleopole position. This will be addressed using
Fisher (1953) parameters: the radius of the 95% confidence cone around
the paleopole (A95), the precision parameter (K), and
the circular standard deviation (S). We performed experiments using a
Python code, which we developed making extensive use of the freely
available paleomagnetic software package PmagPy (Tauxe et al., 2016). We
perform the calculations 1000 times, and each time different samples
and/or sites will be selected from the population. We then calculate the
mean and standard deviation of the different Fisher (1953) parameters.
Our Python code is available on GitHub (see Data Availability
Statement).