Influence of data filters on the accuracy and precision of paleomagnetic poles: what is the optimal sampling strategy?
Dieke Gerritsen1, Bram Vaes1, and Douwe J.J. van Hinsbergen1
1Department of Earth Sciences, Utrecht University, Princetonlaan 8A, 3584 CB Utrecht, the Netherlands.
Corresponding authors: Dieke Gerritsen (h.gerritsen1@uu.nl) and Bram Vaes (b.vaes@uu.nl)
Key Points:
Abstract
To determine a paleopole, the paleomagnetic community commonly applies a loosely defined set of quantitative data filters that were established for studies of geomagnetic field behavior. These filters require costly and time-consuming sampling procedures, but whether they improve accuracy and precision of paleopoles has not yet been systematically analyzed. In this study, we performed a series of experiments on four datasets which consist of 73-125 lava sites with 6-7 samples per lava. The datasets are from different regions and ages, and are large enough to represent paleosecular variation, yet contain demonstrably unreliable datapoints. We show that data filters based on within-site scatter (a k-cutoff, a minimum number of samples per site, and eliminating the farthest outliers per site) cannot objectively identify unreliable directions. We find instead that excluding unreliable directions relies on the subjective interpretation of the expert, highlighting the importance of making all data available following the FAIR principles. In addition, data filters that eliminate datapoints even have an adverse effect: the accuracy as well as the precision of paleopoles decreases with the decreasing number of data. Between-site scatter far outweighs within-site scatter, and when collecting paleomagnetic poles, the extra efforts put into collecting multiple samples per site are more effectively spent on collecting more single-sample sites.
1 Introduction
Paleomagnetic poles, or paleopoles, quantify the past position of rocks relative to the geomagnetic pole and constrain tectonic reconstructions and apparent polar wander paths (APWPs) (e.g., Besse and Courtillot, 2002; Torsvik et al., 2012). The calculation of paleopoles relies on the assumption that the time-averaged geomagnetic field approximates a geocentric axial dipole (GAD), but is complicated by short-term deviations from this field (e.g., Cromwell et al., 2018; Oliveira et al., 2021) known as paleosecular variation (PSV). To obtain a paleopole, paleomagnetists therefore average virtual geomagnetic poles (VGPs), whereby every VGP is then assumed a ‘spot reading’: an instantaneous reading of the past geomagnetic field collected from a rock unit (‘site’) that represents an increment of geological time, such as a lava (Butler, 1992; Tauxe, 2010). However, not every VGP represents an accurate spot reading because artifacts may be introduced by measuring errors or remagnetization. Therefore, the paleomagnetic community commonly uses a set of data filters to acquire a set of reliable spot readings. However, these filters vary between authors and were not determined by studies aiming to constrain paleopoles.
The studies that established the data filters, investigated PSV and geomagnetic field behavior by determining the between-site scatter of a set of VGPs (e.g., Cromwell et al., 2018; de Oliveira et al., 2021; Johnson et al., 2008; Johnson & Constable, 1996; Tauxe et al., 2003). To this end, these studies aim to correct for within-site scatter induced by measuring errors and typically require a minimum number of readings per site, although this number of readings varies between authors (e.g., Biggin et al., 2008; Johnson et al., 2008; Doubrovine et al., 2019; Cromwell et al., 2018). The resulting paleomagnetic directions are then averaged to a site-mean direction which is converted to a VGP if the site passes a criterion for the minimum within-site precision value (‘cutoff’), typically expressed as a Fisher (1953) precision parameter k . This value also varies between authors, e.g., k \(\mathbf{\geq}\)50 or k \(\mathbf{\geq}\)100 (e.g., Biggin et al., 2008; Cromwell et al., 2018; Johnson et al., 2008; Tauxe et al., 2003). Subsequently, similar procedures have become common for calculation of paleopoles (e.g., Butler, 1992; Lippert et al., 2014; Meert et al., 2020). However, does this time- and data-intensive procedure improve the accuracy and precision of paleopoles?
In this study, we analyze to what extent commonly applied paleomagnetic data filters established for PSV studies improve the accuracy and precision of paleopoles. To this end, we study four large paleomagnetic datasets obtained from lava sequences from the Cretaceous of Mongolia, the Permian of Norway, the Miocene of Turkey, and the Quaternary of Antarctica. These datasets are large enough to represent PSV, but contain additional between-site and within-site scatter of varying magnitude due to measurement errors, lightning-induced remagnetization, and/or tectonic deformation. We perform a series of experiments to examine the effects of applying filters on the accuracy and precision of paleopoles. We evaluate whether these filters can objectively exclude outliers and filter non-PSV induced scatter from the paleomagnetic datasets. We then assess how a given number of paleomagnetic directions is optimally distributed over a collection of paleomagnetic sites to acquire the best-constrained paleopole position. We discuss to what extent the filters used in PSV studies improve paleopole accuracy and precision. Our results aim to aid paleomagnetists to optimize their sampling and data filtering strategies.
2 Background
To evaluate the reliability of a paleopole, it is common to use filters to obtain a paleomagnetic dataset representative of PSV. Initially, Van der Voo (1990) recommended to average a minimum set of 25 samples or sites (i.e., spot readings) for a paleopole, and formulated loosely defined filters using Fisher (1953) statistics. Those statistics are used to describe a paleomagnetic dataset and can be applied to either directional data or VGPs. Three parameters are of importance in this statistical approach: the radius of the 95% confidence cone around the mean (\(\mathbf{\alpha}_{\mathbf{95}}\) for directions and A95 for VGPs), the precision parameter (k for directions and K for VGPs), and the circular standard deviation (S) that quantifies the angular between-site dispersion of VGPs. Van der Voo (1990) suggested filters for these parameters:\(\mathbf{\alpha}_{\mathbf{95}}\) or A95\(\mathbf{\leq}\)16°, and k or K\(\mathbf{\geq}\)10. In later studies, filters were added to determine the reliability of each VGP based on quality filters used in PSV studies (e.g., Lippert et al., 2014).
For obtaining paleopoles, paleomagnetists usually sample multiple sites on a locality, where every site likely represents a spot reading of the paleomagnetic field, e.g., a lava flow. The total scatter in a dataset then consists of within-site and between-site scatter (Biggin et al., 2008). Paleomagnetists collect multiple samples per site to test whether a paleomagnetic direction is reproducible and to average measurement and other random errors (McElhinny & McFadden, 1999). A quality check for within-site precision is a k-cutoff (Tauxe et al., 2003), where sites with a k value lower than an arbitrary value of 50 or 100 are discarded (e.g., Biggin et al., 2008; Johnson et al., 2008; Lippert et al., 2014). Furthermore, outliers are often subjectively discarded from sites based on the ‘expert eye’ and experience of the interpreter. How many samples should be collected per site is not widely agreed upon. Based on Monte Carlo simulations, a minimum number of five independently oriented samples was deemed necessary to estimate k reliably (Tauxe et al., 2003). Others suggest that three (Meert et al., 2020), four (Cromwell et al., 2018; Lippert et al., 2014), or six (Asefaw et al., 2021) samples are needed. But most paleomagnetic studies collected six to eight samples per lava site.
To average between-site scatter, assumed predominantly the result of PSV, multiple sites are collected, but the minimum number of sites required for a ‘good’ average, that is, the pole, is not well defined (Tauxe, 2010). The precision with which a paleopole is calculated increases with increasing number of underlying VGPs (Vaes et al., 2021). Meert et al. (2020) suggested eight sites are sufficient to determine a paleopole, provided that each site is constrained by at least three samples. However, based on a statistical simulation of PSV, Tauxe et al. (2003) showed that a minimum of approximately 100 sites may be required to fully sample PSV. However, this number is rarely obtained due to limited availability of sites and because resources are consumed by sampling a high number of samples for each site. Below, we analyze the effect of these different sampling and filtering strategies on the accuracy and precision of the resulting paleopole.
3 Datasets and Approach
3.1 Datasets
As basis for our analysis, we use four large paleomagnetic datasets derived from lavas from Mongolia (van Hinsbergen et al., 2008), Norway (Haldan et al., 2014), Turkey (van Hinsbergen et al., 2010), and Antarctica (Asefaw et al., 2021). The results were previously published, but for the purpose of this study, we reinterpreted the demagnetization diagrams of all samples. We did this by identifying the characteristic remanent magnetization (ChRM) by principal component analysis (Kirschvink, 1980), using the online paleomagnetic analysis platform Paleomagnetism.org (Koymans et al., 2016, 2020). We have thereby not forced interpreted components through the origin, and we have not used the remagnetization great-circle method of McFadden & McElhinny (1988), as was occasionally done in the original interpretations. All interpretable diagrams were interpreted, and no samples were discarded based on obviously outlying directions, e.g., due to lightning strikes. In other words, we deliberately kept directions that an experienced paleomagnetist would likely immediately discard as unreliable. We also did not exclude sites that were collected from tectonically deformed regions. As a result, the dataset contains larger noise and a higher between-site scatter (lower K-value) than the published values. This allows us to assess whether the objective quality criteria alone can clean the dataset from outliers, or whether this relies on a subjective ‘expert eye’. The sites from Mongolia, Norway, and Turkey were sampled with a minimum of seven samples per site, and the sites from Antarctica with a minimum of six. Only sites with seven (or six, for Antarctica) interpretable directions have been used in our analysis. If sites contained more samples, only the first seven (or six, for Antarctica) samples were used. All uninterpreted data are publicly available in databases (Paleomagnetism.org (Koymans et al., 2020) or MagIC (Tauxe et al., 2016), see Data Availability Statement). Our reinterpretations are provided in the supplementary information (Table S1) to this paper, but because our interpretations are (deliberately) not better than the original interpretations, we will not upload these in the databases to avoid confusion. All paleopole positions from this study differ from the published pole positions, but still sit within the 95% confidence cone of the respective published pole (Figure 1).
After the reinterpretation, we arrived at a total of 108 sites of lower Cretaceous lavas, corresponding to the base of the Cretaceous Normal Superchron, from the NE Gobi Altai mountain range of southern Mongolia (van Hinsbergen et al., 2008; vH08). We did not include samples from the Artz Bogd area from the original dataset because these appear systematically rotated due to tectonic deformation. Furthermore, only the sites acquired by van Hinsbergen et al. (2008) are included. Our reinterpreted dataset (G21) leads to a higher scatter than the original interpretation of van Hinsbergen et al. (2008) for reasons outlined above (A95,vH08 = 5.3° vs. A95,G21 = 6.4°, KvH08 = 9.1 vs. KG21 = 5.5). Most sites have limited within-site scatter and high k-values, as expected for lava sites.
We arrived at a total of 73 sites from lavas of Permian age from the Oslo Graben in Norway (Haldan et al., 2014; H14). This dataset contains lavas from the Permo-Carboniferous Reversed Superchron (PCRS). Interestingly, the between-site scatter is much lower than for the other three localities, as addressed by (Brandt et al., 2021; Handford et al., 2021). The within-site scatter of this dataset, however, is higher. Our reinterpretation contains more scatter than the interpretation of Haldan et al. (2014) (A95,H14 = 1.9° vs. A95,G21 = 3.1°, KH14 = 52.2 vs. KG21 = 29.4).
A total of 125 sites of lower to middle Miocene lavas and ignimbritic tuffs come from basins in the northern Menderes Massif in western Turkey (van Hinsbergen et al., 2010; vH10). This dataset was acquired in a tectonically active area and contains basins that were interpreted to be tectonically coherent, and basins that were interpreted by van Hinsbergen et al. (2010) to be internally tectonically disturbed (later confirmed and mapped out in detail by Uzel et al. (2015, 2017)). It also contained sites acquired by preceding studies, but only the sites acquired by van Hinsbergen et al. (2010) are included in our dataset. Our re-interpreted dataset contains an equal amount of scatter as the dataset of van Hinsbergen et al. (2010) (A95,vH10 = 6.7° vs. A95,G21 = 6.4°, KvH10 = 4.6 vs. KG21 = 4.8).
Finally, we used 107 sites with n=6 from lavas of Plio-Pleistocene age of the Erebus volcanic province in Antarctica (Asefaw et al., 2021; A21). We did not reinterpret this dataset and used the published interpretations, but only used sites with n=6. Furthermore, we did not use the age constraint of <5 Ma and k-cutoff. Our reinterpretation does not significantly differ from the published one of Asefaw et al. (2021) (A95,A21 = 5.5° vs. A95,G21 = 5.9°, KA21 = 7.7 vs. KG21 = 6.3).
3.2 Approach
We performed a series of experiments to study the effect on paleopole accuracy and precision of objective filters that one may use when multiple samples per site are available. These filters include the influence of (i) a k-cutoff; (ii) the number of samples per site (n), and (iii) discarding outliers of within-site dispersion. In addition, we assessed how a given number of paleomagnetic directions may be optimally distributed over a collection of paleomagnetic sites (N) to acquire the most best-constrained paleopole position. This will be addressed using Fisher (1953) parameters: the radius of the 95% confidence cone around the paleopole (A95), the precision parameter (K), and the circular standard deviation (S). We performed experiments using a Python code, which we developed making extensive use of the freely available paleomagnetic software package PmagPy (Tauxe et al., 2016). We perform the calculations 1000 times, and each time different samples and/or sites will be selected from the population. We then calculate the mean and standard deviation of the different Fisher (1953) parameters. Our Python code is available on GitHub (see Data Availability Statement).