Public Articles
Blog Post 1
1 $\underline{\text{Binomial Coefficient}}$ In mathematics, the binomial coefficient is written as ${n \choose k}$ and can be pronounced as “n choose k.” Alternatively, binomial coefficients are also sometimes given the notation C(n, k). In this case, the C stands for the word “choices” or “combination” (Benjamin, 2009, p. 8). This is because there are ${n \choose k}$ ways of choosing k elements from a set containing a number of n elements. For example, we can consider the set A = {1, 2, 3, 4}. If we wish to know how many subsets of 2 can be created using this set, we are essentially asking how many ways there are of choosing 2 elements from a set with 4 total elements. Therefore, we can identify that k = 2 and n = 4. Hence, we have ${4 \choose 2}$. To calculate such a problem, we typically would want to write out by hand all the possible combinations. Doing so, one would find that there are six pairs of size two subsets, namely {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, and {3, 4}. However, it becomes clear to see that when we are dealing with large sets of values, this work can become tedious. Therefore, it is convenient to utilize the following formula:
\[{n \choose k} = \frac{n!}{(n-k)! \cdot k!} \]
A Laurent Series Tranform for Integer Sequences via Generalised Continued Fractions
We investigate a function defined by a generalised continued fraction and find it to be closely related to a generating function of series OEIS:A000698. We then alter the function to contain the set of primes rather than the set of real numbers. This continued fraction function also has a similar form with coefficients 2, 6, 48, 594, 10212, 230796, ⋯. We also transform the sequence 1, 1, 1, 1, 1⋯ and gain the Catalan numbers as coefficients of the corresponding Laurent Series. We then provide examples which transform into various OEIS sequences.
Define the function \begin{equation} f(x)=\frac{1}{x+\frac{2}{x+\frac{3}{x+\cdots}}} = \underset{k=1}{\overset{\infty}{\mathrm \large K \normalsize}} \frac{k}{x} \end{equation} evaluating this function until convergence for 16 decimal places, gives something that looks like 1/x, however, after diagnosing the coefficients of the Laurent series, by subtracting likely integer terms we very easily find \begin{equation} \lim_{x\to \infty}f(x)=\frac{1}{x}-\frac{2}{x^3}+\frac{10}{x^5}-\frac{74}{x^7}+\frac{706}{x^9}-\frac{8162}{x^{11}}+\frac{110410}{x^{13}}-\cdots, \end{equation} with a search on OEIS giving sequence A000698 which we believe to be plausible. We can easily imagine a similar function \begin{equation} g(x)=\frac{2}{x+\frac{3}{x+\frac{5}{x+\frac{7}{x+\cdots}}}} = \underset{k=1}{\overset{\infty}{\mathrm \large K \normalsize}} \frac{p_k}{x} \end{equation} where the top row of numbers are the prime numbers, pk. Using the first 9999 primes in the continued fraction, this also converges to 16 decimal places for large enough x, and appears to be described by an integral coefficient Laurent series \begin{equation} g(x)=\frac{2}{x}-\frac{6}{x^3}+\frac{48}{x^5}-\frac{594}{x^7}+\frac{10212}{x^9}-\frac{230796}{x^{11}}+\frac{6569268}{x^{13}}-\cdots \end{equation}
That would allow conjecture for a very interesting relationship \begin{equation} \underset{i=1}{\overset{\infty}{\mathrm \large K \normalsize}} \frac{p_i}{x} = \frac{2}{x} - \frac{6}{x^3} + \frac{48}{x^5} -\frac{594}{x^7} + \frac{10520}{x^9} -\cdots \end{equation} where p1 = 2,p2 = 3 and so one for primes, and the capital K notation is that for the continued fraction.
A further assessment of many integer sequences inside the continued fraction should be made, and a check of corresponding Laurent series undertaken. It may be common place to find integral coefficients.
The first function investigated in this document is also of interest, in OEIS the Laurent coefficient sequence A000698 is commented “Number of nonisomorphic unlabeled connected Feynman diagrams of order 2n-2 for the electron propagator of quantum electrodynamics (QED), including vanishing diagrams.” The continued fraction clearly has the potential to capture the recursive aspect of the theory, and this may be why the series align.
Many thanks to OEIS for providing their invaluable service. Thanks to Wolfram|Alpha for plots as below, and Mathematica for calculations. This work was undertaken in my spare time while being funded by the EPSRC.
Pascal's Triangle
Ozone Paper Moved Offline
and 2 collaborators
We develop a quantitative method for determining Stratosphere to Troposphere Transport events (STTs) and a minimum bound for this transported ozone quantity using ozonesondes over Melbourne, Macquarie Island, and Davis.
Binomial Coefficients
Open Science as a Service: Status and future potential from a German non-university research institution perspective
and 1 collaborator
2016HCT Prelim.
Personal news and content curation is an exciting NLP application. Systems providing this service are often characterised by a collaborative approach that combines human and machine intelligence. As the scope of the problem increases however, so too does the importance of automation. To this end we propose a novel method for scoring news articles and other related content. It is natural to view this problem in a learning-to-rank framework. The training phase of our model first makes use of a pairwise transform. This alters the problem from the ranking of a whole corpus to many individual pairwise comparisons (is article 'a' better than article 'b'). This transformed set is then used to determine the optimal weights in a logistic regression model. These can then be used directly to classify the non-transformed test set. We also perform a comprehensive review and selection process on a large range of candidate features. Our final features involve measures of centrality, informativeness, complexity and within-group similarity.
Microbial Eukaryote Proposal
and 2 collaborators
#Project Summary (1 page)
Due: January 25, 2016 at 5 PM (local time)
DEB - Biodiversity: Discovery & Analysis Cluster
Solicitation: http://www.nsf.gov/pubs/2015/nsf15609/nsf15609.htm
Cluster description: http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503666&org=DEB&from=home
##Overview
The broad goal with this proposal is to increase the overall knowledge of the true diversity of microbial eukaryotes by identifying and culturing microeukaryotes from seagrass beds.
Microorganisms, and specifically marine microbial eukaryotes, represent an underexplored area of diversity. Microbial eukaryotes are known to be important on a number of trophic levels in the marine system CITE, and microbial eukaryotes found in seagrass beds likely contribute to their tremendous biodiversity and roles as important players in nutrient cycling and carbon sequestration in the oceans. We will use a combination of sequencing and culturing techniques to (1) characterize microeukaryotes in a global census of the seagrass Zostera marina, (2) Explore microbial eukaryotic diversity across the Order Alismatales, including the 3 separate lineages of seagrasses and their freshwater and brackish relatives, and (3) Create a publicly available culture collection of microbial eukaryotes from Zostera marina samples from Bodega Bay, CA.
##Intellectual Merit
Microorganisms comprise the majority of diversity on Earth. Traditionally classified using morphological approaches, the advent of sequence data has dramatically altered our views of microbial evolution and diversity. Specifically, high throughput sequencing technologies have enabled us to explore multiple genes and genomes from microorganisms, giving us insight into genome complexity and function in these unseen organisms. As a result microbial ecologists are finding themselves in uncharted territory as they analyze large data sets full of "unclassified" organisms, and it now clear that microorganisms are much more diverse than previously thought.
Although certain pathogenic microeukaryotes have been studied in great detail (ex. giardia, see \cite{Adam_2001}) for review, environmental microeukaryotes, specifically marine microeukatyores, are grossly uncharacterized despite their important functional roles in their ecosystems \cite{Caron_2008}. Novel marine microeukaryotic lineages have previously been found at all phylogenetic scales \cite{Massana_2008}; however, many of these novel organisms are still a mystery to us as they have yet to be cultured. It is estimated that the total diversity of microbial eukaryotes is much higher than what we currently have in culture \cite{Mora_2011} \cite{Pawlowski_2012}.
Seagrasses are a unique system in which to explore marine microbial eukaryotic diversity. These important marine angiosperms provide habitat and food to many rare and endemic species, and contain tremendous levels of biodiversity that has currently only been characterized at the macrobe level \cite{Orth_2006}. Seagrasses are known to be important contributors to biogeochemical processes within the ocean and are one of the largest carbon sinks on earth, sequestering carbon 35X faster than Tropical Rainforests \cite{Mcleod_2011}.
Given their importance in the complex marine food web and their contributions to nutrient cycling within the oceans, we hypothesize that seagrass-associated marine microbial eukaryotes are important to both the high levels of macrobe biodiversity within seagrass beds and to their role in nutrient cycling and carbon sequestration in the ocean ecosystem.
We propose to perform a global census of microbial eukaryotes found in association with the leaves, roots, and sediment of the seagrass Zostera marina. We will then expand our investigation to census the microbial eukaryotes found in association with plants across the Order Alismatales, which includes three independent lineages of seagrasses. Concurrently with the afformentioned censuses, we will establish a culture collection of microbial eukaryotes found associated with Zostera marina from Bodega Bay, California. We are uniquely positioned to be successful at the proposed research; using funds provided by the Gordon and Betty Moore Foundation, we have already established a program to explore bacterial diversity within seagrass beds, and have completed the majority of field work and formed ongoing collaborations with other seagrass researchers from both the Zostera Experimental Network (ZEN) and other research institutions.
##Broader Impacts
The project we propose here is a global interdisciplinary collaboration that will result in increased knowledge of the biodiversity of an understudied group of organisms from an important marine ecosystem. The preposed project is the first to explore seagrass-associated microbial eukaryotes using both sequence and culture based methods, and will generate large amounts of publicly available sequence data and numerous new entries of novel marine organisms to culture collections.
The project we are proposing will include a large outreach component both at the local level (undergraduate researchers, high school students) and the global level (website, collaborators). Undergraduates and local high school students will be intimately involved in creating the culture collection and our progress will be transparently available on our lab website.
Sample Blog Post Math 381
A graph is a structure used in discrete math that is used to show the relationship between various objects. The objects form the vertices of the graph. Two vertices are connected by a line if they satisfy a certain relationship. We call these lines edges. By formalizing the way we connect the dots we are able to rigorously prove mathematical claims.
A quick introduction to version control with Git and GitHub
Many scientists write code as part of their research. Just as experiments are logged in laboratory notebooks, it is important to document the code you use for analysis. However, a few key problems can arise when iteratively developing code that make it difficult to document and track which code version was used to create each result. First, you often need to experiment with new ideas, such as adding new features to a script or increasing the speed of a slow step, but you do not want to risk breaking the currently working code. One often utilized solution is to make a copy of the script before making new edits. However, this can quickly become a problem because it clutters your filesystem with uninformative filenames, e.g. analysis.sh
, analysis_02.sh
, analysis_03.sh
, etc. It is difficult to remember the differences between the versions of the files, and more importantly which version you used to produce specific results, especially if you return to the code months later. Second, you will likely share your code with multiple lab mates or collaborators and they may have suggestions on how to improve it. If you email the code to multiple people, you will have to manually incorporate all the changes each of them sends.
Fortunately, software engineers have already developed software to manage these issues: version control. A version control system (VCS) allows you to track the iterative changes you make to your code. Thus you can experiment with new ideas but always have the option to revert to a specific past version of the code you used to generate particular results. Furthermore, you can record messages as you save each successive version so that you (or anyone else) reviewing the development history of the code is able to understand the rationale for the given edits. Also, it facilitates collaboration. Using a VCS, your collaborators can make and save changes to the code, and you can automatically incorporate these changes to the main code base. The collaborative aspect is enhanced with the emergence of websites that host version controlled code.
In this quick guide, we introduce you to one VCS, Git (git-scm.com), and one online hosting site, GitHub (github.com), both of which are currently popular among scientists and programmers in general. More importantly, we hope to convince you that although mastering a given VCS takes time, you can already achieve great benefits by getting started using a few simple commands. Furthermore, not only does using a VCS solve many common problems when writing code, it can also improve the scientific process. By tracking your code development with a VCS and hosting it online, you are performing science that is more transparent, reproducible, and open to collaboration \cite{23448176, 24415924}. There is no reason this framework needs to be limited only to code; a VCS is well-suited for tracking any plain-text files: manuscripts, electronic lab notebooks, protocols, etc.
ProCS15: A DFT-based chemical shift predictor for backbone and C\(\beta\) atoms in proteins
We present ProCS15: A program that computes the isotropic chemical shielding values of backbone and Cβ atoms given a protein structure in less than a second. ProCS15 is based on around 2.35 million OPBE/6-31G(d,p)//PM6 calculations on tripeptides and small structural models of hydrogen-bonding. The ProCS15-predicted chemical shielding values are compared to experimentally measured chemical shifts for Ubiquitin and the third IgG-binding domain of Protein G through linear regression and yield RMSD values below 2.2, 0.7, and 4.8 ppm for carbon, hydrogen, and nitrogen atoms respectively. These RMSD values are very similar to corresponding RMSD values computed using OPBE/6-31G(d,p) for the entire structure for each protein. The maximum RMSD values can be reduced by using NMR-derived structural ensembles of Ubiquitin. For example, for the largest ensemble the largest RMSD values are 1.7, 0.5, and 3.5 ppm for carbon, hydrogen, and nitrogen. The corresponding RMSD values predicted by several empirical chemical shift predictors range between 0.7 - 1.1, 0.2 - 0.4, and 1.8 - 2.8 ppm for carbon, hydrogen, and nitrogen atoms, respectively.
ProCS15: A DFT-based chemical shift predictor for backbone and C\(\beta\) atoms in proteins
We present ProCS15: A program that computes the isotropic chemical shielding values of backbone and Cβ atoms given a protein structure in less than a second. ProCS15 is based on around 2.35 million OPBE/6-31G(d,p)//PM6 calculations on tripeptides and small structural models of hydrogen-bonding. The ProCS15-predicted chemical shielding values are compared to experimentally measured chemical shifts for Ubiquitin and the third IgG-binding domain of Protein G through linear regression and yield RMSD values below 2.2, 0.7, and 4.8 ppm for carbon, hydrogen, and nitrogen atoms respectively. These RMSD values are very similar to corresponding RMSD values computed using OPBE/6-31G(d,p) for the entire structure for each protein. The maximum RMSD values can be reduced by using NMR-derived structural ensembles of Ubiquitin. For example, for the largest ensemble the largest RMSD values are 1.7, 0.5, and 3.5 ppm for carbon, hydrogen, and nitrogen. The corresponding RMSD values predicted by several empirical chemical shift predictors range between 0.7 - 1.1, 0.2 - 0.4, and 1.8 - 2.8 ppm for carbon, hydrogen, and nitrogen atoms, respectively.
A quick introduction to version control with Git and GitHub
Many scientists write code as part of their research. Just as experiments are logged in laboratory notebooks, it is important to document the code you use for analysis. However, a few key problems can arise when iteratively developing code that make it difficult to document and track which code version was used to create each result. First, you often need to experiment with new ideas, such as adding new features to a script or increasing the speed of a slow step, but you do not want to risk breaking the currently working code. One often utilized solution is to make a copy of the script before making new edits. However, this can quickly become a problem because it clutters your filesystem with uninformative filenames, e.g. analysis.sh
, analysis_02.sh
, analysis_03.sh
, etc. It is difficult to remember the differences between the versions of the files, and more importantly which version you used to produce specific results, especially if you return to the code months later. Second, you will likely share your code with multiple lab mates or collaborators and they may have suggestions on how to improve it. If you email the code to multiple people, you will have to manually incorporate all the changes each of them sends.
Fortunately, software engineers have already developed software to manage these issues: version control. A version control system (VCS) allows you to track the iterative changes you make to your code. Thus you can experiment with new ideas but always have the option to revert to a specific past version of the code you used to generate particular results. Furthermore, you can record messages as you save each successive version so that you (or anyone else) reviewing the development history of the code is able to understand the rationale for the given edits. Also, it facilitates collaboration. Using a VCS, your collaborators can make and save changes to the code, and you can automatically incorporate these changes to the main code base. The collaborative aspect is enhanced with the emergence of websites that host version controlled code.
In this quick guide, we introduce you to one VCS, Git (git-scm.com), and one online hosting site, GitHub (github.com), both of which are currently popular among scientists and programmers in general. More importantly, we hope to convince you that although mastering a given VCS takes time, you can already achieve great benefits by getting started using a few simple commands. Furthermore, not only does using a VCS solve many common problems when writing code, it can also improve the scientific process. By tracking your code development with a VCS and hosting it online, you are performing science that is more transparent, reproducible, and open to collaboration \cite{23448176, 24415924}. There is no reason this framework needs to be limited only to code; a VCS is well-suited for tracking any plain-text files: manuscripts, electronic lab notebooks, protocols, etc.
THE PREDICTION OF OUTCOMES RELATED TO THE USE OF NEW DRUGS IN THE REAL WORLD THROUGH ARTIFICIAL ADAPTIVE SYSTEMS.
Welcome to Authorea!
Hey, welcome. Double click anywhere on the text to start writing. In addition to simple text you can also add text formatted in boldface, italic, and yes, math too: E = mc2! Add images by drag’n’drop or click on the “Insert Figure” button.
Authenticating Route Transitions in an SPA: What to do About the Developer Console
Growth of 48 Built Environment Bacterial Isolates on Board the International Space Station (ISS)
and 6 collaborators
Abstract
Background: While significant attention has been paid to the potential risk of pathogenic microbes aboard crewed spacecraft, much less has focused on the non-pathogenic microbes in these habitats. Preliminary work has demonstrated that the interior of the International Space Station (ISS) has a microbial community resembling those of built environments on earth. Here we report results of sending 48 bacterial strains, collected from built environments on earth, for a growth experiment on the ISS. This project was a component of Project MERCCURI (Microbial Ecology Research Combining Citizen and University Researchers on ISS).
Results: Of the 48 strains sent to the ISS, 45 of them showed similar growth in space and on earth. The vast majority of species tested in this experiment have also been found in culture-independent surveys of the ISS. Only one bacterial strain that avoided contamination showed significantly different growth in space. Bacillus safensis JPL-MERTA-8-2 grew 60% better in space than on earth.
Conclusions: The majority of bacteria tested were not affected by conditions aboard the ISS in this experiment (e.g., microgravity, cosmic radiation). Further work on Bacillus safensis could lead to interesting insights on why this bacteria grew so much better in space.
UZFor2015 - Timing Analysis Report
The data has been taken from \cite{Potter_2011}. All timing stamps are in BJD using the TDB time scale. No further transformations needed. A total of 42 timing measurements exist. However, Potter et al. have not included data points from Dai et al. (2010). See Potter et al. (2011) text for details. In this analysis I will consider the full set of timings as a start and as presented in Potter et al. (2011).
In general I am using IDL for the timing analysis. The cycle or ephemeris numbers have been obtained from IDL> ROUND((BJDMIN-TZERO)/PERIOD) where BJDMIN are all 42 timing measurements, TZERO is an arbitrary timing measurement that defines the CYCLE=E=0 and PERIOD is the binary orbital period (0.087865425 days) and was taken from \cite{Potter_2011}, Table 2. In this work I will use TZERO=BJD 2,450,021.779388. It is a bit different from the TZERO used in \cite{Potter_2011} in order to introduce a bit variation and also because I think the center of mass of the data points is as chosen by me.
As a first step I used IDL’s LINFIT code to fit a straight line with the MEASURE_ERROR keyword set to an array holding the timing measurements errors (Table 2, 3rd column, Potter et al. 2011). This way the square of deviations are weighted with 1/σ2 where σ is the standard timing error for each timing measurement. This is standard procedure and was also used in Potter et al. (2011). The average or mean timing error for the 42 measurements is 6.0 seconds (the standard deviation is also 6.0 seconds) with 0.74 seconds as the smallest and 17 seconds as the largest error. Also I have rescaled the timing measurements by subtracting the first timing measurement from all the others. Rescaling introduces nothing spooky to the analysis and has the advantage to avoid dynamic range problems. This is in particular needed for a later analysis when using MPFIT. Using LINFIT the resulting reduced χ2 value was 95.22 (χ2 = 3808.82 with (42-2) degrees of freedom) with the ephemeris (or computed timings) given as \begin{equation} T(E) = BJD~2450021.77890(6) + E \times 0.0878654291(1) \end{equation} The corresponding root-mean-square (RMS) scatter of the data around the best-fit line is 27.5 seconds and the corresponding standard deviation is 27.7 seconds. As expected they should both be similar. To measure scatter of data around any best-fit model, I will use the RMS quantity. The RMS scatter is 5 times the average timing error and could be indicative of a systematic process.
As a test the CURVEFIT routine has been used in a similar manner. The resulting reduced chi2 was also 95.22 matching and confirming the result from the previous section. The /NODERIVATIVE keyword does not change anything and expressions for the partial derivative has been included. The RMS also agrees with the results obtained from LINFIT. However, the formal 1σ uncertainties in the best-fit parameters (TZERO and PERIOD) are one magnitude smaller compared to the equivalent values obtained from LINFIT. The data and the best-fit line (obtained from LINFIT) is shown in Fig. [linearfit] with the residuals plotted in Fig. [linearfit_res]. There is absolutely no difference when using the results from CURVEFIT.
After fitting a straight line and visually inspecting the residual plots I cannot see any convincing trend that should justify a quadratic ephemeris (linear + a quadratic term). What I see is a sinusoidal variation around the best-fit line. Relative to the linear line the first timing measurement arrives 20s earlier than expected. Then the trend goes down and increases again to 40s at E=0, then decreases again to a minimum to around 20s and increases again thereafter. There is no obvious quadratic trend from looking at the residuals in Fig. [linearfit_res].
Although there is no obvious reason to include a quadratic term I will nevertheless consider a quadratic model. I will do this by again using IDL’s CURVEFIT procedure and the MPFIT package (also IDL) which is a more sophisticated fitting tool utilizing the Levenberg-Marquardt least-squares minimization algorithm developed by Marwardt.
The results from CURVEFIT are surprising. The best-fit χ2 value was 3718.89 yielding a reduced χ2 of 95.36 with (42-3 DoF). The RMS scatter of the residuals around the quadratic model fit was 31 seconds. This means that the fit became worse compared to a linear ephemeris model. The resulting residual plot is shown in Fig. [quadfit_res]. The corresponding best-fit parameters along with formal uncertainties for a quadratic ephemeris are \begin{eqnarray} T(E) &=& T + P \times E + A \times E^2 \\ &=& 24550021.778895(6) + 0.0878654269(3) \times E + 4.3(5)\times 10^{-14} \times E^2 \end{eqnarray}
I have also used MPFIT to fit a quadratic ephemeris to the Potter et al. (2011) timing data. The resulting χ2 is 3718.94 with (42-3) degrees of freedom yielding a reduced χ2 of 95.36. This is identical to the results obtained with CURVEFIT and thus confirmed independently. This is really surprising. The RMS scatter of data around the quadratic ephemeris is around 31 seconds. I will not state the best-fit values for the three model parameters (and their uncertainties) as obtained from MPFIT.
Based on the above result I cannot see that the residuals relative to a linear ephemeris allow the inclusion of a secular term accounting for a quadratic ephemeris. The χ2 increases with an extra parameter which is not what is expected. I will continue now and fit a 1- and 2-companion model.
We have considered a linear + 1-LTT model (excluding secular changes as described in a quadratic ephemeris). We have again used MPFIT for this task. The model is taken from Irwin (19??). We considered 107 initial guesses. The initial guess for the reference epoch and binary period were taken from the best-fit obtained from a linear ephemeris model. Inital guesses for the semi-amplitude of the light-time orbit were taken from an estimate of the amplitude as shown in Fig. 2. Initial guesses for the eccentricity covered the interval [0,0.9995]. Initial guess for the argument of pericenter covered the interval [0,360] degrees. Initial guess for the orbital period was also estimated from Fig. 2. Initial guess for the time of pericenter passage were obtained from T0 and the orbital period of the light-time orbit. Initial guesses were drawn at random. The methodology follows the same techniques as described in Hinse et al. (2012). Best-fit parameters were obtained from the best-fit solution covariance matrix as returned by MPFIT. Parameters errors should be considered as formal. The best-fit had a χ2 = 185.2 with (42-7) degrees of freedom resulting in a reduced χν2 = 5.3. The corresponding RMS scatter of data points around the best-fit is 15.7 seconds. The best-fit parameters are listed in Table [BestFitParamsLinPlus1LTT] and shown in Fig. [BestFitModel_LinPlus1LTT]. Recalling the average timing error (of 42 timing measurements) to be 6 seconds, that means that the RMS residuals are on a 2.6σ level.
T0 (BJD) | 2, 450, 021.77924 ± 3 × 10−5 |
P0 (days) | 0.0878654289 ± 2 × 10−10 |
asinI (AU) | 0.00043 ± 2 × 10−5 |
e | 0.65 ± 0.03 |
ω (radians) | 6.89 ± 0.04 |
Tp (BJD) | 2, 408, 616.0 ± 50 |
P (days) | 6020 ± 35 |
RMS (seconds) | 15.7 |
\label{BestFitParamsLinPlus1LTT}
At the present stage some inconsistencies were discovered in the reported timing uncertainties as listed in Table 1 in Potter et al. (2011). For example the timing uncertainty reported by \cite{Warren_1995} is 0.000023 days, while Potter et al. (2011) reports 0.00003 and 0.00004 days. Furthermore, after scrutinizing the literature we found that several timing measurements were omitted in Potter et al. (2011). We tested for the possibility that Potter et al. (2011) adopts timing uncertainties from the spread of data around a best-fit linear regression. However, that seems not the case: As a test, we used the five timing measurements from \cite{Beuermann1988} as listed in Table 1 in Potter et al. (2011). We fitted a linear straight line using CURVEFIT as implemented in IDL and found a scatter of 0.00004 to 0.00005 days depending on the metric used to measure scatter around the best-fit. The quoted uncertainties in Potter et al. (2011) are smaller by at least a factor of two. We conclude that Potter et al. (2011) must be in error when quoting timing uncertainties in their Table 1. Similar mistakes when quoting timing uncertainties apply to data listed in \cite{Ramsay1994}. Furthermore, after scrutinizing the literature for timing measurements of UZ For we found several timing measurements that were omitted in Potter et al. (2011). For example six eclipse timings were reported by \cite{BaileyCropper_1991} with a uniform uncertainty of 0.00006 days. However, Potter et al. (2011) only reports three of the six timings. Furthermore, a total of five new timings were reported by \cite{Ramsay1994}, but only one were listed in Potter et al. (2011). We can not come up with a good explanation why those extra timing measurements should be omitted or discarded. All of the new data points have been presented in the original works alongside with data points used in the analysis of Potter et al. (2011).
In this research we make use of all timing measurements that have been obtained with reasonable accuracy. We have therefore recompiled all available timing measurements from the literature. We list them in Table [NewTimingData]. The original HJD(UTC) time stamps from the literature were converted to the BJD(TDB) system using the on-line time utilities1 \citep{Eastman_2010}. Not all sources of timing measurements provide explicit information of the the time standard used. In that case we assume that HJD time stamps are valid in the UTC standard. This assumption is to some extend justified since the first timing measurement was taken in august 1983. At that time the UTC time standard for astronomical observations was widespread. All new measurements presented in \cite{Potter_2011} were taken directly from their Table 1. Some remarks are at place. By finding additional timing measurements (otherwise omitted in Potter et al. 2011) in the literature we decided to follow a different approach to estimate timing uncertainties. For measurements that were taken over a short time period one can determine a best-fit line and estimate timing uncertainties from the data scatter. The underlying assumption in this method is that no significant astrophysical signal (interaction between binary components or additional bodies) is contained in the timing measurements over a few consecutive observing nights. Therefore, the scatter around a linear ephemeris should be a reasonable measure of how well timings were measured. In other words, only a first-order effect due to a linear ephemeris is observed. Higher-order eclipse timing variation effects are negligible for data sets obtained during a few consecutive nights. The advantage is that for a given data set the same telescope/instrument were used as well as weather conditions were likely not to have changed much from night to night. Furthermore, most likely the same technique was applied to infer the individual time stamps of a given data set. In Table [NewTimingData] we list the original quoted uncertainties presented in the literature as σlit. We also list the uncertainty obtained from the scatter of the data around a best-fit linear regression line. The corresponding reduced χ2 statistic for each fit is also tabulated in the third column. From the reduced χ2 for each data set one can scale the corresponding uncertainties such that χν2 = 1 is enforced \citep{Bevington2003Book}. This step is only permitted if a high confidence in the applied model is justified. We think that this is the case when time stamps have been obtained over a short time interval. However, ultimately the timing uncertainty depends on the sampling of the eclipse event at a sufficiently high signal-to-noise ratio. The \cite{Imamura_1998} data set was split in two since those time stamps were obtained from two observing runs each lasting for a few days. Furthermore, we have calculated three data scatter metrics around the best-fit line: a) the root-mean-square, b) the standard deviation and c) the standard deviation as given by \cite{Bevington2003Book} and defined as \begin{equation} \sigma^2 = \frac{1}{N-2} \sum_{i=1}^{N}(y_{i} - a - bx_{i})^2 \label{BevEq6p15} \end{equation} where N is the number of data points, a, b the two parameters for a linear line and (xi, yi) is a given timing measurement at a given epoch. We have tested the dependence of scatter on the weight used and found no difference in the scatter metrics when applying a weight of one for all measurements. Finally some additional details need to be mentioned. We only inferred new timing uncertainties for data sets with more than two measurements. For a given data set we used the published ephemeris (orbital period) to calculate the eclipse epochs. For the time stamps presented in \cite{BaileyCropper_1991} no ephemeris was stated. We therefore, used their eclipse cycles for the independent variable to calculate a best-fit line. The reference epoch in each fit was placed to be in or near the middle of the data set. Two data points were discarded in the present analysis. We removed one time stamp from \cite{Ferrario_1989} due to a too high timing uncertainty. Another time stamp was removed from the new data presented in Potter et al. (2011), namely the time stamp BJD(TDB) 2,454,857.36480850. This eclipse is duplicated as it was observed also with the much larger SALT/BVIT instrument resulting in a lower timing error. We therefore use only the SALT/BVIT measurement in the present analysis which makes use of a total of 54 timing stamps. The average or mean timing error for the 54 measurements is 5.7 seconds (the standard deviation is 6.5 seconds) with 0.33 seconds as the smallest and 26.5 seconds as the largest error. Also we have rescaled the timing measurements by subtracting the first time stamp from all the others. Rescaling introduces nothing spooky to the analysis and has the advantage to avoid dynamic range problems when carrying out the process of least-squares minimization. The total baseline of the data set spans 27 years.
BJD(TDB) | σlit | χν2 | σlit, scaled | σRMS | STD | Eq. [BevEq6p15] | Remarks |
---|---|---|---|---|---|---|---|
2455506.427034 | 0.0000100 | – | – | – | – | – | HIPPO/1.9m, \cite{Potter_2011} |
2455478.485831 | 0.0000100 | – | – | – | – | – | HIPPO/1.9m, \cite{Potter_2011} |
2455450.544621 | 0.0000100 | – | – | – | – | – | HIPPO/1.9m, \cite{Potter_2011} |
2454857.364805 | 0.0000086 | – | – | – | – | – | SALT/BVIT, \cite{Potter_2011} |
2454417.334722 | 0.0000086 | – | – | – | – | – | SALT/SALTICAM, \cite{Potter_2011} |
2453408.288086 | 0.0000086 | 0.198 | 3.83E-6 | 0.0000070 | 0.0000070 | 0.0000100 | UCTPOL/1.9m, \cite{Potter_2011} |
2453407.321574 | 0.0000100 | 0.198 | 4.45E-6 | 0.0000070 | 0.0000070 | 0.0000100 | UCTPOL/1.9m, \cite{Potter_2011} |
2453405.300663 | 0.0000350 | 0.198 | 1.56E-5 | 0.0000070 | 0.0000070 | 0.0000100 | UCTPOL/1.9m, \cite{Potter_2011} |
2453404.334042 | 0.0000600 | – | – | – | – | – | SWIFT, \cite{Potter_2011} |
2452494.839196 | 0.0000870 | – | – | – | – | – | XMM OM, \cite{Potter_2011} |
2452494.575626 | 0.0000350 | – | – | – | – | – | UCTPOL/1.9m, \cite{Potter_2011} |
2452493.609058 | 0.0000700 | – | – | – | – | – | UCTPOL/1.9m, \cite{Potter_2011} |
2451821.702394 | 0.0000100 | – | – | – | – | – | WHT/S-Cam, \cite{de_Bruijne_2002} |
2451528.495434 | 0.0000200 | 0.134 | 7.32E-6 | 0.0000040 | 0.0000050 | 0.0000070 | WHT/S-Cam, \cite{Perryman_2001} |
2451528.407579 | 0.0000200 | 0.134 | 7.32E-6 | 0.0000040 | 0.0000050 | 0.0000070 | WHT/S-Cam, \cite{Perryman_2001} |
2451522.432730 | 0.0000200 | 0.134 | 7.32E-6 | 0.0000040 | 0.0000050 | 0.0000070 | WHT/S-Cam, \cite{Perryman_2001} |
2450021.779400 | 0.0000600 | 2.237 | 8.97E-5 | 0.0000500 | 0.0000600 | 0.0000900 | CTIO 1m/photometer, set II, \cite{Imamura_1998} |
2450021.691660 | 0.0000600 | 2.237 | 8.97E-5 | 0.0000500 | 0.0000600 | 0.0000900 | CTIO 1m/photometer, set II, \cite{Imamura_1998} |
2450018.704120 | 0.0000600 | 2.237 | 8.97E-5 | 0.0000500 | 0.0000600 | 0.0000900 | CTIO 1m/photometer, set II, \cite{Imamura_1998} |
2449755.634995 | 0.0000600 | 0.427 | 3.92E-5 | 0.0000200 | 0.0000300 | 0.0000300 | CTIO 1m/photometer, set I, \cite{Imamura_1998} |
2449755.547165 | 0.0000600 | 0.427 | 3.92E-5 | 0.0000200 | 0.0000300 | 0.0000300 | CTIO 1m/photometer, set I, \cite{Imamura_1998} |
2449753.614046 | 0.0000600 | 0.427 | 3.92E-5 | 0.0000200 | 0.0000300 | 0.0000300 | CTIO 1m/photometer, set I, \cite{Imamura_1998} |
2449752.647586 | 0.0000600 | 0.427 | 3.92E-5 | 0.0000200 | 0.0000300 | 0.0000300 | CTIO 1m/photometer, set I, \cite{Imamura_1998} |
2449733.405017 | 0.0000400 | – | – | – | – | – | EUVE, \cite{Potter_2011} |
2449310.332595 | 0.0000230 | – | – | – | – | – | EUVE, \cite{Warren_1995} |
2449276.680076 | 0.0000230 | – | – | – | – | – | EUVE, \cite{Warren_1995} |
2448784.721419 | 0.0000300 | – | – | – | – | – | HST, \cite{Potter_2011} |
2448483.606635 | 0.0000200 | 4.413 | 4.20E-5 | 0.0000300 | 0.0000400 | 0.0000400 | ROSAT, \cite{Ramsay1994} |
2448483.430915 | 0.0000200 | 4.413 | 4.20E-5 | 0.0000300 | 0.0000400 | 0.0000400 | ROSAT, \cite{Ramsay1994} |
2448483.343045 | 0.0000200 | 4.413 | 4.20E-5 | 0.0000300 | 0.0000400 | 0.0000400 | ROSAT, \cite{Ramsay1994} |
2448482.903785 | 0.0000200 | 4.413 | 4.20E-5 | 0.0000300 | 0.0000400 | 0.0000400 | ROSAT, \cite{Ramsay1994} |
2448482.727955 | 0.0000200 | 4.413 | 4.20E-5 | 0.0000300 | 0.0000400 | 0.0000400 | ROSAT, \cite{Ramsay1994} |
2447829.184858 | 0.0000600 | 0.120 | 2.08E-5 | 0.0000170 | 0.0000190 | 0.0000200 | AAT, \cite{BaileyCropper_1991} |
2447829.096998 | 0.0000600 | 0.120 | 2.08E-5 | 0.0000170 | 0.0000190 | 0.0000200 | AAT, \cite{BaileyCropper_1991} |
2447829.009088 | 0.0000600 | 0.120 | 2.08E-5 | 0.0000170 | 0.0000190 | 0.0000200 | AAT, \cite{BaileyCropper_1991} |
2447828.130518 | 0.0000600 | 0.120 | 2.08E-5 | 0.0000170 | 0.0000190 | 0.0000200 | AAT, \cite{BaileyCropper_1991} |
2447828.042638 | 0.0000600 | 0.120 | 2.08E-5 | 0.0000170 | 0.0000190 | 0.0000200 | AAT, \cite{BaileyCropper_1991} |
2447827.954778 | 0.0000600 | 0.120 | 2.08E-5 | 0.0000170 | 0.0000190 | 0.0000200 | AAT, \cite{BaileyCropper_1991} |
2447437.920514 | 0.0000300 | – | – | – | – | – | 2.3m Steward obs., \cite{Allen_1989} |
2447128.809635 | 0.0009000 | 0.059 | 2.18E-4 | 0.0002000 | 0.0002000 | 0.0002000 | 2.3m Steward obs., \cite{Berriman_1988} |
2447128.722035 | 0.0009000 | 0.059 | 2.18E-4 | 0.0002000 | 0.0002000 | 0.0002000 | 2.3m Steward obs., \cite{Berriman_1988} |
2447127.843835 | 0.0009000 | 0.059 | 2.18E-4 | 0.0002000 | 0.0002000 | 0.0002000 | 2.3m Steward obs., \cite{Berriman_1988} |
2447127.755635 | 0.0009000 | 0.059 | 2.18E-4 | 0.0002000 | 0.0002000 | 0.0002000 | 2.3m Steward obs., \cite{Berriman_1988} |
2447145.064339 | 0.0000600 | 1.046 | 6.14E-5 | 0.0002000 | 0.0002000 | 0.0003000 | AAT, \cite{Ferrario_1989} |
2447127.227739 | 0.0003000 | 1.046 | 3.07E-4 | 0.0002000 | 0.0002000 | 0.0003000 | AAT, \cite{Ferrario_1989} |
2447127.139439 | 0.0003000 | 1.046 | 3.07E-4 | 0.0002000 | 0.0002000 | 0.0003000 | AAT, \cite{Ferrario_1989} |
2447097.792555 | 0.0002500 | 0.069 | 6.58E-5 | 0.0000600 | 0.0000500 | 0.0000700 | ESO/MPI 2.2m, \cite{Beuermann1988} |
2447094.717355 | 0.0002300 | 0.069 | 6.05E-5 | 0.0000600 | 0.0000500 | 0.0000700 | ESO/MPI 2.2m, \cite{Beuermann1988} |
2447091.554235 | 0.0002300 | 0.069 | 6.05E-5 | 0.0000600 | 0.0000500 | 0.0000700 | ESO/MPI 2.2m, \cite{Beuermann1988} |
2447090.587785 | 0.0001200 | 0.069 | 3.16E-5 | 0.0000600 | 0.0000500 | 0.0000700 | ESO/MPI 2.2m, \cite{Beuermann1988} |
2447089.709005 | 0.0003000 | 0.069 | 7.89E-5 | 0.0000600 | 0.0000500 | 0.0000700 | ESO/MPI 2.2m, \cite{Beuermann1988} |
2447088.742545 | 0.0003000 | 0.069 | 7.89E-5 | 0.0000600 | 0.0000500 | 0.0000700 | ESO/MPI 2.2m, \cite{Beuermann1988} |
2446446.973823 | 0.0001600 | – | – | – | – | – | EXOSAT, \cite{Osborne_1988} |
2445567.177636 | 0.0001600 | – | – | – | – | – | EXOSAT, \cite{Osborne_1988} |
\label{NewTimingData}
In this work we are not using the F-test as a statistical tool to perform model selection. The F-test is based on the assumption that uncertainties are Gaussian. This assumption might be violated if the data is affected by time-correlated red noise due to atmospheric effects and/or additional astrophysical effects that influence the shape of the eclipse profile. There exist no studies in the literature that has addressed this question and therefore we judge that the outcome of an F-test is unreliable.
In the following we will consider the newly compiled data set with timing uncertainties obtained from rescaling the published uncertainties in order to ensure χ2 = 1 over short time intervals. We have determined the following linear ephemeris using MPFIT. We followed the monte-carlo approach and determined a best-fit model by generating 10 million random initial guesses. We used best-fit parameters from LINFIT to obtain a first estimate of the initial epoch and period. Then initial guesses were drawn from a Gaussian distribution centered at the LINFIT values with standard deviation given by five times the formal LINFIT uncertainties. The linear ephemeris is shown in Fig. [Linearfit_NEW]. The resulting reduced χ2 value was 162.5 (χ2 = 8448.6 with (54-2) degrees of freedom) with the ephemeris (or computed timings) given as \begin{equation} T(E) = BJD_{TDB}~2,450,018.703604(3) + E \times 0.08786542817(9) \end{equation} Residuals are shown in Fig. [Linfit_NEW_Res] and displays a systematic variation. The corresponding RMS scatter of the data around the best-fit line is 28.9 seconds. The scatter is 5 times the average timing error and could be indicative of a systematic process of astrophysical origin.
We have also considered a quadratic model to the new data set. However, and judged by eye from Fig. [Linfit_NEW_Res], there is no obvious upward or downward parabolic trend in the data. Nevertheless we added a quadratic term and generated 10 million initial guesses to find a best-fit model. The resulting reduced χ2 value increased to 165.7 with 54-3 degrees of freedom. We therefore, decide to not consider a quadratic ephemeris in our further analysis.
Using scaled uncertainties we have considered a linear + 1-LTT model. We have again used MPFIT. The model is taken from Irwin (19??) and described in Hinse et al. (2012). We considered 107 initial guesses. The initial guess for the reference epoch and binary period were taken from the best-fit obtained from a linear ephemeris model. Inital guesses for the semi-amplitude of the light-time orbit were taken from an estimate of the amplitude as shown in Fig. 2. Initial guesses for the eccentricity covered the interval [0,1]. Initial guess for the argument of pericentre covered the interval [0,360] degrees. Initial guess for the orbital period was also estimated from Fig. [Linfit_NEW_Res]. Initial guess for the time of pericentre passage were obtained from T0 and the orbital period of the light-time orbit. Initial guesses were drawn at random. The methodology follows the same techniques as described in Hinse et al. (2012). Best-fit parameters were obtained from the best-fit solution covariance matrix as returned by MPFIT. Parameters errors should be considered as formal [OFF-THE-RECORD: FINAL ERRORS WILL USE BOOTSTRAP TECHNIQUE]. The best-fit had a χ2 = 717.6 with 47 degrees of freedom resulting in a reduced χν2 = 15.3. The corresponding RMS scatter of data points around the best-fit is 20.0 seconds. The best-fit parameters are listed in Table [BestFitParamsLinPlus1LTT_New_AllData] and shown in Fig. [BestFitModel_LinPlus1LTT_New_AllData]. Recalling the average timing error to be 6 seconds, that means that the RMS residuals are on a 3.3σ level indicating a significant signal of some origin. However, upon close inspection of Fig. [BestFitModel_LinPlus1LTT_New_AllData] the origin of the large scatter is mainly due to data obtained by \cite{Beuermann1988}, \cite{Berriman_1988}, \cite{Ferrario_1989} and a single point from \cite{Allen_1989} located between cycle number -27,000 and -35,000. In the following we investigate the effect of the resulting model when removing those data points.
T0 (BJD) | 2, 450, 021.77919 ± 3 × 10−5 |
P0 (days) | 0.0878654283 ± 1 × 10−10 |
asinI (AU) | 0.00048 ± 3 × 10−5 |
e | 0.76 ± 0.03 |
ω (radians) | 3.84 ± 0.04 |
Tp (BJD) | 2, 461, 743.0 ± 53 |
P (days) | 5964 ± 25 |
RMS (seconds) | 20.0 |
χ2 | 717.6 |
red. χ2 | 15.3 |
\label{BestFitParamsLinPlus1LTT_New_AllData}
To start we have removed a total of eight points: three points from \cite{Ferrario_1989}, four points from \cite{Berriman_1988} and a single point from \cite{Allen_1989}. The average deviation of those points from our best-fit model (Fig. [BestFitModel_LinPlus1LTT_New_AllData] and Table [BestFitParamsLinPlus1LTT_New_AllData]) was around 35 seconds. The minimum timing uncertainty is 0.33 seconds. The maximum timing uncertainty is 13.8 seconds. The mean of the timing uncertainty is 3.7 seconds. This data set is very similar to the data set investigated by Potter et al. (2011). Our new model now had a χ2 = 467.1 and a reduced χ2 = 12 with 39 DoF resulting in a RMS scatter of 13 seconds. We show the resulting best-fit parameters in Fig. [BestFitModel_LinPlus1LTT_RedDataSet1] and Table [BestFitParamsLinPlus1LTT_RedDataSet1]. As a result we first note that the removal of eight data points did not change significantly the model. This points towards that those discarded points do not contribute significantly to constrain the model during the fitting process. Further we note that our model is significantly different from the first elliptical term model presented in Potter et al. (2011). The most striking difference is dominantly seen in the eccentricity parameter. While they found a near-circular model we find a highly eccentric solution. Next we continue our analysis by removing an additional six data points.
T0 (BJD) | 2, 450, 021.69149 ± 4 × 10−5 |
P0 (days) | 0.0878654287 ± 1 × 10−10 |
asinI (AU) | 0.00047 ± 3 × 10−5 |
e | 0.73 ± 0.04 |
ω (radians) | 0.74 ± 0.03 |
Tp (BJD) | 2455832.0 ± 28 |
P (days) | 6012 ± 23 |
RMS (seconds) | 13.0 |
χ2 | 467.1 |
red. χ2 | 12.0 |
\label{BestFitParamsLinPlus1LTT_RedDataSet1}
In this section we investigate the effects by removing a total of 14 data points. Six from \cite{Beuermann1988}, three from \cite{Ferrario_1989}, four points from \cite{Berriman_1988} and a single point from \cite{Allen_1989}. The minimum timing uncertainty is 0.33 seconds. The maximum uncertainty is 13.8 seconds and the mean is 3.5 seconds. The resulting best-fit model is shown in Fig. [BestFitModel_LinPlus1LTT_RedDataSet2] with best-fit parameters listed in Table [BestFitParamsLinPlus1LTT_RedDataSet2]. We note that the resulting best-fit model has not changed significantly. Also the RMS scatter is comparable with the mean timing uncertainty. From this we can conclude that the timing errors should be scaled with $\sqrt{\chi^2}$ if the model is the correct description of the signal.
T0 (BJD) | 2, 450, 021.69150 ± 3 × 10−5 |
P0 (days) | 0.0878654279 ± 1 × 10−10 |
asinI (AU) | 0.00049 ± 3 × 10−5 |
e | 0.79 ± 0.03 |
ω (radians) | 6.91 ± 0.03 |
Tp (BJD) | 2467502 ± 57 |
P (days) | 5901 ± 20 |
RMS (seconds) | 4.4 |
χ2 | 161.0 |
red. χ2 | 4.9 |
\label{BestFitParamsLinPlus1LTT_RedDataSet2}
Finally, we have also discarded the two first timing measurements from \cite{Osborne_1988}. The mean timing uncertainty is 3 seconds. Again we found a best-fit model as shown in Fig. [BestFitModel_LinPlus1LTT_RedDataSet3] with best-fit parameters listed in Table [BestFitParamsLinPlus1LTT_RedDataSet3]. Also in this case the model did not change much compared to previous investigations. This points towards that the data taken at earlier epochs (discarded) does not play an important role to constrain the model. The RMS scatter of 4 seconds is comparable with the mean uncertainty and does not point towards a signal that could be due to an additional companion.
Based on rescaled timing uncertainties we find: We find no qualitative (visual inspection of residuals) and quantitative (increased chi2) justification for including a quadratic term in any model. We find that certain data points can be discarded without significantly affecting the best-fit model as obtained when all data were included. Therefore those data points do no play a significant role to constrain the model. We find that there is no significant evidence for a 2nd companion when only considering timing data of good quality.
T0 (BJD) | 2, 450, 021.69149 ± 4 × 10−5 |
P0 (days) | 0.0878654279 ± 1 × 10−10 |
asinI (AU) | 0.00050 ± 5 × 10−5 |
e | 0.79 ± 0.05 |
ω (radians) | 5.66 ± 0.05 |
Tp (BJD) | 2467498 ± 70 |
P (days) | 5900 ± 23 |
RMS (seconds) | 4.0 |
χ2 | 160.0 |
red. χ2 | 5.2 |
\label{BestFitParamsLinPlus1LTT_RedDataSet3}
http://astroutils.astronomy.ohio-state.edu/time/↩
Data-driven, interactive article with d3.js plot and IPython Notebook
and 1 collaborator
This week we are launching a brand new look for Authorea and a couple of exciting new features aimed at making scientific research more interactive. Since the very beginning of Authorea, we have been striving to make collaborative scientific writing as easy as possible. But in addition to writing, we are also creating a space for new ways of reading science, and executing it.
For example, if you are a scientist, chances are that you do a lot of data analysis and you might want to visualize and provide access to your data in some fun, new, interactive, more meaningful, data-driven ways, rather than the usual static, data-less plot. There are many ways to create this kind of interactive plots. In this short blog post we will look at two of them.
ProCS15: A DFT-based chemical shift predictor for backbone and C\(\beta\) atoms in proteins
We present ProCS15: A program that computes the isotropic chemical shielding values of backbone and Cβ atoms given a protein structure in less than a second. ProCS15 is based on around 2.35 million OPBE/6-31G(d,p)//PM6 calculations on tripeptides and small structural models of hydrogen-bonding. The ProCS15-predicted chemical shielding values are compared to experimentally measured chemical shifts for Ubiquitin and the third IgG-binding domain of Protein G through linear regression and yield RMSD values below 2.2, 0.7, and 4.8 ppm for carbon, hydrogen, and nitrogen atoms respectively. These RMSD values are very similar to corresponding RMSD values computed using OPBE/6-31G(d,p) for the entire structure for each protein. The maximum RMSD values can be reduced by using NMR-derived structural ensembles of Ubiquitin. For example, for the largest ensemble the largest RMSD values are 1.7, 0.5, and 3.5 ppm for carbon, hydrogen, and nitrogen. The corresponding RMSD values predicted by several empirical chemical shift predictors range between 0.7 - 1.1, 0.2 - 0.4, and 1.8 - 2.8 ppm for carbon, hydrogen, and nitrogen atoms, respectively.
Global TB Report 2015: Technical appendix on methods used to estimate the global burden of disease caused by TB
and 4 collaborators
Estimates of the burden of disease caused by TB and measured in terms of incidence, prevalence and mortality are produced annually by WHO using information gathered through surveillance systems (case notifications and death registrations), special studies (including surveys of the prevalence of disease), mortality surveys, surveys of under-reporting of detected TB and in-depth analysis of surveillance data, expert opinion and consultations with countries. This document provides case definitions and describes the methods used in Global TB Report 2015 to derive TB incidence, prevalence and mortality.
Incidence is defined as the number of new and recurrent (relapse) episodes of TB (all forms) occurring in a given year. Recurrent episodes are defined as a new episode of TB in people who have had TB in the past and for whom there was bacteriological confirmation of cure and/or documentation that treatment was completed. In the remainder of this technical document, relapse cases are referred to as recurrent cases because the term is more useful when explaining the estimation of TB incidence. Recurrent cases may be true relapses or a new episode of TB caused by reinfection. In current case definitions, both relapse cases and patients who require a change in treatment are called retreatment cases. However, people with a continuing episode of TB that requires a treatment change are prevalent cases, not incident cases.
Prevalence is defined as the number of TB cases (all forms) at a given point in time.
Mortality from TB is defined as the number of deaths caused by TB in HIV-negative people occurring in a given year, according to the latest revision of the International classification of diseases (ICD-10). TB deaths among HIV-positive people are classified as HIV deaths in ICD-10. For this reason, estimates of deaths from TB in HIV-positive people are presented separately from those in HIV-negative people.
The case fatality rate is the risk of death from TB among people with active TB disease.
The case notification rate refers to new and recurrent episodes of TB notified to WHO for a given year. The case notification rate for new and recurrent TB is important in the estimation of TB incidence. In some countries, however, information on treatment history may be missing for some cases. Patients reported in the unknown history category are considered incident TB episodes (new or recurrent).
Regional analyses are generally undertaken for the six WHO regions (that is, the African Region, the Region of the Americas, the Eastern Mediterranean Region, the European Region, the South-East Asia Region and the Western Pacific Region). For analyses related to MDR-TB, nine epidemiological regions were defined (Figure [fig:epiregions]). These were African countries with high HIV prevalence, African countries with low HIV prevalence, Central Europe, Eastern Europe, high-income countries, Latin America, the Eastern Mediterranean Region (excluding high-income countries), the South-East Asia Region (excluding high-income countries) and the Western Pacific Region (excluding high-income countries).
Risk of Bias Assessments in Ophthalmology Systematic Reviews and Meta-Analyses
and 2 collaborators
Introduction
In order for systematic reviews to make accurate inferences concerning clinical therapy, the primary studies that constitute the review must provide valid results. The Cochrane Handbook for Systematic Reviews states that assessment of validity is an “essential component” of a review that “should influence the analysis, interpretation, and conclusions of the review”(p. 188) \cite{higgins2008cochrane}. The internal validity of a review’s primary studies must be considered to ensure that bias has not compromised the results, leading to inaccurate estimates of summary effect sizes.
In ophthalmology, there is a need for closer examination of the validity of primary studies comprising a review. As an illustrative example, Chakrabarti et al. (2012) discussed emerging ophthalmic treatments for proliferative (PDR) and nonproliferative diabetic retinopathy (NDR) noting that anti-vascular endothelial growth factor (VEGF) agents consistently received recognition as a possible alternative treatment for diabetic retinopathy. Treatment guidelines from the Scottish Intercollegiate Guidelines Network and the American Academy of Ophthalmology consider anti-VEGF treatment as merely useful as an adjunct to laser for treatment of PDR; however, the Malaysian guidelines indicate that these same agents were to be considered in combination with intraocular steroids and vitrectomy. Most extensively, the National Health and Medical Research Council guidelines recommend the addition of anti-VEGF to laser therapy prior to vitrectomy \cite{Chakrabarti_2012}. The evidence base informing these guidelines is comprised of trials of questionable quality. Martinez-Zapata et al. (2014) conducted a systematic review of this anti-VEGF treatment for diabetic retinopathy, which included 18 randomized controlled trials (RCTs). Of these trials, seven were at high risk of bias while the rest were unclear in one or more domains. The authors concluded, “there is very low or low quality evidence from RCTs for the efficacy and safety of anti-VEGF agents when used to treat PDR over and above current standard treatments" \cite{martinez2014anti}. Thus, low quality evidence provides less confidence regarding the efficacy of treatment, makes suspect guidelines advocating use, and impairs the clinicians ablility to make sound judgements regarding treatment.
Over the years, researchers have conceived many methods in attempt to evaluate the validity or methodological quality of primary studies. Initially, checklists and scales were developed to evaluate whether particular aspects of experimental design, such as randomization, blinding, or allocation concealment were incorporated into the study. These approaches have been criticized for falsely elevating quality scores. Many of these scales and checklists include items that have no bearing on the validity of study findings, such as whether investigators used informed consent or whether ethical approval was obtained \cite{7743790}. Furthermore, with the proliferation of quality appraisal scales, it was found that the choice of scale could alter the results of systematic reviews due to weighting differences of scale components \cite{10493204}. Two such scales, the Jadad scale - also called the Oxford Scoring System \cite{8721797} and the Downs and Black checklist \cite{9764259} were among the popular alternatives. Quality of Reporting of Meta-analyses (QUORUM) \cite{Moher_1999}, the dominant reporting guidelines at that time, called for the evaluation of methodological quality of the primary studies in systematic reviews. This recommendation was short lived as the Cochrane Collaboration began to advocate for a new approach to assess the validity of primary studies. This new method assessed the risk of bias of 6 particular design features of primary studies, with each domain receiving a rating of either low, unclear, or high risk of bias \cite{higgins2008cochrane}. Following suit, the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) - updated reporting guidelines, now calls for the evaluation of bias in all systematic reviews \cite{19622511}.
A previous review examining primary studies from multiple fields of medicine revealed that the failure to incorporate an assessment of methodological quality can result in the implementation of interventions founded on misleading evidence \cite{588948720011204}. Yet, questions remain regarding the assessment of quality and risk of bias in clinical specialties. Therefore, we examined ophthalmology systematic reviews to determine the degree to which methodological quality and risk of bias assessments were conducted. We also evaluated the particular method used in the evaluation, the quality components comprising these assessments, and how systematic reviewers integrated primary studies with low quality or high risk of bias into their results.