Public Articles
Rule of Product
The Rule of Product goes by other names such as the Multiplication Principle or Sequential Rule. It is a multiplication based rule one follows in certain counting procedures.
“The rule of product states that if an action can be performed by making A choices followed by B choices, then it can be performed AB ways” (Benjamin, 2009).
In other words suppose there exists a set of A choices and a set B choices, multiplying AB will be the number of ways to do A and B. As a side note, notice the word “and”, this can often be a queue word for situations were one might find need for The Rule of Product.
Independence of a sets is also required when The Rule of Product is in play. That simply means that a set A and set B from earlier are completely separate.
Moreover, this rule extends to any number of sets. Meaning if we have a set A, B, C, and D they can still be multiplied ABCD.
So lets look at a simple example.
Pascal's triangle
Pascal’s triangle is filled with patterns that can solve
many mathematical problems. One of the neat things that the triangle can do is help with
binomial expressions. First let’s take a look at binomial expressions. Binomial expressions relate to the sum or difference of two
terms such as:
2 things you need to do so your resolutions make you feel like a queen
Stirling Numbers
Stirling numbers of the second kind in discrete mathematics
are used to show numerous combinatoric properties like partitioning a set, (a
number of ways to write an integer of a set), and forming a recurrence relation.
Blog Post 2
1 $\underline{\text{Pascal's Triangle}}$ In discrete mathematics, Pascal’s triangle is the arrangement of binomial coefficients in such a way that a triangle is formed. Although such a pattern was studied centuries before his time, we refer to Pascal’s triangle in relation to Blaise Pascal, a French mathematician. The triangle was originally developed by the ancient Chinese, but Pascal was the first person to discover the importance of all of the patterns that occur within it. His work allegedly stemmed from the popularity of gambling. After considering a question asked to him about gambling with dice, Pascal’s Arithmetical Triangle resulted.
The triangle is created by starting at the top, row 0, with the number 1. Each row afterwards begins and ends with 1 and the pattern follows that as you move through the row, you add the number above and to the left with the number above and to the right for any given position. A portion of Pascal’s triangle is shown below.
Blog Post 1
1 $\underline{\text{Binomial Coefficient}}$ In mathematics, the binomial coefficient is written as ${n \choose k}$ and can be pronounced as “n choose k.” Alternatively, binomial coefficients are also sometimes given the notation C(n, k). In this case, the C stands for the word “choices” or “combination” (Benjamin, 2009, p. 8). This is because there are ${n \choose k}$ ways of choosing k elements from a set containing a number of n elements. For example, we can consider the set A = {1, 2, 3, 4}. If we wish to know how many subsets of 2 can be created using this set, we are essentially asking how many ways there are of choosing 2 elements from a set with 4 total elements. Therefore, we can identify that k = 2 and n = 4. Hence, we have ${4 \choose 2}$. To calculate such a problem, we typically would want to write out by hand all the possible combinations. Doing so, one would find that there are six pairs of size two subsets, namely {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, and {3, 4}. However, it becomes clear to see that when we are dealing with large sets of values, this work can become tedious. Therefore, it is convenient to utilize the following formula:
\[{n \choose k} = \frac{n!}{(n-k)! \cdot k!} \]
A Laurent Series Tranform for Integer Sequences via Generalised Continued Fractions
We investigate a function defined by a generalised continued fraction and find it to be closely related to a generating function of series OEIS:A000698. We then alter the function to contain the set of primes rather than the set of real numbers. This continued fraction function also has a similar form with coefficients 2, 6, 48, 594, 10212, 230796, ⋯. We also transform the sequence 1, 1, 1, 1, 1⋯ and gain the Catalan numbers as coefficients of the corresponding Laurent Series. We then provide examples which transform into various OEIS sequences.
Define the function \begin{equation} f(x)=\frac{1}{x+\frac{2}{x+\frac{3}{x+\cdots}}} = \underset{k=1}{\overset{\infty}{\mathrm \large K \normalsize}} \frac{k}{x} \end{equation} evaluating this function until convergence for 16 decimal places, gives something that looks like 1/x, however, after diagnosing the coefficients of the Laurent series, by subtracting likely integer terms we very easily find \begin{equation} \lim_{x\to \infty}f(x)=\frac{1}{x}-\frac{2}{x^3}+\frac{10}{x^5}-\frac{74}{x^7}+\frac{706}{x^9}-\frac{8162}{x^{11}}+\frac{110410}{x^{13}}-\cdots, \end{equation} with a search on OEIS giving sequence A000698 which we believe to be plausible. We can easily imagine a similar function \begin{equation} g(x)=\frac{2}{x+\frac{3}{x+\frac{5}{x+\frac{7}{x+\cdots}}}} = \underset{k=1}{\overset{\infty}{\mathrm \large K \normalsize}} \frac{p_k}{x} \end{equation} where the top row of numbers are the prime numbers, pk. Using the first 9999 primes in the continued fraction, this also converges to 16 decimal places for large enough x, and appears to be described by an integral coefficient Laurent series \begin{equation} g(x)=\frac{2}{x}-\frac{6}{x^3}+\frac{48}{x^5}-\frac{594}{x^7}+\frac{10212}{x^9}-\frac{230796}{x^{11}}+\frac{6569268}{x^{13}}-\cdots \end{equation}
That would allow conjecture for a very interesting relationship \begin{equation} \underset{i=1}{\overset{\infty}{\mathrm \large K \normalsize}} \frac{p_i}{x} = \frac{2}{x} - \frac{6}{x^3} + \frac{48}{x^5} -\frac{594}{x^7} + \frac{10520}{x^9} -\cdots \end{equation} where p1 = 2,p2 = 3 and so one for primes, and the capital K notation is that for the continued fraction.
A further assessment of many integer sequences inside the continued fraction should be made, and a check of corresponding Laurent series undertaken. It may be common place to find integral coefficients.
The first function investigated in this document is also of interest, in OEIS the Laurent coefficient sequence A000698 is commented “Number of nonisomorphic unlabeled connected Feynman diagrams of order 2n-2 for the electron propagator of quantum electrodynamics (QED), including vanishing diagrams.” The continued fraction clearly has the potential to capture the recursive aspect of the theory, and this may be why the series align.
Many thanks to OEIS for providing their invaluable service. Thanks to Wolfram|Alpha for plots as below, and Mathematica for calculations. This work was undertaken in my spare time while being funded by the EPSRC.
Pascal's Triangle
Ozone Paper Moved Offline
and 2 collaborators
We develop a quantitative method for determining Stratosphere to Troposphere Transport events (STTs) and a minimum bound for this transported ozone quantity using ozonesondes over Melbourne, Macquarie Island, and Davis.
Binomial Coefficients
Open Science as a Service: Status and future potential from a German non-university research institution perspective
and 1 collaborator
2016HCT Prelim.
Personal news and content curation is an exciting NLP application. Systems providing this service are often characterised by a collaborative approach that combines human and machine intelligence. As the scope of the problem increases however, so too does the importance of automation. To this end we propose a novel method for scoring news articles and other related content. It is natural to view this problem in a learning-to-rank framework. The training phase of our model first makes use of a pairwise transform. This alters the problem from the ranking of a whole corpus to many individual pairwise comparisons (is article 'a' better than article 'b'). This transformed set is then used to determine the optimal weights in a logistic regression model. These can then be used directly to classify the non-transformed test set. We also perform a comprehensive review and selection process on a large range of candidate features. Our final features involve measures of centrality, informativeness, complexity and within-group similarity.
Microbial Eukaryote Proposal
and 2 collaborators
#Project Summary (1 page)
Due: January 25, 2016 at 5 PM (local time)
DEB - Biodiversity: Discovery & Analysis Cluster
Solicitation: http://www.nsf.gov/pubs/2015/nsf15609/nsf15609.htm
Cluster description: http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503666&org=DEB&from=home
##Overview
The broad goal with this proposal is to increase the overall knowledge of the true diversity of microbial eukaryotes by identifying and culturing microeukaryotes from seagrass beds.
Microorganisms, and specifically marine microbial eukaryotes, represent an underexplored area of diversity. Microbial eukaryotes are known to be important on a number of trophic levels in the marine system CITE, and microbial eukaryotes found in seagrass beds likely contribute to their tremendous biodiversity and roles as important players in nutrient cycling and carbon sequestration in the oceans. We will use a combination of sequencing and culturing techniques to (1) characterize microeukaryotes in a global census of the seagrass Zostera marina, (2) Explore microbial eukaryotic diversity across the Order Alismatales, including the 3 separate lineages of seagrasses and their freshwater and brackish relatives, and (3) Create a publicly available culture collection of microbial eukaryotes from Zostera marina samples from Bodega Bay, CA.
##Intellectual Merit
Microorganisms comprise the majority of diversity on Earth. Traditionally classified using morphological approaches, the advent of sequence data has dramatically altered our views of microbial evolution and diversity. Specifically, high throughput sequencing technologies have enabled us to explore multiple genes and genomes from microorganisms, giving us insight into genome complexity and function in these unseen organisms. As a result microbial ecologists are finding themselves in uncharted territory as they analyze large data sets full of "unclassified" organisms, and it now clear that microorganisms are much more diverse than previously thought.
Although certain pathogenic microeukaryotes have been studied in great detail (ex. giardia, see \cite{Adam_2001}) for review, environmental microeukaryotes, specifically marine microeukatyores, are grossly uncharacterized despite their important functional roles in their ecosystems \cite{Caron_2008}. Novel marine microeukaryotic lineages have previously been found at all phylogenetic scales \cite{Massana_2008}; however, many of these novel organisms are still a mystery to us as they have yet to be cultured. It is estimated that the total diversity of microbial eukaryotes is much higher than what we currently have in culture \cite{Mora_2011} \cite{Pawlowski_2012}.
Seagrasses are a unique system in which to explore marine microbial eukaryotic diversity. These important marine angiosperms provide habitat and food to many rare and endemic species, and contain tremendous levels of biodiversity that has currently only been characterized at the macrobe level \cite{Orth_2006}. Seagrasses are known to be important contributors to biogeochemical processes within the ocean and are one of the largest carbon sinks on earth, sequestering carbon 35X faster than Tropical Rainforests \cite{Mcleod_2011}.
Given their importance in the complex marine food web and their contributions to nutrient cycling within the oceans, we hypothesize that seagrass-associated marine microbial eukaryotes are important to both the high levels of macrobe biodiversity within seagrass beds and to their role in nutrient cycling and carbon sequestration in the ocean ecosystem.
We propose to perform a global census of microbial eukaryotes found in association with the leaves, roots, and sediment of the seagrass Zostera marina. We will then expand our investigation to census the microbial eukaryotes found in association with plants across the Order Alismatales, which includes three independent lineages of seagrasses. Concurrently with the afformentioned censuses, we will establish a culture collection of microbial eukaryotes found associated with Zostera marina from Bodega Bay, California. We are uniquely positioned to be successful at the proposed research; using funds provided by the Gordon and Betty Moore Foundation, we have already established a program to explore bacterial diversity within seagrass beds, and have completed the majority of field work and formed ongoing collaborations with other seagrass researchers from both the Zostera Experimental Network (ZEN) and other research institutions.
##Broader Impacts
The project we propose here is a global interdisciplinary collaboration that will result in increased knowledge of the biodiversity of an understudied group of organisms from an important marine ecosystem. The preposed project is the first to explore seagrass-associated microbial eukaryotes using both sequence and culture based methods, and will generate large amounts of publicly available sequence data and numerous new entries of novel marine organisms to culture collections.
The project we are proposing will include a large outreach component both at the local level (undergraduate researchers, high school students) and the global level (website, collaborators). Undergraduates and local high school students will be intimately involved in creating the culture collection and our progress will be transparently available on our lab website.
Sample Blog Post Math 381
A graph is a structure used in discrete math that is used to show the relationship between various objects. The objects form the vertices of the graph. Two vertices are connected by a line if they satisfy a certain relationship. We call these lines edges. By formalizing the way we connect the dots we are able to rigorously prove mathematical claims.
A quick introduction to version control with Git and GitHub
Many scientists write code as part of their research. Just as experiments are logged in laboratory notebooks, it is important to document the code you use for analysis. However, a few key problems can arise when iteratively developing code that make it difficult to document and track which code version was used to create each result. First, you often need to experiment with new ideas, such as adding new features to a script or increasing the speed of a slow step, but you do not want to risk breaking the currently working code. One often utilized solution is to make a copy of the script before making new edits. However, this can quickly become a problem because it clutters your filesystem with uninformative filenames, e.g. analysis.sh
, analysis_02.sh
, analysis_03.sh
, etc. It is difficult to remember the differences between the versions of the files, and more importantly which version you used to produce specific results, especially if you return to the code months later. Second, you will likely share your code with multiple lab mates or collaborators and they may have suggestions on how to improve it. If you email the code to multiple people, you will have to manually incorporate all the changes each of them sends.
Fortunately, software engineers have already developed software to manage these issues: version control. A version control system (VCS) allows you to track the iterative changes you make to your code. Thus you can experiment with new ideas but always have the option to revert to a specific past version of the code you used to generate particular results. Furthermore, you can record messages as you save each successive version so that you (or anyone else) reviewing the development history of the code is able to understand the rationale for the given edits. Also, it facilitates collaboration. Using a VCS, your collaborators can make and save changes to the code, and you can automatically incorporate these changes to the main code base. The collaborative aspect is enhanced with the emergence of websites that host version controlled code.
In this quick guide, we introduce you to one VCS, Git (git-scm.com), and one online hosting site, GitHub (github.com), both of which are currently popular among scientists and programmers in general. More importantly, we hope to convince you that although mastering a given VCS takes time, you can already achieve great benefits by getting started using a few simple commands. Furthermore, not only does using a VCS solve many common problems when writing code, it can also improve the scientific process. By tracking your code development with a VCS and hosting it online, you are performing science that is more transparent, reproducible, and open to collaboration \cite{23448176, 24415924}. There is no reason this framework needs to be limited only to code; a VCS is well-suited for tracking any plain-text files: manuscripts, electronic lab notebooks, protocols, etc.
ProCS15: A DFT-based chemical shift predictor for backbone and C\(\beta\) atoms in proteins
We present ProCS15: A program that computes the isotropic chemical shielding values of backbone and Cβ atoms given a protein structure in less than a second. ProCS15 is based on around 2.35 million OPBE/6-31G(d,p)//PM6 calculations on tripeptides and small structural models of hydrogen-bonding. The ProCS15-predicted chemical shielding values are compared to experimentally measured chemical shifts for Ubiquitin and the third IgG-binding domain of Protein G through linear regression and yield RMSD values below 2.2, 0.7, and 4.8 ppm for carbon, hydrogen, and nitrogen atoms respectively. These RMSD values are very similar to corresponding RMSD values computed using OPBE/6-31G(d,p) for the entire structure for each protein. The maximum RMSD values can be reduced by using NMR-derived structural ensembles of Ubiquitin. For example, for the largest ensemble the largest RMSD values are 1.7, 0.5, and 3.5 ppm for carbon, hydrogen, and nitrogen. The corresponding RMSD values predicted by several empirical chemical shift predictors range between 0.7 - 1.1, 0.2 - 0.4, and 1.8 - 2.8 ppm for carbon, hydrogen, and nitrogen atoms, respectively.
ProCS15: A DFT-based chemical shift predictor for backbone and C\(\beta\) atoms in proteins
We present ProCS15: A program that computes the isotropic chemical shielding values of backbone and Cβ atoms given a protein structure in less than a second. ProCS15 is based on around 2.35 million OPBE/6-31G(d,p)//PM6 calculations on tripeptides and small structural models of hydrogen-bonding. The ProCS15-predicted chemical shielding values are compared to experimentally measured chemical shifts for Ubiquitin and the third IgG-binding domain of Protein G through linear regression and yield RMSD values below 2.2, 0.7, and 4.8 ppm for carbon, hydrogen, and nitrogen atoms respectively. These RMSD values are very similar to corresponding RMSD values computed using OPBE/6-31G(d,p) for the entire structure for each protein. The maximum RMSD values can be reduced by using NMR-derived structural ensembles of Ubiquitin. For example, for the largest ensemble the largest RMSD values are 1.7, 0.5, and 3.5 ppm for carbon, hydrogen, and nitrogen. The corresponding RMSD values predicted by several empirical chemical shift predictors range between 0.7 - 1.1, 0.2 - 0.4, and 1.8 - 2.8 ppm for carbon, hydrogen, and nitrogen atoms, respectively.
A quick introduction to version control with Git and GitHub
Many scientists write code as part of their research. Just as experiments are logged in laboratory notebooks, it is important to document the code you use for analysis. However, a few key problems can arise when iteratively developing code that make it difficult to document and track which code version was used to create each result. First, you often need to experiment with new ideas, such as adding new features to a script or increasing the speed of a slow step, but you do not want to risk breaking the currently working code. One often utilized solution is to make a copy of the script before making new edits. However, this can quickly become a problem because it clutters your filesystem with uninformative filenames, e.g. analysis.sh
, analysis_02.sh
, analysis_03.sh
, etc. It is difficult to remember the differences between the versions of the files, and more importantly which version you used to produce specific results, especially if you return to the code months later. Second, you will likely share your code with multiple lab mates or collaborators and they may have suggestions on how to improve it. If you email the code to multiple people, you will have to manually incorporate all the changes each of them sends.
Fortunately, software engineers have already developed software to manage these issues: version control. A version control system (VCS) allows you to track the iterative changes you make to your code. Thus you can experiment with new ideas but always have the option to revert to a specific past version of the code you used to generate particular results. Furthermore, you can record messages as you save each successive version so that you (or anyone else) reviewing the development history of the code is able to understand the rationale for the given edits. Also, it facilitates collaboration. Using a VCS, your collaborators can make and save changes to the code, and you can automatically incorporate these changes to the main code base. The collaborative aspect is enhanced with the emergence of websites that host version controlled code.
In this quick guide, we introduce you to one VCS, Git (git-scm.com), and one online hosting site, GitHub (github.com), both of which are currently popular among scientists and programmers in general. More importantly, we hope to convince you that although mastering a given VCS takes time, you can already achieve great benefits by getting started using a few simple commands. Furthermore, not only does using a VCS solve many common problems when writing code, it can also improve the scientific process. By tracking your code development with a VCS and hosting it online, you are performing science that is more transparent, reproducible, and open to collaboration \cite{23448176, 24415924}. There is no reason this framework needs to be limited only to code; a VCS is well-suited for tracking any plain-text files: manuscripts, electronic lab notebooks, protocols, etc.
THE PREDICTION OF OUTCOMES RELATED TO THE USE OF NEW DRUGS IN THE REAL WORLD THROUGH ARTIFICIAL ADAPTIVE SYSTEMS.
Welcome to Authorea!
Hey, welcome. Double click anywhere on the text to start writing. In addition to simple text you can also add text formatted in boldface, italic, and yes, math too: E = mc2! Add images by drag’n’drop or click on the “Insert Figure” button.
Authenticating Route Transitions in an SPA: What to do About the Developer Console
Growth of 48 Built Environment Bacterial Isolates on Board the International Space Station (ISS)
and 6 collaborators
Abstract
Background: While significant attention has been paid to the potential risk of pathogenic microbes aboard crewed spacecraft, much less has focused on the non-pathogenic microbes in these habitats. Preliminary work has demonstrated that the interior of the International Space Station (ISS) has a microbial community resembling those of built environments on earth. Here we report results of sending 48 bacterial strains, collected from built environments on earth, for a growth experiment on the ISS. This project was a component of Project MERCCURI (Microbial Ecology Research Combining Citizen and University Researchers on ISS).
Results: Of the 48 strains sent to the ISS, 45 of them showed similar growth in space and on earth. The vast majority of species tested in this experiment have also been found in culture-independent surveys of the ISS. Only one bacterial strain that avoided contamination showed significantly different growth in space. Bacillus safensis JPL-MERTA-8-2 grew 60% better in space than on earth.
Conclusions: The majority of bacteria tested were not affected by conditions aboard the ISS in this experiment (e.g., microgravity, cosmic radiation). Further work on Bacillus safensis could lead to interesting insights on why this bacteria grew so much better in space.