Public Articles
MolSSI Education: Empowering the Next Generation of Computational Molecular Scientists.
and 3 collaborators
The is a research and education center that supports software development in the . One of ’s core objectives is to provide education and training for the next generation of computational researchers. Education targets various career stages and skill levels through its live workshops, online resources, and software fellowship program. Education focuses its efforts within four areas: programming and software development, and , faculty and curriculum development, and the software fellowship program. This article delineates educational efforts at the , overall goals, and resources that can be useful to researchers in the computational molecular sciences.
How useful are lexicostatistical and phylogenetic methods, in plotting the migration of Polynesian peoples across Oceania?
Genotypic variation rather than ploidy level determines functional trait expression in a foundation tree species in the presence and absence of environmental stress
and 5 collaborators
The First Paper
Our paper is motivated by the recent publication [98] in which the FDTDM was used to compute the LDR for smoke clusters of up to four mono- mers in order to analyze implications of depolarization lidar observations from the Cloud—Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) satel- lite [115]. We extend the analysis of Ref. [98] by con- sidering a more comprehensive and representative set of soot-particle models and using what we believe to be more relevant refractive indices.\citep{Mishchenko_2013}
Right cardiac chambers echo-bubble contrast in a patient with decompression sickness: A case report and a literature revie
and 2 collaborators
BIO 465 Capstone Project Introduction
and 1 collaborator
Lecture 12- Entanglement
and 2 collaborators
Robotic Pill for Biomarker and Fluid Sampling in the Gastrointestinal Tract
and 6 collaborators
DarkCideS 1.0, a global database for bats in karsts and caves
and 35 collaborators
Tipos básicos de objeto no R: vetores e listas
and 1 collaborator
1-Nitropyrene Exposure as Genotoxicity and Oxidative Stress Biomarker
Development of Self-folded Corrugated Structures Using Automatic Origami Technique by Inkjet Printing
and 1 collaborator
Four Bar Motion
Trends and Health Risks of Heavy metal present in Sewage Sludge: A Situational Analysis in Indian Context
and 4 collaborators
Smart electronic nose enabled by an all-feature olfactory algorithm (AFOA)
and 6 collaborators
Os cinco macacos e o pensamento crítico
and 1 collaborator
The effects of forest edge and nest height on nest predation in a U.K. deciduous forest fragment
Nvidia Hopper GPU and Grace CPU Highlights
and 2 collaborators
A quick introduction to version control with Git and GitHub
and 2 collaborators
Many scientists write code as part of their research. Just as experiments are logged in laboratory notebooks, it is important to document the code you use for analysis. However, a few key problems can arise when iteratively developing code that make it difficult to document and track which code version was used to create each result. First, you often need to experiment with new ideas, such as adding new features to a script or increasing the speed of a slow step, but you do not want to risk breaking the currently working code. One often utilized solution is to make a copy of the script before making new edits. However, this can quickly become a problem because it clutters your filesystem with uninformative filenames, e.g. analysis.sh
, analysis_02.sh
, analysis_03.sh
, etc. It is difficult to remember the differences between the versions of the files, and more importantly which version you used to produce specific results, especially if you return to the code months later. Second, you will likely share your code with multiple lab mates or collaborators and they may have suggestions on how to improve it. If you email the code to multiple people, you will have to manually incorporate all the changes each of them sends.
Fortunately, software engineers have already developed software to manage these issues: version control. A version control system (VCS) allows you to track the iterative changes you make to your code. Thus you can experiment with new ideas but always have the option to revert to a specific past version of the code you used to generate particular results. Furthermore, you can record messages as you save each successive version so that you (or anyone else) reviewing the development history of the code is able to understand the rationale for the given edits. Also, it facilitates collaboration. Using a VCS, your collaborators can make and save changes to the code, and you can automatically incorporate these changes to the main code base. The collaborative aspect is enhanced with the emergence of websites that host version controlled code.
In this quick guide, we introduce you to one VCS, Git (git-scm.com), and one online hosting site, GitHub (github.com), both of which are currently popular among scientists and programmers in general. More importantly, we hope to convince you that although mastering a given VCS takes time, you can already achieve great benefits by getting started using a few simple commands. Furthermore, not only does using a VCS solve many common problems when writing code, it can also improve the scientific process. By tracking your code development with a VCS and hosting it online, you are performing science that is more transparent, reproducible, and open to collaboration \cite{23448176, 24415924}. There is no reason this framework needs to be limited only to code; a VCS is well-suited for tracking any plain-text files: manuscripts, electronic lab notebooks, protocols, etc.
Design and empirical validation of effectiveness of LANGA, a game-based platform for second-language learning
and 2 collaborators
Scratch Pad / Multiple Dirichlet Convolutions
Consider $$ \phi(s) = c $$ easy as moments are constant... arbitrary $$ \phi(s) = c_1 c_2^s $$ again both arbitrary constants... special points s = 1, s = 0, c1 = ϕ(0), c2 = ϕ(1)/ϕ(0). Consider first meaningful case $$ \phi(s) = c_1 c_2 ^ s \Gamma(c_3 + c_4 s) $$ $$ \phi(0) = c_1 \Gamma(c_3) $$ $$ \phi(1) = c_1 c_2 \Gamma(c_3 + c_4) $$ $$ \frac{\phi(1)}{\phi(0)} = c_2 \frac{\Gamma(c_3+c_4)}{\Gamma(c_3)} $$ which if d is an integer can be expanded as a Pochammer $$ \phi(\frac{1}{c_4}) = c_1 c_2 ^ {1/c_4} \Gamma(c_3 + 1) = c_1 c_2 ^ {1/c_4} c_3 \Gamma(c_3) $$ $$ \phi(\frac{1}{c_4})\phi^{-1}(0) = c_2 ^ {1/c_4} c_3 $$ also consider the points where c3 + c4s ∈ {1, 2} such that the gamma term vanishes to 1.
$$ \phi\left(\frac{1-c_3}{c_4}\right) = c_1 c_2 ^ \frac{1-c_3}{c_4} $$ $$ \phi\left(\frac{2-c_3}{c_4}\right) = c_1 c_2 ^ \frac{2-c_3}{c_4} $$ noting that $$ \phi\left(\frac{2-c_3}{c_4}\right)\phi^{-2}\left(\frac{1-c_3}{c_4}\right) = c_1^{-1} c_2^{c_3/c_4} $$ $$ \phi\left(\frac{2-c_3}{c_4}\right)\phi^{-1}\left(\frac{1-c_3}{c_4}\right) = c_2^{1/c_4} $$
Which means (great result) $$ \phi(\frac{1}{c_4})\phi^{-1}(0)\phi^{-1}\left(\frac{2-c_3}{c_4}\right)\phi\left(\frac{1-c_3}{c_4}\right) = c_3 $$ if we find enough equations for each parameter, there is a chance of a self consistent/iterative solution? We would ideally want something like c1(c2, c3, c4), etc. unless it doesn’t really matter... $$ \phi^{c_4}\left(\frac{2-c_3}{c_4}\right)\phi^{-c_4}\left(\frac{1-c_3}{c_4}\right) = c_2 $$ $$ \frac{\phi(0)}{\Gamma(c_3)} = c_1 $$ $$ \frac{1}{\log_{c_2}\left(\phi\left(\frac{2-c_3}{c_4}\right)\phi^{-1}\left(\frac{1-c_3}{c_4}\right)\right)} = c_4 $$
However, there are a couple of important factors where we can’t stray into territory that is broken due to analytic continuation...
For some close parameters on at least one example, this works for c1 and c3, the goal is then to get the remaining ones using those.
Consider the log derivative as a secret weapon... measure $$ \theta(s) = \frac{d}{ds} \log \phi(s) = \log(c_2) + c_4 \psi(c_3 + c_4 s) $$
$$ \theta(0) = \log(c_2) + c_4 \psi(c_3) $$
also we know that (ψ(1)= − γ,ψ(2)=1 − γ)... $$ \theta\left(\frac{1-c_3}{c_4}\right) = \log(c_2) - c_4 \gamma $$ $$ \theta\left(\frac{2-c_3}{c_4}\right) = \log(c_2) + c_4 (1-\gamma) $$ then $$ \theta\left(\frac{2-c_3}{c_4}\right) - \theta\left(\frac{1-c_3}{c_4}\right) = c_4 $$ $$ \theta\left(\frac{2-c_3}{c_4}\right) + \theta\left(\frac{1-c_3}{c_4}\right) = 2\log(c_2) + c_4 (1- 2 \gamma) $$
Consider the more interesting $$ \phi(s) = c_1 c_2^s \frac{\Gamma(c_3 + c_4 s)}{\Gamma(c_5 + c_6 s)} $$ key points include $$ \phi(0) = c_1 \frac{\Gamma(c_3)}{\Gamma(c_5)} $$ $$ \phi(1) = c_1 c_2 \frac{\Gamma(c_3 + c_4)}{\Gamma(c_5 + c_6)} $$ and potentially important points $$ \phi\left(\frac{1-c_3}{c_4}\right) = c_1 c_2^\frac{1-c_3}{c_4} \frac{1}{\Gamma(c_5 + c_6 \frac{1-c_3}{c_4} )} $$ $$ \phi\left(\frac{2-c_3}{c_4}\right) = c_1 c_2^\frac{2-c_3}{c_4}... $$ $$ \phi\left(\frac{1-c_5}{c_6}\right) = c_1 c_2^\frac{1-c_5}{c_6}... $$ $$ \phi\left(\frac{2-c_5}{c_6}\right) = c_1 c_2^\frac{2-c_5}{c_6}... $$
We also have $$ \theta(s) = \log(c_2) + c_4 \psi( c_3 + c_4 s) - c_6 \psi( c_5 + c_6 s) $$ $$ \theta(0) = \log(c_2) + c_4 \psi( c_3 ) - c_6 \psi( c_5 ) $$ $$ \theta(1) = \log(c_2) + c_4 \psi( c_3 + c_4) - c_6 \psi( c_5 + c_6) $$ $$ \theta(p_{134}) = \log(c_2) - \gamma c_4 - c_6 \psi( c_5 + c_6 p_{134}) $$ $$ \theta(p_{234}) = \log(c_2) + (1-\gamma) c_4 - c_6 \psi( c_5 + c_6 p_{234}) $$ etc.
Try $$ \theta(p_{234}) - \theta(p_{134}) = c_4 - c_6 \psi( c_5 + c_6 p_{234}) + c_6 \psi( c_5 + c_6 p_{134}) $$
We might consider the function f3456(s) that sets $$ \frac{\Gamma(c_3 + c_4 f_{3456}(s))}{\Gamma(c_5 + c_6 f_{3456}(s))} = 1 $$
In each case, we really need to think about a value of s that specifically exposes a particular parameter... For Γ(a + bs), to expose the a we can consider a root finder for Γ(x*)−a = 0 and then scale the result as (x* − a)/b... then we just evaluate all the other terms at that point, so in general
$$ \phi(s) = \frac{\Gamma(a_1 + b_1 s)...\Gamma(a_k + b_k s)}{\Gamma(c_1 + d_1 s)...\Gamma(c_l + d_l s)} $$ solve for a bunch of roots numerically for each gamma term. $$ x^*_{\uparrow k} = \frac{root(\Gamma(x) - a_k) - a_k}{b_k} $$ $$ x^*_{\downarrow k} = \frac{root(\Gamma(x) - c_k) - c_k}{d_k} $$ we evaluate the moment function directly at these roots $$ \phi(x^*) $$ to expose the b and d terms, we need to evaluate the log derivative of the moment function θ(s)... this amounts to a weighted sum of digamma functions... we need to figure our how the roots work there $$ c_4 \psi(c_3 + c_4 s) \to -\gamma c_4 $$ alternatively, we could try to equip the root finders with a way to collapse each gamma function into the other parameter...
After this we end up with a set of equations $$ \phi(x^*_{\uparrow j}) = \frac{\Gamma(a_1 + b_1 s)\cdots a_j\cdots \Gamma(a_k + b_k s)}{\Gamma(c_1 + d_1 s)\cdots\Gamma(c_l + d_l s)} $$ $$ \phi(x^*_{\downarrow j}) = \frac{\Gamma(a_1 + b_1 s)\cdots \Gamma(a_k + b_k s)}{\Gamma(c_1 + d_1 s)\cdots c_j \cdots \Gamma(c_l + d_l s)} $$ and we set the update rules to be $$ a_j \to \frac{\phi(x^*_{\uparrow j})}{\Gamma(a_j + b_j x^*_{\uparrow j})}\frac{\Gamma(c_1 + d_1 x^*_{\uparrow j})\cdots\Gamma(c_l + d_l x^*_{\uparrow j})}{\Gamma(a_1 + b_1 x^*_{\uparrow j})\cdots\Gamma(a_k + b_k x^*_{\uparrow j})} $$ and like wise... This means we only need to evaluate the original function, and the single extra divisor.
For logDx we seem to have $$ \frac{\log D_x f(x)}{f(x)} + \log(x) = g(x) $$ then g(x) is relatively well behaved and for powers of x simply $$ x^k \to \psi_0(k+1) $$ but works for ex and logx as f(x), among others...
We seem to have $$ \log D \log D 1 \equiv \lim_{h \to 0} \frac{D^h 1 - 2 D^{-h} 1}{h^2} $$
Consider multiple Dirichlet convolutions.
Example $$ \mu^3 = \sum_{abc=n}\mu(a)\mu(b)\mu(c) = A007428 \to \frac{1}{\zeta^3(s)} $$ $$ \omega \mu^2 = \sum_{abc=n}\omega(a)\mu(b)\mu(c) = A143519 = \sum_{d|n} \chi_p(d)\mu\left(\frac{n}{d}\right) = \sum_{p|n} \mu\left(\frac{p}{d}\right) $$ $$ \omega^2\mu = \sum_{abc=n}\omega(a)\omega(b)\mu(c) = A345354 = \sum_{p|n} \omega\left( \frac{n}{p} \right) $$
In Shorthand $$ \Omega \omega \mu = A307409 = (\Omega(n)-1)\omega(n) = \sum_{p|n} \Omega\left( \frac{n}{p} \right)\to ... $$
So including μω into a triple, has the effect of summing over prime divisors.
$$ \omega^2 \Omega = ??? $$
$$ \omega^3 = apparently not A200221! $$
Then for four terms apparently $$ \mu^2 \omega^2 = A230595? = \chi_p * \chi_p \to \zeta_p(s)^2 $$
Of course we could also have more varied and complicated expressions such as $$ \sum_{abc=n}\omega(a)\omega(b)\mu(c)\omega(c) $$
$$ |\mu^2 \Omega| = A344478 ? \to ??? $$
$$ \mu^2 \lambda = A326415 = \sum_{d|n} \mu_2(d) \lambda(n/d) = \sum_{d|n} \mu_3(d)\chi_\square(n/d)\to \frac{\zeta(2s)}{\zeta(s)^3} $$ where μ2, μ3 are iterative applications of μ to ϵ.
Which is ’Moebius transform applied twice’ to λ.
$$ \mu^2 x = A007431 = \sum_{d|n} \phi(d) \mu(n/d) $$ where x is to just sum over a or b or c directly in the product... This is moebius transform applied twice to natural numbers... (because of the mu-squared).
We can check to see if μx as a token is like “divisor sum, phi” as a meaning?
Seems that
$$ \mu x \Omega = A095112 = \sum_k \Omega(gcd(n,k)) = \sum_{d|n} \phi(d) \Omega(n/d) $$
So we have for some n.t. function Q $$ \mu \omega Q \equiv \sum_{d | n} \chi_p(d) Q(n/d) $$ $$ \mu x Q \equiv \sum_{d | n} \phi(d) Q(n/d) $$
We can conclude that $$ \sum_{abc = n} f(a)g(b)h(c) = \sum_{d|n}\left[\sum_{q|d}f(q)g\left(\frac{d}{q}\right)\right]h\left(\frac{n}{d}\right) $$ which makes complete sense. This can be extended arbitrarily deep $$ \sum_{x_1x_2x_3x_4 = n} f_1(x_1)f_2(x_2)f_3(x_3)f_4(x_4) = \sum_{d_4|n}\left[\sum_{d_3|d_4}\left[\sum_{d_2|d_3}f_1(d_2)f_2\left(\frac{d_3}{d_2}\right)\right]f_3\left(\frac{d_4}{d_3}\right)\right]f_4\left(\frac{n}{d_4}\right) $$
For any unknown sequence s(n), we can then try to ’fit’ a depth n function. We can define a set of sequences, and represent each f as a linear combination (ideally with weights 0, 1...), but then we expand out the terms like
$$ \sum_{d|n} \mathbf{a\cdot f(d)} \mathbf{ b \cdot g(n/d)} = $$
———————–
Consider a graph, with nodes and edges... Consider subgraphs as analogies to integer divisors, perhaps a good analogy is chemical compounds.
Consider the existence of a function that takes a molecular graph as input and outputs a number, i.e. a descriptor calculator such as count the number of rings, or number of appearances of certain chemical groups. f(G)=x.
Now consider a notation $$ g(G) = \sum_{S|G} f(S) $$ where a function g of the graph G is the sum of another function f applied to all the possible subgraphs S of G, including perhaps G itself...
What we might be missing is the notion of G/S if we think of this a G without S, then we can think of G being S1S2S3S4 however, for the analogy of numbers we need to consider the existence of prime subgraphs for which there is only one unique graph factorisation.
The main problem is there are many ways to attach fragments to compounds. A number is like a bag of prime factors, it is only the count of each prime that matters, there is only one result once the bag is evaluated and the order of evaluation does not matter. For a compound, although there might exist a (very large) set of fragments that could somehow be defined that reasonably cover the space of interesting molecules, if we defined each compound as a bag of these ’prime fragments’, 1) the order at which they are taken out of the the bag and stuck together (concatenated) matters [a combinatoric problem], 2) where exactly they are joined together matters [a second combinatoric problem].
Issues with SMARTS counts. We would have to define all prime fragments such that when they were joined in any way no new prime fragments could be counted.
——————–
We consider a method of approximating functions by a statistically optimal matrix
Consider a function on a fixed domain, e.g. [0, 1]. If we randomly sample a vector of points from the domain and sort the vector into a new vector x, then we have a distribution for each element. For small vectors there will be a reasonable level of variation, for larger vectors the variation will narrow. We then apply the function to that vector to get f. We will consider the statistical optimum matrix A such that Ax ≈ f for any sorted input x.
Some questions, is there only one optimal? Does the accuracy increase arbitrarily for arbitrarily large matrix. Do the eigenvalues and vectors of A has any connection to the true function f(x)? Is there a way of constructing A from the function f?
We can consider the samples x. This looks a little like a stick breaking process, and is also related to the distribution of the minimum, or maximum, or kth smallest element from a n samples of a uniform distribution.
We can first consider a diagonal A for simplicity, but we find this cannot well describe any function but a line so generalise to a matrix to allow overlap information.
We have the order statistics for a uniform distribution as $$ x_k \sim \mathrm{Beta}(k,n+1-k) $$ then simplistically for the 2 × 2 case we have $$ \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} f(x_1) \\ f(x_2) \end{bmatrix} $$
Utilizzo dei modelli per la stima della qualità dell'aria
and 1 collaborator
Simple Physics with Python: a workbook on introductory Physics with open source software
and 4 collaborators