Public Articles

Blog Post 8

1 $\underline{\text{Parity}}$

Parity, in terms of mathematics, describes the classification of an integer as either even or odd. An even number is defined as an integer that is divisible by 2 while an odd number is one that is not. A more formal definition states that an even number is an integer *n* of the form *n* = 2*k* where *k* is an integer. On the other hand, an odd number is an integer of the form *n* = 2*k* + 1. In set notation we see:

\[ \text{Even} \hspace{0.5mm} = \hspace{0.5mm} {2k : k \in \mathbb{Z}} \]

\[ \text{Odd} \hspace{0.5mm} = \hspace{0.5mm} {2k+1 : k \in \mathbb{Z}} \]

In number theory, the idea of parity allows us to solve some mathematical problems simply by making note of odd and even numbers. In the same way, the impossibility of some mathematical constructions can be proven. For example, consider the following question:

The impact of boreal wildfires on carbon and nitrogen dynamics: the interplay between biotic and abiotic processes

Wildfires are a natural phenomenon but human activities are altering both the driving factors (climate) and the vulnerability (land-use factors) of ecosystems, increasing both frequency and severity of fire impacts. This is an issue of concern given that wildfires play a major role in the global carbon cycle by affecting carbon and nitrogen storage in ecosystems. Yet, our knowledge of early post-fire carbon (C) and nitrogen (N) (hereafter abbreviated as CN) dynamics has been severely limited by the lack of cross-scale (from soil to plant to ecosystem) and cross-landscape (wetlands to uplands, managed and unmanaged land) studies. Understanding the mechanisms causing variability in CN dynamics (e.g., CN accumulation) , in heterogeneous landscapes, is critical for predicting changes in C and N storage with more frequent disturbance. Given this immediate research need, I propose an ambitious research program to investigate the impact of wildfires on the C and N cycle in the boreal landscape, capitalizing on a recent stand-replacing wildfire in Sweden. With an array of paired pre- and post-fire data, which is rare in wildfire ecosystem research, I aim to address whether pre-disturbance and initial post-disturbance conditions can be used to formulate predictions of post-disturbance ecosystem development. I will employ a novel multidiciplinary framework, which integrates ecological process, like plant community development, into the biogeochemical processes. This much needed integration makes it possible to improve and add new mechanisms to current ecosystem models and to answer under what conditions is the system is most vulnerable to change under frequent and severe wildfires. Three question-based work packages are described below as the basis of this wildfire research program:

**CN losses.**CN losses. Where in the landscape do the largest C and N losses occur, and what factors control losses? How large are CN combustion losses relative to C transformed into charcoal and hydrologically-exported CN following fire?**CN pool development.**What is the relative importance of abiotic (e.g. soil moisture, temperature) and biotic (e.g. plant traits) factors in generating variation in post-fire recovery rate of C and N pools at different spatial scales?**Vegetation development.**What controls species and trait assembly post-fire? What is the role of niche-based processes (abiotic effects: environmental filtering, and biotic effects: legacy effects, regeneration traits) in contrast to neutral processes (stochasticity, priority effects)?

HST proposal 2016

and 6 collaborators

*The ‘Scientific Justification’ section of the proposal (see Section 9.1) should include a description of the scientific investigations that will be enabled by the final data products, and their importance*

*6 page limit, total proposal + figures can be 11.*

One of the most powerful observational tools for constraining the physics governing galaxy formation and evolution is morphology. The structural features of a galaxy are known to have close relationships with its physical properties; eg. the link between star formation rate and Hubble type \citep{Masters2010,Bundy2010,Schawinski2014} or spiral arms \citep{Willett2015}, bars and AGN \citep{Oh2012,Hao09,Galloway2015}, bars and atomic gas content \citep{Masters2012}, [lots more possibilities of examples - help with more non-galaxy zoo examples?] It is known that the demographics of most morphological features are *not*, in general, constant as a function of redshift. This is not surprising, given that key elements involved in the formation of galaxies are also shown to change as the Universe evolves, eg. star formation is known to peak at *z* ∼ 1 and drop steadily thereafter.

[few paragraphs of more descriptive examples of how galaxy physics is related to morphology + reasons for studying 0 < *z* < 2)]

Obtaining morphological data for such large numbers of galaxies is a unique challenge, in that to date there is no system that can produce both accurate and complete morphologies using automated methods. This problem is especially present with increasing redshift, for two reasons. First, images of distant galaxies are less resolved, making it difficult to distinguish finer features in the image. Second, galaxy shapes become increasingly irregular in the early Universe, due to increased merger rate and the clumpy nature of star formation. As large telescopes become more capable of imaging these distant galaxies, we continue to discover for the first time new large-scale structures which do not exist at low *z*; this creates a difficulty in defining an automated categorization for these unique types. Until automated methods overcome these challenges, visual classification by humans remains the most accurate method of measuring galaxy morphology, especially for galaxies beyond the local Universe.

Visual classification is of course not without its own challenges, which are time and efficiency. While humans produce more accurate and complete classifications than a computer, the time it takes to do so is overwhelming for the wealth of data becoming available by large surveys. The Galaxy Zoo project has developed a highly innovative method for bypassing the time drawback while maintaining the accuracy of visual classification. Displaying images of SDSS galaxies to volunteers via a simple and engaging web interface, `www.galaxyzoo.org`

asks people to classify the images by eye. Within its first year, each of the ∼1 million SDSS galaxies had already been classified an average of 40 times through the efforts of hundreds of thousands of members of the general public providing ∼40 million classifications \citep{Lintott2008,Fortson2012}.

In 2010, Galaxy Zoo moved beyond the local Universe by including ∼100, 000 HST galaxies in a project known as Galaxy Zoo: Hubble. All galaxies were classified at least 40 times by late 2012. This project enabled the first direct, morphologically accurate studies to be done on the evolution of galaxies, several of which have already been completed with the preliminary data, including bar fraction with redshift \citep{Cheung2014,Melvin2014} and passive disk fraction with redshift \citep{Galloway2016}. These only represent a small fraction of the numerous possibilities for scientific investigation capable with these data; disk/spheroidal distinction, bars, spiral arms, clumpiness, and bulge dominance are a portion of the morphological information provided by this catalog (for the full list see Figure [fig:decision tree]).

Our aim with this proposal is to develop the next phase of Galaxy Zoo:Hubble, which we will hereafter refer to as Galaxy Zoo:Hubble 2 (GZH2). The motivation for extending this project is twofold: First, although the visual classification methods have been immensely successful thus far in obtaining robust morphologies for large ( 100, 000) samples of galaxies, automation methods have improved since the first release in the form of powerful machine-learning algorithms. These alone are still not independently capable of accurate classification for galaxies at all redshifts, however *combining* these methods with the current system of human classifications has been shown to reduce the classification time of galaxies by 80% (can we cite something/ provide a figure Melanie?), thereby significantly improving both the efficiency and accuracy of GZH classifications. The details for this process are explained in full in the Analysis Plan. Second, in addition to the original GZH galaxies, an additional XX,XXX HST galaxies will be added to the project to be classified by this new method.

By combining machine-learning with human classifications, GZH2 will provide the most morphologically accurate data for the widest redshift range (to *z* ∼ 1.2) currently available. These data will enable countless new science projects involving galaxy evolution than has ever been capable to this level of accuracy. With the funding from this proposal, our team will focus on two science cases: clumpy galaxies (need better zinger description) and the mass-metallicity relation.

Homework 3

and 3 collaborators

To construct the L2 MNE operator, we used the formula found in the lecture slides: $$P_{MNE}=RL^T(LRL^T + \lambda C_n)^{-1}$$

The following Matlab script was used in the simple case without depth weighting:

```
load ex3_data.mat
%% MNE
L2=L*L';
snr = sqrt(n_trial);
lambda = trace(L2)/(trace(Cn)*snr^2);
ev = eigs(L2+lambda*Cn, rank(data1));
tol = ev(end)
p_mne=L'*pinv(L2+lambda*Cn, tol);
src1 = p_mne*data1;
src2 = p_mne*data2;
```

Here an identity matrix was used for the source covariance matrix.

Next, to include depth weighting, we modified the source covariance matrix:

```
%% Depth-Weighted MNE
W_i = diag(sqrt(sum(L,1).^2./306));
lambda = trace(L*W_i*L')/(trace(Cn)*snr^2);
ev = eigs(L*W_i*L'+lambda*Cn, rank(data1));
tol = ev(end)
pw_mne=W_i*L'*pinv(L*W_i*L'+lambda*Cn,tol);
srcw1 = pw_mne*data1;
srcw2 = pw_mne*data2;
```

Finally, we constructed a beamformer operator for the data:

```
%% Beamformer
N = 306;
M = 5124;
ev = eigs(Cd, rank(data1));
tol = ev(end)
p_bf = (pinv(Cd,tol)*L)';
denominator = diag(p_bf*L);
p_bf = p_bf./repmat(denominator,1, N);
srcb1 = p_bf*data1;
srcb2 = p_bf*data2;
```

using the following formula from the lecture slides $$P_{BF,\theta} = \dfrac{(C_\mathrm{d}^{-1} L_\theta)^T}{L_\theta^T C_\mathrm{d}^{-1} L_\theta},$$ where *L*_{θ} is the gain vector for source point *θ*.

We wanted to visualize the source estimates as a function of time. We achieved this with the following script:

```
%% Plots
close all
vis_surface_data(srcb1(:,115), 0.1, max(srcb1(:)), anat_decim)
%%
close all
figure(3)
hold on
vv = var(data1,[], 2);
plot(timeaxis,data1(vv > 0.3*max(vv),:))
plot([1,1]*timeaxis(115),ylim(gca()), 'k--')
plot([1,1]*timeaxis(140),ylim(gca()), 'k--')
plot([1,1]*timeaxis(190),ylim(gca()), 'k--')
xlabel('time [s]')
axis tight
%%
close all
figure(4)
hold on
vv = var(data2,[], 2);
plot(timeaxis,data2(vv > 0.3*max(vv),:))
plot([1,1]*timeaxis(125),ylim(gca()), 'k--')
plot([1,1]*timeaxis(190),ylim(gca()), 'k--')
xlabel('time [s]')
axis tight
```

Key Cryptography

RSA encryption can be used for many things such as keeping important messages secured. It is very difficult to break or decode messages that have been encrypted by RSA encryption if not given a public key. There are a few steps that one must go through in order to encrypt and decrypt a message.

We can look at few variables that are needed through the RSA encryption process:

\(e\) will be our public key

\(d\) is the value used for decoding and is only given to the receiver

\(p\) and \(q\) are the primes

\(n\) is the result of \(pq\)

\(M\) is the original message

\(C=M^e\) (mod \(n\)) is used to encrypt messages

\(C^d\) (mod \(n\)) is used to decrypt messages

Graph Theory

Graph theory is a topic used in discrete mathematics to show networks and study relationships between objects in a more mathematical way. Graphs consists of a a set of vertices usually denoted \(V,\) and an sets of edges typically denoted \(E.\) Each edge in a graph connects the vertices. A graph \(G\) is defined as an ordered pair where \(G=\left(V,E\right).\)

Benchmark study on the electronic structure properties of polyoxometalate Keggin-like structures and Carbon substrates

*Oh, an empty article!*

You can get started by **double clicking** this text block and begin editing. You can also click the **Text** button below to add new block elements. Or you can **drag and drop an image** right onto this text. Happy writing!

Test1

*Oh, an empty article!*

You can get started by **double clicking** this text block and begin editing. You can also click the **Text** button below to add new block elements. Or you can **drag and drop an image** right onto this text. Happy writing!

Análise de medidas em grafos para conectividade funcional em redes de modo padrão na demência da doença de Alzheimer leve utilizando técnicas de aprendizado de máquina.

A doença de Alzheimer (DA) é uma doença neurodegenerativa que surge, em geral, após a sétima década de vida, e acarreta alterações cognitivas como déficit de memória episódica, nomeação e outros déficits de linguagem, habilidades visuo-esoaciais, praxias e funções executivas.

Estima-se que, em todo o mundo, mais de 27 milhões de pessoas sofram de DA \cite{Wimo_2006}. E é a principal causa de demência na população idosa, responsável por cerca de 60 a 70% de todas as demências.

Sua prevalência tem aumentado progressivamente devido, uma das causas é o envelhecimento da população mundial. A prevalência da doença dobra, a cada 5 anos, em média, passando de 1% aos 60 anos e chegando a mais de 40% da população com mais de 85 anos de idade \cite{Cummings_2002}. Com o avanço tecnológico e científico, a expectativa de vida média vêm aumentando a cada ano, e estima-se que em 2050 serão cerca de 100 milhões de casos da DA em todo o mundo \cite{Wimo_2013}.

Existem alguns fatores de risco conhecidos para a DA de início tardio, como idade, doenças vasculares e fatores genéticos como a presença do alelo *ϵ*4 da apolipoproteína E(APOE4), uma proteína carreadora de colesterol envolvida no metabolismo das placas neuríticas (PN) \cite{Poirier_2001}.

Existem 5 alelos para a APOE, numerados de *ϵ*1 a *ϵ*5, sendo o mais comum o *ϵ*3 (cerca de 90% da população caucasiana com 1 alelo e 60% com 2 alelos), o *ϵ*2, cuja presença pode conferir proteção contra o depósito de peptídeo *β*-amilóide (*β*A) e o *ϵ*4, com cerca de 30 % da população com 1 alelo \cite{Corder_1998}.

Do ponto de vista anátomo-patológico, as PN e os emaranhados neurofibrilares (ENF) são as características mais marcantes da DA.

O estudo da fisiopatologia do *β*A levou ao desenvolvimento de novas propostas terapêuticas, como a inibição da atividade das enzimas *γ* e *β*-secretase e/ou estimulação da atividade da *α*-secretase, ou ainda, imunoterapia com anticorpos anti-*β*A. Porém, apenas o depósito do *β*A não explica toda a fisiopatologia da DA, além de apresentar pouca correlação com a gravidade da demência \cite{Eckman_2007}, \cite{Poirier_2001}.

Os ENF contém FHP originados da hiperfosforilação da proteína tau. Algumas áreas cerebrais são mais vulneráveis a esse processo patológico, como os hipocampos e os córtices frontais. É comum ocorrer nessas regiões FHP contendo proteína tau anormalmente fosforilada, de peso molecular maior que o habitual, conhecida como proteína associada a DA \cite{Poirier_2001}. Esse fenômeno pode justificar a maior correlação clínica dos sintomas cognitivos com a presença dos ENF.

Existem outros fatores causais para a DA, como resposta inflamatória local, disfunção mitocondrial, alteração de neurotransmissores secundária a perda dos neurônios colinérgicos do núcleo basal de Meynert e serotoninérgicos dos núcleos da rafe, além de perda sináptica precoce. Essa perda sináptica é a variável neuropatológica com maior correlação com o grau de demência \cite{Scheff_2003}.

Dessa forma, DA possui múltiplas causas, e é possível que cada um desses fatores fisiopatológicos contribua de forma diferente para a origem dos sintomas cognitivos. É possível que em um futuro próximo, a DA poss ser tratada precocemente, levando em conta o perfil genético e molecular de cada indivíduo.

Desse modo, um dos ramos mais ativos na pesquisa sobre DDA hoje é baseado na busca de biomarcadores que possam antecipar seu diagnóstico, e espera-se que em um futuro próximo haja terapias farmacológicas que possam interromper sua progressão. Neste contexto, surgiu o conceito de Comprometimento Cognitivo Leve (CCL), que é o termo clínico usado para pacientes com alterações cognitivas, porém sem prejuízo significativo em atividades diárias (ou seja, sem que sejam preenchidos critérios para diagnóstico de demência) \cite{21514249}. Embora não haja um critério universalmente aceito, a maioria dos pesquisadores considera necessários: uma queixa cognitiva, comumente memória episódica e preferencialmente confirmada por uma pessoa próxima; comprometimento cognitivo objetivo, com desempenho inferior em testes neuropsicológicos quando comparados a pessoas da mesma faixa etária e escolaridade; Além de atividades de vida diária preservadas ou minimamente comprometidas.

Energy dependence of Net-kaon Multiplicity Distributions at RHIC

*Oh, an empty article!*

You can get started by **double clicking** this text block and begin editing. You can also click the **Text** button below to add new block elements. Or you can **drag and drop an image** right onto this text. Happy writing!

Blog Post 7

1 $\underline{\text{Public Key Cryptography}}$

Public key cryptography (from the Greek roots “crypto” and “graphon,” meaning “hidden writing”) is perhaps the most discreet application of discrete mathematics (Benjamin, 2009). The most famous method for public key cryptography is called the RSA method. This method was discovered in 1977 by three mathematicians Rivest, Shamier, and Adleman.

To carry out this method, one would begin by defining *n* to be the product of two primes *p* and *q*. We must also choose a private key, as it is called, and we will name this *d*. Then, we compute *ϕ*(*n*) which is equivalent to (*p* − 1)(*q* − 1). As long as *d* is relatively prime to *ϕ*(*n*), we can set up the equation *d**e* − *ϕ*(*n*)*f* = 1 and solve for *e* and *f* using the Euclidean Algorithm. The value for *e* is made public. Therefore, the information that is published is *n* and *e* while all other values are kept private.

The message that is sent is $M^e \equiv C \pmod{n}$ where the original message is denoted *M* and the encrypted message is *C*. To decrypt our message, one would compute $C^d \equiv M \pmod{n}$.

Graph Theory

In
discrete mathematics, graphs are used to show concepts of networks or
structures in a mathematical way. In particular, a graph consists of vertices
\(\left(V\right)\), that is a finite set and a set of edges \(\left(E\right)\). Each edge has at least one vertex
connected to it and can also be described as its endpoint. And edge is what
connects these vertices or endpoints.

The Chinese Remainder Theorem

The Chinese Remainder Theorem is used in discrete mathematics to find a unique solution up to a desired modulus.

The Chinese Remainder Theorem states: If \(m_1\) and \(m_2\) are relatively prime, the the system of congruences \(N\equiv a_1\) (mod \(m_1\)), \(N\equiv a_2\) (mod \(m_2\)) has a unique solution (mod \(m_1m_2\)).

From this theorem, we can generalize and say that if \(m_1\) and \(m_2\) are relatively prime, then we can allow \(a_1\) and \(a_2\) to be any two integers. There will exist an integer \(N\) that satisfies the expressions above.

With \(\left(m_1,m_2\right)=1,\) there exists \(x\) and \(y\) that satisfies \(m_1x+m_2y=1.\) We can find \(x\) and \(y\) by plugging in numbers to find solutions that work or we can use the Euclidean Algorithm and back substitution to find the solutions.

From here, the solution to the system of congruences is found by using our formula: \(N=m_1a_2x+m_2a_1y\)

Modular Arithmetic

Modular arithmetic is used in discrete mathematics to output remainders. It is an arithmetic of congruences and is sometimes referenced as "clock arithmetic." This is the case because numbers are said to wrap around our modulus which is the fixed quantity. Below is an example using a clock:

Welcome to Authorea!

Hey, welcome. Double click anywhere on the text to start writing. In addition to simple text you can also add text formatted in **boldface**, *italic*, and yes, math too: *E* = *m**c*^{2}! Add images by drag’n’drop or click on the “Insert Figure” button.

Blog Post 6

1 $\underline{\text{Card Shuffling}}$ In discrete mathematics, a riffle shuffle is a shuffling of cards in which the top half is placed in one hand and the other half lies in the opposite hand. The cards are then alternatively interlaced with one another. There are two different types of riffle shuffles that will be the topic of discussion: the out-shuffle and the in-shuffle. An out-shuffle keeps the top card on top and the bottom card on bottom. When the top card is, instead, placed in the second position we have an in-shuffle.

Consider a typical 52 card deck. If we name and order the cards 0, 1, 2, 3, 4..., then after an out-shuffle we receive the order 0, 26, 1, 27, 2, 28.... After an in-shuffle, the cards appear as 26, 0, 27, 1, 28, 2.... Note that the first card is in the 0 position. To return to the original order, we must make 8 out-shuffles. Interestingly enough, however, it takes 52 in-shuffles.

Fermat's Little Theorem

Fermat's Little Theorem states, if \(p\) is a prime number and \(a\) is an integer, \(a^p\equiv a\) (mod \(p\)). The theorem itself is used and is very helpful when testing numbers to see if they are not prime. It can be easy to see whether a small number is not prime however, with big numbers it can be difficult. One of the most important things to note is that the theorem does not tell whether a number is prime but it does show if a number is not prime. Therefore, the famous theorem also states that if \(p\) does not divide \(a\), then \(a^{p-1}\equiv\) \(1\) (mod \(p\)).

Enabling a muscle-based gesture interface for an Air Guitar game.

With the advancement of technology and the decline of manual labor, humans try to improve their quality of life using any innovation that they can think of. Technology makes our lives easier and more efficient. In turn, efficiency implies that we can allot the saved time for other tasks. For example, the evolution of buttons to touch screen. This evolution saved precious time by introducing dynamic menus, faster input, and a lot of flexibility. However, evolution does not stop there. With the introduction of electromyography (EMG), or the technique of evaluating and recording muscle activity through either a needle (intramuscular) or electrodes on muscles (surface), another form of input was made. It’s unclear if it provides a substantial increase in efficiency compared to traditional touch screens. However, having different options for certain situations are favorable. Other forms of input are: speech recognition, eye gaze trackers, and computer vision. These three other forms of input however, are unfavorable in certain situations; electromyography may be used for those situations. Surface EMG may be preferred over intramuscular EMG because it’s too professional and expensive. One device that’s able to conduct surface EMG is the Myo armband.

Recollimation boundary layers as X-ray sources in young stellar jets

and 2 collaborators

Young stars accrete mass from circumstellar disks and in many cases, the accretion coincides with a phase of massive outflows, which can be highly collimated. Those jets emit predominantly in the optical and IR wavelength range. However, in several cases X-ray and UV observations reveal a weak but highly energetic component in those jets. X-rays are observed both from stationary regions close to the star and from knots in the jet several hundred AU from the star. In this article we show semi-analytically that a fast stellar wind which is recollimated by the pressure from a slower, more massive disk wind can have the right properties to power stationary X-ray emission. The size of the shocked regions is compatible with observational constraints. Our calculations support a wind-wind interaction scenario for the high energy emission near the base of YSO jets. For the specific case of DG Tau, a stellar wind with a mass loss rate of 5 ⋅ 10^{−10} *M*_{⊙} *y**r*^{−1} and a wind speed of 800 km s^{−1} reproduces the observed X-ray spectrum. We conclude that a stellar wind recollimation shock is a viable scenario to power stationary X-ray emission close to the jet launching point.

[section] [section] [section] [section] [section] Optimal Predictors: A Bayesian Notion of Approximation Algorithm

The concept of an "approximation algorithm" is usually only applied to optimization problems since in optimization problems the performance of the algorithm on any given input is a continuous parameter. We introduce a new concept of approximation applicable to decision problems and functions, inspired by Bayesian probability. From the perspective of a Bayesian reasoner with limited computational resources, the answer to a problem that cannot be solved exactly is uncertain and therefore should be described by a random variable. It thus should make sense to talk about the expected value of this random variable, an idea we formalize in the language of average-case complexity theory by introducing the concept of "optimal predictor." We show that optimal predictors exhibit many parallels with "classical" probability theory, prove some existence theorems and demonstrate some applications to artificial general intelligence.

Modular Arithmetic

Modular arithmetic is used in discrete math to find remainders. The definition states, if \(a\) and \(b\) are both integers and \(m>0\) then \(a\) is congruent to \(b\) (mod \(m\)) if \(m\) divides \(a-b\). The notion of modular arithmetic deals with the remainders that are found in Euclidean division. The actions of trying to find the remainder is also known as modulo operation or (mod \(n\)) where \(n\) is a an integer. For instance, the division of \(8\) by \(3\) can also be written as \(8\) (mod \(3\)) and we can find the remainder to equal \(2\) thus, \(8\) (mod \(3\)) \(=2\).

Blog Post 5

1 $\underline{\text{Binary Notation}}$

We define binary numbers as the powers of two that lay the foundation for the additive building blocks of positive integers. Note that the word binary comes from “Bi” meaning two. In this system, integers are expressed in terms of only 0s and 1s. The values that represent each integer are calculated by finding the sum of the powers of two that make up the given number. We pull out the amount of times that each power of two occurs. For example, the decimal number ten is written as “1010” because it is $\underline{1}$ ⋅ $2^3+ \underline{0} $ ⋅ $2^2 + \underline{1} \cdot 2^1 + \underline{0} \cdot 2^0 $. Notice that this starts with the largest power of two. We read “1010” as “one-zero-one-zero” as opposed to one thousand and ten. The binary representation of the first few natural numbers are shown in the table below.