What's Open Access Good For? Absolutely everything!

and 6 collaborators

**access.**One might guess that this is the what defines the research publishing community. One might be wrong.

**#OpenAccess**and what is it good for? Absolutely everything, as far as research is concerned.

The effect of carbon subsidies on marine planktonic niche partitioning and recruitment during biofilm assembly

and 1 collaborator

# Introduction

Biofilms are diverse and complex microbial consortia, and, the biofilm lifestyle is the rule rather than the exception for microbes in many environments. Large and small-scale biofilm architectural features play an important role in their ecology and influence their role in biogeochemical cycles \citep{17170748}. Fluid mechanics impact biofilm structure and assembly \citep{hoedl_2011,19571890, 14647381}, but it is less clear how other abiotic factors such as resource availability affect biofilm assembly. Aquatic biofilms initiate with seed propagules from the planktonic community \citep{hoedl_2011, 22120588}. Thus, resource amendments that influence planktonic communities may also influence the recruitment of microbial populations during biofilm community assembly.

In a crude sense, biofilm and planktonic microbial communities divide into two key groups: oxygenic phototrophs including eukaryotes and cyanobacteria (hereafter “photoautotrophs”), and heterotrophic bacteria and archaea. This dichotomy, admittedly an abstraction (e.g. non-phototrophs can also be autotrophs), can be a powerful paradigm for understanding community shifts across ecosystems of varying trophic state \citep{Cotner_2002}. Heterotrophs meet some to all of their organic carbon (C) requirements from photoautotroph produced C while simultaneously competing with photoautotrophs for limiting nutrients such as phosphorous (P) \citep{379}. The presence of external C inputs, such as terrigenous C leaching from the watershed \citep{Jansson_2008, Karlsson_2012} or C exudates derived from macrophytes \citep{Stets_2008, Stets_2008b}, can alleviate heterotroph reliance on photoautotroph derived C and shift the heterotroph-photoautotroph relationship from commensal and competitive to strictly competitive . Therefore, increased C supply should increase the resource space available to heterotrophs and increase competition for mineral nutrients decreasing nutrients available for photoautotrophs (assuming that heterotrophs are superior competitors for limiting nutrients as has been observed ). These dynamics should result in the increase in heterotroph biomass relative to the photoautotroph biomass along a gradient of increasing labile C inputs. We refer to this differential allocation of limiting resources among components of the microbial community as niche partitioning.

While these gross level dynamics have been discussed conceptually \citep{Cotner_2002} and to some extent demonstrated empirically \citep{Stets_2008}, the effects of biomass dynamics on photoautotroph and heterotroph membership and structure has not been directly evaluated in plankton or biofilms. In addition, how changes in planktonic communities propagate to biofilms during community assembly is not well understood. We designed this study to test if C subsidies shift the biomass balance between autotrophs and heterotrophs within the biofilm or its seed pool (i.e. the plankton), and, to measure how changes in biomass pool size alter composition of the plankton and biofilm communities. Specifically, we amended marine mesocosms with varying levels of labile C input and evaluated differences in photoautotroph and heterotrophic bacterial biomass in plankton and biofilm samples along the C gradient. In each treatment we characterized plankton and biofilm community composition by PCR amplifying and DNA sequencing 16S rRNA genes and plastid 23S rRNA genes.

A quick introduction to version control with Git and GitHub

and 2 collaborators

# Introduction to version control

Many scientists write code as part of their research. Just as experiments are logged in laboratory notebooks, it is important to document the code you use for analysis. However, a few key problems can arise when iteratively developing code that make it difficult to document and track which code version was used to create each result. First, you often need to experiment with new ideas, such as adding new features to a script or increasing the speed of a slow step, but you do not want to risk breaking the currently working code. One often utilized solution is to make a copy of the script before making new edits. However, this can quickly become a problem because it clutters your filesystem with uninformative filenames, e.g. `analysis.sh`

, `analysis_02.sh`

, `analysis_03.sh`

, etc. It is difficult to remember the differences between the versions of the files, and more importantly which version you used to produce specific results, especially if you return to the code months later. Second, you will likely share your code with multiple lab mates or collaborators and they may have suggestions on how to improve it. If you email the code to multiple people, you will have to manually incorporate all the changes each of them sends.

Fortunately, software engineers have already developed software to manage these issues: version control. A version control system (VCS) allows you to track the iterative changes you make to your code. Thus you can experiment with new ideas but always have the option to revert to a specific past version of the code you used to generate particular results. Furthermore, you can record messages as you save each successive version so that you (or anyone else) reviewing the development history of the code is able to understand the rationale for the given edits. Also, it facilitates collaboration. Using a VCS, your collaborators can make and save changes to the code, and you can automatically incorporate these changes to the main code base. The collaborative aspect is enhanced with the emergence of websites that host version controlled code.

In this quick guide, we introduce you to one VCS, Git (git-scm.com), and one online hosting site, GitHub (github.com), both of which are currently popular among scientists and programmers in general. More importantly, we hope to convince you that although mastering a given VCS takes time, you can already achieve great benefits by getting started using a few simple commands. Furthermore, not only does using a VCS solve many common problems when writing code, it can also improve the scientific process. By tracking your code development with a VCS and hosting it online, you are performing science that is more transparent, reproducible, and open to collaboration \cite{23448176, 24415924}. There is no reason this framework needs to be limited only to code; a VCS is well-suited for tracking any plain-text files: manuscripts, electronic lab notebooks, protocols, etc.

Municipalidad como centro de ciencia: Participación ciudadana, investigación local y amplificar la cultura científica.

and 2 collaborators

La inminente creación de un Ministerio de Ciencia y Tecnología pretende potenciar el rol público en la investigación y desarrollo de conocimiento. Un desafío clave para la comunidad científica en este escenario es justificar de cara a la ciudadanía la cantidad de energía y recursos públicos que se le brindarán. Disciplinas como *Education and Public Outreach* ponen foco en las acciones que la comunidad científica debe desempeñar. Este documento propone una estrategia municipal complementaria, anclada en la disciplina *Citizen Science*, para facilitar las acciones los ciudadanos y ciudadanas interesados en influir en la creación de conocimiento, cuyos impuestos estan siendo utilizados. Se presenta una estrategia que aprovecha el ámbito municipal y su territorialidad para reflejar las necesidades, anhelos y requerimientos de las vecinas y vecinos con respecto a la investigación y desarrollo científico vinculada a su territorio. Además, se propone un sistema que refleja el caracter emergente de estos intereses, necesidades y anhelos, y al mismo tiempo brinda la oportunidad de vincular a los investigadores a través de sus proyectos, ya sea en etapa de propuesta o existentes, que satisfacen esos intereses, necesidades y anhelos, y así facilitar un vínculo de las vecinas y vecinos con proyectos que se hacen cargo de sus inquietudes.

Homework Portfolio

1.

- 1.71. Our proof of the Cauchy-Schwarz inequality, Theorem 1.13, used that when *U* is a unit vector, $$0 \leq ||V−(U·V)U||^2 = ||V||2 −(U·V)^2$$. Therefore if *U* is a unit vector and equality holds, then *V* = (*U* · *V*)*U*. Show that equality occurs in the Cauchy Schwarz inequality for two arbitrary vectors *V* and *W* only if one of the vectors is a multiple (perhaps zero) of the other vector.

Answer: In the first case, when *W* = 0, the *W* is a multiple of *V*; In the second case, when *W* is nonzero, then consider the unit vector $U = \frac{W}{||W||}$. Then, by the result in the question, it follows that *V* = (*U* · *V*)*U*. Therefore:

$$V = (U · V)U = (\frac{W}{||W||}· V)·\frac{W}{||W||}= (\frac{W·V}{||W||^2})·W$$ As $(\frac{W·V}{||W||^2})$ is a constant, so *V* is a multiple of *W*.

2.

-2.19. Suppose C is an n by n matrix with orthonormal columns. Use Theorem 2.2 to show that $$||CX|| \leq \sqrt{n} ||X||$$ Use the Pythagorean theorem and the result of Problem 2.17 to show that in fact $$||CX|| = ||X||$$ for such a matrix.

Answer: (1) First we calculate *C*. Let *C*_{j} denote the *j*th column of C. Since C has orthonormal columns, each *C*_{j} has norm 1. Then $$||C|| = \sqrt{\mathop{\sum_{i=1}^n\sum_{j=1}^n}C_{ij}^2}$$ $$=\sqrt{\mathop{\sum_{j=1}^n(\sum_{i=1}^n}C_{ij}^2)}$$ $$=\sqrt{\sum_{j=1}^n|| C_j||^2}$$ Now by Theorem 2.2, $$||CX|| \leq ||C||||X|| = \sqrt{n}||X||$$ as desired. (2)By Problem 2.17, *C**X* = *x*_{1}*C*_{1} + *x*_{2}*C*_{2} + … + *x*_{n}*C*_{n}. To find the norm of RHS, we need to apply the Pythagorean theorem (we need that C has orthonormal columns) to get $$||x_1 C_1 + x_2 C_2 + \dots + x_n C_n||^2 = ||x_1 C_1||^2 + ||x_2 C_2||^2 + \dots + ||x_n C_n||^2$$ Now we can put the pieces together: $$||CX||^2 = ||x_1 C_1 + x_2 C_2 + \dots + x_n C_n||^2$$ $$=||x_1 C_1||^2 + ||x_2 C_2||^2+\dots+ ||x_n C_n||^n$$ $$=x_1^2||C_1||^2 + x_2^2|| C_2||^2+\dots+ x_n ^2||C_n||^n$$ $$=x_1^2 + x_2^2+\dots+ x_n ^2$$ $$= ||X||^2$$ Since norms are nonnegative, we can conclude that ||*C**X*|| = ||*X*||.

3.

-2.44. Use the Cauchy-Schwarz inequality $$|A · B| \leq||A||||B||$$

to prove: (a) the function *f*(*X*)=*C* · *X*is uniformly continuous,

(b) the function *g*(*X*, *Y*)=*X* · *Y* is continuous.

Answer: (a) In case 1, If *C* = 0 then *f*(*X*)=0 for all *X*,so |*f*(*X*)−*f*(*Y*)| = 0 < *ε* for all *ε*, *X*, *Y*.

In case 2, where *C* = (*c*1, *c*2, ..., *c**n*)≠(0, 0, ..., 0).

By the definition of f and properties of the dot product, |*f*(*X*)−*f*(*Y*)| = |*C* · *Y* − *C* · *Y*|=|*C* · (*X* − *Y*)|.

By the Cauchy-Schwartz inequality we get $$|f(X)− f(Y)|=|C·(X−Y)|\leq||C||||X−Y||$$ Let *ε* > 0 and take $δ=\frac {ε}{||C||}$

If $||X−Y||<δ= \frac {ε}{||C||}$ Then we have |*f*(*X*)−*f*(*Y*)| ≤ ||*C*||||*X* − *Y*|| < *ε* for all *X* and *Y* in *R**n*. Therefore for any C, f is uniformly continuous.

(b) Fix *V*(*A*, *B*) in *R*^{2n}, to show that *g*(*V*)=*g*(*X*, *Y*)=*X**Y* is continuous at (*a*, *b*)

Given *ϵ* > 0, we need to find *δ* > *o* If $||U-V||=\sqrt {(X-A)^2+(Y-B)^2}<\delta$

then ||*g*(*U*)−*g*(*V*)||=|*X**Y* − *A**B*|<*ϵ*

By the triangle inequality, we know that $$|XY-AB|= |[(X-A)+A][(Y-B)+B]-AB|$$ $$=|(X-A)(Y-B)+B(X-A)+A(Y-B)|$$ $$\leq||X-A||+||B|| ||X-A||+||A|| ||Y-B||$$ If $\sqrt {(X-A)^2+(Y-B)^2}<\delta$ Then ||*X* − *A*|| + ||*B*||||*X* − *A*|| + ||*A*||||*Y* − *B*|| $$\leq (1+||A||+||B||)\sqrt {(X-A)^2+(Y-B)^2}$$ $$\leq(1+||A||+||B||)\epsilon$$ $$\leq(1+||A||+||B||) \frac {\delta}{1+||A||+||B||}$$ $$=\delta$$ So, given *ϵ* > 0, set $\delta= min(1,\frac {\delta}{1+||A||+||B||})$ we have $\sqrt {(X-A)^2+(Y-B)^2}<\delta$ Therefore, for any (*A*, *B*) in *R*^{2n}, *g*(*X*, *Y*)=*X* · *Y* is continuous.

4.

-2.45. In the triangle inequality ||*A* + *B*|| ≤ ||*A*|| + ||*B*|| put *A* = *X* − *Y* and *B* = *Y*. Deduce ||*Y*|| − ||*X*|| ≤ ||*Y* − *X*||. Show that if two points are within one unit distance of each other, then the difference of their norms is less than or equal to one.

Answer: Let *A* and *B*be in *R*_{n}. Apply the triangle inequality we have $$||A + B|| \leq ||A|| + ||B||.$$ Let*X* = *A* + *B*, *Y* = *B*,so *A* = *X* − *Y*. so we have $$||X|| \leq ||X − Y|| + ||Y||, ||X|| − ||Y|| \leq ||X − Y||.$$

Exchange the symbol *X* and *Y* we get $$ ||Y|| − ||X|| \leq ||Y − X||.$$ So when *X* and *Y* are within one unit of distance of each other which means ||*Y* − *X*|| ≤ 1, by the inequality we can conclude $$||Y|| − ||X|| \leq ||Y − X||\leq 1$$

5.

-3.9. Suppose a function *F* from *R*_{n} to *R*_{m} is differentiable at *A*. Justify the following statements that prove $$L_AH = DF(A)H,$$ that is, the linear function LA in Definition 3.4 is given by the matrix of partial derivatives*D**F*(*A*).

(a) There is a matrix*C* such that *L*_{A}(*H*)=*C**H* for all *H*.

(b) Denote by *C*_{i} the *i* − *t**h* row of *C*. The fraction $$\frac{||F(A+H)−F(A)−L_A(H)||}{||H||} = \frac{||F(A+H)−F(A)−CH||}{||H||}$$ tends to zero as ||*H*|| tends to zero if and only if each component $$\frac {f_i(A+H)− f_i(A)−C_iH}{ ||H||}$$ tends to zero as ||*H*|| tends to zero.

(c) Set *H* = *h**e*_{j} in the *i* − *t**h*component of the numerator to show that the partial derivative *f*_{i, xj}(*A*) exists and is equal to the (*i*, *j*) entry of *C*.

Answer: (a) *F* from *R*_{n} to *R*_{m} is differentiable at *A*. By the definition of differentiability there is a linear function *L*_{A}(*H*) such that $$\frac{||F(A+H)−F(A)−L_A(H)||}{||H||}$$

tends to 0 as ||*H*|| tends to zero.

By Theorem 2.1, every linear function from *R*_{n} to *R*_{m} can be written as *L**A*(*H*)=*C**H* for all *H*, where *C* is some *m* × *n* matrix.

(b) The absolute value of each component of a vector is less than or equal to the norm of the vector so for each H and i we have $$0 \leq| f_i(A + H) − f_i(A) − C_i · H| \leq ||F(A + H) − F(A) − CH||$$ where *C**i* is the *i* − *t**h* row of *C*.

Since $$\frac{||F(A+H)−F(A)=LA(H)||}{||H||}$$ tends to 0,

by the squeeze theorem, both$\frac{ | f_i(A+H)− f_i(A)−C_i·H|}{||H|| }$ and$\frac{ f_i(A+H)− f_i(A)−C_i·H}{||H|| }$

(c) Let *H* = *h**e*_{j} = (0, 0, ..., 1, 0, ..., 0), the 1 in the*j* − *t**h* place. By part (b), tend to zero as ||H|| tends to zero. $$\lim_{\||H||=|h|\to\ 0 } \frac{f_i(A+he_j)− f_i(A)−hC_i ·e_j}{|h|} =0$$

Therefore $$\lim_{h\to\ 0 } \frac{f_i(A+he_j)− f_i(A)−hC_i ·e_j}{h} =0$$

Since *C*_{i} · *e*_{j} = *c*_{ij} and by the definition of partial derivatives,we have $$\lim_{h\to\ 0 } \frac{f_i(A+he_j)− f_i(A)}{h} = \frac {\partial f_i}{\partial x_j} (A)$$

So we can conclude that *c*_{ij} = *c*(*A*)

6.

6.14. Justify the following items which prove:

If f is continuous on *R*_{2} and ∫_{R}*f**d**A* = 0 for all smoothly bounded sets *R*, then is identically zero.

(a) If *f*(*a*, *b*)=*p* > 0 then there is a disc *D* of radius *r* > 0 centered at (*a*, *b*)in which $f(x,y)> \frac {1}{2}p$

(b) If f is continuous and *f*(*x*, *y*)≥*p*_{1} > 0 on a disk *R* then ∫_{R}*f**d**A* ≥ *p*_{1}(*A**r**e**a*(*R*)). ∫_{R}*f**d**A* = 0for all smoothly bounded regions *R*, then *f* cannot be positive at any point. (d) *f* is not negative at any point either. (e) *f* = 0 at all points.

Answer: First, we can assume that f is continuous on R2 and ∫_{R}*f**d**A* = 0 for all smoothly bounded sets *R*

(a) Because f is continuous at (*a*, *b*), by the definition of continuity, there is *r* > 0 such that for all (*x*, *y*) such that||(*x*, *y*)−(*a*, *b*)||<*r*, we have |*f*(*x*, *y*)−*f*(*a*, *b*)| < *p*/2.Then we assume p > 0, so *p* − *p*/2 < *f*(*x*, *y*)<*p* + *p*/2 In particular, *f*(*x*, *y*)>*p*/2

(b) As *R* is bounded, the closure of *R* is closed and bounded. So we can apply the extreme value theorem which means *f* is bounded on the closure of *R*. In particular, *f* is bounded on *R*. *f* is also integrable on *R*; in fact ∫_{R}*f**d**A* = 0. Apply the lower bound property, ∫_{R}*f**d**A* ≥ *p*_{1}(*A**r**e**a*(*R*)) holds.

(c) Suppose *f* is positive at (*a*, *b*).

From (*a*), there is a disc *R*of nonzero radius on which *f*(*x*, *y*)>*f*(*a*, *b*)/2 > 0.

From (b), ∫_{R}*f**d**A* ≥ (*f*(*a*, *b*)/2)·*a**r**e**a*(*R*)>0 But we assumed that ∫_{R}*f**d**A* = 0 for all smoothly bounded sets *R*, it comes to a contradiction. Therefore *f* cannot be positive at any point.

(d) As we know that −*f* is continuous, and that for all smoothly bounded regions *R*, by linearity, we have −*f**d**A* = −*f**d**A* = −0 = 0 . From(c),we know that −*f* cannot be positive at any point. Thus, we conclude that f cannot be negative at any point.

(e) Therefore, for any (*a*, *b*), *f*(*a*, *b*) is defined and is neither positive nor negative, so it must be 0.

7.

-6.44. Justify the following steps to prove that if *f* is integrable on *R*_{2} and *g* is a continuous function with 0 ≤ *g* ≤ *f* then *g* is integrable on *R*_{2}.

(a) ∫_{D(n)}*g**d**A* exsits

(b) 0 ≤ ∫_{D(n)}*g**d**A* ≤ ∫_{D(n)}*f**d**A*

(c) The numbers ∫_{D(n)}*g**d**A* are an increasing sequence bounded above.

(d) lim_{n → ∞}∫_{D(n)}*g**d**A* exsits

Answer:

Check *D* : *D* = *R*^{2} unbounded, g 0, continuous, so we need to prove lim_{n → ∞}∫_{D(n)}*g**d**A* exsits.

(a) *g* ≥ 0 is continuous on *R*_{2} and *D*(*n*) is bounded for each *n* so *g* is integrable over *D*(*n*)

(b) By theorem 6.9 *L**a**r**e**a*(*D*)≤*I*(*f*, *D*) and the fact 0g, we know that 0*a**r**e**a*(*D*)≤∫_{D(n)}*g**d**A* so if 0 ≤ *f*(*x*, *y*)−*g*(*x*, *y*) then $$ 0=0 area(D)\leq \int_{D(n)} f(x,y)-g(x,y) dA \leq \int_{D(n)} f dA - \int_{D(n)} g dA $$ Therefore, 0 ≤ ∫_{D(n)}*g**d**A* ≤ ∫_{D(n)}*f**d**A*

(c) Let *C*_{n} = ∫_{D(n)}*g**d**A*. Because *g* ≥ 0, *D*(*n*)≤*D*(*n* + 1). Then *C*_{1}, *C*_{2}, *C*_{3}...*C*_{n} is an increasing sequence. Since 0 ≤ ∫_{D(n)}*g**d**A* ≤ ∫_{D(n)}*f**d**A* and $$\lim_{n\to\infty} \int_{D(n)} f dA = \int_{R ^2} f dA$$ exists, We got $$\int_{D(n)} g dA \leq \int_{R ^2} f dA$$

(e) By the Monotone Convergence Theorem for sequences, ∫_{D(n)}*g**d**A* increasing and bounded above is convergent so lim_{n → ∞}∫_{D(n)}*g**d**A* = lim_{n → ∞}*C*_{n} exists

8.

6.50. Justify steps (a)–(d) to prove that if a continuous function *f* is integrable on an unbounded set *D* then |∫_{D}*f**d**A*| ≤ ∫_{D}|*f*|*d**A*

(a)∫_{D}*f**d**A* = ∫_{D}*f*_{+}*d**A* − ∫_{D}*f*_{−}*d**A* ≤ ∫_{D}*f*_{+}*d**A* + ∫_{D}*f*_{−}*d**A* = ∫_{D}|*f*|*d**A*

(b)∫_{D}(−*f*)*d**A* ≤ ∫_{D}|*f*|*d**A*

(c)−∫_{D}*f**d**A* ≤ ∫_{D}|*f*|*d**A*

(d)|∫_{D}*f**d**A*| ≤ ∫_{D}|*f*|*d**A*

(a) By Definition 6.9, if *f* is continuous and integrable on an unbounded set *D*, then |*f*| is integrable on *D*. Rewrite *f*(*x*, *y*)=*f*_{+}(*x*, *y*)−*f*_{−}(*x*, *y*) where *f*_{+}(*x*, *y*)=*f*(*x*, *y*) if *f*(*x*, *y*)≥0 and 0 otherwise, and *f*_{−}(*x*, *y*)= − *f*(*x*, *y*)if *f*(*x*, *y*)≤0 and 0 otherwise. So, by the definition of ∫_{D}*f**d**A*, $$\int_ {D} f dA=\int_ {D} f_+ dA - \int_ {D} f_- dA $$ Since ∫_{D}*f*_{−}*d**A* is nonnegtive $$\int_ {D} f_+ dA - \int_ {D} f_- dA \leq \int_ {D} f_+ dA + \int_ {D} f_- dA$$ Since *f*_{+} ≥ 0 and *f*_{−} ≥ 0 are integrable over *D* $$\int_ {D(n)} f_+ dA + \int_ {D(n)} f_- dA =\int_ {D(n)} (f_+ + f_-) dA$$ By the properties of limits of increasing sequence D(n), we know ∫_{D(n)}(*f*_{+} + *f*_{−})*d**A* converges so $$\int_ {D} f_+ dA + \int_ {D} f_- dA =\int_ {D} (f_+ + f_-) dA$$ By the equation *f*(*x*, *y*)=*f*_{+}(*x*, *y*)−*f*_{−}(*x*, *y*), we got $$\int_ {D} f dA \leq \int_ {D} \left|f \right|dA$$

(b) In the same way, we apply (a) to the functions − f to get $$\int_ {D} -f dA \leq \int_ {D} \left|-f \right|dA= \int_ {D} \left|f \right|dA$$

(c)By the properties of limits and the equation ∫_{D(n)} − *f**d**A*=_ D(n) f dA ,*w**e**g**e**t**m**a**t**h**P**l**a**c**e**h**o**l**d**e**r*46*i**d*(*d*)*I**f*ba*a**n**d* −*b* ≤ *a* then|*b*|≤*a*. From (a), we got $$\int_ {D} f dA \leq \int_ {D} \left|f \right|dA$$, From (b) and (c), we got $$- \int_ {D} f dA \leq \int_ {D} \left|f \right|dA$$ Therefore, we can conclude that $$\left| \int_ {D} f dA\right| \leq \int_ {D} \left|f \right|dA$$

9.

-4.21. Find the point on the plane $$z = x − 2y + 3$$ that is closest to the origin, by finding where the square of the distance between (0, 0) and a point (*x*, *y*) of the plane is at a minimum. Use the matrix of second partial derivatives to show that the point is a local minimum.

Let $$ D=d^2 = f(x,y)= x^2+y^2+(x-2y+3)^2 $$, to find the local extrema we let $$\triangledown f = (4x−4y+6,−4x+10y−12)=0$$ at ( − 0.5, 1). so $$ H(-0.5,1)=
\left[
\begin{array}{ c c }
4 & -4 \\
-4 & 10
\end{array} \right]
$$ Because 4 > 0 and (4)(10) − (−4)2 = 24 > 0. So by the Theorem 4.3, it is positive definite. By theorem 4.8, If ▿*f*(*A*)=0 and the Hessian matrix [*f*_{xixj}(*A*]) is positive definite at *A*, then *f*(*A*) is a local minimum. Therefore, *f* has a local minimum at point ( − 0.5, 1)

10.

-7.32. Let *S* be the unit sphere centered at the origin in *R*^{3}. Evaluate the following items, using as little calculation as possible

(a)∫_{S}1*d**σ*

(b)∫_{S}||*X*||^{2}*d**σ*

(c) Verify that ∫_{S}*x*_{1}^{2}*d**σ* = ∫_{S}*x*_{2}^{2}*d**σ* = ∫_{S}*x*_{3}^{2}*d**σ* using either a symmetric argument or parametrizations. Can you do this without evaluating them?

(d) Use the result of parts (b) and (c) to deduce the value of ∫_{S}*x*_{1}^{2}*d**σ*

Answer:

(a) In geometry, ∫_{S}1*d**σ* means the area of the unit sphere in *R*^{3} So ∫_{S}1*d**σ* = *π* · 1^{3} = 4*π*

(b) For all X S we have ||*X*||^{2} = 1, therefore ∫_{S}||*X*||^{2}*d**σ* = ∫_{S}1*d**σ* = 4*π*

(c) Rotation by /2 about the *x*_{3}-axis corresponds to some transformation on the domain of the parametrization of *S*. We know that *x*_{1} comes to the same position as *x*_{2}, Therefore ∫_{S}*x*_{1}^{2}*d**σ* = ∫_{S}*x*_{2}^{2}*d**σ* In the same way, make a rotation by *π*/2 about the *x*_{2} , we got ∫_{S}*x*_{1}^{2}*d**σ* = ∫_{S}*x*_{3}^{2}*d**σ* Therefore, ∫_{S}*x*_{1}^{2}*d**σ* = ∫_{S}*x*_{2}^{2}*d**σ* = ∫_{S}*x*_{3}^{2}*d**σ*

(d) By the definition of norm ||*X*||, we know that ||*X*|| = *x*_{1}^{2} + *x*_{2}^{2} + *x*_{3}^{2} So, $$\int_{S} ||X||^2 d\sigma= \int_{S} x_1^2 +x_2^2 +x_3^2 d\sigma= 3\int_{S} x_1^2 d\sigma = 4\pi$$ Therefore,$$ \int_{S} x_1^2 d\sigma =\frac{1}{3}\int_{S} ||X||^2 d\sigma = \frac{4\pi}{3}$$

CudaHashedNet Midterm Report

and 1 collaborator

# Introduction

As available datasets increase in size, machine learning models can successfully use more and more parameters. In applications such as computer vision, models with up to 144 million \cite{simonyan2014very} parameters are not uncommon and reach state-of-the-art performance. Experts can train and deploy such models on large machines, but effective use of lower-resource hardware such as commodity laptops or even mobile phones remains a challenge.

One way to address the challenge of large models is through model compression using hashing \cite{hashnets}. In general, this amounts to reducing a parameter set *S* = {*s*_{0}, *s*_{1}, ...*s*_{D}} to a greatly reduced set *R* = {*r*_{0}, *r*_{1}, ..., *r*_{d}} with *d* ≪ *D* by randomly tying parameters to hash buckets (*s*_{i} = *r*_{h(i)}). This turned out to perform very well for neural networks, leading to the so-called HashedNets.

Many machine learning models involve several linear projections representable by matrix-vector products *W* ⋅ *x* where x is input data and *W* consists of model parameters. In most such models, this linear algebra operation is the performance bottleneck; neural networks, in particular, chain a large number of matrix-vector products, intertwined with non-linearities. In terms of dimensionality, modern systems deal with millions training samples *x*_{i} lying in possibly high-dimensional spaces. The shape of *W*, (*d*_{out}, *d*_{in}), depends on how deep a layer is in a network: at the first layer, *d*_{in} depends on the data being processed, while *d*_{out} at the final layer depends on the desired system output (i.e., *d*_{out} = 1 for binary classification, and *d*_{out} = *p* if the output can fall in *p* classes). In middle layers, dimensionality is up to the model designer, and increasing it can make the model more powerful but bigger and slower. Notably, middle layers often have square *W*_{h}. When *W* is stored in a reduced hashed format *W*_{h}, many common trade-offs may change.

The goal of our project is to explore the performance bottlenecks of the *W*_{h} ⋅ *x* operation where *W*_{h} is a hashed representation of an array that stays constant for many inputs *x*_{i}. Since neural networks are typically trained with batches of input vectors *x* concatenated into an input matrix *X*, we will look at the general case of matrix-matrix multiplication, where the left matrix is in a reduced hashed format *W*_{h} ⋅ *X*.

Taking advantage of massively parallel GPU architecture can be important even when dealing with smaller models. In March 2015, Nvidia announced a SoC for mobile devices with a GPU performance of 1 teraflop, the Tegra X1 \cite{tegra}; we foresee future mobile devices to have stronger and stronger GPUs.

The objectives of our project are to:

Investigate fast applications of

*W*_{h}⋅*X*when*W*_{h}is small enough to be fully loaded into memory. In this case, is it faster to first materialize the hashed array and use existing fast linear algebra routines? Can the product be computed faster on a GPU with minimal memory overhead? This can lead to highly efficient deployment of powerful models on commodity hardware or phones.Analyze performance when even after hashing

*W*_{h}is too big. In the seminal work that popularized the usage large scale deep convolutional neural networks and training using the GPU \cite{krizhevsky2012imagenet}, Krizhevsky predicts that GPUs with more memory can lead to bigger networks with better performance. Hashing-based compression can help practitioners prototype very large models on their laptops before deciding which configuration to spend cloud computing resources on. Can we make HashedNets training on GPUs efficient? This may involve forcing a predictable split on the hash function to allow for independent division of work.

The Design of HyperFETs

# Model

## Transistor

The transistor is modeled generically by a heavily simplified virtual-source (short-channel) MOSFET model \cite{Khakifirooz_2009}. Although this model was first defined for Silicon transistors, it has been successfully adapted to numerous other contexts, including Graphene \cite{Han_Wang_2011} and Gallium Nitride devices, both HEMTs \cite{RadhakrishnaThesis} and MOSHEMT+VO_{2} HyperFETs \cite{Verma_2017}. Following Khakifirooz \cite{Khakifirooz_2009}, the drain current *I*_{D} is expressed \begin{equation}
\frac{I_D}{W}=Q_{ix_0}v_{x_0}F_s
\end{equation} where *Q*_{iz0} is the charge at the virtual source point, *v*_{x0} is the virtual source saturation velocity, and *F*_{s} is an empirically fitted “saturation function” which smoothly transitions between linear (*F*_{s} ∝ *V*_{DS}/*V*_{DSSAT}) and saturation (*F*_{s} ≈ 1) regimes. The charge in the channel is described via the following semi-empirical form first proposed for CMOS-VLSI modeling \cite{Wright_1985} and employed frequently since (often with modifications, eg \cite{Khakifirooz_2009, RadhakrishnaThesis}): \begin{equation}
Q_{ix_0}=C_\mathrm{inv}nV_\mathrm{th}\ln\left[1+\exp\left\{\frac{V_{GSi}-V_T}{nV_\mathrm{th}}\right\}\right]
\end{equation} where *C*_{inv} is an effective inversion capacitance for the gate, *n**V*_{th}ln10 is the subthreshold swing of the transistor, *V*_{GSi} is the transistor gate-to-source voltage, *V*_{T} is the threshold voltage, and *V*_{th} is the thermal voltage *k**T*/*q*.

For precise modeling, Khakifirooz includes further adjustments of *V*_{T} due to the drain voltage (DIBL parameter) and the gate voltage (strong vs weak inversion shift), as well as a functional form of *F*_{s}. For a first-pass, we will ignore these effects, employ a constant *V*_{T}, and assume the supply voltage is maintained above the gate overdrive such that *F*_{s} ≈ 1. However, we will add on a leakage floor with conductance *G*_{leak}. Altogether, the final current expression (for the analytical part of this analysis) is \begin{equation}
\frac{I_D}{W}=nv_{x_0}C_\mathrm{inv}V_{th}\ln\left[1+\exp\left\{\frac{V_\mathrm{GSi}-V_\mathrm{T}}{nV_{th}}\right\}\right]+\frac{G_\mathrm{leak}}{W}V_\mathrm{DSi}\label{eq:transistor_iv}
\end{equation}

AEP 4830 HW9 Monte Carlo Calculations

The purpose of this homework is to explore the Monte Carlo Algorithm and apply it to the simplified protein folding model in 2D.

# Monte Carlo Method

Monte Carlo Method uses the randomly generated possible solutions to a certain problem in a solution space and test its degree of goodness based on certain physical requirements\cite{NumRec}. The ways of generating the possible solutions are usually two. First, we can generate the possible solutions totally at random. For example, we use random number generator to do Monte Carlo Integration. Second, we can generate the possible solutions from the previous step by randomly changing some parameters of the previous one. We will use the later one to generate our 2D protein structures in this homework.

The general flow of Monte Carlo Method is shown as follows. Note that we use the term “conformation space” instead of “solution space” since we are talking about protein structures here.

Start from a initial state in the conformation space.

Randomly change the previous state, subjecting to requirement 1.

Determine the degree of goodness by criterion 2.

Accept/ reject this state by physical rule 3.

If it is accepted, pass this state and repeat 2 through 5 for certain number of steps.

If it is rejected, do not pass the state and repeat 2 through 4 until the new state is accepted.

The requirement 1, criterion 2 and rule 3 are problem-specific and we will mention these in our protein folding problem.

# 2D Protein Folding

Proteins are composed of 20 different amino acids (AAs) in a polypeptide chain and due to the mutual interactions between those AAs, proteins will favor some folded states to lower the Gibbs free energy. The interactions are mostly negative because of hydrophobic effects or ion-ion interactions. In order for proteins to perform certain biological functions, their unique structures are essential. We can use a simple bead-and-chain model for a 2D protein chain\cite{S_ali_1994}, assuming that all the AAs are of the same size and the peptide bond between two AAs is rigid, being only one unit and unstretchable. Each AA occupies one grid point of the 2D space and cannot be in the same point of any other AAs. When protein folds, the non-covalent interactions apply to the two non-bonding AAs separate by one unit. An we can calculate the relative Gibbs free energy *Δ**G* by summing all the interactions of non-bonding neighbors.

\begin{equation}
\Delta G = E_0 = \sum_{(i,j)} E_{t(i)t(j)}
\end{equation} where (*i*, *j*) are the indices of two neighboring AAs of types (*t*(*i*),*t*(*j*)) and *E* is an 20 × 20 interaction matrix.

With this model in mind, we can determine the requirements mentioned in the previous section.

Requirement 1:

The modified AA cannot occupy other’s positions.

The modified AA must be one unit away from its neighbor(s).

The best way to modify an AA’s position is to move (1, 0), (1, 1), (0, 1), ( − 1, 1), ( − 1, 0), ( − 1, −1), (0, −1) and (1, −1), eight possible changes.

Criterion 2: Evaluate the interaction energy,

*E*_{0}and use this number to determine the goodness of the state. The lower, the better.

Rule 3:

If the new state has lower

*E*_{0}, the protein will adopt this state in order to reach the minimum of the folding landscape.

If the new state has higher

*E*_{0}, the protein does not favor such state. However, there is still some probability to jump from lower energy state to higher energy ones,*P*=*e*^{−(Enew − E0)/kT}.

Once the model is set and the steps are clear, we can start to do the simulation.

# Program Codes

First, we need a general random number generator, *myrand(seed)*. We will test its validity and then apply it to alter the position of a randomly selected AA. Given different *seed*, the function will give different random number sequences. Our seeds for generating interaction matrix *E* and the AA sequence in the protein are two distinct yet fixed value. So we will guarantee that we use exactly the same protein and interactions throughout the calculation. Other than those, the seed will be set by *time(NULL)* and independent of our bias.

Second, there are several subroutines to do the Monte Carlo calculations and to make sure the protein is subject to some requirements. Note that the information of proteins is stored in a 45 × 3 matrix with the first column being AA types, second the x positions and third the y positions.

*neighbor()*: inputs a Protein Vector and outputs the pairs of indices of two non-bonding AAs.

*Energy()*: input pairs of neighbor indices, Protein Vector, interaction matrix*E*and outputs the energy*E*_{0}.

*n2ndistance()*: inputs a Protein Vector and outputs the end-to-end distance of the protein.

*pcheck()*: inputs Protein Vector and the index of certain AA and check if that AA occupies others’ positions. The function outputs*true*if the protein is not allowed,*false*otherwise.

*conformationchange()*: input Protein Vector and make a position change to one of its AA and outputs a modified new Protein Vector.

# Results

First we tested the random number generator *myrand()*. The random number generator gives a uniform distribution of numbers between 0 and 1. And the points (*x*_{n + 1}, *x*_{n}) cover the 1 × 1 square without noticeable patterns as shown in Fig. 1. We further test it by estimating *π*.

\begin{equation}
\frac{\pi}{4} = \frac{N_{in}}{N}
\end{equation} where *N*_{in} is the number of points in the quarter circle and *N* is the total number of points.As the total number of points increases, the RHS will reach $\frac{\pi}{4}$ asympotically, as shown in Fig. 2.

Macalester POTW 1201: Problem 1201. What Goes Up Might Not Come Down

## Problem Statement

A random walk on the 2-dimensional integer lattice begins at the origin. At each step, the walker moves one unit either left, right, or up, each with probability $\frac13$. (No downward steps ever.) A walk is a success if it reaches the point (1, 1). What is the probability of success?

Note: One can vary the problem by varying the target point. Eg., use (1, 0) or (0, 1) instead. Perhaps there is a good method to resolve the general case of target (*a*, *b*).

Source: Bruce Torrence, Randolph-Macon College