A quick introduction to version control with Git and GitHub

and 2 collaborators

# Introduction to version control

Many scientists write code as part of their research. Just as experiments are logged in laboratory notebooks, it is important to document the code you use for analysis. However, a few key problems can arise when iteratively developing code that make it difficult to document and track which code version was used to create each result. First, you often need to experiment with new ideas, such as adding new features to a script or increasing the speed of a slow step, but you do not want to risk breaking the currently working code. One often utilized solution is to make a copy of the script before making new edits. However, this can quickly become a problem because it clutters your filesystem with uninformative filenames, e.g. `analysis.sh`

, `analysis_02.sh`

, `analysis_03.sh`

, etc. It is difficult to remember the differences between the versions of the files, and more importantly which version you used to produce specific results, especially if you return to the code months later. Second, you will likely share your code with multiple lab mates or collaborators and they may have suggestions on how to improve it. If you email the code to multiple people, you will have to manually incorporate all the changes each of them sends.

Fortunately, software engineers have already developed software to manage these issues: version control. A version control system (VCS) allows you to track the iterative changes you make to your code. Thus you can experiment with new ideas but always have the option to revert to a specific past version of the code you used to generate particular results. Furthermore, you can record messages as you save each successive version so that you (or anyone else) reviewing the development history of the code is able to understand the rationale for the given edits. Also, it facilitates collaboration. Using a VCS, your collaborators can make and save changes to the code, and you can automatically incorporate these changes to the main code base. The collaborative aspect is enhanced with the emergence of websites that host version controlled code.

In this quick guide, we introduce you to one VCS, Git (git-scm.com), and one online hosting site, GitHub (github.com), both of which are currently popular among scientists and programmers in general. More importantly, we hope to convince you that although mastering a given VCS takes time, you can already achieve great benefits by getting started using a few simple commands. Furthermore, not only does using a VCS solve many common problems when writing code, it can also improve the scientific process. By tracking your code development with a VCS and hosting it online, you are performing science that is more transparent, reproducible, and open to collaboration \cite{23448176, 24415924}. There is no reason this framework needs to be limited only to code; a VCS is well-suited for tracking any plain-text files: manuscripts, electronic lab notebooks, protocols, etc.

Charge Density Waves: Models, Current Characteristics and Applications

Municipalidad como centro de ciencia: Participación ciudadana, investigación local y amplificar la cultura científica.

and 2 collaborators

La inminente creación de un Ministerio de Ciencia y Tecnología pretende potenciar el rol público en la investigación y desarrollo de conocimiento. Un desafío clave para la comunidad científica en este escenario es justificar de cara a la ciudadanía la cantidad de energía y recursos públicos que se le brindarán. Disciplinas como *Education and Public Outreach* ponen foco en las acciones que la comunidad científica debe desempeñar. Este documento propone una estrategia municipal complementaria, anclada en la disciplina *Citizen Science*, para facilitar las acciones los ciudadanos y ciudadanas interesados en influir en la creación de conocimiento, cuyos impuestos estan siendo utilizados. Se presenta una estrategia que aprovecha el ámbito municipal y su territorialidad para reflejar las necesidades, anhelos y requerimientos de las vecinas y vecinos con respecto a la investigación y desarrollo científico vinculada a su territorio. Además, se propone un sistema que refleja el caracter emergente de estos intereses, necesidades y anhelos, y al mismo tiempo brinda la oportunidad de vincular a los investigadores a través de sus proyectos, ya sea en etapa de propuesta o existentes, que satisfacen esos intereses, necesidades y anhelos, y así facilitar un vínculo de las vecinas y vecinos con proyectos que se hacen cargo de sus inquietudes.

What's Open Access Good For? Absolutely everything!

and 5 collaborators

The effect of carbon subsidies on marine planktonic niche partitioning and recruitment during biofilm assembly

and 1 collaborator

Homework Portfolio

1.

- 1.71. Our proof of the Cauchy-Schwarz inequality, Theorem 1.13, used that when *U* is a unit vector, $$0 \leq ||V−(U·V)U||^2 = ||V||2 −(U·V)^2$$. Therefore if *U* is a unit vector and equality holds, then *V* = (*U* · *V*)*U*. Show that equality occurs in the Cauchy Schwarz inequality for two arbitrary vectors *V* and *W* only if one of the vectors is a multiple (perhaps zero) of the other vector.

Answer: In the first case, when *W* = 0, the *W* is a multiple of *V*; In the second case, when *W* is nonzero, then consider the unit vector $U = \frac{W}{||W||}$. Then, by the result in the question, it follows that *V* = (*U* · *V*)*U*. Therefore:

$$V = (U · V)U = (\frac{W}{||W||}· V)·\frac{W}{||W||}= (\frac{W·V}{||W||^2})·W$$ As $(\frac{W·V}{||W||^2})$ is a constant, so *V* is a multiple of *W*.

2.

-2.19. Suppose C is an n by n matrix with orthonormal columns. Use Theorem 2.2 to show that $$||CX|| \leq \sqrt{n} ||X||$$ Use the Pythagorean theorem and the result of Problem 2.17 to show that in fact $$||CX|| = ||X||$$ for such a matrix.

Answer: (1) First we calculate *C*. Let *C*_{j} denote the *j*th column of C. Since C has orthonormal columns, each *C*_{j} has norm 1. Then $$||C|| = \sqrt{\mathop{\sum_{i=1}^n\sum_{j=1}^n}C_{ij}^2}$$ $$=\sqrt{\mathop{\sum_{j=1}^n(\sum_{i=1}^n}C_{ij}^2)}$$ $$=\sqrt{\sum_{j=1}^n|| C_j||^2}$$ Now by Theorem 2.2, $$||CX|| \leq ||C||||X|| = \sqrt{n}||X||$$ as desired. (2)By Problem 2.17, *C**X* = *x*_{1}*C*_{1} + *x*_{2}*C*_{2} + … + *x*_{n}*C*_{n}. To find the norm of RHS, we need to apply the Pythagorean theorem (we need that C has orthonormal columns) to get $$||x_1 C_1 + x_2 C_2 + \dots + x_n C_n||^2 = ||x_1 C_1||^2 + ||x_2 C_2||^2 + \dots + ||x_n C_n||^2$$ Now we can put the pieces together: $$||CX||^2 = ||x_1 C_1 + x_2 C_2 + \dots + x_n C_n||^2$$ $$=||x_1 C_1||^2 + ||x_2 C_2||^2+\dots+ ||x_n C_n||^n$$ $$=x_1^2||C_1||^2 + x_2^2|| C_2||^2+\dots+ x_n ^2||C_n||^n$$ $$=x_1^2 + x_2^2+\dots+ x_n ^2$$ $$= ||X||^2$$ Since norms are nonnegative, we can conclude that ||*C**X*|| = ||*X*||.

3.

-2.44. Use the Cauchy-Schwarz inequality $$|A · B| \leq||A||||B||$$

to prove: (a) the function *f*(*X*)=*C* · *X*is uniformly continuous,

(b) the function *g*(*X*, *Y*)=*X* · *Y* is continuous.

Answer: (a) In case 1, If *C* = 0 then *f*(*X*)=0 for all *X*,so |*f*(*X*)−*f*(*Y*)| = 0 < *ε* for all *ε*, *X*, *Y*.

In case 2, where *C* = (*c*1, *c*2, ..., *c**n*)≠(0, 0, ..., 0).

By the definition of f and properties of the dot product, |*f*(*X*)−*f*(*Y*)| = |*C* · *Y* − *C* · *Y*|=|*C* · (*X* − *Y*)|.

By the Cauchy-Schwartz inequality we get $$|f(X)− f(Y)|=|C·(X−Y)|\leq||C||||X−Y||$$ Let *ε* > 0 and take $δ=\frac {ε}{||C||}$

If $||X−Y||<δ= \frac {ε}{||C||}$ Then we have |*f*(*X*)−*f*(*Y*)| ≤ ||*C*||||*X* − *Y*|| < *ε* for all *X* and *Y* in *R**n*. Therefore for any C, f is uniformly continuous.

(b) Fix *V*(*A*, *B*) in *R*^{2n}, to show that *g*(*V*)=*g*(*X*, *Y*)=*X**Y* is continuous at (*a*, *b*)

Given *ϵ* > 0, we need to find *δ* > *o* If $||U-V||=\sqrt {(X-A)^2+(Y-B)^2}<\delta$

then ||*g*(*U*)−*g*(*V*)||=|*X**Y* − *A**B*|<*ϵ*

By the triangle inequality, we know that $$|XY-AB|= |[(X-A)+A][(Y-B)+B]-AB|$$ $$=|(X-A)(Y-B)+B(X-A)+A(Y-B)|$$ $$\leq||X-A||+||B|| ||X-A||+||A|| ||Y-B||$$ If $\sqrt {(X-A)^2+(Y-B)^2}<\delta$ Then ||*X* − *A*|| + ||*B*||||*X* − *A*|| + ||*A*||||*Y* − *B*|| $$\leq (1+||A||+||B||)\sqrt {(X-A)^2+(Y-B)^2}$$ $$\leq(1+||A||+||B||)\epsilon$$ $$\leq(1+||A||+||B||) \frac {\delta}{1+||A||+||B||}$$ $$=\delta$$ So, given *ϵ* > 0, set $\delta= min(1,\frac {\delta}{1+||A||+||B||})$ we have $\sqrt {(X-A)^2+(Y-B)^2}<\delta$ Therefore, for any (*A*, *B*) in *R*^{2n}, *g*(*X*, *Y*)=*X* · *Y* is continuous.

4.

-2.45. In the triangle inequality ||*A* + *B*|| ≤ ||*A*|| + ||*B*|| put *A* = *X* − *Y* and *B* = *Y*. Deduce ||*Y*|| − ||*X*|| ≤ ||*Y* − *X*||. Show that if two points are within one unit distance of each other, then the difference of their norms is less than or equal to one.

Answer: Let *A* and *B*be in *R*_{n}. Apply the triangle inequality we have $$||A + B|| \leq ||A|| + ||B||.$$ Let*X* = *A* + *B*, *Y* = *B*,so *A* = *X* − *Y*. so we have $$||X|| \leq ||X − Y|| + ||Y||, ||X|| − ||Y|| \leq ||X − Y||.$$

Exchange the symbol *X* and *Y* we get $$ ||Y|| − ||X|| \leq ||Y − X||.$$ So when *X* and *Y* are within one unit of distance of each other which means ||*Y* − *X*|| ≤ 1, by the inequality we can conclude $$||Y|| − ||X|| \leq ||Y − X||\leq 1$$

5.

-3.9. Suppose a function *F* from *R*_{n} to *R*_{m} is differentiable at *A*. Justify the following statements that prove $$L_AH = DF(A)H,$$ that is, the linear function LA in Definition 3.4 is given by the matrix of partial derivatives*D**F*(*A*).

(a) There is a matrix*C* such that *L*_{A}(*H*)=*C**H* for all *H*.

(b) Denote by *C*_{i} the *i* − *t**h* row of *C*. The fraction $$\frac{||F(A+H)−F(A)−L_A(H)||}{||H||} = \frac{||F(A+H)−F(A)−CH||}{||H||}$$ tends to zero as ||*H*|| tends to zero if and only if each component $$\frac {f_i(A+H)− f_i(A)−C_iH}{ ||H||}$$ tends to zero as ||*H*|| tends to zero.

(c) Set *H* = *h**e*_{j} in the *i* − *t**h*component of the numerator to show that the partial derivative *f*_{i, xj}(*A*) exists and is equal to the (*i*, *j*) entry of *C*.

Answer: (a) *F* from *R*_{n} to *R*_{m} is differentiable at *A*. By the definition of differentiability there is a linear function *L*_{A}(*H*) such that $$\frac{||F(A+H)−F(A)−L_A(H)||}{||H||}$$

tends to 0 as ||*H*|| tends to zero.

By Theorem 2.1, every linear function from *R*_{n} to *R*_{m} can be written as *L**A*(*H*)=*C**H* for all *H*, where *C* is some *m* × *n* matrix.

(b) The absolute value of each component of a vector is less than or equal to the norm of the vector so for each H and i we have $$0 \leq| f_i(A + H) − f_i(A) − C_i · H| \leq ||F(A + H) − F(A) − CH||$$ where *C**i* is the *i* − *t**h* row of *C*.

Since $$\frac{||F(A+H)−F(A)=LA(H)||}{||H||}$$ tends to 0,

by the squeeze theorem, both$\frac{ | f_i(A+H)− f_i(A)−C_i·H|}{||H|| }$ and$\frac{ f_i(A+H)− f_i(A)−C_i·H}{||H|| }$

(c) Let *H* = *h**e*_{j} = (0, 0, ..., 1, 0, ..., 0), the 1 in the*j* − *t**h* place. By part (b), tend to zero as ||H|| tends to zero. $$\lim_{\||H||=|h|\to\ 0 } \frac{f_i(A+he_j)− f_i(A)−hC_i ·e_j}{|h|} =0$$

Therefore $$\lim_{h\to\ 0 } \frac{f_i(A+he_j)− f_i(A)−hC_i ·e_j}{h} =0$$

Since *C*_{i} · *e*_{j} = *c*_{ij} and by the definition of partial derivatives,we have $$\lim_{h\to\ 0 } \frac{f_i(A+he_j)− f_i(A)}{h} = \frac {\partial f_i}{\partial x_j} (A)$$

So we can conclude that *c*_{ij} = *c*(*A*)

6.

6.14. Justify the following items which prove:

If f is continuous on *R*_{2} and ∫_{R}*f**d**A* = 0 for all smoothly bounded sets *R*, then is identically zero.

(a) If *f*(*a*, *b*)=*p* > 0 then there is a disc *D* of radius *r* > 0 centered at (*a*, *b*)in which $f(x,y)> \frac {1}{2}p$

(b) If f is continuous and *f*(*x*, *y*)≥*p*_{1} > 0 on a disk *R* then ∫_{R}*f**d**A* ≥ *p*_{1}(*A**r**e**a*(*R*)). ∫_{R}*f**d**A* = 0for all smoothly bounded regions *R*, then *f* cannot be positive at any point. (d) *f* is not negative at any point either. (e) *f* = 0 at all points.

Answer: First, we can assume that f is continuous on R2 and ∫_{R}*f**d**A* = 0 for all smoothly bounded sets *R*

(a) Because f is continuous at (*a*, *b*), by the definition of continuity, there is *r* > 0 such that for all (*x*, *y*) such that||(*x*, *y*)−(*a*, *b*)||<*r*, we have |*f*(*x*, *y*)−*f*(*a*, *b*)| < *p*/2.Then we assume p > 0, so *p* − *p*/2 < *f*(*x*, *y*)<*p* + *p*/2 In particular, *f*(*x*, *y*)>*p*/2

(b) As *R* is bounded, the closure of *R* is closed and bounded. So we can apply the extreme value theorem which means *f* is bounded on the closure of *R*. In particular, *f* is bounded on *R*. *f* is also integrable on *R*; in fact ∫_{R}*f**d**A* = 0. Apply the lower bound property, ∫_{R}*f**d**A* ≥ *p*_{1}(*A**r**e**a*(*R*)) holds.

(c) Suppose *f* is positive at (*a*, *b*).

From (*a*), there is a disc *R*of nonzero radius on which *f*(*x*, *y*)>*f*(*a*, *b*)/2 > 0.

From (b), ∫_{R}*f**d**A* ≥ (*f*(*a*, *b*)/2)·*a**r**e**a*(*R*)>0 But we assumed that ∫_{R}*f**d**A* = 0 for all smoothly bounded sets *R*, it comes to a contradiction. Therefore *f* cannot be positive at any point.

(d) As we know that −*f* is continuous, and that for all smoothly bounded regions *R*, by linearity, we have −*f**d**A* = −*f**d**A* = −0 = 0 . From(c),we know that −*f* cannot be positive at any point. Thus, we conclude that f cannot be negative at any point.

(e) Therefore, for any (*a*, *b*), *f*(*a*, *b*) is defined and is neither positive nor negative, so it must be 0.

7.

-6.44. Justify the following steps to prove that if *f* is integrable on *R*_{2} and *g* is a continuous function with 0 ≤ *g* ≤ *f* then *g* is integrable on *R*_{2}.

(a) ∫_{D(n)}*g**d**A* exsits

(b) 0 ≤ ∫_{D(n)}*g**d**A* ≤ ∫_{D(n)}*f**d**A*

(c) The numbers ∫_{D(n)}*g**d**A* are an increasing sequence bounded above.

(d) lim_{n → ∞}∫_{D(n)}*g**d**A* exsits

Answer:

Check *D* : *D* = *R*^{2} unbounded, g 0, continuous, so we need to prove lim_{n → ∞}∫_{D(n)}*g**d**A* exsits.

(a) *g* ≥ 0 is continuous on *R*_{2} and *D*(*n*) is bounded for each *n* so *g* is integrable over *D*(*n*)

(b) By theorem 6.9 *L**a**r**e**a*(*D*)≤*I*(*f*, *D*) and the fact 0g, we know that 0*a**r**e**a*(*D*)≤∫_{D(n)}*g**d**A* so if 0 ≤ *f*(*x*, *y*)−*g*(*x*, *y*) then $$ 0=0 area(D)\leq \int_{D(n)} f(x,y)-g(x,y) dA \leq \int_{D(n)} f dA - \int_{D(n)} g dA $$ Therefore, 0 ≤ ∫_{D(n)}*g**d**A* ≤ ∫_{D(n)}*f**d**A*

(c) Let *C*_{n} = ∫_{D(n)}*g**d**A*. Because *g* ≥ 0, *D*(*n*)≤*D*(*n* + 1). Then *C*_{1}, *C*_{2}, *C*_{3}...*C*_{n} is an increasing sequence. Since 0 ≤ ∫_{D(n)}*g**d**A* ≤ ∫_{D(n)}*f**d**A* and $$\lim_{n\to\infty} \int_{D(n)} f dA = \int_{R ^2} f dA$$ exists, We got $$\int_{D(n)} g dA \leq \int_{R ^2} f dA$$

(e) By the Monotone Convergence Theorem for sequences, ∫_{D(n)}*g**d**A* increasing and bounded above is convergent so lim_{n → ∞}∫_{D(n)}*g**d**A* = lim_{n → ∞}*C*_{n} exists

8.

6.50. Justify steps (a)–(d) to prove that if a continuous function *f* is integrable on an unbounded set *D* then |∫_{D}*f**d**A*| ≤ ∫_{D}|*f*|*d**A*

(a)∫_{D}*f**d**A* = ∫_{D}*f*_{+}*d**A* − ∫_{D}*f*_{−}*d**A* ≤ ∫_{D}*f*_{+}*d**A* + ∫_{D}*f*_{−}*d**A* = ∫_{D}|*f*|*d**A*

(b)∫_{D}(−*f*)*d**A* ≤ ∫_{D}|*f*|*d**A*

(c)−∫_{D}*f**d**A* ≤ ∫_{D}|*f*|*d**A*

(d)|∫_{D}*f**d**A*| ≤ ∫_{D}|*f*|*d**A*

(a) By Definition 6.9, if *f* is continuous and integrable on an unbounded set *D*, then |*f*| is integrable on *D*. Rewrite *f*(*x*, *y*)=*f*_{+}(*x*, *y*)−*f*_{−}(*x*, *y*) where *f*_{+}(*x*, *y*)=*f*(*x*, *y*) if *f*(*x*, *y*)≥0 and 0 otherwise, and *f*_{−}(*x*, *y*)= − *f*(*x*, *y*)if *f*(*x*, *y*)≤0 and 0 otherwise. So, by the definition of ∫_{D}*f**d**A*, $$\int_ {D} f dA=\int_ {D} f_+ dA - \int_ {D} f_- dA $$ Since ∫_{D}*f*_{−}*d**A* is nonnegtive $$\int_ {D} f_+ dA - \int_ {D} f_- dA \leq \int_ {D} f_+ dA + \int_ {D} f_- dA$$ Since *f*_{+} ≥ 0 and *f*_{−} ≥ 0 are integrable over *D* $$\int_ {D(n)} f_+ dA + \int_ {D(n)} f_- dA =\int_ {D(n)} (f_+ + f_-) dA$$ By the properties of limits of increasing sequence D(n), we know ∫_{D(n)}(*f*_{+} + *f*_{−})*d**A* converges so $$\int_ {D} f_+ dA + \int_ {D} f_- dA =\int_ {D} (f_+ + f_-) dA$$ By the equation *f*(*x*, *y*)=*f*_{+}(*x*, *y*)−*f*_{−}(*x*, *y*), we got $$\int_ {D} f dA \leq \int_ {D} \left|f \right|dA$$

(b) In the same way, we apply (a) to the functions − f to get $$\int_ {D} -f dA \leq \int_ {D} \left|-f \right|dA= \int_ {D} \left|f \right|dA$$

(c)By the properties of limits and the equation ∫_{D(n)} − *f**d**A*=_ D(n) f dA ,*w**e**g**e**t**m**a**t**h**P**l**a**c**e**h**o**l**d**e**r*46*i**d*(*d*)*I**f*ba*a**n**d* −*b* ≤ *a* then|*b*|≤*a*. From (a), we got $$\int_ {D} f dA \leq \int_ {D} \left|f \right|dA$$, From (b) and (c), we got $$- \int_ {D} f dA \leq \int_ {D} \left|f \right|dA$$ Therefore, we can conclude that $$\left| \int_ {D} f dA\right| \leq \int_ {D} \left|f \right|dA$$

9.

-4.21. Find the point on the plane $$z = x − 2y + 3$$ that is closest to the origin, by finding where the square of the distance between (0, 0) and a point (*x*, *y*) of the plane is at a minimum. Use the matrix of second partial derivatives to show that the point is a local minimum.

Let $$ D=d^2 = f(x,y)= x^2+y^2+(x-2y+3)^2 $$, to find the local extrema we let $$\triangledown f = (4x−4y+6,−4x+10y−12)=0$$ at ( − 0.5, 1). so $$ H(-0.5,1)=
\left[
\begin{array}{ c c }
4 & -4 \\
-4 & 10
\end{array} \right]
$$ Because 4 > 0 and (4)(10) − (−4)2 = 24 > 0. So by the Theorem 4.3, it is positive definite. By theorem 4.8, If ▿*f*(*A*)=0 and the Hessian matrix [*f*_{xixj}(*A*]) is positive definite at *A*, then *f*(*A*) is a local minimum. Therefore, *f* has a local minimum at point ( − 0.5, 1)

10.

-7.32. Let *S* be the unit sphere centered at the origin in *R*^{3}. Evaluate the following items, using as little calculation as possible

(a)∫_{S}1*d**σ*

(b)∫_{S}||*X*||^{2}*d**σ*

(c) Verify that ∫_{S}*x*_{1}^{2}*d**σ* = ∫_{S}*x*_{2}^{2}*d**σ* = ∫_{S}*x*_{3}^{2}*d**σ* using either a symmetric argument or parametrizations. Can you do this without evaluating them?

(d) Use the result of parts (b) and (c) to deduce the value of ∫_{S}*x*_{1}^{2}*d**σ*

Answer:

(a) In geometry, ∫_{S}1*d**σ* means the area of the unit sphere in *R*^{3} So ∫_{S}1*d**σ* = *π* · 1^{3} = 4*π*

(b) For all X S we have ||*X*||^{2} = 1, therefore ∫_{S}||*X*||^{2}*d**σ* = ∫_{S}1*d**σ* = 4*π*

(c) Rotation by /2 about the *x*_{3}-axis corresponds to some transformation on the domain of the parametrization of *S*. We know that *x*_{1} comes to the same position as *x*_{2}, Therefore ∫_{S}*x*_{1}^{2}*d**σ* = ∫_{S}*x*_{2}^{2}*d**σ* In the same way, make a rotation by *π*/2 about the *x*_{2} , we got ∫_{S}*x*_{1}^{2}*d**σ* = ∫_{S}*x*_{3}^{2}*d**σ* Therefore, ∫_{S}*x*_{1}^{2}*d**σ* = ∫_{S}*x*_{2}^{2}*d**σ* = ∫_{S}*x*_{3}^{2}*d**σ*

(d) By the definition of norm ||*X*||, we know that ||*X*|| = *x*_{1}^{2} + *x*_{2}^{2} + *x*_{3}^{2} So, $$\int_{S} ||X||^2 d\sigma= \int_{S} x_1^2 +x_2^2 +x_3^2 d\sigma= 3\int_{S} x_1^2 d\sigma = 4\pi$$ Therefore,$$ \int_{S} x_1^2 d\sigma =\frac{1}{3}\int_{S} ||X||^2 d\sigma = \frac{4\pi}{3}$$

CudaHashedNet Midterm Report

and 1 collaborator

The Design of HyperFETs

# Model

## Transistor

The transistor is modeled generically by a heavily simplified virtual-source (short-channel) MOSFET model \cite{Khakifirooz_2009}. Although this model was first defined for Silicon transistors, it has been successfully adapted to numerous other contexts, including Graphene \cite{Han_Wang_2011} and Gallium Nitride devices, both HEMTs \cite{RadhakrishnaThesis} and MOSHEMT+VO_{2} HyperFETs \cite{Verma_2017}. Following Khakifirooz \cite{Khakifirooz_2009}, the drain current *I*_{D} is expressed \begin{equation}
\frac{I_D}{W}=Q_{ix_0}v_{x_0}F_s
\end{equation} where *Q*_{iz0} is the charge at the virtual source point, *v*_{x0} is the virtual source saturation velocity, and *F*_{s} is an empirically fitted “saturation function” which smoothly transitions between linear (*F*_{s} ∝ *V*_{DS}/*V*_{DSSAT}) and saturation (*F*_{s} ≈ 1) regimes. The charge in the channel is described via the following semi-empirical form first proposed for CMOS-VLSI modeling \cite{Wright_1985} and employed frequently since (often with modifications, eg \cite{Khakifirooz_2009, RadhakrishnaThesis}): \begin{equation}
Q_{ix_0}=C_\mathrm{inv}nV_\mathrm{th}\ln\left[1+\exp\left\{\frac{V_{GSi}-V_T}{nV_\mathrm{th}}\right\}\right]
\end{equation} where *C*_{inv} is an effective inversion capacitance for the gate, *n**V*_{th}ln10 is the subthreshold swing of the transistor, *V*_{GSi} is the transistor gate-to-source voltage, *V*_{T} is the threshold voltage, and *V*_{th} is the thermal voltage *k**T*/*q*.

For precise modeling, Khakifirooz includes further adjustments of *V*_{T} due to the drain voltage (DIBL parameter) and the gate voltage (strong vs weak inversion shift), as well as a functional form of *F*_{s}. For a first-pass, we will ignore these effects, employ a constant *V*_{T}, and assume the supply voltage is maintained above the gate overdrive such that *F*_{s} ≈ 1. However, we will add on a leakage floor with conductance *G*_{leak}. Altogether, the final current expression (for the analytical part of this analysis) is \begin{equation}
\frac{I_D}{W}=nv_{x_0}C_\mathrm{inv}V_{th}\ln\left[1+\exp\left\{\frac{V_\mathrm{GSi}-V_\mathrm{T}}{nV_{th}}\right\}\right]+\frac{G_\mathrm{leak}}{W}V_\mathrm{DSi}\label{eq:transistor_iv}
\end{equation}

AEP 4830 HW9 Monte Carlo Calculations

The purpose of this homework is to explore the Monte Carlo Algorithm and apply it to the simplified protein folding model in 2D.

# Monte Carlo Method

Monte Carlo Method uses the randomly generated possible solutions to a certain problem in a solution space and test its degree of goodness based on certain physical requirements\cite{NumRec}. The ways of generating the possible solutions are usually two. First, we can generate the possible solutions totally at random. For example, we use random number generator to do Monte Carlo Integration. Second, we can generate the possible solutions from the previous step by randomly changing some parameters of the previous one. We will use the later one to generate our 2D protein structures in this homework.

The general flow of Monte Carlo Method is shown as follows. Note that we use the term “conformation space” instead of “solution space” since we are talking about protein structures here.

Start from a initial state in the conformation space.

Randomly change the previous state, subjecting to requirement 1.

Determine the degree of goodness by criterion 2.

Accept/ reject this state by physical rule 3.

If it is accepted, pass this state and repeat 2 through 5 for certain number of steps.

If it is rejected, do not pass the state and repeat 2 through 4 until the new state is accepted.

The requirement 1, criterion 2 and rule 3 are problem-specific and we will mention these in our protein folding problem.

# 2D Protein Folding

Proteins are composed of 20 different amino acids (AAs) in a polypeptide chain and due to the mutual interactions between those AAs, proteins will favor some folded states to lower the Gibbs free energy. The interactions are mostly negative because of hydrophobic effects or ion-ion interactions. In order for proteins to perform certain biological functions, their unique structures are essential. We can use a simple bead-and-chain model for a 2D protein chain\cite{S_ali_1994}, assuming that all the AAs are of the same size and the peptide bond between two AAs is rigid, being only one unit and unstretchable. Each AA occupies one grid point of the 2D space and cannot be in the same point of any other AAs. When protein folds, the non-covalent interactions apply to the two non-bonding AAs separate by one unit. An we can calculate the relative Gibbs free energy *Δ**G* by summing all the interactions of non-bonding neighbors.

\begin{equation}
\Delta G = E_0 = \sum_{(i,j)} E_{t(i)t(j)}
\end{equation} where (*i*, *j*) are the indices of two neighboring AAs of types (*t*(*i*),*t*(*j*)) and *E* is an 20 × 20 interaction matrix.

With this model in mind, we can determine the requirements mentioned in the previous section.

Requirement 1:

The modified AA cannot occupy other’s positions.

The modified AA must be one unit away from its neighbor(s).

The best way to modify an AA’s position is to move (1, 0), (1, 1), (0, 1), ( − 1, 1), ( − 1, 0), ( − 1, −1), (0, −1) and (1, −1), eight possible changes.

Criterion 2: Evaluate the interaction energy,

*E*_{0}and use this number to determine the goodness of the state. The lower, the better.

Rule 3:

If the new state has lower

*E*_{0}, the protein will adopt this state in order to reach the minimum of the folding landscape.

If the new state has higher

*E*_{0}, the protein does not favor such state. However, there is still some probability to jump from lower energy state to higher energy ones,*P*=*e*^{−(Enew − E0)/kT}.

Once the model is set and the steps are clear, we can start to do the simulation.

# Program Codes

First, we need a general random number generator, *myrand(seed)*. We will test its validity and then apply it to alter the position of a randomly selected AA. Given different *seed*, the function will give different random number sequences. Our seeds for generating interaction matrix *E* and the AA sequence in the protein are two distinct yet fixed value. So we will guarantee that we use exactly the same protein and interactions throughout the calculation. Other than those, the seed will be set by *time(NULL)* and independent of our bias.

Second, there are several subroutines to do the Monte Carlo calculations and to make sure the protein is subject to some requirements. Note that the information of proteins is stored in a 45 × 3 matrix with the first column being AA types, second the x positions and third the y positions.

*neighbor()*: inputs a Protein Vector and outputs the pairs of indices of two non-bonding AAs.

*Energy()*: input pairs of neighbor indices, Protein Vector, interaction matrix*E*and outputs the energy*E*_{0}.

*n2ndistance()*: inputs a Protein Vector and outputs the end-to-end distance of the protein.

*pcheck()*: inputs Protein Vector and the index of certain AA and check if that AA occupies others’ positions. The function outputs*true*if the protein is not allowed,*false*otherwise.

*conformationchange()*: input Protein Vector and make a position change to one of its AA and outputs a modified new Protein Vector.

# Results

First we tested the random number generator *myrand()*. The random number generator gives a uniform distribution of numbers between 0 and 1. And the points (*x*_{n + 1}, *x*_{n}) cover the 1 × 1 square without noticeable patterns as shown in Fig. 1. We further test it by estimating *π*.

\begin{equation}
\frac{\pi}{4} = \frac{N_{in}}{N}
\end{equation} where *N*_{in} is the number of points in the quarter circle and *N* is the total number of points.As the total number of points increases, the RHS will reach $\frac{\pi}{4}$ asympotically, as shown in Fig. 2.