John Blischak Use \label and \ref for boxes.  almost 9 years ago

Commit id: 19e4b28fa9dd786aa3261708b6f114d4fd74c2ea

deletions | additions      

       

\label{box:definitions}  \subsection{Box 1: \ref{box:definitions}:  Definitions} \begin{itemize}  \item \textbf{Version Control System (VCS)}: \textit{(noun)} a program that tracks changes to specified files over time and maintains a library of all past versions of those files   \item \textbf{Git}: \textit{(noun)} a version control system  \item \textbf{repository (repo)}: \textit{(noun)} folder containing all tracked files as well as the version control history  \item \textbf{commit}: \textit{(noun)} a snapshot of changes made to the staged file(s); \textit{(verb)} to save a snapshot of changes made to the staged file(s)   \item \textbf{branch}: \textit{(noun)} a parallel version of the files in a repository (Box 2) \ref{box:branching})  \item \textbf{local}: \textit{(noun)} the version of your repository that is stored on your personal computer  \item \textbf{remote}: \textit{(noun)} the version of your repository that is stored on the internet, for instance on GitHub  \item \textbf{clone}: \textit{(verb)} to create a local copy of a remote repository on your personal computer         

\label{box:branching}  \subsection{Box 2: \ref{box:branching}:  Branching} Do you ever make changes to your code, but are not sure you will want to keep those changes for your final analysis? Using Git, you can maintain parallel versions of your code that you can easily bounce between while you are working on your changes. You can think of it like making a copy of the folder you keep your scripts in, so that you have your original scripts intact but also have the new folder where you make changes. Using Git, this is called branching and it is better than separate folders because 1) it uses a fraction of the space on your computer, 2) keeps a record of when you made the parallel copy (branch) and what you have done on the branch, and 3) there is a way to incorporate those changes back into your main code if you decide to keep your changes (and a way to deal with conflicts). By default, your repository will start with one branch, usually called "master". To create a new branch in your repository, type \verb|git branch new_branch_name|. You can see what branches a current repository has by typing \verb|git branch|, with the branch you are currently in being marked by a star. To move between branches, type \verb|git checkout branch_to_move_to|. You can edit files and commit them on each branch separately. If you want combine the changes in your new branch with the master branch, you can merge the branches by typing \verb|git merge new_branch_name| while in the master branch.          

\label{box:not}  \subsection{Box 3: \ref{box:not}:  What \textit{not} to version control} You \textit{can} version control any file that you put in a Git repository, whether it is text-based, an image, or giant data files. However, just because you \textit{can} version control something, does not mean you \textit{should}. Git works best for plain text based documents such as your scripts or your manuscript if written in LaTeX or Markdown. This is because for text files, Git saves the entire file only the first time you commit it and then saves just your changes with each commit. This takes up very little space and Git has the capability to compare between versions (using \verb|git diff|). You can commit a non-text file, but a full copy of the file will be saved with each commit. Over time, you may find the size of your repository growing very quickly. A good rule of thumb is to version control anything text based: your scripts or manuscripts if they are written in plain text. Things \textit{not} to version control are large data files that never change, binary files (including Word and Excel documents), and the output of your code.          

Once you have your files saved in a Git repository, you can share it with your collaborators and the wider scientific community by putting your code online.  This also has the added benefit of creating a backup of your work and provides a mechanism for syncing your files across multiple computers.  Sharing a repository is made easier if you use one of the many online services that host Git repositories (Table 1), e.g. GitHub.  Note, however, that any files that have not been tracked with at least one commit are not included in the Git repository, even if they are located within the same directory on your local computer (see Box 3 \ref{box:not}  for advice on the types of files that should not be versioned with Git). To begin using GitHub, you will first need to sign up for an account.  For the examples in this tutorial, we will use the fake username "scientist123".  Next choose the option to "Create a new repository".  Call it "thesis" because that is the directory name containing the files, but this is not a requirement.  Also, now that the code will be existing in multiple places, you need to learn some more terminology (Box 1). \ref{box:definitions}).  A local repository refers to code that is stored on the machine you are using, e.g. your laptop; whereas, a remote repository refers to the code that is hosted online.  Thus, you have just created a remote repository. 

\end{lstlisting}  You first specify the remote repository, "origin".  Second, you tell Git to push to the "master" copy of the repository - we won’t go into other options in this tutorial, but Box 2 \ref{box:branching}  discusses them briefly. Pushing to GitHub also has the added benefit of backing up your code in case anything were to happen to your computer.  Also, it can be used to sync your code across multiple machines, similar to a service like Dropbox, but with the added capabilities of Git.         

To start versioning your code with Git, navigate to your newly created or existing project directory (in this case, \verb|~/thesis|).  Start tracking your code by running the command \verb|git init|, which initializes a new Git repository in the current folder.  A repository refers to the current version of the tracked files as well as all the previously saved versions (Box 1). \ref{box:denfinitions}).  \begin{lstlisting}  $ cd ~/thesis 

There are a few key things to notice from this output.  First, the three scripts are recognized as untracked files because you have not told git to take snapshots of anything yet.  Second, the word "commit" means "a version of the code", e.g. "the figure was generated using the commit from yesterday" (Box 1). \ref{box:definitions}).  This word can also be used as a verb, in which case it means "to save", e.g. "to commit a change."  Lastly, it explains how you can start tracking your files.  You need to use the command \verb|git add|.