Using Overleaf with Git Submodules
Using Overleaf with Git Submodules#
Overleaf is useful for editing manuscripts on the cloud, sometimes synchronously with collaborators.
However, editing synchronously in real-time with collaborators on an Overleaf project is exceedingly rare (for me, at least). One may also find compiling on Overleaf to be slower than working locally on a LaTeX editor. In any case, one may still prefer the manuscript as an Overleaf project to be embedded within a GitHub repo that contains other scripts (e.g., Python/R/Stata scripts).
Importing an existing GitHub repo from GitHub#
The easiest way for relatively small and trivial projects is to use the existing Overleaf function to import an existing GitHub repo as an Overleaf project by clicking on the New Project button.
However, this does not always work. One of the reasons it won’t work is if the Github repo (as the superproject) is large. Overleaf won’t support these repos for import.
The alternative is to use the Overleaf project as a standalone manuscript project and as the submodule in a superproject containing all other project assets (e.g., data files).
Using Git Submodules for Overleaf Manuscripts Pt. 0#
First, the Overleaf manuscript project should be linked to a GitHub repo. If it’s already linked, skip forward to pt. 1.
If it is not already a Github repo, link it by selecting Menu button and then the GitHub button, which gets you to the GitHub sync modal.
The Overleaf project should now reside in a GitHub repo like user/manuscript on GitHub (https://github.com/user/manuscript). (* You can obviously name it anything else.)
Now that the Overleaf manuscript is linked to a GitHub repo, we can add it as a submodule to the existing project.
Using Git Submodules for Overleaf Manuscripts Pt. 1#
CD to the remote GitHub repo folder containing some existing assets of the project (e.g., a folder with data files
data/, a readme documentation
README.md, and a folder with coding scripts
$ cd my_project $ ls data/ README.md scripts/
Add the submodule from https://github.com/user/manuscript (If the GitHub repo does not exist yet, refer to this). You can see the new changes in your local repo when you check the
$ git submodule add https://github.com/user/manuscript.git $ git status On branch main Your branch is up-to-date with 'origin/main'. Changes to be committed: (use "git reset HEAD <file>..." to unstage) new file: .gitmodules new file: manuscript $ ls data/ manuscript/ README.md scripts/
Git submodules added the
manuscript submodule into the root directory of my_project. There is also a new
.gitmodules file. This is a configuration file so that Git knows how to map from the local directory to the
manuscript submodule on GitHub.
$ cat .gitmodules [submodule "manuscript"] path = manuscript url = https://github.com/user/manuscript.git
Push the submodule to the project’s Git repo.
$ git commit -m "Add manuscript submodule" [main fbace23] Add manuscript submodule 2 files changed, 4 insertions(+) create mode 100644 .gitmodules create mode 160000 manuscript $ git push origin main
160000 mode just means that the commit is as a directory entry rather than a subdirector or a file.
Pull upstream changes from the submodule remote, if changes exist. This checks for new work in the submodule and prevents future merge conflicts. Just
cd into the submodule and
git pull as usual.
$ cd manuscript $ git pull origin main
To change locally and push to submodule repo from the local repo, just
cd to the submodule folder and do the usual
add-commit-push to the submodule’s remote repo. (changes are sometimes are collected in detached heads.)
$ cd manuscript $ git add *** $ git commit -m "Some changes from local" $ git push origin main
Finally, we may want to push changes in the submodule to the project repo. That’s because changes have been pushed to the
manuscript repo but not to the overall
project repo. This can be seen by going back to the project root directory. A
git status will show that the changes in the submodule folder as modified but not pushed to the project repo.
$ cd ../ $ ls data/ manuscript/ README.md scripts/ $ git status On branch main Your branch is up to date with 'origin/main'. Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: manuscript (new commits)
add-commit-push will resolve this.
Gotcha with detached Heads#
Sometimes the changes inside the
submodule folder might be collected in a detached
HEAD. To confirm this, do a
$ git branch * (HEAD detached at 660da63) * main
So we need to make a branch, switch back to
master) and then merge so that the new changes are in
main. First, make a temporary
tmp branch for the detached head. Then checkout
main. Merge commits from previously detached head into
main. Delete the temporary branch and go back to the
$ git branch tmp $ git checkout main $ git merge tmp $ git branch -d tmp $ git branch * main
Cloning a Git repo with a submodule#
Start by cloning a git repo as usual.
$ git clone https://github.com/user/project.git Cloning into 'project'... ... $ ls data/ manuscript/ README.md scripts/
cding into the manuscript
submodule folder reveals that it’s still empty.
$ cd manuscript $ ls .
So we need to
init the local config file, and then do a
git submodule update to fetch all the assets from that project and check out the appropriate commits listed in the superproject.
$ git submodule init Submodule 'manuscript' (https://github.com/user/manuscript.git) registered for path './' $ git submodule update Cloning into 'D:/project/manuscript'... Submodule path './': checked out '405998645a301ee47ab43125ec01fd2e7a48671c' $ ls figs/ ms/ tabs/