All EliApps accounts come with unlimited Google Drive storage. We recommend the following command line tool for uploading data from the clusters to Google Drive (although there is a bit of a learning curve).
- From https://github.com/prasmussen/gdrive, scroll down to “Downloads” and download the “gdrive-linux-x64” executable to the cluster.
- Change the permissions on the executable.
- Run the utility for the first time to authenticate with your Eliapps account.
- Go to listed url and log in with your Yale email address. Copy the token from the website and paste it into the prompt on your terminal.
- (optional) Put the parent directory of the utility in your path so you can call it from anywhere. To do this, put gdrive-linux-x64 into a directory such as
~/binand then add the following line to your .bashrc file:
- Ssh to the data transfer node from the login node (the transfer nodes have much faster connections to Google)
- Open up a tmux session for the data transfer so that you don’t have to watch the upload
- To upload to Google Drive, you need to know the “id” of the destination directory. To list all the directories in the root of your Google Drive, run
- Upload individual files:
- Upload directory. Note that the destination directory must be empty
- Log into drive.google.com and see all your files!
chmod +x gdrive-linux-x64
and source your
Note: the following instructions assume you have performed the final optional step above for simplicity. If you haven’t, just replace
gdrive-linux-x64 below with the absolute path to your
gdrive-linux-x64 list --query " 'root' in parents"
The first column is the “Id”. To get the Ids in a subdirectory, run
gdrive-linux-x64 list --query " '<IdOfTheParentFolder>' in parents"
<IdOfTheParentFolder> is the id of the directory you want to look in. Repeat this process until you find the id of the directory you want to upload to.
gdrive-linux-x64 upload <file> <id_of_destination>
gdrive-linux-x64 sync upload <directory> <id_of_destination>
It is better to use "sync upload" and not just "upload" when transferring directories because sync will keep track of what has already been upload and only upload new things if you restart the upload or want to upload just changes. Note that you can only upload a single file or a directory (recursively) in a single command--wildcards don't work. Obviously you can batch uploads in a shell script. Check out the documentation on the github, it is very comprehensive.
Another thing to keep in mind is that for better performance, upload a smaller number of large files rather than tons of small files.