GDrive

All EliApps accounts come with unlimited Google Drive storage. We recommend the following command line tool for uploading data from the clusters to Google Drive (although there is a bit of a learning curve).

Installation

  1. From https://github.com/prasmussen/gdrive, scroll down to “Downloads” and download the “gdrive-linux-x64” executable to the cluster.
  2. Change the permissions on the executable.
  3. chmod +x gdrive-linux-x64
  4. Run the utility for the first time to authenticate with your Eliapps account.
  5. ./gdrive-linux-x64 about
  6. Go to listed url and log in with your Yale email address. Copy the token from the website and paste it into the prompt on your terminal.
  7. (optional) Put the parent directory of the utility in your path so you can call it from anywhere. To do this, put gdrive-linux-x64 into a directory such as ~/bin and then add the following line to your .bashrc file:
  8. 	export PATH=$PATH:$HOME/bin
    

    and source your .bashrc file: source ~/.bashrc

    Usage

    Note: the following instructions assume you have performed the final optional step above for simplicity. If you haven’t, just replace gdrive-linux-x64 below with the absolute path to your gdrive-linux-x64 executable.

    1. Ssh to the data transfer node from the login node (the transfer nodes have much faster connections to Google)
    2. ssh transfer

    3. Open up a tmux session for the data transfer so that you don’t have to watch the upload
    4. To upload to Google Drive, you need to know the “id” of the destination directory. To list all the directories in the root of your Google Drive, run
    5. gdrive-linux-x64 list --query " 'root' in parents"
      

      The first column is the “Id”. To get the Ids in a subdirectory, run

      gdrive-linux-x64 list --query " '<IdOfTheParentFolder>' in parents"
      

      where <IdOfTheParentFolder> is the id of the directory you want to look in. Repeat this process until you find the id of the directory you want to upload to.

    6. Upload individual files:
    7. gdrive-linux-x64 upload <file> <id_of_destination>
      
    8. Upload directory. Note that the destination directory must be empty
    9. gdrive-linux-x64 sync upload <directory> <id_of_destination>
      
    10. Log into drive.google.com and see all your files!

    It is better to use "sync upload" and not just "upload" when transferring directories because sync will keep track of what has already been upload and only upload new things if you restart the upload or want to upload just changes. Note that you can only upload a single file or a directory (recursively) in a single command--wildcards don't work. Obviously you can batch uploads in a shell script. Check out the documentation on the github, it is very comprehensive.

    Another thing to keep in mind is that for better performance, upload a smaller number of large files rather than tons of small files.