Project

General

Profile

Wiki » History » Version 13

John Pellman, 08/26/2025 10:04 AM

1 1 John Pellman
h1. Globus Data Transfer from SEMC
2
3 3 John Pellman
{{toc}}
4 2 John Pellman
5
h2. Introduction
6
7 1 John Pellman
Globus is an application that lets you transfer large amounts of research data efficiently and securely to your personal computer, institution’s storage systems or a cloud provider. It eliminates the use of big portable external hard drives that need to be mounted to a local storage to transfer data. The New York Structural Biology Center primarily uses Globus to transfer data that is collected on-site from electron microscopes. To get started with, follow the instructions below. 
8 3 John Pellman
9 4 John Pellman
h2. Logging in to Globus
10 3 John Pellman
11 4 John Pellman
Typically, you can log into Globus with your institutional credentials.  To determine if this is possible, check if your institution is listed under the dropdown menu located on the "Globus login page":https://app.globus.org (https://app.globus.org).  If it is, you can skip to the next section (_Logging in with Institutional Credentials_).
12 3 John Pellman
13 1 John Pellman
!{width:33%}globus_cilogon_1.png!
14
15 4 John Pellman
h3. Creating an ORCID Account and Linking it with Globus
16
17 3 John Pellman
If you do not have access to Globus via your institution, you can use your ORCID to sign in to Globus.  If you do not already have an ORCID, you can sign up for one "here":https://orcid.org/register.  To sign in with your ORCID, you will need to click the green ORCID button on the login page:
18
19
!{width:33%}globus_orcid_1.png!
20
21
You will then be prompted to connect your ORCID to Globus.  Click the _Authorize access_ button when prompted. 
22
23
!{width:33%}globus_orcid_2.png!
24
25
If your ORCID account is successfully linked with Globus, you will be presented with the following screen.  Click _Continue_.
26
27
!{width:33%}globus_orcid_3.png!
28
29
Fill out the subsequent form.  Enter in your email address and organizational affiliation when prompted.
30
31
Finally, grant Globus the permissions it needs from your ORCID account by clicking on _Allow_ on the last screen:
32 1 John Pellman
33
!{width:33%}globus_orcid_5.png!
34 4 John Pellman
35
You should now be logged into Globus.  For future sessions, you can click the green ORCID button on the main page to log in again.
36
37
h3. Logging in with Institutional Credentials
38
39
Select your institution from the dropdown on the Globus login page and click _Continue_.
40
41
!{width:33%}globus_cilogon_1.png!
42
43
You will now be prompted with the dialogue you typically use to log in to your institution's services (email, etc).  Enter your institutional credentials (e.g., NetID, UNI) and sign in as normal.
44
45
!{width:33%}globus_cilogon_2.png!
46
47
h3. Globus File Manager
48
49 5 John Pellman
Once you have successfully signed in (via ORCID or institutional credentials) you will be presented with the Globus file manager.  In the next section, we will describe how you can use this web interface to initiate data transfers to your home institution.
50 4 John Pellman
51
!{width:33%}globus_filemanager.png!
52 6 John Pellman
53
h2. Transferring Data
54
55
h3. Transferring files from NYSBC to your personal endpoint 
56
57
1. Click on the Collections tab and type ‘NYSBC#SEMC’ This will act as your source endpoint, where you will be transferring files from 
58
59
!{width:25%}image6.png!
60
 
61
2. You will see a login widget below which will ask for your username and password to authenticate.  Please take note that you need to enter the username which was provided by SEMC when you first registered for your project. Also, note that this username is all lowercase 
62
63
If you are already logged into another NYSBC account. Please go to Settings> Manage Identities > NYSBC OIDC Server  
64
65
!{width:25%}image7.png!
66
67
3. After you authenticate, you will see your default path which is currently set at /h1/<username> 
68
69
!{width:25%}image8.png!
70
71
4. You can transfer raw frames and references by selecting the ‘frames’ folder  
72
73
 * Select your username 
74
 * Then, select the session you want to transfer frames from eg: (22mar18g) 
75
 * Under the rawdata folder, you will see all your raw frames 
76
77
5. You can also transfer aligned images by selecting the ‘leginon’ folder 
78
79
 * Select your username  
80
 * Then select the session you want to transfer frames from eg: (22mar18g) 
81
 * Under the rawdata folder you will see all your aligned images 
82
83
!{width:25%}image9.png!
84
85
6. Now, you need to select the destination endpoint. For transferring files to your personal computer, you need to install Globus Connect Personal. Click on ‘Endpoints’ on the left navigation bar and select ‘⊕ Create a personal endpoint’ on the top right of the screen 
86
87
!{width:25%}image10.png!
88
89
You will see the following screen 
90
91
!{width:25%}image11.png!
92
93
 * Download and install the Globus Connect Personal as per your OS distribution  
94
95
For specific installation instructions, 
96
Mac installation: https://docs.globus.org/how-to/globus-connect-personal-mac/ Windows: https://docs.globus.org/how-to/globus-connect-personal-windows/ Linux: https://docs.globus.org/how-to/globus-connect-personal-linux/  
97
98
* Log In, then press allow, this will take you to the Globus Connect Personal Setup 
99
    * For Owner Identity, select your Globus ID 
100
    * Enter your endpoint collection name. eg: Personal Computer 
101
    * Press ‘Save’ 
102
103
!{width:25%}image12.png!
104
105 9 John Pellman
8. Next, select your personal endpoint as your destination endpoint. Go back to the file manager and select NYSBC#SEMC in the Collection tab  
106
9. Then select, transfer and sync to option.  
107
10. On the right side, select your personal endpoint as the destination endpoint. Please see the screenshot 
108 1 John Pellman
109 6 John Pellman
!{width:25%}image13.png!
110
111 1 John Pellman
You will now see the folders on your laptop.  
112 6 John Pellman
113 9 John Pellman
11. Once you select the appropriate source and destination folder, then click on Start to initiate the transfer process 
114 6 John Pellman
115
!{width:25%}image13.png! 
116
117 9 John Pellman
12. To monitor the transfer process, click on the Activity tab on the left sidebar 
118 6 John Pellman
119
You will see the transfer logs and the message when the transfer is completed.  
120
121
Note: Sometimes, your personal computer will be under a firewall which will not allow transferring files. To configure your firewall settings, please follow the link https://docs.globus.org/how-to/configure-firewall-gcp/  
122
123
For any issues related to Globus please check the following guides: 
124
125
 * https://docs.globus.org/how-to/ 
126
 * https://docs.globus.org/faq/ 
127
128
h3. Transferring files from NYSBC to your home institution endpoint  
129
130 8 John Pellman
 * Follow steps 1 - 5 as described above. 
131 6 John Pellman
 * Then select your home institution endpoint as the destination endpoint. Please contact your organization’s IT team to acquire the name of the endpoint authorized by your institution and for the login credentials and queries related to your directory permissions. 
132
 * Follow step 11 as described above. 
133
 * You can keep track of data transfer status by clicking on the Activity tab. When the transfer is completed, you will see a green check mark and “Condition” will say SUCCEEDED. 
134 7 John Pellman
135
h2. FAQ
136
137
h3. I'm unable to link my Globus account to my NCCAT/MEMC account from the CUIMC Campus
138
139
The typical advice we give to Columbia affiliates if they encounter an error with Globus account linking is to first switch to a different network (e.g., a WiFi hotspot).  This is because our Globus endpoint domain (_globus3.5540bd.75bc.data.globus.org_) apparently cannot be used on certain CUIMC networks.  Specifically, our Globus domain cannot be resolved (i.e., its domain name cannot be mapped to an IP address).  The _domain globus3.5540bd.75bc.data.globus.org_ is managed by Globus (not us) and in turn, uses Amazon as a domain name provider.  Ultimately this means that there is a problem with CUIMC IT's domain name infrastructure where Amazon domains cannot be reached; from the outside it's unclear if this an actual issue or just a matter of policy related to the protection of patient data / PHI.
140
141
If you keep encountering this issue, our advice would be to contact CUIMC IT and/or Globus support.
142
143
h3. How do I unlink my MEMC account and link my NCCAT account (and vice versa)?
144
145
If you currently have a MEMC account linked to your Globus account, you may not be able to access your NCCAT directory (or vice versa). If this is the case, you should use the following instructions to unlink your MEMC account and link to your NCCAT account:
146
147
1. Click on _Manage Identities_ on the _Settings_ page in the Globus web UI.
148
2. Then on the next page click on the trash can icon in the same row as _<username>@globus3.5540bd.75bc.data.globus.org_ and then click on _Unlink Identity_.
149
3. Next time you are prompted to log in with your credentials, use your NCCAT account and its associated password.
150
151
h3. My Globus transfer times out every 5-10 minutes and is proceeding very slowly
152
153
On rare occasions, your transfer may proceed slowly and then time out at regular intervals.  The time-out errors will look something like the following:
154
155
<pre>
156
Error (transfer)
157
Endpoint: myuniversity#endpoint(586f0817-9bd9-4eb5-bbdd-017afd630986)
158
Server: m-98f9a0.99817a.0ec8.data.globus.org:443
159
Command: STOR
160
/path/to/myfile.mrc
161
Message: The operation timed out
162
---
163
Details: Timeout waiting for response
164
</pre>
165
166
As an initial troubleshooting step, we suggest that you run your transfer over a VPN (such as "ProtonVPN":https://protonvpn.com/ or "Mullvad":https://mullvad.net).  If this does not resolve the issue, please contact the MEMC or NCCAT office and we can assist.  A longer explanation of the most common cause for this issue can be found below.
167
168
h4. Why this issue occurs
169
170
If your Globus endpoint is on a university/research institute campus, the particular issue that you are encountering is most likely caused by a quirk of how internet routing works.  Our network engineer has configured our gateway router so that outgoing traffic is sent over "NYSERNet":https://nysernet.org/home (our regional research and education network).  Our network also sends a message to other networks  (such as your institution's network) telling them to send incoming traffic over NYSERNet as well.  However, sometimes this message (a route advertisement) does not reach a network, and incoming traffic is routed differently.
171
172
This asymmetry in routing (outbound traffic over NYSERNet, inbound traffic from another network) causes the connection to become periodically interrupted, and every time this happens your Globus client needs to create an entirely new connection, which slows the transfer down considerably.
173
174
Our network engineer's workaround for this is to manually implement an exception to our normal routing logic.  For this to work, we require a fixed IP address or range for the destination that traffic should go to.