Project

General

Profile

Wiki » History » Version 12

John Pellman, 08/26/2025 10:02 AM

1 1 John Pellman
h1. Globus Data Transfer from SEMC
2
3 3 John Pellman
{{toc}}
4 2 John Pellman
5
h2. Introduction
6
7 1 John Pellman
Globus is an application that lets you transfer large amounts of research data efficiently and securely to your personal computer, institution’s storage systems or a cloud provider. It eliminates the use of big portable external hard drives that need to be mounted to a local storage to transfer data. The New York Structural Biology Center primarily uses Globus to transfer data that is collected on-site from electron microscopes. To get started with, follow the instructions below. 
8 3 John Pellman
9 4 John Pellman
h2. Logging in to Globus
10 3 John Pellman
11 4 John Pellman
Typically, you can log into Globus with your institutional credentials.  To determine if this is possible, check if your institution is listed under the dropdown menu located on the "Globus login page":https://app.globus.org (https://app.globus.org).  If it is, you can skip to the next section (_Logging in with Institutional Credentials_).
12 3 John Pellman
13 1 John Pellman
!{width:33%}globus_cilogon_1.png!
14
15 4 John Pellman
h3. Creating an ORCID Account and Linking it with Globus
16
17 3 John Pellman
If you do not have access to Globus via your institution, you can use your ORCID to sign in to Globus.  If you do not already have an ORCID, you can sign up for one "here":https://orcid.org/register.  To sign in with your ORCID, you will need to click the green ORCID button on the login page:
18
19
!{width:33%}globus_orcid_1.png!
20
21
You will then be prompted to connect your ORCID to Globus.  Click the _Authorize access_ button when prompted. 
22
23
!{width:33%}globus_orcid_2.png!
24
25
If your ORCID account is successfully linked with Globus, you will be presented with the following screen.  Click _Continue_.
26
27
!{width:33%}globus_orcid_3.png!
28
29
Fill out the subsequent form.  Enter in your email address and organizational affiliation when prompted.
30
31 12 John Pellman
!{width:33%}globus_orcid_4b.png!
32 3 John Pellman
33
Finally, grant Globus the permissions it needs from your ORCID account by clicking on _Allow_ on the last screen:
34 1 John Pellman
35
!{width:33%}globus_orcid_5.png!
36 4 John Pellman
37
You should now be logged into Globus.  For future sessions, you can click the green ORCID button on the main page to log in again.
38
39
h3. Logging in with Institutional Credentials
40
41
Select your institution from the dropdown on the Globus login page and click _Continue_.
42
43
!{width:33%}globus_cilogon_1.png!
44
45
You will now be prompted with the dialogue you typically use to log in to your institution's services (email, etc).  Enter your institutional credentials (e.g., NetID, UNI) and sign in as normal.
46
47
!{width:33%}globus_cilogon_2.png!
48
49
h3. Globus File Manager
50
51 5 John Pellman
Once you have successfully signed in (via ORCID or institutional credentials) you will be presented with the Globus file manager.  In the next section, we will describe how you can use this web interface to initiate data transfers to your home institution.
52 4 John Pellman
53
!{width:33%}globus_filemanager.png!
54 6 John Pellman
55
h2. Transferring Data
56
57
h3. Transferring files from NYSBC to your personal endpoint 
58
59
1. Click on the Collections tab and type ‘NYSBC#SEMC’ This will act as your source endpoint, where you will be transferring files from 
60
61
!{width:25%}image6.png!
62
 
63
2. You will see a login widget below which will ask for your username and password to authenticate.  Please take note that you need to enter the username which was provided by SEMC when you first registered for your project. Also, note that this username is all lowercase 
64
65
If you are already logged into another NYSBC account. Please go to Settings> Manage Identities > NYSBC OIDC Server  
66
67
!{width:25%}image7.png!
68
69
3. After you authenticate, you will see your default path which is currently set at /h1/<username> 
70
71
!{width:25%}image8.png!
72
73
4. You can transfer raw frames and references by selecting the ‘frames’ folder  
74
75
 * Select your username 
76
 * Then, select the session you want to transfer frames from eg: (22mar18g) 
77
 * Under the rawdata folder, you will see all your raw frames 
78
79
5. You can also transfer aligned images by selecting the ‘leginon’ folder 
80
81
 * Select your username  
82
 * Then select the session you want to transfer frames from eg: (22mar18g) 
83
 * Under the rawdata folder you will see all your aligned images 
84
85
!{width:25%}image9.png!
86
87
6. Now, you need to select the destination endpoint. For transferring files to your personal computer, you need to install Globus Connect Personal. Click on ‘Endpoints’ on the left navigation bar and select ‘⊕ Create a personal endpoint’ on the top right of the screen 
88
89
!{width:25%}image10.png!
90
91
You will see the following screen 
92
93
!{width:25%}image11.png!
94
95
 * Download and install the Globus Connect Personal as per your OS distribution  
96
97
For specific installation instructions, 
98
Mac installation: https://docs.globus.org/how-to/globus-connect-personal-mac/ Windows: https://docs.globus.org/how-to/globus-connect-personal-windows/ Linux: https://docs.globus.org/how-to/globus-connect-personal-linux/  
99
100
* Log In, then press allow, this will take you to the Globus Connect Personal Setup 
101
    * For Owner Identity, select your Globus ID 
102
    * Enter your endpoint collection name. eg: Personal Computer 
103
    * Press ‘Save’ 
104
105
!{width:25%}image12.png!
106
107 9 John Pellman
8. Next, select your personal endpoint as your destination endpoint. Go back to the file manager and select NYSBC#SEMC in the Collection tab  
108
9. Then select, transfer and sync to option.  
109
10. On the right side, select your personal endpoint as the destination endpoint. Please see the screenshot 
110 1 John Pellman
111 6 John Pellman
!{width:25%}image13.png!
112
113 1 John Pellman
You will now see the folders on your laptop.  
114 6 John Pellman
115 9 John Pellman
11. Once you select the appropriate source and destination folder, then click on Start to initiate the transfer process 
116 6 John Pellman
117
!{width:25%}image13.png! 
118
119 9 John Pellman
12. To monitor the transfer process, click on the Activity tab on the left sidebar 
120 6 John Pellman
121
You will see the transfer logs and the message when the transfer is completed.  
122
123
Note: Sometimes, your personal computer will be under a firewall which will not allow transferring files. To configure your firewall settings, please follow the link https://docs.globus.org/how-to/configure-firewall-gcp/  
124
125
For any issues related to Globus please check the following guides: 
126
127
 * https://docs.globus.org/how-to/ 
128
 * https://docs.globus.org/faq/ 
129
130
h3. Transferring files from NYSBC to your home institution endpoint  
131
132 8 John Pellman
 * Follow steps 1 - 5 as described above. 
133 6 John Pellman
 * Then select your home institution endpoint as the destination endpoint. Please contact your organization’s IT team to acquire the name of the endpoint authorized by your institution and for the login credentials and queries related to your directory permissions. 
134
 * Follow step 11 as described above. 
135
 * You can keep track of data transfer status by clicking on the Activity tab. When the transfer is completed, you will see a green check mark and “Condition” will say SUCCEEDED. 
136 7 John Pellman
137
h2. FAQ
138
139
h3. I'm unable to link my Globus account to my NCCAT/MEMC account from the CUIMC Campus
140
141
The typical advice we give to Columbia affiliates if they encounter an error with Globus account linking is to first switch to a different network (e.g., a WiFi hotspot).  This is because our Globus endpoint domain (_globus3.5540bd.75bc.data.globus.org_) apparently cannot be used on certain CUIMC networks.  Specifically, our Globus domain cannot be resolved (i.e., its domain name cannot be mapped to an IP address).  The _domain globus3.5540bd.75bc.data.globus.org_ is managed by Globus (not us) and in turn, uses Amazon as a domain name provider.  Ultimately this means that there is a problem with CUIMC IT's domain name infrastructure where Amazon domains cannot be reached; from the outside it's unclear if this an actual issue or just a matter of policy related to the protection of patient data / PHI.
142
143
If you keep encountering this issue, our advice would be to contact CUIMC IT and/or Globus support.
144
145
h3. How do I unlink my MEMC account and link my NCCAT account (and vice versa)?
146
147
If you currently have a MEMC account linked to your Globus account, you may not be able to access your NCCAT directory (or vice versa). If this is the case, you should use the following instructions to unlink your MEMC account and link to your NCCAT account:
148
149
1. Click on _Manage Identities_ on the _Settings_ page in the Globus web UI.
150
2. Then on the next page click on the trash can icon in the same row as _<username>@globus3.5540bd.75bc.data.globus.org_ and then click on _Unlink Identity_.
151
3. Next time you are prompted to log in with your credentials, use your NCCAT account and its associated password.
152
153
h3. My Globus transfer times out every 5-10 minutes and is proceeding very slowly
154
155
On rare occasions, your transfer may proceed slowly and then time out at regular intervals.  The time-out errors will look something like the following:
156
157
<pre>
158
Error (transfer)
159
Endpoint: myuniversity#endpoint(586f0817-9bd9-4eb5-bbdd-017afd630986)
160
Server: m-98f9a0.99817a.0ec8.data.globus.org:443
161
Command: STOR
162
/path/to/myfile.mrc
163
Message: The operation timed out
164
---
165
Details: Timeout waiting for response
166
</pre>
167
168
As an initial troubleshooting step, we suggest that you run your transfer over a VPN (such as "ProtonVPN":https://protonvpn.com/ or "Mullvad":https://mullvad.net).  If this does not resolve the issue, please contact the MEMC or NCCAT office and we can assist.  A longer explanation of the most common cause for this issue can be found below.
169
170
h4. Why this issue occurs
171
172
If your Globus endpoint is on a university/research institute campus, the particular issue that you are encountering is most likely caused by a quirk of how internet routing works.  Our network engineer has configured our gateway router so that outgoing traffic is sent over "NYSERNet":https://nysernet.org/home (our regional research and education network).  Our network also sends a message to other networks  (such as your institution's network) telling them to send incoming traffic over NYSERNet as well.  However, sometimes this message (a route advertisement) does not reach a network, and incoming traffic is routed differently.
173
174
This asymmetry in routing (outbound traffic over NYSERNet, inbound traffic from another network) causes the connection to become periodically interrupted, and every time this happens your Globus client needs to create an entirely new connection, which slows the transfer down considerably.
175
176
Our network engineer's workaround for this is to manually implement an exception to our normal routing logic.  For this to work, we require a fixed IP address or range for the destination that traffic should go to.