Wiki » History » Version 8
John Pellman, 08/25/2025 04:54 PM
1 | 1 | John Pellman | h1. Globus Data Transfer from SEMC |
---|---|---|---|
2 | |||
3 | 3 | John Pellman | {{toc}} |
4 | 2 | John Pellman | |
5 | h2. Introduction |
||
6 | |||
7 | 1 | John Pellman | Globus is an application that lets you transfer large amounts of research data efficiently and securely to your personal computer, institution’s storage systems or a cloud provider. It eliminates the use of big portable external hard drives that need to be mounted to a local storage to transfer data. The New York Structural Biology Center primarily uses Globus to transfer data that is collected on-site from electron microscopes. To get started with, follow the instructions below. |
8 | 3 | John Pellman | |
9 | 4 | John Pellman | h2. Logging in to Globus |
10 | 3 | John Pellman | |
11 | 4 | John Pellman | Typically, you can log into Globus with your institutional credentials. To determine if this is possible, check if your institution is listed under the dropdown menu located on the "Globus login page":https://app.globus.org (https://app.globus.org). If it is, you can skip to the next section (_Logging in with Institutional Credentials_). |
12 | 3 | John Pellman | |
13 | 1 | John Pellman | !{width:33%}globus_cilogon_1.png! |
14 | |||
15 | 4 | John Pellman | h3. Creating an ORCID Account and Linking it with Globus |
16 | |||
17 | 3 | John Pellman | If you do not have access to Globus via your institution, you can use your ORCID to sign in to Globus. If you do not already have an ORCID, you can sign up for one "here":https://orcid.org/register. To sign in with your ORCID, you will need to click the green ORCID button on the login page: |
18 | |||
19 | !{width:33%}globus_orcid_1.png! |
||
20 | |||
21 | You will then be prompted to connect your ORCID to Globus. Click the _Authorize access_ button when prompted. |
||
22 | |||
23 | !{width:33%}globus_orcid_2.png! |
||
24 | |||
25 | If your ORCID account is successfully linked with Globus, you will be presented with the following screen. Click _Continue_. |
||
26 | |||
27 | !{width:33%}globus_orcid_3.png! |
||
28 | |||
29 | Fill out the subsequent form. Enter in your email address and organizational affiliation when prompted. |
||
30 | |||
31 | !{width:33%}globus_orcid_4.png! |
||
32 | |||
33 | Finally, grant Globus the permissions it needs from your ORCID account by clicking on _Allow_ on the last screen: |
||
34 | 1 | John Pellman | |
35 | !{width:33%}globus_orcid_5.png! |
||
36 | 4 | John Pellman | |
37 | You should now be logged into Globus. For future sessions, you can click the green ORCID button on the main page to log in again. |
||
38 | |||
39 | h3. Logging in with Institutional Credentials |
||
40 | |||
41 | Select your institution from the dropdown on the Globus login page and click _Continue_. |
||
42 | |||
43 | !{width:33%}globus_cilogon_1.png! |
||
44 | |||
45 | You will now be prompted with the dialogue you typically use to log in to your institution's services (email, etc). Enter your institutional credentials (e.g., NetID, UNI) and sign in as normal. |
||
46 | |||
47 | !{width:33%}globus_cilogon_2.png! |
||
48 | |||
49 | h3. Globus File Manager |
||
50 | |||
51 | 5 | John Pellman | Once you have successfully signed in (via ORCID or institutional credentials) you will be presented with the Globus file manager. In the next section, we will describe how you can use this web interface to initiate data transfers to your home institution. |
52 | 4 | John Pellman | |
53 | !{width:33%}globus_filemanager.png! |
||
54 | 6 | John Pellman | |
55 | h2. Transferring Data |
||
56 | |||
57 | h3. Transferring files from NYSBC to your personal endpoint |
||
58 | |||
59 | 1. Click on the Collections tab and type ‘NYSBC#SEMC’ This will act as your source endpoint, where you will be transferring files from |
||
60 | |||
61 | !{width:25%}image6.png! |
||
62 | |||
63 | 2. You will see a login widget below which will ask for your username and password to authenticate. Please take note that you need to enter the username which was provided by SEMC when you first registered for your project. Also, note that this username is all lowercase |
||
64 | |||
65 | If you are already logged into another NYSBC account. Please go to Settings> Manage Identities > NYSBC OIDC Server |
||
66 | |||
67 | !{width:25%}image7.png! |
||
68 | |||
69 | 3. After you authenticate, you will see your default path which is currently set at /h1/<username> |
||
70 | |||
71 | !{width:25%}image8.png! |
||
72 | |||
73 | 4. You can transfer raw frames and references by selecting the ‘frames’ folder |
||
74 | |||
75 | * Select your username |
||
76 | * Then, select the session you want to transfer frames from eg: (22mar18g) |
||
77 | * Under the rawdata folder, you will see all your raw frames |
||
78 | |||
79 | 5. You can also transfer aligned images by selecting the ‘leginon’ folder |
||
80 | |||
81 | * Select your username |
||
82 | * Then select the session you want to transfer frames from eg: (22mar18g) |
||
83 | * Under the rawdata folder you will see all your aligned images |
||
84 | |||
85 | !{width:25%}image9.png! |
||
86 | |||
87 | 6. Now, you need to select the destination endpoint. For transferring files to your personal computer, you need to install Globus Connect Personal. Click on ‘Endpoints’ on the left navigation bar and select ‘⊕ Create a personal endpoint’ on the top right of the screen |
||
88 | |||
89 | !{width:25%}image10.png! |
||
90 | |||
91 | You will see the following screen |
||
92 | |||
93 | !{width:25%}image11.png! |
||
94 | |||
95 | * Download and install the Globus Connect Personal as per your OS distribution |
||
96 | |||
97 | For specific installation instructions, |
||
98 | Mac installation: https://docs.globus.org/how-to/globus-connect-personal-mac/ Windows: https://docs.globus.org/how-to/globus-connect-personal-windows/ Linux: https://docs.globus.org/how-to/globus-connect-personal-linux/ |
||
99 | |||
100 | * Log In, then press allow, this will take you to the Globus Connect Personal Setup |
||
101 | * For Owner Identity, select your Globus ID |
||
102 | * Enter your endpoint collection name. eg: Personal Computer |
||
103 | * Press ‘Save’ |
||
104 | |||
105 | !{width:25%}image12.png! |
||
106 | |||
107 | 8. Next, select your personal endpoint as your destination endpoint. |
||
108 | |||
109 | * Go back to the file manager and select NYSBC#SEMC in the Collection tab |
||
110 | * Then select, transfer and sync to option. |
||
111 | * On the right side, select your personal endpoint as the destination endpoint. Please see the screenshot |
||
112 | |||
113 | !{width:25%}image13.png! |
||
114 | |||
115 | You will now see the folders on your laptop. |
||
116 | |||
117 | Once you select the appropriate source and destination folder, then click on Start to initiate the transfer process |
||
118 | |||
119 | !{width:25%}image13.png! |
||
120 | |||
121 | To monitor the transfer process, click on the Activity tab on the left sidebar |
||
122 | |||
123 | You will see the transfer logs and the message when the transfer is completed. |
||
124 | |||
125 | Note: Sometimes, your personal computer will be under a firewall which will not allow transferring files. To configure your firewall settings, please follow the link https://docs.globus.org/how-to/configure-firewall-gcp/ |
||
126 | |||
127 | For any issues related to Globus please check the following guides: |
||
128 | |||
129 | * https://docs.globus.org/how-to/ |
||
130 | * https://docs.globus.org/faq/ |
||
131 | |||
132 | h3. Transferring files from NYSBC to your home institution endpoint |
||
133 | |||
134 | 8 | John Pellman | * Follow steps 1 - 5 as described above. |
135 | 6 | John Pellman | * Then select your home institution endpoint as the destination endpoint. Please contact your organization’s IT team to acquire the name of the endpoint authorized by your institution and for the login credentials and queries related to your directory permissions. |
136 | * Follow step 11 as described above. |
||
137 | * You can keep track of data transfer status by clicking on the Activity tab. When the transfer is completed, you will see a green check mark and “Condition” will say SUCCEEDED. |
||
138 | 7 | John Pellman | |
139 | h2. FAQ |
||
140 | |||
141 | h3. I'm unable to link my Globus account to my NCCAT/MEMC account from the CUIMC Campus |
||
142 | |||
143 | The typical advice we give to Columbia affiliates if they encounter an error with Globus account linking is to first switch to a different network (e.g., a WiFi hotspot). This is because our Globus endpoint domain (_globus3.5540bd.75bc.data.globus.org_) apparently cannot be used on certain CUIMC networks. Specifically, our Globus domain cannot be resolved (i.e., its domain name cannot be mapped to an IP address). The _domain globus3.5540bd.75bc.data.globus.org_ is managed by Globus (not us) and in turn, uses Amazon as a domain name provider. Ultimately this means that there is a problem with CUIMC IT's domain name infrastructure where Amazon domains cannot be reached; from the outside it's unclear if this an actual issue or just a matter of policy related to the protection of patient data / PHI. |
||
144 | |||
145 | If you keep encountering this issue, our advice would be to contact CUIMC IT and/or Globus support. |
||
146 | |||
147 | h3. How do I unlink my MEMC account and link my NCCAT account (and vice versa)? |
||
148 | |||
149 | If you currently have a MEMC account linked to your Globus account, you may not be able to access your NCCAT directory (or vice versa). If this is the case, you should use the following instructions to unlink your MEMC account and link to your NCCAT account: |
||
150 | |||
151 | 1. Click on _Manage Identities_ on the _Settings_ page in the Globus web UI. |
||
152 | 2. Then on the next page click on the trash can icon in the same row as _<username>@globus3.5540bd.75bc.data.globus.org_ and then click on _Unlink Identity_. |
||
153 | 3. Next time you are prompted to log in with your credentials, use your NCCAT account and its associated password. |
||
154 | |||
155 | h3. My Globus transfer times out every 5-10 minutes and is proceeding very slowly |
||
156 | |||
157 | On rare occasions, your transfer may proceed slowly and then time out at regular intervals. The time-out errors will look something like the following: |
||
158 | |||
159 | <pre> |
||
160 | Error (transfer) |
||
161 | Endpoint: myuniversity#endpoint(586f0817-9bd9-4eb5-bbdd-017afd630986) |
||
162 | Server: m-98f9a0.99817a.0ec8.data.globus.org:443 |
||
163 | Command: STOR |
||
164 | /path/to/myfile.mrc |
||
165 | Message: The operation timed out |
||
166 | --- |
||
167 | Details: Timeout waiting for response |
||
168 | </pre> |
||
169 | |||
170 | As an initial troubleshooting step, we suggest that you run your transfer over a VPN (such as "ProtonVPN":https://protonvpn.com/ or "Mullvad":https://mullvad.net). If this does not resolve the issue, please contact the MEMC or NCCAT office and we can assist. A longer explanation of the most common cause for this issue can be found below. |
||
171 | |||
172 | h4. Why this issue occurs |
||
173 | |||
174 | If your Globus endpoint is on a university/research institute campus, the particular issue that you are encountering is most likely caused by a quirk of how internet routing works. Our network engineer has configured our gateway router so that outgoing traffic is sent over "NYSERNet":https://nysernet.org/home (our regional research and education network). Our network also sends a message to other networks (such as your institution's network) telling them to send incoming traffic over NYSERNet as well. However, sometimes this message (a route advertisement) does not reach a network, and incoming traffic is routed differently. |
||
175 | |||
176 | This asymmetry in routing (outbound traffic over NYSERNet, inbound traffic from another network) causes the connection to become periodically interrupted, and every time this happens your Globus client needs to create an entirely new connection, which slows the transfer down considerably. |
||
177 | |||
178 | Our network engineer's workaround for this is to manually implement an exception to our normal routing logic. For this to work, we require a fixed IP address or range for the destination that traffic should go to. |