The Oscar cluster will be unavailable for scheduled maintenance, beginning Monday, June 22, 2020 at 8:00 am to Monday, June 29, 2020 at 5:00 pm.
We expect to bring the cluster back online much sooner than published downtime, but have lengthened the downtime in case of unforeseen problems. Without issues, the downtime should be 3-4 days.
Following services will be unavailable during the downtime:
SSH Login
VNC
File System Mounts (CIFS, SMB)
Existing jobs and VNC sessions will terminate. Any Slurm jobs in the queue that can’t finish before 8 am on the morning of the downtime won’t run. They will remain in the queue until the nodes are released back into production on June 29, 2020 (or earlier).
During this downtime, we will perform several maintenance tasks, including upgrading the GPFS file-system. An update will be sent once the cluster is back in production.
If you have any questions or concerns please let us know at support@ccv.brown.edu
CCV User Services
The Oscar cluster is back online. The GPFS file-system has been upgraded and the Slurm reservations have been released; the queued jobs started running yesterday around 8pm. Please see below for our current status:
The following services are available now:
- SSH Logins: ssh user_name@ssh.ccv.brown.edu (Do not use ssh3/ssh4 directly)
- Slurm batch system
- VNC Sessions
- Legacy CIFS Mount (VPN required)
- Transfer Servers and the primary Globus endpoint (brownccv#Brown-CCV-oscar-1)
Notes:
- “SMB” Mount is still under maintenance (smb.ccv.brown.edu - VPN required)
- “Myquota” and possibly other quota-related commands may not accurately report home directory usage. We’re working on updating them.
- Some older (at least the 2012-vintage Sandy Bridge) nodes have motherboard firmware that is incompatible with updated Infiniband adapter firmware. We’re continuing to work on bringing more of the other old nodes online. We’d recently been seeing roughly 100 of the old nodes completely idle, so don’t expect that this will impose significant job wait time. We’ll be evaluating whether we need to add new nodes for additional computing power.
As always, thank you for using Oscar and your patience during this maintenance period. Please report any issues to support@ccv.brown.edu.
Sincerely,
CCV User Services Team