How to compress Ovito outputs for better performance

Ovito

How to compress Ovito outputs for better performance

A user reached us about opening a large molecular simulation file with a GUI software called Ovito in the cluster.

The issue was that both X11 connection or downloading the file is very slow for the user and they were looking for improving their workflow to be able to see their simulated results faster.

The usual file size for their outputs is about 10Gb.

When we reviewed the file, we noticed that it is an ASCII file including many different aspects of their simulation such as XYZ coordinates,

velocity, force and other parameters for millions of atoms.

We noticed that to see the 3D maps users only need the corresponding coordinates.

Also, because of the large number of atoms if we drop some of them the 3D maps will still be accurate.

As a solution we generate a Python script that let them to prune the file and reduce the file size.

The Python script can be customized for selecting atoms and drops extra columns and compresses the file.

In this way we can reduced the file size dramatically. For example we reduced the 10Gb file to about 25Mb by choosing every 25 atoms.

Our Python script is available here:

#!/usr/bin/python3
import sys
import os
with open(sys.argv[1], 'r') as f:
    lines = f.readlines()
step = int(sys.argv[2])
ncol = 6
selected_lines = lines[9::step]
with open('out_%s.lammps' % step, 'w') as out:
    for i in range(3):
        out.write(lines[i])
    out.write(str(len(selected_lines)) + '\n')
    for i in range(4,8):
        out.write(lines[i])
    out.write(' '.join(lines[8].split()[:ncol+2]) + '\n')
    for slines in selected_lines:
        out.write(' '.join(slines.split()[:ncol]) + '\n')
os.system("""
tar -czvf out_%s.lammps.tar.gz out_%s.lammps
""" % (step,step))