Extended XYZ format is a enhanced version of the basic XYZ format that allows extra columns to be present in the file for additonal per atom properties as well as standardising the format of the comment line to include the cell lattice and other per frame parameters.
It's easiest to describe the format with an example. Here is a standard XYZ file containing a bulk cubic 8 atom silicon cell.
8 Cubic bulk silicon cell Si 0.00000000 0.00000000 0.00000000 Si 1.36000000 1.36000000 1.36000000 Si 2.72000000 2.72000000 0.00000000 Si 4.08000000 4.08000000 1.36000000 Si 2.72000000 0.00000000 2.72000000 Si 4.08000000 1.36000000 4.08000000 Si 0.00000000 2.72000000 2.72000000 Si 1.36000000 4.08000000 4.08000000
The first line is the number of atoms, followed by a comment and then one line per atom, giving the element symbol and cartesian x y, and z coordinates in Angstroms.
Here's the same configuration in extended XYZ format
8 Lattice="5.44 0.0 0.0 0.0 5.44 0.0 0.0 0.0 5.44" Properties=pos:R:3 Time=0.0 Si 0.00000000 0.00000000 0.00000000 Si 1.36000000 1.36000000 1.36000000 Si 2.72000000 2.72000000 0.00000000 Si 4.08000000 4.08000000 1.36000000 Si 2.72000000 0.00000000 2.72000000 Si 4.08000000 1.36000000 4.08000000 Si 0.00000000 2.72000000 2.72000000 Si 1.36000000 4.08000000 4.08000000
In extended XYZ format, the comment line is replaced by a series of key/value pairs. The keys should be strings and values can be integers, reals, logicals (denoted by T and F for true and false) or strings. Quotes are required if a value contains any spaces (like Lattice above). There are two mandatory parameters that any extended XYZ: Lattice and Properties. Other parameters, e.g. `Time` in the example above, can be added to the parameter line as needed.
Lattice is a Cartesian 3x3 matrix representation of the cell lattice vectors, in the form (i.e. fortran array ordering):
Lattice="R11 R21 R31 R12 R22 R32 R13 R23 R33"
The list of properties in the file is described by the Properties parameter, which should take the form of a series of colon separated triplets giving the name, format (R for real, I for integer, S for string and L for logical) and number of columns of each property. For example
indicates the first three columns represent atomic positions, the next three velocities, and the last is a single integer called select. With this property definition, the line
Si 4.08000000 4.08000000 1.36000000 0.00000000 0.00000000 0.00000000 1
would describe an atom at position (4.08,4.08,1.36) with zero velocity and the select property set to 1.