Dataset Format
stru_out
This file contains structure and k-mesh data. It should contain the following data in order
lattice vectors: 3 lines, 3 float number in each line, unit: Bohr radius
reciprocal lattice vectors: 3 lines, 3 float number in each line, unit: inverse Bohr radius
number of k-grids along each lattice vectors: 1 line,
nkx
,nky
,nkz
. The total number of k-pointsnkpts
equals to the product ofnkx
,nky
andnkz
Cartesian coordinates of each k-point:
nkpts
lines, 3 float number in each line, unit: inverse Bohr radiusmapping of k-point to its irreducible counterpart:
nkpts
lines, 1 integer in each line.
The mapping should be considered as below:
suppose the number on the n
-th line is m
, it means that
the irreducible k-point corresponding to the n
-th k-point in the full k-point set is the m
-th
k-point in the full set.
Cs_data_xxx.txt
These files contain the localized RI triple coefficients.
In plain text format, each file has a header with two integers: total number of atoms and number of periodic unit cells. Then till the end of file, the data is formatted as blocks of RI coefficient \(C\) on each pair of atoms and unit cell
i_atom_1 i_atom_2 n_1 n_2 n_3 n_basis_1 n_basis_2 n_aux_basis_1
C(1, 1, 1)
...
C(n_aux_basis_1, n_basis_2, n_basis_1)
Here C
is the RI coefficients between the atom i_atom_1
and i_atom_2
in unit cells separated by
lattice vector \(\mathbf{R} = n_1 \mathbf{a}_1 + n_2 \mathbf{a}_2 + n_3 \mathbf{a}_3\).
The auxiliary basis is located on i_atom_1
. The number of basis functions on i_atom_1
and i_atom_2
are n_basis_1
and n_basis_2
, respectively. The number of auxiliary functions is n_aux_basis_1
.
The indices of C
runs in the Fortran order, i. e. the first index runs the fastest.
In binary format, the data is organized similarly in the plain text format, except for an extra integer is included in the header, which is the number of atom pairs and lattice vectors included in the file. The coefficients are saved in double precision. To better illustrate the format of binary file, the following Python snippet could be helpful
import struct
import numpy as np
# ensure that "Cs_data_0" exists and was generated with binary output mode in DFT code
cfile_path = "Cs_data_0.txt"
with open(cfile_path, 'rb') as h:
n_atoms, n_cells, n_apcell_file = struct.unpack('iii', h.read(12))
for _ in range(n_apcell_file):
a1, a2, r1, r2, r3, nb1, nb2, nbb1 = struct.unpack('i' * 8, h.read(4 * 8))
apcell = (a1, a2, r1, r2, r3)
array_size = nb1 * nb2 * nbb1
array = np.array(struct.unpack('d' * array_size, h.read(8 * array_size)))
array = np.reshape(array, (nb1, nb2, nbb1))
apcells[apcell] = array
band_out
This file contains band energies and occupation numbers from the mean-field starting-point calculation. It has a 5-line header
n_k_points
n_spins
n_states
n_basis
e_fermi
The first 4 lines contain an integer in each. The 5th line is a float number, which is the Fermi energy in Hartree unit.
The remaining lines consists of n_k_points*n_spins
blocks of n_states+1
lines, in the format of
i_k_point i_spin
1 f_1 e_1_ha e_1_ev
2 f_2 e_2_ha e_2_ev
3 f_3 e_3_ha e_3_ev
...
n f_n e_n_ha e_n_ev
...
This block contains the energies and occupation numbers of states \(\left|\psi_{n,k\sigma}\right\rangle\)
i_k_point
marks the index of k-point \(k\) in the full k-point set.
i_spin
specify the spin channel \(\sigma\).
In each of the following lines, the first integer species the index of state.
The 3 float numbers stand for the occupation number, the energy in Hartree unit and that in electronvolt
unit, respectively.
For spin-unpolarized calculation, f_n
is a number from 0 to 2, otherwise it is from 0 to 1.
KS_eigenvector_xxx.txt
These files contain the wave functions (eigenvectors) from the starting-point calculation expanded by orbital basis.
Each file can be divided in blocks of n_states*n_basis*n_spins+1
lines,
where n_states
, n_basis
and n_spins
will be extracted from band_out
file.
Each block stores the data for a particular k-point, \(c^i_{n,k\sigma}\):
i_k_point
c(1,1,1)_real c(1,1,1)_imag
...
c(i,n,s)_real c(i,n,s)_imag
...
The first line contains single integer, the index of the k-point of following data. The remaining lines store the data with running index \(i\), \(n\), \(\sigma\) in C-style row-major order, i. e., spin index runs fastest, then state index and finally basis index. Each line has two float numbers, which are the real and imaginary part of \(c^i_{n,k\sigma}\).
coulomb_mat_xxx.txt
These files contains the Coulomb matrices in auxiliary basis. A single header line contains an integer, the number of irreducible k-point at which the Coulomb matrices are computed. The remaining part of the file is organized in blocks
n_aux_basis row_start row_end col_start col_end
i_k_point k_weight
v(row_start, col_start )_real v(row_start, col_start )_imag
v(row_start, col_start+1)_real v(row_start, col_start+1)_imag
...
v(row_end, col_end)_real v(row_end, col_end)_imag
where
integer
n_aux_basis
is the total number of auxiliary basis functions.integer
row_start
,row_end
,col_start
andcol_end
mark the submatrix of the full Coulomb matrix that this block contain.integer
i_k_point
is the index of k-point of the current Coulomb matrix, in the full k-point list.float number
k_weight
is the weight of the irreducible k-points.
After the block header, there should be (row_end-row_start+1)
times (col_end-col_start+1)
lines
for the actual matrix element data. Each line contains two float numbers, which are the real and imaginary
parts of the element. The data is ordered in C-style row major.