load_snap_file

pai4sk.sml_io.load_snap_file(filename, num_chunks=None)

Data loading from Snap formatted data file. It supports both local and distributed(MPI) method of loading data. For MPI execution this can be used for distributed SnapML training and inference.

Parameters
  • filename (str) – The file where the data resides in snap format.

  • num_chunks (int) – Number of chunks per partition

Returns

  • X (scipy.sparse matrix or ndarray of shape (n_samples, n_features))

  • y (ndarray of shape (n_samples,))