Utils

snap_ml_spark.Utils.dump_to_snap_format(X, y, filename, transpose=False, implicit_vals=False)

Non-distributed data writing to snap format

Parameters:
  • X (numpy array or sparse matrix) – The data used for training or inference.
  • y (numpy array) – The labels of the samples in X.
  • filename (str) – The file where X and y will be stored in snap format.
  • transpose (bool , default : False) – If transpose is True, X will be stored in transposed format.
snap_ml_spark.Utils.read_from_snap_format(filename)

Non-distributed data loading from snap format

Parameters:filename (str) – The file where the data resides.
Returns:X, y – Returns two datasets. X : the data used for training or inference y : the labels of the samples in X.
Return type:numpy array or sparse matrix, numpy array