Utils

snap_ml_spark.Utils.dump_to_snap_format(X, y, filename, transpose=False, implicit_vals=False)

Non-distributed data writing to snap format

Parameters
  • X (numpy array or sparse matrix) – The data used for training or inference.

  • y (numpy array) – The labels of the samples in X.

  • filename (str) – The file where X and y will be stored in snap format.

  • transpose (bool , default : False) – If transpose is True, X will be stored in transposed format.

snap_ml_spark.Utils.read_from_snap_format(filename)

Non-distributed data loading from snap format

Parameters

filename (str) – The file where the data resides.

Returns

X, y – Returns two datasets. X : the data used for training or inference y : the labels of the samples in X.

Return type

numpy array or sparse matrix, numpy array