DatasetReader¶

class snap_ml_spark.DatasetReader.DatasetReader¶

Load distributed dataset from file.

load(file)¶

Load training data in memory

setFormat(format)¶

Specify the dataformat of the file. Format values: “snap” or “libsvm” or “csv”

setNumFt(x)¶

Set the number of features

takeRange(idx_start, idx_end)¶

If not the whole dataset should be loaded specify start and end index.

Parameters