LinearRegression

class snap_ml_spark.estimator.LinearRegression(featuresCol='features', labelCol='label', predictionCol='prediction', trainingHistory=0, maxIter=1000, regParam=0.0, elasticNetParam=0.0, tol=0.001, solver='auto', weightCol=None, useGpu=False, dual=True, balanced=False, nthreads=-1, gpuMemLimit=0, verbose=False)

Linear regression.

The learning objective is to minimize the specified loss function, with regularization.

>>> from pyspark.ml.linalg import Vectors
>>> df = spark.createDataFrame([
...     (1.0, 2.0, Vectors.dense(1.0)),
...     (0.0, 2.0, Vectors.sparse(1, [], []))], ["label", "weight", "features"])
>>> lr = LinearRegression(maxIter=5, regParam=0.0, weightCol="weight")
>>> model = lr.fit(df)
>>> test0 = spark.createDataFrame([(Vectors.dense(-1.0),)], ["features"])
>>> abs(model.transform(test0).head().prediction - (-1.0)) < 0.001
True
>>> abs(model.coefficients[0] - 1.0) < 0.001
True
>>> abs(model.intercept - 0.0) < 0.001
True
>>> test1 = spark.createDataFrame([(Vectors.sparse(1, [0], [1.0]),)], ["features"])
>>> abs(model.transform(test1).head().prediction - 1.0) < 0.001
True
>>> lr.setParams("vector")
Traceback (most recent call last):
    ...
TypeError: Method setParams forces keyword arguments.
>>> lr_path = temp_path + "/lr"
>>> lr.save(lr_path)
>>> lr2 = LinearRegression.load(lr_path)
>>> lr2.getMaxIter()
5
>>> model_path = temp_path + "/lr_model"
>>> model.save(model_path)
>>> model2 = LinearRegressionModel.load(model_path)
>>> model.coefficients[0] == model2.coefficients[0]
True
>>> model.intercept == model2.intercept
True
>>> model.numFeatures
1
getBalanced()

Gets the value of balanced or its default value.

getDual()

Gets the value of dual or its default value.

getGpuMemLimit()

Gets the value of gpuMemLimit or its default value.

getNthreads()

Gets the value of nthreads or its default value.

getTrainingHistory()

Gets the value of trainingHistory or its default value.

getUseGpu()

Gets the value of useGpu or its default value.

getVerbose()

Gets the value of verbose or its default value.

setBalanced(value)

Sets the value of balanced.

setDual(value)

Sets the value of dual.

setGpuMemLimit(value)

Sets the value of gpuMemLimit.

setNthreads(value)

Sets the value of nthreads.

setParams(featuresCol="features", labelCol="label", predictionCol="prediction", trainingHistory=0, maxIter=1000, regParam=0.0, elasticNetParam=0.0, tol=1e-6, solver="auto", weightCol=None, useGpu=False, dual=True, balanced=False, nthreads=-1, gpuMemLimit=0, verbose=False)

Sets params for linear regression.

setTrainingHistory(value)

Sets the value of trainingHistory.

setUseGpu(value)

Sets the value of useGpu.

setVerbose(value)

Sets the value of verbose.