ロジスティック回帰モデルの簡単なトレーニングをしようとしています。私のトレーニングデータ、モデル、エラーメッセージを以下に示します。なぜこの 'リスト'オブジェクトに属性 '最初の'エラーがないのですか?PySpark mllibロジスティック回帰エラー "リストオブジェクトに最初に属性がありません"
train_data = numdata.collect()
train_data[:3]
[LabeledPoint(1.0, [2.0,36.0,0.0,100.0,100.0,38.0,0.0,100.0,95.0,100.0,100.0]),
LabeledPoint(1.0, [0.0,77.0,16.0,100.0,99.0,86.0,1.0,99.0,100.0,99.0,95.0]),
LabeledPoint(1.0, [0.0,22.0,0.0,100.0,95.0,21.0,1.0,95.0,100.0,100.0,100.0])]
lrm = LogisticRegressionWithSGD.train(train_data)
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in() ----> 1 lrm = LogisticRegressionWithSGD.train(train_data)
C:\spark-2.0.1-bin-hadoop2.7\python\pyspark\mllib\classification.pyc in train(cls, data, iterations, step, miniBatchFraction, initialWeights, regParam, regType, intercept, validateData, convergenceTol) 319 bool(intercept), bool(validateData), float(convergenceTol)) 320 --> 321 return _regression_train_wrapper(train, LogisticRegressionModel, data, initialWeights) 322 323
C:\spark-2.0.1-bin-hadoop2.7\python\pyspark\mllib\regression.pyc in _regression_train_wrapper(train_func, modelClass, data, initial_weights) 206 def _regression_train_wrapper(train_func, modelClass, data, initial_weights): 207 from pyspark.mllib.classification import LogisticRegressionModel --> 208 first = data.first() 209 if not isinstance(first, LabeledPoint): 210 raise TypeError("data should be an RDD of LabeledPoint, but got %s" % type(first))
AttributeError: 'list' object has no attribute 'first'