0
私はpandasデータフレームで2つの変数を使用してWelch Two Sample t-testを実行しようとしています。両方の変数は文字列です。私はJupyterノートブックを使用しています - 私はscipy.stats.ttest_ind()
へTypeError:/: 'unicode'と 'int'のサポートされていないオペランドタイプ
camis cuisine_description dba boro zipcode record_date inspection_date score grade critical_flag action violation_code violation_description inspection_type
40372466 American MURALS ON 54/RANDOLPHS'S MANHATTAN 10019 2017-04-26T06:00:59.000 2016-03-10T00:00:00.000 10 A Critical Violations were cited in the following area(s). 02H Food not cooled by an approved method whereby ... Cycle Inspection/Re-inspection
50Jewish/Kosher SUSHI FUSSION QUEENS 11375 2017-04-26T06:00:59.000 2015-12-08T00:00:00.000 20 B Not Critical Violations were cited in the following area(s). 10I Single service item reused, improperly stored,... Cycle Inspection/Re-inspection
41028194 Chinese SAI'S CAFE BROOKLYN 11219 2017-04-26T06:00:59.000 2015-01-02T00:00:00.000 13 A Not Critical Violations were cited in the following area(s). 10I Single service item reused, improperly stored,... Cycle Inspection/Re-inspection
TypeError Traceback (most recent call last)
<ipython-input-228-5ba9bcaf819c> in <module>()
1 from scipy import stats
----> 2 print(scipy.stats.ttest_ind(gradeRm['inspection_type'], gradeRm['grade']))
/Users/sharonmorris/anaconda/lib/python2.7/site- packages/scipy/stats/stats.pyc in ttest_ind(a, b, axis, equal_var, nan_policy)
4058 return Ttest_indResult(np.nan, np.nan)
4059
-> 4060 v1 = np.var(a, axis, ddof=1)
4061 v2 = np.var(b, axis, ddof=1)
4062 n1 = a.shape[axis]
/Users/sharonmorris/anaconda/lib/python2.7/site- packages/numpy/core/fromnumeric.pyc in var(a, axis, dtype, out, ddof, keepdims)
3124
3125 return _methods._var(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
-> 3126 **kwargs)
/Users/sharonmorris/anaconda/lib/python2.7/site-packages/numpy/core/_methods.pyc in _var(a, axis, dtype, out, ddof, keepdims)
103 if isinstance(arrmean, mu.ndarray):
104 arrmean = um.true_divide(
--> 105 arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
106 else:
107 arrmean = arrmean.dtype.type(arrmean/rcount)
TypeError: unsupported operand type(s) for /: 'unicode' and 'int'
ユニコード文字列を数学関数に渡しています。その文字列がどのように見えるかによって、おそらくdtypeを変更するだけです。あなたはデータフレームのスニペットを投稿できますか? – pshep123
normalize gradeRm ['inspection_type']、gradeRm ['grade']データです。例。インポートUnicodeデータ unicodedata.normalize( 'NFKD'、gradeRm ['inspection_type'])。encode( 'ascii'、 'ignore') – vikasmcajnu