2017-11-13 11 views
0

であることを確認しましたが、同様の質問がありますが、単純なことを理解できません。私は異なる野球の位置の平均給与を計算しようとしています。NaNを使用した結果、計算手段が

library(Lahman) 
library(tidyverse) 

data("Fielding") 
data(Salaries) 

# First, I need to merge two datasets 
merged.df <- merge(Fielding, Salaries, by = "playerID", na.rm = TRUE) 
merged.df.2002 <- merged.df[merged.df$yearID.x == "2002",] 

# Let's try tapply 
mean.salary <- tapply(merged.df.2002$POS, merged.df.2002$salary, mean, na.rm = TRUE) 
# So it gives me an error 
# In mean.default(X[[i]], ...) : 
# argument is not numeric or logical: returning NA 

class(merged.df.2002$POS) 
class(merged.df.2002$salary) 

# Very likely POS column is factor for some reason. 
# Coerce them through 
merged.df.2002$POS <- as.numeric(as.character(merged.df.2002$POS)) 
# Warning message: 
# NAs introduced by coercion 
merged.df.2002$salary <- as.numeric(as.character(merged.df.2002$salary)) 
#as.numeric(merged.df.2002$salary) 
class(merged.df.2002$salary) 

# Let's try tapply again 
mean.salary <- tapply(merged.df.2002$POS, merged.df.2002$salary, mean, na.rm 
= TRUE) 
mean.salary 

60000 62500 63500 65000 67000 67500 68000 68750 70000 
71000 72500 77500 78000 80000 82000 82500 
NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  
NaN  NaN  NaN  NaN  NaN  NaN 

どうもありがとう!

+5

を台無しに。特にmean(NA、na.rm = TRUE)はNaNです。 – Hugh

+1

関連性が低い: 'na.rm'は' merge'関数の引数ではありません。 – lmo

+0

@Hugh、あなたの超高速応答のために多くのありがとう、だからこの流れで私の間違いはありますか? – Oleksiy

答えて

0

まあ、それは簡単だったが、私は空のセットの平均値がNaNである

merged.df <- merge(Fielding, Salaries) 
# So, my mistake was that I merged only by playerid. 

merged.df.2002 <- merged.df[merged.df$yearID == 2002, ] 
# we pick 2002 year from the merged dataset 

# use tapply for mean 
merged.df.mean <- tapply(merged.df.2002$salary, merged.df.2002$POS, mean, 
na.rm = TRUE) 

#  1B  2B  3B  C  OF  P  SS 
# 2543845 1400543 1547836 1787933 2659230 2150887 1979732 
関連する問題