2017-11-07 26 views
-1

のすべてのレベルを見つける(<a href="https://archive.ics.uci.edu/ml/datasets/adult" rel="nofollow noreferrer">here</a>)UCI機械学習リポジトリから有名な「大人のデータセットを使用して因子変数

データセットは次のようになります。

Observations: 32,561 
Variables: 15 
$ AGE   <int> 39, 50, 38, 53, 28, 37, 49, 52, 31, 42, 37, 30, 23, 32, 40, 34, 25, 32, 38, 43, 40, 54, 35, 43, 59, ... 
$ WORKCLASS  <chr> "State-gov", "Self-emp-not-inc", "Private", "Private", "Private", "Private", "Private", "Self-emp-no... 
$ FNLWGT  <int> 77516, 83311, 215646, 234721, 338409, 284582, 160187, 209642, 45781, 159449, 280464, 141297, 122272,... 
$ EDUCATION  <chr> "Bachelors", "Bachelors", "HS-grad", "11th", "Bachelors", "Masters", "9th", "HS-grad", "Masters", "B... 
$ EDUCATIONNUM <int> 13, 13, 9, 7, 13, 14, 5, 9, 14, 13, 10, 13, 13, 12, 11, 4, 9, 9, 7, 14, 16, 9, 5, 7, 9, 13, 9, 10, 9... 
$ MARITALSTATUS <chr> "Never-married", "Married-civ-spouse", "Divorced", "Married-civ-spouse", "Married-civ-spouse", "Marr... 
$ OCCUPATION <chr> "Adm-clerical", "Exec-managerial", "Handlers-cleaners", "Handlers-cleaners", "Prof-specialty", "Exec... 
$ RELATIONSHIP <chr> "Not-in-family", "Husband", "Not-in-family", "Husband", "Wife", "Wife", "Not-in-family", "Husband", ... 
$ RACE   <chr> "White", "White", "White", "Black", "Black", "White", "Black", "White", "White", "White", "Black", "... 
$ SEX   <chr> "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Male", "Female", "Male", "Male", "Mal... 
$ CAPITALGAIN <int> 2174, 0, 0, 0, 0, 0, 0, 0, 14084, 5178, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... 
$ CAPITALLOSS <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2042, 0, 0, 0, 0, 0, 0, 0, 0, 1... 
$ HOURSPERWEEK <int> 40, 13, 40, 40, 40, 40, 16, 45, 50, 40, 80, 40, 30, 50, 40, 45, 35, 40, 50, 45, 60, 20, 40, 40, 40, ... 
$ NATIVECOUNTRY <chr> "United-States", "United-States", "United-States", "United-States", "Cuba", "United-States", "Jamaic... 
$ ABOVE50K  <int> 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0... 

私はに、クラスの文字のすべての変数を変更使用率:

df <- df %>% mutate_if(is_character, as.factor) 

今、私は、各因子変数のレベルを確認したいと、私はこれを行うことができます3210

levels(df2$working_class) 
levels(df2$education) 
levels(df2$marital_status) 

各因子変数についても同様です。確かに、purrrパッケージでこれを行う簡単な方法があるはずです。次のようなものがあります。

df %>% map_if(is.factor, levels) 

残念ながら、これはクラス整数の列も選択します。

マップ変数を考慮してレベルを返すにはマップをどのようにして作ればよいですか? おかげ

+0

を助ける:私は仕事にこれを見つけた:DF%>%select_if(is.factor)%>%map_if(is.factor、レベル)。いいえ、私は理解できませんなぜdf2%>%select_if(is.factor)%>%map(is.factor、levels)またはdf%>%map_if(is.factor、levels) – Franky

答えて

1

希望は、このアップデートのよう==

library(df) 
df %>% select_if(is.factor) %>% 
sapply(levels) 
+0

ありがとうございます。これは、df%>%map_if(is.factor、levels)よりも優れていますが、タイプファクタの変数だけでなく、各変数のレベルも返します。 df%>%select_if(is.factor)%>%sapply(レベル) – Franky

+0

https://stackoverflow.com/questions/17907944/how-to-select-all-factor-variables-in-r – nachti

関連する問題