2017-03-08 6 views
1
library(tidyr) 
library(dplyr) 
library(tidyverse) 

以下は単純なデータフレームのコードです。私は、異なる列に展開された列要素カテゴリでエクスポートされた不気味なデータを持っています。類似の列名を参照して複数の列をTidyrの結合で結合する

Client<-c("Client1","Client2","Client3","Client4","Client5") 
Sex_M<-c("Male","NA","Male","NA","Male") 
Sex_F<-c(" ","Female"," ","Female"," ") 
Satisfaction_Satisfied<-c("Satisfied"," "," ","Satisfied","Satisfied") 
Satisfaction_VerySatisfied<-c(" ","VerySatisfied","VerySatisfied"," "," ") 
CommunicationType_Email<-c("Email"," "," ","Email","Email") 
CommunicationType_Phone<-c(" ","Phone ","Phone "," "," ") 
DF<-data_frame(Client,Sex_M,Sex_F,Satisfaction_Satisfied,Satisfaction_VerySatisfied,CommunicationType_Email,CommunicationType_Phone) 

私はtidyrの「団結」を使用して、単一の列にカテゴリを再結合します。

DF<-DF%>%unite(Sat,Satisfaction_Satisfied,Satisfaction_VerySatisfied,sep=" ")%>% 
unite(Sex,Sex_M,Sex_F,sep=" ") 

はしかし、私は複数の「団結」の行を記述する必要があり、私はこれが3回ルールに違反感じるので、私の実際のデータが必要な列の数十が含まれている、特に以来、これを容易にする方法が必要組み合わせる。一度「一度」使用する方法はありますか、何らかの形で一致する列名を参照して、類似しているすべての列名(「Sex_M」と「Sex_F」の場合は「Sex」、「CommunicationType_Email」の場合は「CommunicationType」を含む)と "CommunicationType_Phone")は上記の式と結合されていますか?

私も列名を入力できる機能について考えていましたが、これは複雑な標準評価が必要なため、これは難しいことです。

+0

'DFが%>%団結を複数のケースについてはunite

library(tidyverse) DF %>% unite(Sat, matches("^Sat")) 

を使用することができます'(土は、( "土")が含まれていますか)? – Nate

+0

'DF%>%unite(土曜日、マッチ(" ^土 "))' – akrun

答えて

1

我々は、おそらく

gather(DF, Var, Val, -Client, na.rm = TRUE) %>% 
     separate(Var, into = c("Var1", "Var2")) %>% 
     group_by(Client, Var1) %>% 
     summarise(Val = paste(Val[!(is.na(Val)|Val=="")], collapse="_")) %>% 
     spread(Var1, Val) 
# Client CommunicationType Satisfaction Sex 
#* <chr>    <chr>   <chr> <chr> 
#1 Client1    Email  Satisfied Male 
#2 Client2    Phone VerySatisfied Female 
#3 Client3    Phone VerySatisfied Male 
#4 Client4    Email  Satisfied Female 
#5 Client5    Email  Satisfied Male 
+1

ありがとう、複数のケースasnwerは素晴らしい作品! – Mike

0

これは何か?多くの列がある場合。

result<-with(new.env(),{ 
    Client<-c("Client1","Client2","Client3","Client4","Client5") 
    Sex_M<-c("Male","NA","Male","NA","Male") 
    Sex_F<-c(" ","Female"," ","Female"," ") 
    Satisfaction_Satisfied<-c("Satisfied"," "," ","Satisfied","Satisfied") 
    Satisfaction_VerySatisfied<-c(" ","VerySatisfied","VerySatisfied"," "," ") 
    CommunicationType_Email<-c("Email"," "," ","Email","Email") 
    CommunicationType_Phone<-c(" ","Phone ","Phone "," "," ") 
    x<-ls() 
    categories<-unique(sub("(.*)_(.*)", "\\1", x)) 
    df<-setNames(data.frame(lapply(x, function(y) get(y))), x) 
    for(nm in categories){ 
    df<-unite_(df, nm, x[contains(vars = x, match = nm)]) 
    } 
    return(df) 
}) 

Client CommunicationType Satisfaction  Sex 
1 Client1   Email_  Satisfied_  _Male 
2 Client2   _Phone _VerySatisfied Female_NA 
3 Client3   _Phone _VerySatisfied  _Male 
4 Client4   Email_  Satisfied_ Female_NA 
5 Client5   Email_  Satisfied_  _Male 
関連する問題