2017-01-11 13 views
0

次のデータセットを作成して問題を再現できます。私はモジュール/ファイル名が重複しています。1つの列に重複した値を削除し、別の列に最新の値を戻します。

owaspSample <- data.frame(Module=c("AccessDetails.java","AccessDiverse.java","BgField.java","BgStatus.java","CmdDate.java","CmdGameDate.java","CommentDate.java","CostDate.java","EntranceDetails.java","GameDate.java","LdPopDate.java","LeaseCostDate.java","PastApprovalDate.java","ProvisioningDate.java","ReservationDate.java","RefDate.java","ServiceDate.java","StatusDate.java","ProfileDate.java","UpdateCmdDate.java","ViewDate.java","AccessDetails.java","AccessDiverse.java","AuthenticationDate.java","CmdDate.java","CmdSummaryDate.java","CmdViewDate.java","ChangeOrderDate.java","CommentDate.java","CostDate.java","GameDate.java","LdPopDate.java","LeaseCostDate.java","PastApprovalDate.java","ReservationDate.java","RefDate.java","UnderwaterCmdDate.java","WaveDate.java","XmlFormatter.java"), 
Category = c("SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","SQL Injection","XML External Entity Injection"), 
scanDate=c("2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-23","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24","2016-10-24"), 
VulnCount = c("13","15"," 1"," 3","15"," 2","11","30"," 2"," 2"," 2"," 2"," 4"," 2"," 3"," 9"," 1"," 1"," 1"," 8"," 6","25","28"," 3","30"," 1"," 6"," 5","20","23"," 3"," 3"," 4","10"," 3","17"," 1"," 3"," 2"), 
Owasp = c("A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A00-SQL Injection","A01-Injection")) 

次のコマンドを実行して重複を削除しても動作するようです。しかし、私は最新の日付で複製を返すことができるようにしたい。日付は動的でなければなりません。例えば

owaspSample <- owaspSample[!duplicated(owaspSample$Module),] 

あなたはこれが発生した場合:それを行う方法を任意のアイデア

Module     Category  Date  VulnCount Owasp 
EntranceDetails.java SQL Injection 2016-10-23  2  A00-SQL Injection 
CostDate.java   SQL Injection 2016-10-24  23  A00-SQL Injection 
GameDate.java   SQL Injection 2016-10-24  3  A00-SQL Injection 

Module     Category  Date  VulnCount Owasp 
CostDate.java   SQL Injection 2016-10-23  30  A00-SQL Injection 
EntranceDetails.java SQL Injection 2016-10-23  2  A00-SQL Injection 
GameDate.java   SQL Injection 2016-10-23  2  A00-SQL Injection 
CostDate.java   SQL Injection 2016-10-24  23  A00-SQL Injection 
GameDate.java   SQL Injection 2016-10-24  3  A00-SQL Injection 

予想される出力はこのすべきですか?

+1

'duplicated'の' fromLast'引数を参照してください。 – nicola

+0

ニコラ、ありがとう。少なくともそれは最新の日付に基づいてモジュールを返します。ただし、重複していないファイルは削除します。このデータセットはテストをうまく提供しています。私が間違っていることがわかった –

答えて

0

私はnicolaの提案を使用して別のコードuniqueを追加しましたが、重複していないファイル名を緩和しません。

owaspSample <- owaspSample[unique(owaspSample$Module),] 

owaspSample <- owaspSample[!duplicated(owaspSample$Module, fromLast = TRUE),] 

私は同じことをすると思いました。しかし一緒に彼らは私に期待される結果を与えている。

0

dplyrでこれを行うことができます。 'Module'でグループ化した後、

library(dplyr) 
owaspSample %>% 
     group_by(Module) %>% 
     slice(n()) 
関連する問題