Rの二次割り当て手順（QAP）が異なる結果を生成しています

私の質問を見て、彼らの思考と経験を共有する人のために、事前に感謝したいと思います。私は、5人の個人のコミュニティの間の行動の相関について二次割り当て手順（QAP）を実行しようとしています。私は個体間の行動の頻度を表す10個の行列を持っており、行列の対の間の相関（ピアソンのr）を計算しました。たとえば、行列1と行列2、行列2と行列3、行列3と行列4 ...との間の相関関係が見つかりました。次に、Rパッケージのsapのqaptest関数を使用して、これらの相関の重要性を評価したいと考えました。 qaptestのRのドキュメントに従って、私はすべての行列を配列に配置しました。次に、行列のペア（行列1と行列2、行列2と行列3 ...など）間のQAP p値を計算しました。しかし、配列の行列数を変更した場合（たとえば、最初の配列を配列に配置した場合など）、最初の行列のQAP p値は大きく変化していました。行列とQAPの理解に基づいて、削除された行列は行列1と行列2でQAPテストを実行することと関係がないため、これは起こりません。私は以下の私の行列とスクリプトを含めました。ここでRの二次割り当て手順（QAP）が異なる結果を生成しています

（下記のコードでは、これは私がfilelist1を作った段階で、コードの後半は唯一の行列に1-5を使用しています。）リスト形式で私の行列である：

[[1]] 
    1 2 3 4 5 
1 1 0 0 0 0 
2 5 0 3 5 0 
3 0 0 0 0 0 
4 0 0 0 0 0 
5 2 0 1 0 0 

[[2]] 
    1 2 3 4 5 
1 0 0 1 0 0 
2 3 6 10 1 2 
3 0 0 0 0 0 
4 0 5 0 0 0 
5 0 0 5 0 0 

[[3]] 
    1 2 3 4 5 
1 0 1 0 0 0 
2 2 0 5 7 0 
3 0 0 0 0 3 
4 1 0 0 0 0 
5 1 2 2 3 0 

[[4]] 
    1 2 3 4 5 
1 0 6 0 0 2 
2 2 0 8 5 0 
3 0 5 0 0 0 
4 1 0 0 0 0 
5 0 0 1 3 2 

[[5]] 
    1 2 3 4 5 
1 0 0 0 0 0 
2 1 0 2 5 1 
3 0 0 0 0 0 
4 1 2 3 0 1 
5 0 3 3 1 0 

[[6]] 
    1 2 3 4 5 
1 0 0 0 0 0 
2 2 0 3 0 3 
3 0 0 0 0 0 
4 1 0 4 0 0 
5 1 5 7 0 0 

[[7]] 
    1 2 3 4 5 
1 0 0 0 0 0 
2 2 0 6 0 3 
3 0 0 0 0 0 
4 6 0 4 0 0 
5 1 0 2 0 0 

[[8]] 
    1 2 3 4 5 
1 0 0 0 1 0 
2 2 0 1 6 0 
3 0 0 0 0 0 
4 0 0 0 0 0 
5 6 0 2 2 0 

[[9]] 
    1 2 3 4 5 
1 0 0 0 0 0 
2 0 0 2 3 2 
3 0 0 0 0 0 
4 0 0 0 0 0 
5 1 0 2 0 0 

[[10]] 
    1 2 3 4 5 
1 0 0 0 0 0 
2 1 0 1 1 0 
3 0 0 0 0 0 
4 0 0 0 0 0 
5 6 0 1 2 0

これは、私のRスクリプト：

# read in all ten of the matrices 
a<-read.csv("test1.csv") 
b<-read.csv("test2.csv") 
c<-read.csv("test3.csv") 
d<-read.csv("test4.csv") 
e<-read.csv("test5.csv") 
f<-read.csv("test6.csv") 
g<-read.csv("test7.csv") 
h<-read.csv("test8.csv") 
i<-read.csv("test9.csv") 
j<-read.csv("test10.csv") 

filelist<-list(a,b,c,d,e,f,g,h,i,j) #place files in a list 
filelist1<-lapply(filelist,function(x){ 
    x<-x[1:5, 2:6] #choose only columns in the matrix 
    colnames(x)<-1:5 #rename columns according to identity 
    x<-as.matrix(x) #make a matrix 
    return(x) 
}) 

ee<-array(dim=c(5,5,10)) #create an empty array 

array<-function(files) { 
    names(files) <- c("c1","c2","c3", "c4", "c5", "c6", "c7", "c8", "c9", "c10") #name the matrices 
    invisible(lapply(names(files), function(x) assign(x,files[[x]],envir=.GlobalEnv))) #place the matrices in a global environment 
    ee[,,1]<-c(c1) #place each matrix in order into the array 
    ee[,,2]<-c(c2) 
    ee[,,3]<-c(c3) 
    ee[,,4]<-c(c4) 
    ee[,,5]<-c(c5) 
    ee[,,6]<-c(c6) 
    ee[,,7]<-c(c7) 
    ee[,,8]<-c(c8) 
    ee[,,9]<-c(c9) 
    ee[,,10]<-c(c10) 
    return(ee) #return the completely filled in array 
} 

a.array<-array(filelist1) # apply the function to the list of matrices 

q1.2<-qaptest(a.array,gcor,g1=1,g2=2) #run the qaptest funtion 
#a.array is the array with the matrices,gcor tells the function that we want a correlation 
#g1=1 and g2=2 indicates that the qap analysis should be run between the first and second matrices in the array. 
    summary.qaptest(q1.2) #provides a summary of the qap results 
#in this case, the p-value is roughly: p(f(perm) >= f(d)): 0.176 

############ If I take out the last five matrices, the q1.2 p-value changes dramatically 
#first clear the memory or R will not create another blank array 
rm(list = ls()) 

a<-read.csv("test1.csv") #read in all five files 
b<-read.csv("test2.csv") 
c<-read.csv("test3.csv") 
d<-read.csv("test4.csv") 
e<-read.csv("test5.csv") 

filelist<-list(a,b,c,d,e) #create a list of the files 
filelist1<-lapply(filelist,function(x){ 
    x<-x[1:5, 2:6] #include only the matrix 
    colnames(x)<-1:5 #rename the columns 
    x<-as.matrix(x) #make it a matrix 
    return(x) 
}) 

ee<-array(dim=c(5,5,5)) #this time the array only has five slots 

array<-function(files) { 
    names(files) <- c("c1","c2","c3", "c4", "c5") 
    invisible(lapply(names(files), function(x) assign(x,files[[x]],envir=.GlobalEnv))) 
    ee[,,1]<-c(c1) 
    ee[,,2]<-c(c2) 
    ee[,,3]<-c(c3) 
    ee[,,4]<-c(c4) 
    ee[,,5]<-c(c5) 
    return(ee) 
} 

a.array<-array(filelist1) 

q1.2<-qaptest(a.array,gcor,g1=1,g2=2) 
#in this case, the p-value is roughly: p(f(perm) >= f(d)): 0.804 
    summary.qaptest(q1.2)

私は行列のまったく同じペアを分析していたときのp値が大きく異なるだろう理由を考えることはできません。唯一の違いは、配列に配置された追加の行列の数です。他の誰かがこの問題を経験しましたか？

ありがとうございました！

出典

2016-10-25 G. DeO

qaptest()は、配列の最初の次元からグラフを読み込みます。だからee[,,1]<-c(c1)（など）はee[1,,]<-c(c1)（など）を読みます。すべてのグラフを最初の次元に配置すると、qaptestsは同じ結果を返すはずです。個人的には、array()の代わりにlist()を使用し、qaptestを使用することをお勧めします。

出典

2016-10-26 00:15:23 paqmo

ありがとうございました！配列の代わりにリストに配置することは、そのトリックを行うように見えました。 –

Rの二次割り当て手順（QAP）が異なる結果を生成しています

答えて

関連する問題