Pythonを使用して各ファイルの 'HOH'分子の総数を数える方法

#!/usr/bin/python 
import os 
path=os.getcwd() 
print path 
list_of_filenames=os.listdir(path+'//newfiles') 
print list_of_filenames 
residue=[] 
for f in list_of_filenames: 
     f1=open(path+'//newfiles//'+f).readlines() 
     for line in f1: 
       if line.startswith('HETATM'): 
         res_number=line[22:26] 
         if res_number not in residue and line[17:20]=='HOH': 
           residue.append(res_number) 
         else: 
           continue 
       else: 
         continue 
print(len(residue))

上記のスクリプトを使用して、すべてのファイルで 'HOH'分子の合計数を1つの値として得ました。しかし、私はどれくらいの量の 'HOH'分子が各ファイルにあるのかを知る必要があります。Pythonを使用して各ファイルの 'HOH'分子の総数を数える方法

このスクリプトは私の要求に応じてどのように変更する必要があるか教えてください。

出典

2017-03-12 IKW

各ファイルの発生を最小限に抑えるための変更。

residue=[] 
for f in list_of_filenames: 
     f1=open(path+'//newfiles//'+f).readlines() 
     for line in f1: 
       if line.startswith('HETATM'): 
         res_number=line[22:26] 
         if res_number not in residue and line[17:20]=='HOH': 
           residue.append(res_number) 
         else: 
           residue.append(0) # changed 
       else: 
         continue 

for i in range(len(residue)): # print each occurence 
    print(residue[i])

出典

2017-03-12 07:31:52

ありがとうございます。大きな助けを – IKW

次の回答は、すべてのファイルを辞書に格納し、ファイルごとの使用回数を反復します。それを試してみて、それ以上の入力が必要な場合はコメントしてください。

#!/usr/bin/python 

di = {} 

import os 
path=os.getcwd() 
print path 
list_of_filenames=os.listdir(path+'//newfiles') 
print list_of_filenames 
residue=[] 
for f in list_of_filenames: 
     f1=open(path+'//newfiles//'+f).readlines() 
     di[f] = 0 
     for line in f1: 
       if line.startswith('HETATM'): 
         res_number=line[22:26] 
         if res_number not in residue and line[17:20]=='HOH': 
           residue.append(res_number) 
           di[f] += 1 
         else: 
           continue 
       else: 
         continue 
print(len(residue)) 
print(di)

出典

2017-03-12 07:32:23 Neil

ありがとうございました。あなたは私を救いました。再度、感謝します。あなたに私の敬礼！ – IKW

私は自分の問題は私が辞書を使用していないと思う。今私はそれに慣れます。 di [f] = 0、di [f] = 1となる。どういう意味ですか？ – IKW

di [f]はキーがfである辞書要素です。=記号の後には値を設定します – Neil

ヒント：defaultdict。

別々のファイルを別々に追跡し、いくつかの定型コードを減らすには、基本的に各ファイルにセットを格納するdefaultdict（set）を使用します。

#!/usr/bin/python 
import os 
path=os.getcwd() 
list_of_filenames=os.listdir(path+'//newfiles') 
residue = collections.defaultdict(set) 
for f in list_of_filenames: 
    with open(open(path+'//newfiles//'+f) as f1: # close the file 
     for line in f1.readlines(): 
       if line.startswith('HETATM'): 
         res_number=line[22:26] 
         if res_number not in residue[f] and line[17:20]=='HOH': 
           residue[f].add(res_number) 
         else: 
           continue 
       else: 
         continue 
print(residue)

出典

2017-03-12 07:35:08 Shuo

ありがとうございました。私はこの答えで新しいことを学びました。 – IKW

@Iromiは答えを受け入れますか？ :) – Shuo

はい私は答えを受け入れる。しかし、私は小さな疑いがあります。出力は、必要とされる回答に似ていません。例えば、それは各ファイルの残基の総数を示していません。例えばfile1：1000、file2：1500、file3：2000のようなものです。 – IKW

Pythonを使用して各ファイルの 'HOH'分子の総数を数える方法

答えて

関連する問題