2017-09-22 3 views
0

私は与えられたシーケンスから要素を数えるために使用されるスクリプトを開発しています。私はすでにこの作業を改善する方法を見つけましたが、文字列に含まれている文字が実際に数えられるものではなく、どうにかして印刷するときに辞書を使用できるかどうかは疑問でした。次の文字列の辞書の使い方は?

sequence = str(input('Enter DNA sequence:')) 
print ('Your sequence contain:',len(sequence), 'bases', 'with the following 
structure:') 
adenine = sequence.count("A") + sequence.count("a") 
thymine = sequence.count("T") + sequence.count("t") 
cytosine = sequence.count("C") + sequence.count("c") 
guanine = sequence.count ("G") + sequence.count("g") 



print("adenine =", adenine) 
print("thymine=", thymine) 
print("cytosine=", cytosine) 
print("guanine=", guanine) 

私はこのような辞書に考えていた: DICC = {アデニン:[ "A"、 ""]、チミン:例えば

[ "T"、 "T"] 、 シトシン:[ "C"、 "C"]、グアニン:[ "G"、 "G"]

}

しかし、私は、彼らがある場合はヌクレオチドではないこれらの文字を印刷する方法を知りませんたとえば、次のシーケンスで結果が次のようになります。

sequence = AacGTtxponwxs: 
your sequence contain 13 bases with the following structure: 
adenine = 2 
thymine = 2 
cytosine = 1 
thymine = 2 
p is not a DNA value 
x is not a DNA value 
o is not a DNA value 
n is not a DNA value 
w is not a DNA value 
s is not a DNA value 

答えて

0

出力

sequence = 'AacGTtxponwxs' 
adenine = 0 
thymine = 0 
cytosine = 0 
guanine = 0 
outputstring = [] 
for elem in sequence: 
    if elem in ('a','A'): 
    adenine += 1 
    elif elem in ('T','t'): 
    thymine += 1 
    elif elem in ('C','c'): 
    cytosine += 1 
    elif elem in ('G','g'): 
    guanine += 1 
    else: 
    outputstring.append('{} is not a DNA value'.format(elem)) 
print ('your sequence contain {} bases with the following structure:'.format(len(sequence))) 
print ('adenine = ',adenine) 
print ('thymine = ',thymine) 
print ('cytosine = ',cytosine) 
print ('thymine = ',guanine ) 
print ("\n".join(outputstring)) 

これを試してみる:collections.Counterを使用して

your sequence contain 13 bases with the following structure: 
adenine = 2 
thymine = 2 
cytosine = 1 
thymine = 1 
x is not a DNA value 
p is not a DNA value 
o is not a DNA value 
n is not a DNA value 
w is not a DNA value 
x is not a DNA value 
s is not a DNA value 
1

dict様クラスである)、あなたはよりDRYことができます。

from collections import Counter 

sequence = 'AacGTtxponwxs' 
s = sequence.lower() 
bases = ['adenine', 'thymine', 'cytosine', 'guanine'] 
non_bases = [x for x in s if x not in (b[0] for b in bases)] 
c = Counter(s) 
for base in bases: 
    print('{} = {}'.format(base, c[base[0]])) 
# adenine = 2 
# thymine = 2 
# cytosine = 1 
# guanine = 1 

for n in non_bases: 
    print('{} is not a DNA value'.format(n)) 
# o is not a DNA value 
# n is not a DNA value 
# p is not a DNA value 
# s is not a DNA value 
# w is not a DNA value 
# x is not a DNA value 
0
#Are you studying bioinformatics at HAN? I remember this as my assignment lol 
#3 years ago 
sequence = str(input('Enter DNA sequence:')) 
sequence.lower() 
count_sequence = 0 
countA = 0 
countT = 0 
countG = 0 
countC = 0 
countNotDNA = 0 
for char in sequence: 
    if char in sequence: 
     count_sequence+=1 
     if char == 'a': 
      countA +=1 
     if char == 't': 
      countT +=1 
     if char == 'g': 
      countG +=1 
     if char == 'c': 
      countC +=1 

     else: 
      countNotDNA+=1 


print("sequence is", count_sequence, "characters long containing:","\n", countA, "Adenine","\n", countT, "Thymine","\n", countG, "Guanine","\n", countC, "Cytosine","\n", countNotDNA, "junk bases") 

ありがとうございます:)

関連する問題