Python 3は、入力文字列から句読点を取り除き、文字列内の単語の位置を数えてテキストを圧縮します。

ユーザから入力文字列を受け取り、ord関数を使用して文字列から句読点を取り除くプログラムを作成しています。次に、各単語の位置を計算し（1から始まる）、繰り返される単語は無視します。圧縮された文章は、位置のようにテキストファイルに書き込まれる必要があります。Python 3は、入力文字列から句読点を取り除き、文字列内の単語の位置を数えてテキストを圧縮します。

私のコードの問題は、入力文字列が個々の文字に分割され、位置カウントが個々の文字をカウントすることです。私は単純な修正があると確信していますが、84バージョン後にはアイデアがなくなりました。

import string 

sentence=input("Please enter a sentence: ") 
sentence=sentence.upper() 
sentencelist = open("sentence_List.txt","w") 
sentencelist.write(str(sentence)) 
sentencelist.close() 


words=list(str.split(sentence)) 
wordlist=len(words) 
position=[] 
text=() 
uniquewords=[] 
texts="" 
nsentence=(sentence) 

for c in list(sentence): 
     if not ord(c.lower()) in range(97,122): 
       nsentence=nsentence.replace(c, "")#Ascii a-z 
print(nsentence) 


nsentencelist=len(nsentence) 
print(nsentencelist) 
nsentencelist2 = open("nsentence_List.txt","w") 
nsentencelist2.write(str(nsentence)) 
nsentencelist2.close()

出典

2017-03-06 Alan Walters

サンプル入力と期待される出力を入れてください。 – Crispin

それが言葉なら、なぜノーアルファベットを ""（空文字列）に置き換えるのですか？ – EvanL00

問題は、あなたが言葉に文"we are good. OK"を分割しようとしたときにそう、""（空文字列）と句読点を交換され、あなたが実際に"wearegoodOK"を分割します。句読点を空白で置き換えてください。" " index_of_first_occurrenceペア：

それとも、Strip Punctuation From String in Python

出典

2017-03-06 13:06:57 EvanL00

に提案されているように、単語を分割するために正規表現を使用することができますがここで剥ぎ取ら句読点や大文字と小文字の文章や単語のソートされた辞書を返す関数です。このデータはファイルに出力することができますが、具体的な出力要件がわからないためここでは行っていません。

import re 
from collections import OrderedDict 

def compress(sentence): 

    # regular expression looks for punctuation 
    PUNCTUATION_REGEX = re.compile(str(r'[^a-zA-Z\s]')) 

    # use an OrderedDict to keep items sorted 
    words = OrderedDict() 
    # look for punctuation and replace it with an empty string. also sets case to lower. 
    sentence = re.sub(PUNCTUATION_REGEX, '', sentence).lower() 
    # loop through words in the sentence 
    for idx, word in enumerate(sentence.split()): 
     # check that we haven't encountered this word before 
     if not words.get(word): 
      # add new word to dict, with index as value (not 0-indexed) 
      words[word] = idx + 1 
    return sentence, words

出典

2017-03-06 13:45:54 Crispin

Python 3は、入力文字列から句読点を取り除き、文字列内の単語の位置を数えてテキストを圧縮します。

答えて

関連する問題