テキストファイル内の単語の頻度を行単位で検索するC++

-2

ファイルを読み込んだ後に単語を要求する必要があります。その後、その単語を行単位で表示する必要があります。また、私はchar配列でこれをチェックする必要があります。私の出力の例を確認することができます。あなたが見ることができるようにテキストファイル内の単語の頻度を行単位で検索するC++

Line 2: 1 occurrence(s) 
line 4: 2 occurrence(s) 
Line 7: 1 occurrence(s)

私はな長さを検索文字列によって行な長さを分け、これが発生するsearchStringのpossiblityの最大時間です。だから、私は出現を表示する必要がありますが、私のコードはこの分裂をオカレンスとして示しています。これについて私に助けてくれますか？

#include <iostream> 
#include <string> 
#include <fstream> 
#include <istream> 

using namespace std; 
int number_of_lines = 1; 

void numberoflines(); 

unsigned int GetFileLength(std::string FileName) 
{ 
    std::ifstream InFile(FileName.c_str()); 
    unsigned int FileLength = 0; 
    while (InFile.get() != EOF) FileLength++; 
    InFile.close(); 
    cout<<"Numbers of character in your file : "<<FileLength<<endl; 
    return FileLength; 
} 


int main() 
{ 
    string searchString, fileName, line; 
    int a; 
    string *b; 
    char *c,*d; 
    int wordCount = 0, count = 0,count1=0; 
    cout << "Enter file name : " << endl; 
    cin >> fileName; 
    GetFileLength(fileName); 
    cout << "Enter a word for searching procces : " << endl; 
    cin >> searchString; 



    ifstream in (fileName.c_str(), ios::in); 
    d= new char[searchString.length()+1]; 

    strcpy(d,searchString.c_str()); 

    a=GetFileLength(fileName); 
    b= new string [a]; 


    if(in.is_open()){ 
     while(!in.eof()){ 
      getline(in,line); 
      c= new char[line.length()+1]; 
      count++; 


      strcpy(c,line.c_str()); 


      count1=0; 
      for (int i = 0; i < line.length()/searchString.length(); i++) 
      { 

       char *output = NULL; 
       output = strstr (c,d); 
       if(output) { 
        count1++; 
       } 
       else count1--; 
      } 
      if(count1>0){cout<<"Line "<<number_of_lines<<": "<<count1<<" occurrence(s) "<<endl;} 
      number_of_lines++; 
      if (count==10) 
      { 
       break; 
      } 
     } 

     numberoflines(); 
    } 


    return 0; 
} 

void numberoflines(){ 
    number_of_lines--; 
    cout<<"number of lines in text file: " << number_of_lines << endl; 
}

出力：

出典

2016-05-03 A.Atakul

*私はほとんどこれをしましたが、私のプログラムは文字数を示しています*私には分かりません。その文はどういう意味ですか？ – NathanOliver

私はこのコードでメモリリークを無視しようとしていますが、恐ろしく失敗しています。 – WhozCraig

@WhozCraigメモリリークは、ファイルの長さを取得する方法と比べて何もありません。 – Overv

このループ：cとdはあなたがstrstr()を呼び出すたびに同じであるため

 for (int i = 0; i < line.length()/searchString.length(); i++) 
     { 
      char *output = NULL; 
      output = strstr (c,d); 
      if(output) { 
       count1++; 
      } 
      else count1--; 
     }

は、ライン内の文字列のすべてのマッチを数えていません。あなたが検索を繰り返すときは、前回の試合の後のどこかから始める必要があります。

一致するものが見つからない場合はcount1から引きずる理由もありません。それが起こるときにループを終了する必要があります。そして、iで何もしていないので、forループの使用には少しの注意点があります。 whileループを使用してください。

 char *start = c; 
     size_t searchlen = searchString.length(); 
     while (true) 
     { 
      char *output = strstr (start,d); 
      if(output) { 
       count1++; 
       start = output + searchlen; 
      } else { 
       break; 
      } 
     }

出典

2016-05-03 19:30:13 Barmar

大変ありがとう、strcpy関数のための減算とforループは意味がありませんが、私は何を知っていませんでしたすべてをやってみました。どうもありがとうございました。 –

ファイル全体をstd::stringに読み込む必要はありません。最適化する前にこのプログラムを単純にしておくことをお勧めします。

あなたの質問に記載されているように、文字配列を使用し、行ごとに読み込む必要があります。

istream::getline functionを参照すると、非常に便利です。

はのは、1024

の行の最大長を宣言してみましょうここで読み取り、ファイルの部分です：

#define MAX_LINE_LENGTH (1024) 
char text_buffer[MAX_LINE_LENGTH]; // Look, no "new" operator. :-) 
//... 
while (my_text_file.getline(text_buffer, MAX_LINE_LENGTH, '\n')) 
{ 
//... TBD 
}

上記のコードは、変数text_bufferにテキストの行を読み取ります。

文字配列を使用しているため、好きなテキスト（たとえば、strstr）の "str"関数を読んでください。自分で書く必要があるかもしれません。

次のステップは、テキスト行から「単語」を抽出することです。

単語を抽出するには、単語の開始位置と終了位置を知る必要があります。したがって、テキスト行を検索する必要があります。有用であるので、isalpha funcitonを参照してください。

ここで先頭を見つけ、単語の終わりのためのループです：

unsigned int word_start_position = 0; // start at beginning of the line. 
unsigned int word_end_position = 0; 
const unsigned int length = strlen(text_buffer); // Calculate only once. 
while (word_start_position < length) 
{ 
    // Find the start of a word. 
    while (!isalpha(text_buffer[word_start_position])) 
    { 
    ++word_start_position; 
    } 

    // Find end of the word. 
    word_end_position = word_start_position; 
    while (isalpha(text_buffer[word_end_position])) 
    { 
    ++word_end_position; 
    } 
}

ありO.P.を解決するために、上記のコードの断片に残っているいくつかのロジックの問題が。

次の部分は、単語の文字を別の変数にコピーするために、単語の開始位置と終了位置を使用するコードを追加することです。この変数は、マップまたは連想配列または辞書で使用され、出現回数が含まれます。

言い換えれば、コンテナで単語を検索します。単語が存在する場合は、関連するオカレンス変数をインクリメントします。存在しない場合は、単語をコンテナに追加して1の出現を付けます。

Good Luck！

出典

2016-05-03 20:13:34

ありがとうございます！私はあなたのアドバイスを心に残して、もう一度感謝します！ –

私の答えが分かりましたら、チェックマークをクリックしてください。 –

テキストファイル内の単語の頻度を行単位で検索するC++

答えて

関連する問題