TensorFlowテキストを生成するためにLSTMを使用する

テンソルを使用してテキストを生成したいのですが、LSTMチュートリアル（https://www.tensorflow.org/versions/master/tutorials/recurrent/index.html#recurrent-neural-networks）コードを変更していますが、私の初期の解決策は、それは改善されません。私は理由を見落とす。この考え方は、ゼロ行列から始めて、一度に1つの単語を生成することです。TensorFlowテキストを生成するためにLSTMを使用する

これは

def generate_text(session,m,eval_op): 

    state = m.initial_state.eval() 

    x = np.zeros((m.batch_size,m.num_steps), dtype=np.int32) 

    output = str() 
    for i in xrange(m.batch_size): 
     for step in xrange(m.num_steps): 
      try: 
       # Run the batch 
       # targets have to bee set but m is the validation model, thus it should not train the neural network 
       cost, state, _, probabilities = session.run([m.cost, m.final_state, eval_op, m.probabilities], 
                  {m.input_data: x, m.targets: x, m.initial_state: state}) 

       # Sample a word-id and add it to the matrix and output 
       word_id = sample(probabilities[0,:]) 
       output = output + " " + reader.word_from_id(word_id) 
       x[i][step] = word_id 

      except ValueError as e: 
       print("ValueError") 

    print(output)

を次のように発電機が見えます私はptb_modelに、変数「確率」を追加している私は2つの機能 https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/rnn/ptb/ptb_word_lm.py

の下に追加されたためのコードであり、そしてそれ単にロジットよりもソフトマックスです。

self._probabilities = tf.nn.softmax(logits)

とサンプリング：私はあなたのコードを使用し

def sample(a, temperature=1.0): 
    # helper function to sample an index from a probability array 
    a = np.log(a)/temperature 
    a = np.exp(a)/np.sum(np.exp(a)) 
    return np.argmax(np.random.multinomial(1, a, 1))

出典

2016-04-13 seberik

、それは右ではないようです。だから私はそれを少し変更して、それは仕事のようです。は、ここに私のコードであり、私はそれが正しいか分からない：私は正確に同じ目標に向かって取り組んできました

def generate_text(session,m,eval_op, word_list): 
output = [] 
for i in xrange(20): 
    state = m.initial_state.eval() 
    x = np.zeros((1,1), dtype=np.int32) 
    y = np.zeros((1,1), dtype=np.int32) 
    output_str = "" 
    for step in xrange(100): 
     if True: 
      # Run the batch 
      # targets have to bee set but m is the validation model, thus it should not train the neural network 
      cost, state, _, probabilities = session.run([m.cost, m.final_state, eval_op, m.probabilities], 
                 {m.input_data: x, m.targets: y, m.initial_state: state}) 
      # Sample a word-id and add it to the matrix and output 
      word_id = sample(probabilities[0,:]) 
      if (word_id<0) or (word_id > len(word_list)): 
       continue 
      #print(word_id) 
      output_str = output_str + " " + word_list[word_id] 
      x[0][0] = word_id 
    print(output_str) 
    output.append(output_str) 
return output

出典

2016-05-27 08:26:34 macg

、そしてちょうどそれが動作するようになりました。ここには多くの修正がありますが、いくつかのステップを逃したと思います。

まず、テキストを生成するために、1つのタイムステップだけを表すモデルの異なるバージョンを作成する必要があります。その理由は、各出力yをモデルの次のステップに供給する前にそれらをサンプリングする必要があるからです。私はまた、これらの行をモデルに確率を追加num_stepsと

class SmallGenConfig(object): 
    """Small config. for generation""" 
    init_scale = 0.1 
    learning_rate = 1.0 
    max_grad_norm = 5 
    num_layers = 2 
    num_steps = 1 # this is the main difference 
    hidden_size = 200 
    max_epoch = 4 
    max_max_epoch = 13 
    keep_prob = 1.0 
    lr_decay = 0.5 
    batch_size = 1 
    vocab_size = 10000

に等しいの両方を設定し、新しい設定をすることによってこれをしなかった：

self._output_probs = tf.nn.softmax(logits)

と

@property 
def output_probs(self): 
    return self._output_probs

私のgenerate_text()機能にはいくつかの違いがあります。最初の1つは、tf.train.Saver()オブジェクトを使用してディスクから保存されたモデルパラメータをロードすることです。上記の新しい設定でPTBModelをインスタンス化した後、これを行うことに注意してください。

def generate_text(train_path, model_path, num_sentences): 
    gen_config = SmallGenConfig() 

    with tf.Graph().as_default(), tf.Session() as session: 
    initializer = tf.random_uniform_initializer(-gen_config.init_scale, 
               gen_config.init_scale)  
    with tf.variable_scope("model", reuse=None, initializer=initializer): 
     m = PTBModel(is_training=False, config=gen_config) 

    # Restore variables from disk. 
    saver = tf.train.Saver() 
    saver.restore(session, model_path) 
    print("Model restored from file " + model_path)

2つ目の違いは、私は（私はこの関数を記述しなければならなかった、以下のコードを参照してください）、IDSから単語列にルックアップテーブルを得ることです。

words = reader.get_vocab(train_path)

最初の状態は同じ方法で設定しますが、最初のトークンを別の方法で設定します。私は「文の終わり」トークンを使用して、正しいタイプの単語で文章を始めるようにします。私は単語インデックスを見て、 <eos>が索引2（確定的）を持っていることを発見したので、それをハードコードしました。最後に、1x1 Numpy Matrixでラップして、モデル入力に適切なタイプにします。

state = m.initial_state.eval() 
    x = 2 # the id for '<eos>' from the training set 
    input = np.matrix([[x]]) # a 2D numpy matrix

最後に、文章を生成する部分です。output_probsとfinal_stateを計算するようにsession.run()に伝えます。そして、それに入力と状態を与えます。最初の反復では入力は<eos>であり、状態はinitial_stateですが、それ以降の反復では最後のサンプリングされた出力を入力し、最後の反復から状態を渡します。 wordsリストを使用して、出力索引から単語ストリングを検索することにも注意してください。

text = "" 
    count = 0 
    while count < num_sentences: 
     output_probs, state = session.run([m.output_probs, m.final_state], 
            {m.input_data: input, 
            m.initial_state: state}) 
     x = sample(output_probs[0], 0.9) 
     if words[x]=="<eos>": 
     text += ".\n\n" 
     count += 1 
     else: 
     text += " " + words[x] 
     # now feed this new word as input into the next iteration 
     input = np.matrix([[x]])

その後、我々がしなければならないすべては、我々が蓄積したテキストをプリントアウトです。

print(text) 
    return

これはgenerate_text()の機能です。

最後に、get_vocab()の関数定義をreader.pyに入れてみましょう。

def get_vocab(filename): 
    data = _read_words(filename) 

    counter = collections.Counter(data) 
    count_pairs = sorted(counter.items(), key=lambda x: (-x[1], x[0])) 

    words, _ = list(zip(*count_pairs)) 

    return words

あなたがする必要がある最後の事は

save_path = saver.save(session, "/tmp/model.ckpt")

のように見えた、それを訓練した後、モデルを保存できるようにすることですそして、それは生成時に使用すると、後でディスクからロードしますモデルですテキスト。

もう1つの問題がありました.Tensorflow softmax関数によって生成される確率分布が1.0に完全に合わないことがあることが判明しました。合計が1.0より大きい場合、np.random.multinomial()はエラーをスローします。だから私はあなたが一緒にこのすべてを置く場合は、この

def sample(a, temperature=1.0): 
    a = np.log(a)/temperature 
    a = np.exp(a)/np.sum(np.exp(a)) 
    r = random.random() # range: [0,1) 
    total = 0.0 
    for i in range(len(a)): 
    total += a[i] 
    if total>r: 
     return i 
    return len(a)-1

のように見えます私自身のサンプリング機能を、書かなければならなかった、小さなモデルは私にいくつかのクールな文章を生成することができました。がんばろう。

出典

2016-08-02 19:53:15

TensorFlowテキストを生成するためにLSTMを使用する

答えて

関連する問題