データ圧縮のために文字列ではなく数字をファイルに書き込む？

私はPythonでLZWアルゴリズムを使って簡単なテキストファイルをエンコードしていました。しかし、私は文字列をwrite（）関数を使って.txtファイルに書き込むことができることに気付きました。それ自体はほとんど同じ容量を占めています。だから、何らかの理由で実際の整数をファイルに書き込んで（おそらく別の形式で）、は適切な圧縮を達成できますか？データ圧縮のために文字列ではなく数字をファイルに書き込む？

readfile = open("C:/Users/Dhruv/Desktop/read.txt", "r") 
writefile = open("C:/Users/Dhruv/Desktop/write.txt", "w") 
content = readfile.read() 
length = len(content) 

codes = [] 
for i in range(0, 256) : 
    codes.append(str(chr(i))) 

current_string = "" 
for i in range(0, length) : 
    temp = current_string + content[i] 
    print(temp) 
    if temp in codes : 
     current_string += content[i] 
    else : 
     codes.append(current_string + content[i]) 
     writefile.write(str(codes.index(current_string)) + " ") 
     current_string = str(content[i]) 
writefile.write(str(codes.index(current_string)) + " ") 
readfile.close() 
writefile.close();

出典

2017-04-03 Dhruv Chadha

モード 'wb'で書き込むために開かれたバイナリファイル*を意味するかもしれません... –

@AnttiHaapalaと同意して、 "wb"を使用してバイナリエンコーディングでbytes（）を送信してください。 http://stackoverflow.com/questions/20955543/python-writing-binaryを参照してください – RobertB

私は255以上の整数も保存したいのですが、どうすればいいですか？また、私はそれらを整数だけとして読み返したいと思う。 –

あなたのデータはnumpyの配列として表すことができる場合は、次の関数は、.txtファイルに整数としてそれを書くことができます。

_hdは、ファイル名と_dataです

import numpy as np 
def writer(_hd, _data): 
    out_file_name = str(_hd) + '.csv' 
    np.savetxt(out_file_name, _data, fmt='%i') 
    return None

がnumpyの配列であります。 fmt = '％i'はデータを整数として保存します。他のオプションも利用可能ですhere。

出典

2017-04-03 20:15:13 salehinejad

データ圧縮のために文字列ではなく数字をファイルに書き込む？

答えて

関連する問題