テキストファイルから出力ファイルにデータを抽出する

名前が単なる数字であるファイルがたくさんあります。（1から最大数まで）から始まり、これらのファイルはそれぞれ "tags"（ObjectID =、X =、Y =など）によって互いに似ていますが、それらのタグの後の値は同じではありませんまったく。テキストファイルから出力ファイルにデータを抽出する

1つのファイルから別のファイルにデータを手動でコピー/貼り付けし、Pythonを使用して小さなスクリプトを作成することから仕事を簡単にしたいと思っていました。

これは完全なスクリプトです：（

ObjectID = 1216 
X = -1480.500610 
Y = 2610.885742 
ObjectID = 970 
X = -1517.210693 
Y = 2522.842285 
ObjectID = 3802 
X = -1512.156616 
Y = 2521.116210 
etc.

しかし、私はそれがそのようにしたくない：

import os 

BASE_DIRECTORY = 'C:\Users\Tom\Desktop\TheServer\scriptfiles\Objects' 
output_file = open('output.txt', 'w') 
output = {} 
file_list = [] 

for (dirpath, dirnames, filenames) in os.walk(BASE_DIRECTORY): 
    for f in filenames: 
     if 'txt' in str(f): 
      e = os.path.join(str(dirpath), str(f)) 
      file_list.append(e) 

for f in file_list: 
    print f 
    txtfile = open(f, 'r') 
    output[f] = [] 
    for line in txtfile: 
     if 'ObjectID =' in line: 
      output[f].append(line) 
     elif 'X =' in line: 
      output[f].append(line) 
     elif 'Y =' in line: 
      output[f].append(line) 
tabs = [] 
for tab in output: 
    tabs.append(tab) 

tabs.sort() 
for tab in tabs: 
    for row in output[tab]: 
     output_file.write(row + '')

今、すべてが正常に動作している、出力ファイルは次のようになります各値に新しい行があります）。私はすべてのファイルについてこれを行う必要があります：

ファイルを読みます。
値の前のタグを削除します。
出力フォルダにこれらの値を持つ単一の行をフォーマットします。（これを次のようにしたいとします： "（1216、-1480.500610,2522.842285）"）
その行を出力フォルダに書き込みます。
すべてのファイルに対して繰り返します。

助けてください。あなたのループでは

出典

2016-04-06 M. Rox

あなたが読む必要があるファイルからいくつかのサンプル行を貼り付けることができますか？ – Kruser

出力行と同じ –

値を1行に追加するコードを追加しました。 – Kruser

希望します。

data = open('sam.txt', 'r').read() 

>>> print data 
ObjectID = 1216 
X = -1480.500610 
Y = 2610.885742 
ObjectID = 970 
X = -1517.210693 
Y = 2522.842285 
ObjectID = 3802 
X = -1512.156616 
Y = 2521.116210 
>>>

は現在、いくつかの文字列置換を行うことができます:)

>>> data = data.replace('ObjectID =', '').replace('\nX = ', ',').replace('\nY = ', ',') 
>>> print data 
1216,-1480.500610,2610.885742 
970,-1517.210693,2522.842285 
3802,-1512.156616,2521.116210

出典

2016-04-06 13:13:00 Sampath

、記録「に」あなたがいるかどうかを追跡：

records = [] 
in_record = False 
id, x, y = 0, 0, 0 
for line in txtfile: 
    if not in_record: 
     if 'ObjectID =' in line: 
      in_record = True 
      id = line[10:] 
    elif 'X =' in line: 
     x = line[3:] 
    elif 'Y =' in line: 
     y = line[3:] 
     records.append((id, x, y)) 
     in_record = False

は、その後、あなたが簡単にcsvモジュールを書くことができタプルのリストを持っています。

出典

2016-04-06 10:31:31

これでファイルに何も書き込まれません。 –

@ M.Roxファイルに 'records'を書く必要があります。 –

ここが必要です。結果を新しいファイルに追加するためのコードを書くのに十分な時間がありませんでした。代わりに、それが印刷されますが、その点が分かります。

import os.path 

path = "path" 

#getting the number of files in your folder 
num_files = len([f for f in os.listdir(path) 
       if os.path.isfile(os.path.join(path, f))]) 

#function that returns your desired output for a given file 
def file_head_ext(file_path, file_num): 
    with open(file_path + "/" + file_num) as myfile: 
     head = [next(myfile).split("=") for x in range(3)] 
     formatted_head = [elm[1].replace("\n",'').replace(" ","") for elm in head] 
    return(",".join(formatted_head)) 


for filnum in range(1,num_files): 
    print(file_head_ext(path, str(filnum)))

出典

2016-04-06 10:54:47

ここでは、コンテンツを生成しているループのバージョンを検索します。
行内容ObjectId、XとYが同じ行にあるように書き直しました。

それはそれはあなたが何をしたいですになります。あなたが探してみてください名を分離する文字（コードで区切り）を知る必要があり

for f in file_list: 
    print f 
    txtfile = open(f, 'r') 
    output[f] = [] 
    for line in txtfile: 
     myline = '' 
     if 'ObjectID =' in line: 
      pos = line.rfind("ObjectID =") + len("ObjectID =") 
      rest = line[pos:] 
      # Here you set the delimiter after the ObjectID value. Can be "," 
      numbers = rest.split(" ") 
      if len(numbers) > 0: 
       myline.append(numbers[0]) 

     elif 'X =' in line: 
      pos = line.rfind("X =") + len("X =") 
      rest = line[pos:] 
      # Here you set the delimiter after the ObjectID value. Can be "," 
      numbers = rest.split(" ") 
      if len(numbers) > 0: 
       myline.append(numbers[0]) 
     elif 'Y =' in line: 
      pos = line.rfind("Y =") + len("Y =") 
      rest = line[pos:] 
      # Here you set the delimiter after the ObjectID value. Can be "," 
      numbers = rest.split(" ") 
      if len(numbers) > 0: 
       myline.append(numbers[0]) 

     output[f].append(myline)

注：ObjectID =から行から取得したい実際の値。

出典

2016-04-06 10:58:53 Kruser

テキストファイルから出力ファイルにデータを抽出する

答えて

関連する問題