Pythonで表形式のCLI出力をJSON形式に変換する

-1

以下の出力をPythonでJson形式に変換する必要があります。Pythonで表形式のCLI出力をJSON形式に変換する

どうすればいいですか？

switch# sh mod 
Mod Ports Module-Type       Model    Status 
--- ----- ----------------------------------- ------------------ ---------- 
1 48  1/2/4/8 Gbps FC/Supervisor-3  DS-C9148-K9-SUP active * 

Mod Sw    Hw  World-Wide-Name(s) (WWN) 
--- -------------- ------ -------------------------------------------------- 
1 6.2(17)   1.1  20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8 


Mod MAC-Address(es)       Serial-Num 
--- -------------------------------------- ---------- 
1 c0-8c-60-65-82-dc to c0-8c-60-65-82-df JAF1736ALLM

入力1：https://i.stack.imgur.com/EGsY4.jpg

入力2：https://i.stack.imgur.com/aDGcB.jpg

出典

2017-08-28 Aftab

1何が出力されるべきか、そして2.何を試してもうまくいかなかったのですか？ –

私は複雑な正規のexまたはステートフルラインパーサーを使用しなければならないと言いたいと思います。不運にも、どちらも挑戦的で醜いところのどこかにあるだろう。 –

各キー値を構築するために、各キーと値のスライスを定義するには、 '---'セパレータを使用できます。（あなたの例から、私はユニークなModの値を持つ複数の "モッズ" sが、ある推測しているので、私は、全体的なアキュムレータのキーは、このフィールドを使用していました。）

from collections import defaultdict 
import re 
from itertools import groupby 

sample = """\ 
Mod Ports Module-Type       Model    Status 
--- ----- ----------------------------------- ------------------ ---------- 
1 48  1/2/4/8 Gbps FC/Supervisor-3  DS-C9148-K9-SUP active * 
2 48  1/2/4/8 Gbps FC/Supervisor-3  DS-C9148-K9-SUP active * 

Mod Sw    Hw  World-Wide-Name(s) (WWN) 
--- -------------- ------ -------------------------------------------------- 
1 6.2(17)   1.1  20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8 
2 6.2(17)   1.1  20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8 

Mod MAC-Address(es)       Serial-Num 
--- -------------------------------------- ---------- 
1 c0-8c-60-65-82-dc to c0-8c-60-65-82-df JAF1736ALLM 
2 c0-8c-60-65-82-ec to c0-8c-60-65-82-ef JAF1736AXXX 

Xbar Ports Module-Type Model Status 
---- ----- ----------- ----- ------ 
1 0  Fabric 1 ABC ok 

Xbar Sw Hw 
---- -- --- 
1 NA 1.0 

""" 

all_input_lines = sample.splitlines() 
mod_accum = defaultdict(dict) 
xbar_accum = defaultdict(dict) 

for is_blank, input_lines_iter in groupby(all_input_lines, 
              key=lambda s: not bool(s.strip())): 
    input_lines = list(input_lines_iter) 
    if is_blank: 
     continue 

    # assume first two lines are field names and separator dashes 
    names, dashes = input_lines[:2] 

    # make sure dashes line is all '---' separators 
    if not all(ss == set('-') for ss in map(set, dashes.split())): 
     print("invalid line group found, skipping...") 
     print('-'*40) 
     print('\n'.join(input_lines)) 
     print('-'*40) 
     continue 

    # use regex to get start/end of each '---' divider, and make slices 
    spans = (match.span() for match in re.finditer('-+', dashes)) 
    slices = [slice(sp[0], sp[1]+1) for sp in spans] 

    names = [names[sl].rstrip() for sl in slices] 

    # is this a module or an xbar? 
    if 'Mod' in names: 
     key = 'Mod' 
     accum = mod_accum 
    elif 'Xbar' in names: 
     key = 'Xbar' 
     accum = xbar_accum 
    else: 
     raise ValueError("no Mod or Xbar name in row names ({})".format(
          ",".join(names))) 

    for line in input_lines: 
     # use slices to extract data from values, make into a dict 
     row_dict = dict(zip(names, (line[sl].rstrip() for sl in slices))) 

     # accumulate these values into any previous ones collected for this Mod 
     accum[row_dict[key]].update(row_dict) 

# print out what we got 
import json 
all_data = {"Modules": mod_accum, "Xbars": xbar_accum} 
print(json.dumps(all_data, indent=2))

プリント：

{ 
    "Modules": { 
    "2": { 
     "World-Wide-Name(s) (WWN)": "20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8", 
     "Module-Type": "1/2/4/8 Gbps FC/Supervisor-3", 
     "Ports": "48", 
     "Sw": "6.2(17)", 
     "Hw": "1.1", 
     "Model": "DS-C9148-K9-SUP", 
     "Status": "active *", 
     "Serial-Num": "JAF1736AXXX", 
     "MAC-Address(es)": "c0-8c-60-65-82-ec to c0-8c-60-65-82-ef", 
     "Mod": "2" 
    }, 
    "1": { 
     "World-Wide-Name(s) (WWN)": "20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8", 
     "Module-Type": "1/2/4/8 Gbps FC/Supervisor-3", 
     "Ports": "48", 
     "Sw": "6.2(17)", 
     "Hw": "1.1", 
     "Model": "DS-C9148-K9-SUP", 
     "Status": "active *", 
     "Serial-Num": "JAF1736ALLM", 
     "MAC-Address(es)": "c0-8c-60-65-82-dc to c0-8c-60-65-82-df", 
     "Mod": "1" 
    } 
    }, 
    "Xbars": { 
    "1": { 
     "Module-Type": "Fabric 1", 
     "Ports": "0", 
     "Sw": "NA", 
     "Hw": "1.0", 
     "Model": "ABC", 
     "Status": "ok", 
     "Xbar": "1" 
    } 
    } 
}

出典

2017-08-28 21:25:35 PaulMcG

ポールの提案に感謝します。上記のコードは1つのモジュールで完璧に動作します。しかし入力1の場合は、この中に新しいrowname 'xbar'があるため、投げるキーエラーです。どのように我々はこれを処理することができますどのような考え。また、入力2の次のモジュールセットについて反復しません。 – Aftab

これを書いた後、私はこれが複数のモジュールの場合であると感じました。 itertools.groupbyを使用して行のグループを抜き出し、データのない行のグループを取得した場合のエラーチェックを少し行います。これから多くのPythonを学んだわけではありませんが、おそらくこれは便利なコード例です。 – PaulMcG

私は解決策を持っていますが、それはかなりではありません。あなたの全体の出力がtextであると仮定してください。

import re 
lines = text.split("\n") 
keylines = [line for i, line in enumerate(lines) if len(lines)>(i+1) and "---" in lines[i+1]] 
vallines = [line for i, line in enumerate(lines) if i!=0 and "---" in lines[i-1]] 
keys = re.split(" +", " ".join(keylines)) 
vals = re.split(" +", " ".join(vallines)) 
result = dict(zip(keys, vals))

出力：

{ 
    "Mod": "1", 
    "Ports": "48", 
    "Module-Type": "1/2/4/8 Gbps FC/Supervisor-3", 
    "Model": "DS-C9148-K9-SUP", 
    "Status": "active *", 
    "Sw": "6.2(17)", 
    "Hw": "1.1", 
    "World-Wide-Name(s) (WWN)": "20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8", 
    "MAC-Address(es)": "c0-8c-60-65-82-dc to c0-8c-60-65-82-df", 
    "Serial-Num": "JAF1736ALLM" 
}

それは以下の仮定を行い、彼らが真でない場合に解除されます：

ませ値が連続して複数のスペースが含まれていません。
"フィールド"の間には少なくとも2つのスペースがあります。
ダッシュの付いた行には、少なくとも1つのセグメントが3つのダッシュの長さです。

出典

2017-08-28 08:50:09 L3viathan

Pythonで表形式のCLI出力をJSON形式に変換する

答えて

関連する問題