Python csvで列の一部を切り捨てます

私はこの奇妙な問題にぶち当たっています。Python csvで列の一部を切り捨てます

私はまた、これは、過去に働いていた言及すべきなので、私も多分.csvまたは特定のライン自体に何かを考えています。

クイックブレークダウン。 CVE（脆弱性）データの.csvファイルからデータを取得するスクリプトがあります。次に、cvssモジュールを使用して結果を再解析し、パッチと緊急性の優先度を測定する方法として出力を使用します。それが台無しにどこ

（私たちは新しいツールを実装するまで、このスクリプトは一時的な修正である）

ここです。ここでは、私のインジェストファイルの出力が今のように見えます。ここで

Vulnerability Title,Plugin ID,Original CVSS Score,Default Vector,Original Severity,AWS Score,AWS Vector,AWS Severity,Hosts,Host Type,Percentage Impacted 
Cisco IOS IKEv1 Packet Handling Remote Information Disclosure (cisco-sa-20160916-ikev1) (BENIGNCERTAIN),NES-93736,4.6,CVSS2#AV:N/AC:L/Au:N/C:P/I:N/A:N,,,AV:N/AC:L/Au:N/C:P/I:N/A:N,,26,26, 
Cisco IOS Software TCP Memory Leak DoS (cisco-sa-20150325-tcpleak),NES-82568,4.9,CVSS2#AV:N/AC:L/Au:N/C:N/I:N/A:C,,,AV:N/AC:L/Au:N/C:N/I:N/A:C,,30,26, 
RHEL 5/6/7 : nss and nss-util (RHSA-2016:2779),NES-94912,4.2,CVSS2#AV:N/AC:M/Au:N/C:C/I:C/A:C/E:F/RL:OF/RC:ND,,,AV:N/AC:M/Au:N/C:C/I:C/A:C/E:F/RL:OF/RC:ND,,5112,23,

が

Vulnerability Title,Plugin ID,Original CVSS Score,Default Vector,Original Severity,AWS Score,AWS Vector,AWS Severity,Hosts,Host Type,Percentage Impacted 
ium,4.6,AV:A/AC:H/Au:M/C:P/I:N/A:P/CDP:L/TD:H/CR:H/IR:H/AR:H,Medium,26,26,0.2524271844660194 
Cisco IOS Software TCP Memory Leak DoS (cisco-sa-20150325-tcpleak),NES-82568,4.9,CVSS2#AV:N/AC:L/Au:N/C:N/I:N/A:C,Medium,4.9,AV:A/AC:H/Au:M/C:N/I:N/A:C/CDP:L/TD:M/CR:H/IR:H/AR:H,Medium,30,26,0.2912621359223301 
RHEL 5/6/7 : nss and nss-util (RHSA-2016:2779),NES-94912,4.2,CVSS2#AV:N/AC:M/Au:N/C:C/I:C/A:C/E:F/RL:OF/RC:ND,Medium,4.2,AV:A/AC:H/Au:M/C:C/I:C/A:C/E:F/RL:OF/RC:ND/CDP:L/TD:M/CR:H/IR:H/AR:H,Medium,5112,23,0.615458704550927

は少しさらにそれを説明するには（怒鳴る装着されている）私のスクリプトの後に出力され、1行目は、単語中のカットオフである「イウム」で始まります128行目（#ORIGINAL SCOREと書かれている部分）のスクリプトの一番下の部分から来ています。それはミディアムと言うべきです。したがって、基本的には、入力の2つのように見て、出力と比較して、この行全体を切り取り、スクリプトが追加しようとしている単語の半分だけを追加するとします。私は多分それがすべてのブラッカーや何かのためだと思ったが、私は確信していない。

Cisco IOS IKEv1 Packet Handling Remote Information Disclosure (cisco-sa-20160916-ikev1) (BENIGNCERTAIN),NES-93736,4.6,CVSS2#AV:N/AC:L/Au:N/C:P/I:N/A:N,

ここにこの機能を実行しているスクリプトがあります。私が知っているのはちょっと醜いので、改善提案は歓迎されますが、なぜ私のファイルを乱すのが私の優先事項なのかを知ることができます。私はパンダに切り替えることを考えましたが、これを一度も使用したことがないので、少し時間がかかるでしょう。まだこれをどうやって行うのか分かりません。

def rescore_function(): 
#headers 
    print 'Starting Rescore' 
    csv_in = open('/tmp/rescore_test.csv', 'rb') 
    csv_out = open('/tmp/rescored_vulnerabilities.csv', 'wb') 
    writer = csv.writer(csv_out) 
    reader = csv.reader(csv_in) 
    headers = next(reader, None) 
    if headers: 
     writer.writerow(headers) 

    print 'Creating Target Distrobution' 
    for row in csv.reader(csv_in): 
    #This is a terrible way of setting up the percentage of hosts impacted for target distrobution. Its ugly and horrible. Host count defines the host impacted, host_type identifies what kind of host it is. Such as Alinux, Rhel5, or Cisco IOS 
     host_count = float(row[8]) 
     host_type = float(row[9]) 
     alinux_impact = host_count/ALINUX_HOST 
     cisco_impact = host_count/CISCO_COUNT 
     juniper_impact = host_count/JUNIPER_COUNT 
     citrix_impact = host_count/CITRIX_COUNT   
     all_linux= host_count/LINUX_TOTAL 
     print 'math set' 

#The reason for vul_id is 3 lists combined is simple. alinux_impact NEEDS to be 24, cisco NEEDs to be 26, juniper NEEDS to match 27, because vul_id is the softwares 'vulnerability ID type 
#range falls into all_linux. So fillvalue=vul_os[-1] means if its not 24,26,27, it is "all_linux" which means it compares it to the All linux number.  
     vul_id = [24, 26, 27, 25] + range(24) + range(28,101) 
     vul_os = [alinux_impact, cisco_impact, juniper_impact, all_linux] 

     append_file = open('/tmp/rescored_vulnerabilities.csv', 'ab') 
     append_write = csv.writer(append_file) 

#Does the for loop with the fillvalue as mentioned above. Basically Y is the host type (linux, Cisco IOS, etc) and X is the vulnerability type. So it runs through and figures out the TD and rescore methods. 
#X equals the percetange of impacted, so the Metric will be based on amount/percentage of X impacted and does a regex search and replace based on that using the CVSS calculations. 
     print vul_id 
     print vul_os 
     for x,y in izip_longest(vul_os, vul_id, fillvalue=vul_os[-1]): 
      print x,y 
      print host_type 
    #VECTOR REGEXP, host_type is which OS/Device type. 23 = RHEL5, 24 = Alinux, 26 = Cisco, 27 = Juniper 
      if host_type == y: 
       row[10] = x 
       if x <= 0.25: 
        AC_Metric = 'A:C/CDP:L/TD:L/CR:H/IR:H/AR:H' 
        AP_Metric = 'A:P/CDP:L/TD:L/CR:H/IR:H/AR:H' 
        AN_Metric = 'A:N/CDP:L/TD:L/CR:H/IR:H/AR:H' 
        RCUC_Metric = 'RC:UC/CDP:L/TD:L/CR:H/IR:H/AR:H' 
        RCUR_Metric = 'RC:UR/CDP:L/TD:L/CR:H/IR:H/AR:H' 
        RCC_Metric = 'RC:C/CDP:L/TD:L/CR:H/IR:H/AR:H' 
        RCND_Metric = 'RC:ND/CDP:L/TD:L/CR:H/IR:H/AR:H' 
       elif 0.26 <= x <= 0.75: 
        AC_Metric = 'A:C/CDP:L/TD:M/CR:H/IR:H/AR:H' 
        AP_Metric = 'A:P/CDP:L/TD:M/CR:H/IR:H/AR:H' 
        AN_Metric = 'A:N/CDP:L/TD:M/CR:H/IR:H/AR:H' 
        RCUC_Metric = 'RC:UC/CDP:L/TD:M/CR:H/IR:H/AR:H' 
        RCUR_Metric = 'RC:UR/CDP:L/TD:M/CR:H/IR:H/AR:H' 
        RCC_Metric = 'RC:C/CDP:L/TD:M/CR:H/IR:H/AR:H' 
        RCND_Metric = 'RC:ND/CDP:L/TD:M/CR:H/IR:H/AR:H' 
       else: 
        AC_Metric = 'A:C/CDP:L/TD:H/CR:H/IR:H/AR:H' 
        AP_Metric = 'A:P/CDP:L/TD:H/CR:H/IR:H/AR:H' 
        AN_Metric = 'A:N/CDP:L/TD:H/CR:H/IR:H/AR:H' 
        RCUC_Metric = 'RC:UC/CDP:L/TD:H/CR:H/IR:H/AR:H' 
        RCUR_Metric = 'RC:UR/CDP:L/TD:H/CR:H/IR:H/AR:H' 
        RCC_Metric = 'RC:C/CDP:L/TD:H/CR:H/IR:H/AR:H' 
        RCND_Metric = 'RC:ND/CDP:L/TD:H/CR:H/IR:H/AR:H' 


       text = row[6] 
       text = re.sub(r'AV:N','AV:A',text) 
       text = re.sub(r'AC:L','AC:H',text) 
       text = re.sub(r'AC:M','AC:H',text) 
       text = re.sub(r'Au:N','Au:M',text) 
       text = re.sub(r'Au:S','Au:M',text) 
       text = re.sub(r'A:C$',AC_Metric,text) 
       text = re.sub(r'A:P$',AP_Metric,text) 
       text = re.sub(r'A:N$',AP_Metric,text) 
       text = re.sub(r'RC:UC',RCUC_Metric,text) 
       text = re.sub(r'RC:UR',RCUR_Metric,text) 
       text = re.sub(r'RC:C',RCC_Metric,text) 
       text = re.sub(r'RC:ND',RCND_Metric,text) 
       row[6] = text 
    #NEW SCORE, uses CVSS module to take the previous vector and find out the the numbered score. It then uses that number to define the severity word. 
       try: 
        vector = row[6] 
        c = CVSS2(vector) 
        row[5] = c.scores()[2] 
        vul_score = row[5] 
        if 0 <= vul_score <= 3.9: 
         vuln_word = 'Low' 
        elif 4.0 <= vul_score <=6.9: 
         vuln_word = 'Medium' 
        elif 7.0 <= vul_score <= 9.9: 
         vuln_word = 'High' 
        else: 
         vuln_word = 'Critical' 
        row[7] = vuln_word 
       except CVSS2MalformedError: 
        rescored_success = False 
        pass 
    #ORIGINAL SCORE, does the same as above for the original vector since NESSUS does not provide the Severity "word". This only finds the word, not the number value. 
       default_score = float(row[2]) 
       if 0 <= default_score <= 3.9: 
        default_severity = 'Low' 
       elif 4.0 <= default_score <=6.9: 
        default_severity = 'Medium' 
       elif 7.0 <= default_score <= 9.9: 
        default_severity = 'High' 
       else: 
        default_severity = 'Critical' 
       row[4] = default_severity 
       append_write.writerow(row)

出典

2017-01-03 Mallachar

なぜあなたは 'rb'モードで読んでいますか？それはバイナリファイルではありませんか？ 'r'で試してみてください。 – jbasko

@jbasko 'rb'はpython2（https://docs.python.org/2/library/csv.html#module-contents）のcsv.readerの推奨モードです – snakecharmerb

ありがとう@snakecharmerbはそれを知らなかった。 – jbasko

あなたのコードが再現するのは難しいかなり大きいですが、私は何かが書き込みファイルハンドルと書き込みモードで/同時バッファリングされたファイルへのアクセスに行くすべてのバッファリングを持つ魚であると思われます。かなり混乱

は、最初のあなたは、前述のハンドルが閉じられていないのに対し、あなたはAPPENDでファイルを開き、各反復のヘッダ

を書くcsv_out = open('/tmp/rescored_vulnerabilities.csv', 'wb')

で切り捨て/開いていますモード： append_file = open('/tmp/rescored_vulnerabilities.csv', 'ab')
append_fileのいずれかを閉じることはできません。

私はこれをお勧めする：

最初の切り捨てオープンはokです
が同じファイルに（それが動作する、writeによってwriteポイントをappend_writeを交換append_file = open('/tmp/rescored_vulnerabilities.csv', 'ab')
を削除し、まだ開いています）
最後にclosecsv_outに忘れずに（またはすべてのコードをwith open(...) as csv_out:ブロック

この問題はUn * xのみであることに注意してください。 Windowsのファイルシステムでは、書き込みモードでファイルを2回開くことができないため、すぐに例外がスローされます。

出典

2017-01-03 20:12:49

ああ、そうでしたファイルを追加します。私はそれを削除し、元の作家だけを使用するように切り替え、すべてを修正しました。助けてくれてありがとう！そして、ファイル書き込みファイルを閉じた後、少し後にスクリプトを実行しますが、すべてを修正します。 – Mallachar

それは素晴らしいです！他に何ができるのか分からなかった。 –

Python csvで列の一部を切り捨てます

答えて

関連する問題