2017-02-07 10 views
1

正規表現のマッチグループを '#'文字で置き換えたいと思います。python - 可変正規表現グループのみ置換

可変数のグループを含むRegexの可変数があります。

正規表現グループの値を置き換える必要があります。

#! /usr/bin/python 

import re 

data = """Line1 '4658' 
Line2 data 'AAA'\tBBB\t55555 
Roach""".splitlines() 

# a variable number of Regex's containing a variable number of groups 
needles = [ r"Line1 '(\d+)'", 
     r"'(AAA)'\t\S+\t(\S+)", 
     r"(Roach)" ] 

pattern = re.compile('|'.join(needles)) 

for line in data: 

match = pattern.search(line) 

if (match): 

    print(re.sub(match.string[match.start():match.end()], '#' * len(match.string), line)) 

# current output 
""" 
############ 
Line2 data ########################## 
##### 
""" 

# desired output 
""" 
Line1 '####' 
Line2 data '###' BBB ##### 
##### 
""" 

答えて

0

このようなコードを変更します。

#! /usr/bin/python 

import re 

data = """Line1 '4658' 
Line2 data 'AAA'\tBBB\t55555 
Roach""".splitlines() 

# a variable number of Regex 's containing a variable number of groups 
needles = [r "Line1 '(\d+)'", 
    r "'(AAA)'\t\S+\t(\S+)", 
    r "(Roach)" 
] 

pattern = re.compile('|'.join(needles)) 

for line in data: 
    match = pattern.search(line) 
    for matched_str in match.groups(): 
    if (matched_str): 
     line = re.sub(matched_str, '#' * len(matched_str), line) 
    print(line) 

とRAN:

$ python a.py 
Line1 '####' 
Line2 data '###' BBB ##### 
##### 
0

をあなたは余分な試合のためre.search()を使用する必要はありません。正規表現を変更して、文字列のすべての部分に一致させ、目的の部分を置き換える適切な関数を使用するだけです。ここ

は、文章のいずれかの例である:ここ

In [51]: def replacer(x):          
      matched = x.groups() 
      if len(matched) == 4: 
       return "{}{}{}{}".format(matched[0], len(matched[1]) * '*', matched[2], len(matched[3]) * '*') 
    ....:  

In [52]: pattern = re.compile(r"([^']*)'(AAA)'(\t\S+\t)(\S+)") 

In [53]: pattern.sub(replacer, "Line2 data 'AAA'\tBBB\t55555") 
Out[53]: 'Line2 data ***\tBBB\t*****' 

は完全なコードである:

import re 

data = """Line1 '4658' 
Line2 data 'AAA'\tBBB\t55555 
Roach""".splitlines() 

# a variable number of Regex's containing a variable number of groups 
needles = [ r"(Line1)'(\d+)'", 
     r"([^']*)'(AAA)'(\t\S+\t)(\S+)", 
     r"(Roach)" ] 


def replacer(x):          
    matched = x.groups() 
    if matched[2]: 
     # in this case groups from 3rd index have been matched 
     return "{}{}{}{}".format(matched[2], len(matched[3]) * '#', matched[4], len(matched[5]) * '#') 
    elif matched[0]: 
     # in this case groups from 1st index have been matched 
     return "{}{}".format(matched[0], len(matched[1]) * '#') 
    elif matched[-1]: 
     # in this case last group has been matched 
     return len(matched[-1]) * '#' 


pattern = re.compile('|'.join(["{}".format(i) for i in needles])) 


for line in data: 
    print(pattern.sub(replacer, line)) 

出力:

Line1 #### 
Line2 data ### BBB ##### 
#####