REGEX - （Python 3.5を使用） - ファイル内の文字列を見つける

私は開いているので、そこから特定のデータを抽出する必要があります。私はまだ正規表現に少し新しく、私が必要とするものを見つけるのに苦労しています。REGEX - （Python 3.5を使用） - ファイル内の文字列を見つける

NEWS ID: 918273/1 
TITLE: News Platform Solution Overview (CNN) (US English Session) 
ACCOUNT: supernewsplatformacct (55712) 

Your request has been completed. 

Output Format MP4 

Please click on the "Download File" link below to access the download page. 

Download File <http://news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4>

を私が必要：

918273 -from- NEWS ID: 918273/1

News Platform Solution Overview (CNN) (US English Session) -from- TITLE: News Platform Solution Overview (CNN) (US English Session)

以下

は、ファイルからのデータであり、それはちょうどFYIと思われるいくつかのタブが含まれています

supernewsplatformacct -from-ACCOUNT: supernewsplatformacct (55712)

http://news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4 -from- Download File <http://news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4>

私は

[\n\r][ \t]*NEWS ID:[ \t]*([^\n\r]*)

しかし運とをしようとしています。どんな助けでも大歓迎です！

出典

2016-12-09 Kenny

代わり組み合わせの[正規表現を学ぶ]（http://stackoverflow.com/questions/4736/learning-regular-expressions） – Biffen

使用 '\ S'（空白）の可能性の重複スペース、タブ、 '\ r/\ n'のようなものです。物事をよりきれいにする。あなたの正規表現は '[\ n \ r]'で始まるのはなぜですか？そして、私たちにいくつかのpythonコードを表示できますか？ –

(?:^|(?<=\n))[^:<\n]*[:<](.*)

re.findallで使用できます。デモをご覧ください。

https://regex101.com/r/d7RPNB/2

出典

2016-12-09 20:19:11 vks

これが最後で、最後の項目は 'http：// news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4'でした。あなたは' // news ...... 'のみを取得します。 – depperm

デモが表示されます私が言っているのは、彼の要求に正確に合致しないということです。ここには、http：// ....を取得する若干変更されたバージョンがあります。（？：^ |（？<= \ n））[^：\ n] * [^ http]：\ s *（？P 。*）|（？P http： > ' – depperm

最後に余分なスペースや'> 'を得ないように変更しました。https://regex101.com/r/WJkRK5/1をチェックしてください – depperm

msg = """NEWS ID: 918273/1 
TITLE: News Platform Solution Overview (CNN) (US English Session) 
ACCOUNT: supernewsplatformacct (55712) 

Your request has been completed. 

Output Format MP4 

Please click on the "Download File" link below to access the download page. 

Download File <http://news.downloadwebsitefake.com/newsid/file1294757493292848575.mp4>""" 
import re 
regex = r'[^:]+:\s+(.*)$|[^<]+<([^>]+)>' 
matches = [re.match(regex, i).group(1) or re.match(regex, i).group(2) for i in msg.split('\n') if i and re.match(regex, i)] 
print(matches)

出典

2016-12-09 21:02:51 freegnu

REGEX - （Python 3.5を使用） - ファイル内の文字列を見つける

答えて

関連する問題