Rの特定の段落を分割する方法は？

長いテキストから特定の段落を抽出したい。など：Rの特定の段落を分割する方法は？

txt1 <- "What is claimed is: 
1. A hybridized CMP conditioner, comprising: a base; 
a first abrasive unit, provided on said base and comprising a first 
bonding layer fixed on said base, a substrate for abrasive unit provided  
on said first bonding layer and an abrasive layer provided on said 
substrate for abrasive unit, said abrasive layer being a diamond coating 
formed through a chemical vapor deposition process, and said diamond 
coating being provided on the surface thereof with a plurality of abrasive  
tips. 
2. The hybridized CMP conditioner according to claim 1, wherein said base 
is provided on the surface thereof with a central region and an annular  
outer region around the outside of said central region. 
3. The hybridized CMP conditioner according to claim 2, wherein said 
central region is provided with a recessed portion for said first abrasive 
unit to be provided therein, and said annular outer region is provided 
with a plurality of first accommodating portions spaced apart from each 
other for said second abrasive units to be provided therein. "

私は最初の段落を抽出したいだけです。このように：

1. A hybridized CMP conditioner, comprising: a base; 
a first abrasive unit, provided on said base and comprising a first 
bonding layer fixed on said base, a substrate for abrasive unit provided  
on said first bonding layer and an abrasive layer provided on said 
substrate for abrasive unit, said abrasive layer being a diamond coating 
formed through a chemical vapor deposition process, and said diamond 
coating being provided on the surface thereof with a plurality of abrasive  
tips.

私は

strsplit(txt1, "\n1.", perl = TRUE)

を行うために使用strsplit機能を試してみましたが、結果は私が欲しいものではありません。

strsplitを使用して

[1] "What is claimed is:"                                                                                                                                                                                                                                                            

[2] " A hybridized CMP conditioner, comprising: a base; \na first abrasive 
unit, provided on said base and comprising a first bonding layer fixed on 
said base, a substrate for abrasive unit provided on said first bonding 
layer and an abrasive layer provided on said substrate for abrasive unit, 
said abrasive layer being a diamond coating formed through a chemical 
vapor deposition process, and said diamond coating being provided on the 
surface thereof with a plurality of abrasive tips; and \na plurality of 
second abrasive units, provided on said base and comprising a second 
bonding layer fixed on said base, a carrying post provided on said second 
bonding layer, an abrasive particle provided on said carrying post and an 
abrasive material-bonding layer provided between said carrying post and 
said abrasive particle. \n2. The hybridized CMP conditioner according to 
claim 1, wherein said base is provided on the surface thereof with a 
central region and an annular outer region around the outside of said 
central region. "

出典

2017-08-17 Eva

を 'stringr :: str_split（TXT1、 "\ nは[：桁：]。+ \\"）あなたは、数ネストされた' gsub'sが必要になります ' –

例えば、 'gsub（" \\ s \\ s |^\\ s + | \\ s + $ "、" "、gsub（" \\ s \\ s "、" "、gsub（" \ n "、 "、" unlist（strsplit（txt1、 "[0-9] [。]"）））））） ' – OdeToMyFiddle

：

# split at newline followed by number and '.' 
paragraphs <- unlist(strsplit(txt1, "\\n(?=(\\d+\\.))", perl = TRUE)) 
# get rid of newlines and select 1st paragraph 
gsub(" *\\n", " ", paragraphs)[2]

出典

2017-08-17 08:23:03 Ape

Rの特定の段落を分割する方法は？

答えて

関連する問題