はR 3.5.0でサポートされている正規表現\\のLですか？

私はR-devの上の非常に特定の状況ではPerlの式\\L\\1との難しさを経験しています（2017年6月6日と2017年6月16日r72796ビルド）：再現するはR 3.5.0でサポートされている正規表現\のLですか？

bib <- readLines("https://raw.githubusercontent.com/HughParsonage/TeXCheckR/master/tests/testthat/lint_bib_in.bib", encoding = "UTF-8") 

leading_spaces <- 2 

is_field <- grepl("=", bib, fixed = TRUE) 
field_width <- nchar(trimws(gsub("[=].*$", "", bib, perl = TRUE))) 

widest_field <- max(field_width[is_field]) 

out <- bib 

# Vectorized gsub: 
for (line in seq_along(bib)){ 
    # Replace every field line with 
    # two spaces + field name + spaces required for widest field + space 
    if (is_field[line]){ 
    spaces_req <- widest_field - field_width[line] 
    out[line] <- 
     gsub("^\\s*(\\w+)\\s*[=]\\s*\\{", 
      paste0(paste0(rep(" ", leading_spaces), collapse = ""), 
        "\\L\\1", 
        paste0(rep(" ", spaces_req), collapse = ""), 
        " = {"), 
      bib[line], 
      perl = TRUE) 
    } 
} 

# Add commas: 
out[is_field] <- gsub("\\}$", "\\},", out[is_field], perl = TRUE) 

out[9] 
#> R-dev " author" 
#> R 3.4.0 " author  = {Tony Wood and Amélie Hunter and Michael O'Toole and Prasana Venkataraman and Lucy Carter},"

、それが必要です：

readLinesへ

ファイルから、エンコードを指定します。
は、Perlの正規表現で\\Lまたは\\Uを使用するには（dputを使用すると再現しません）。
UTF-8（上記でアメリのE）を必要とするベクトルの要素を有する文字
のベクターを使用する

これはR 3.5.0の変化である、または有しますこの場合、私は\\Lを乱用していますか？

出典

2017-06-16 Hugh

見て、あなたは警告されています：[。*これはおそらくバグが含まれている、あなたはそれを使用する場合ので注意してください*]（https://cran.r-project.org /bin/windows/base/rdevel.html）。 –

私はスニペットを構築することができませんでした - 'leading_spaces'は何ですか？ –

この特定のバグにより、パッケージのR CMDチェックでエラーが発生しています。ノンレプレックスについては申し訳ありませんが、私は編集しました。 – Hugh

は明らかにいくつかの予期しない動作があります。

\1に言及するとき、それは出力動作：

[1] " author  = {Tony Wood and Amélie Hunter and Michael O'Toole and Prasana Venkataraman and Lucy Carter},"

\U又は\Lを\1と一緒に使用されるときはいつでもしかし、第二後方参照が削除されます。

"\\U\\1"：[1] " AUTHOR"
"\\U\\1\\E\\2"：[1] " AUTHOR"

gsubfn解決策はまだ（ここにtoupper()との一例）を動作します。

library(gsubfn) 
bib <- readLines("https://raw.githubusercontent.com/HughParsonage/TeXCheckR/master/tests/testthat/lint_bib_in.bib", encoding = "UTF-8") 
leading_spaces <- 2 
is_field <- grepl("=", bib, fixed = TRUE) 
field_width <- nchar(trimws(gsub("[=].*$", "", bib, perl = TRUE))) 
widest_field <- max(field_width[is_field]) 
out <- bib 

# Vectorized gsub: 
for (line in seq_along(bib)){ 
    # Replace every field line with 
    # two spaces + field name + spaces required for widest field + space 
    if (is_field[line]){ 
    spaces_req <- widest_field - field_width[line] 
    out[line] <- 
     gsubfn("^\\s*(\\w+)\\s*=\\s*\\{", 
      function(y) paste0(
        paste0(rep(" ", leading_spaces), collapse = ""), 
        toupper(y), 
        paste0(rep(" ", spaces_req), collapse = ""), 
        " = {" 
      ), 
      bib[line], engine="R" 
    ) 
    } 
} 
# Add commas: 
out[is_field] <- gsub("\\}$", "},", out[is_field], perl = TRUE) 

out[9]

出力：

[1] " AUTHOR  = {Tony Wood and Amélie Hunter and Michael O'Toole and Prasana Venkataraman and Lucy Carter},"

の

私のSessionInfo詳細：

> sessionInfo() 
R Under development (unstable) (2017-06-19 r72808) 
Platform: i386-w64-mingw32/i386 (32-bit) 
Running under: Windows 7 x64 (build 7601) Service Pack 1 

Matrix products: default 

locale: 
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252 
[3] LC_MONETARY=English_United States.1252 
[4] LC_NUMERIC=C       
[5] LC_TIME=English_United States.1252  

attached base packages: 
[1] stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] gsubfn_0.6-6 proto_1.0.0 

loaded via a namespace (and not attached): 
[1] compiler_3.5.0 tools_3.5.0 tcltk_3.5.0

出典

2017-06-19 13:28:37

はR 3.5.0でサポートされている正規表現\\のLですか？

答えて

関連する問題