fread - 文字列内の複数の区切り文字

私はfreadを使用してテーブルを読み込もうとしています。 txtファイルは次のようになり、テキストがあります。私が使用しているfread - 文字列内の複数の区切り文字

"No","Comment","Type" 
"0","he said:"wonderful|"","A" 
"1","Pr/ "d/s". "a", n) ","B"

Rコードは次のとおりです。data.table Rパッケージのdevelopment versionとdataset0 <- fread("data/test.txt", stringsAsFactors = F)。

3つの列を持つデータセットが表示されます。しかし：

Error in fread(input = "data/stackoverflow.txt", stringsAsFactors = FALSE) : 
Line 3 starting <<"1","Pr/ ">> has more than the expected 3 fields. 
Separator 3 occurs at position 26 which is character 6 of the last field: << n) ","B">>. 
Consider setting 'comment.char=' if there is a trailing comment to be ignored.

どのように解決するのですか？

出典

2017-03-21 A.Yin

data.tableのdevelopment versionが埋め込まれた引用符をエスケープされていない。このようなファイルを処理します。 point 10 on the wiki pageを参照してください。

私はあなたの入力でそれをテストしています。

$ more unescaped.txt 
"No","Comment","Type" 
"0","he said:"wonderful."","A" 
"1","The problem is: reading table, and also "a problem, yes." keep going on.","A" 

> DT = fread("unescaped.txt") 
> DT 
    No                 Comment Type 
1: 0              he said:"wonderful." A 
2: 1 The problem is: reading table, and also "a problem, yes." keep going on. A 
> ncol(DT) 
[1] 3

出典

2017-03-22 02:14:43

'fread'は、askerが' dput（） 'を使う代わりにdata.frameやmatrixの内容だけを表示したり、データを作成するためのコードを提供したりするので、質問に答えるためのスマートなツールです。 – Uwe

テキストをtxtファイルにコピーしてコピーし、まったく同じコードを使用します。それでも同じエラーが報告されます。再試行した 'data.table' Rパッケージが、役に立たなかった。 –

@ A.Yin開始時に「開発バージョン」という言葉が表示されます。 –

使用readLines区切り文字を置き換えるとread.table、行ずつ読むこと：

# read with no sep 
x <- readLines("test.txt") 

# introduce new sep - "|" 
x <- gsub("\",\"", "\"|\"", x) 

# read with new sep 
read.table(text = x, sep = "|", header = TRUE) 

# No                 Comment Type 
# 1 0              he said:"wonderful." A 
# 2 1 The problem is: reading table, and also "a problem, yes." keep going on. A

出典

2017-03-21 23:57:44 zx8754

元のテキストに既に '|'が含まれている場合はどうなりますか？ –

@ A.Yinでは、テキストに含まれていない文字を使用します。 – zx8754

fread - 文字列内の複数の区切り文字

答えて

関連する問題