# coding: utf-8
import re
import nltk
from nltk.tokenize import *
text = "\"Predictions suggesting that large changes in weight will
accumulate indefinitely in response to small sustained lifestyle
modifications rely on the half-century-old 3,500 calorie rule, which
equates a weight alteration of 2.2 lb to a 3,500 calories cumulative
deficit or increment,\" write the study authors Dr. Jampolis, Dr.
Chaudry, and Prof. Harlen, from N.P.C Clinic in OH. The 3,500- calorie
rule \"predicts that a person who increases daily energy expenditure by
100 calories by walking 1 mile per day\" will lose 50 pounds over five
years, the authors say. But the true weight loss is only about 10
pounds if calorie intake doesn't increase, \"because changes in mass
... alter the energy requirements of the body’s make-up.\" \"This is a
myth, strictly speaking, but the smaller amount of weight loss achieved
with small changes is clinically significant and should not be
discounted,\" says Dr. Melina Jampolis, CNN diet and fitness expert."
print(regexp_tokenize(text, pattern='(?:(?!\d)\w)+|\S+'))
私はあなたの望む出力が何であるか不明です – rahlf23
希望の出力はトークン化されたテキストですが、アポストロフィのような句読点(1トークンのままではありません)と略語も分離しませんトークン) – user3432543
基本的に "/"、 "\"、 "、"と引用符を削除したいだけですか? – rahlf23