まず、How to iterate through all nodes of a tree?とHow to navigate a nltk.tree.Tree?を参照してください。
>>> from nltk.tree import Tree
>>> bracket_parse = "(S (VP (VB get) (NP (PRP me)) (ADVP (RB now))))"
>>> ptree = Tree.fromstring(bracket_parse)
>>> ptree
Tree('S', [Tree('VP', [Tree('VB', ['get']), Tree('NP', [Tree('PRP', ['me'])]), Tree('ADVP', [Tree('RB', ['now'])])])])
>>> for subtree in ptree.subtrees():
... print subtree
...
(S (VP (VB get) (NP (PRP me)) (ADVP (RB now))))
(VP (VB get) (NP (PRP me)) (ADVP (RB now)))
(VB get)
(NP (PRP me))
(PRP me)
(ADVP (RB now))
(RB now)
そして、何あなたが探しているのはhttps://github.com/nltk/nltk/blob/develop/nltk/tree.py#L341次のとおりです。Tree.productions()
はProduction
オブジェクトを返すこと
>>> ptree.productions()
[S -> VP, VP -> VB NP ADVP, VB -> 'get', NP -> PRP, PRP -> 'me', ADVP -> RB, RB -> 'now']
注意、https://github.com/nltk/nltk/blob/develop/nltk/tree.py#L22を参照し、 https://github.com/nltk/nltk/blob/develop/nltk/grammar.py#L236。あなたは文法規則の文字列形式をしたい場合は
、あなたはどちらかを行うことができます。
>>> for rule in ptree.productions():
... print rule
...
S -> VP
VP -> VB NP ADVP
VB -> 'get'
NP -> PRP
PRP -> 'me'
ADVP -> RB
RB -> 'now'
それとも
>>> rules = [str(p) for p in ptree.productions()]
>>> rules
['S -> VP', 'VP -> VB NP ADVP', "VB -> 'get'", 'NP -> PRP', "PRP -> 'me'", 'ADVP -> RB', "RB -> 'now'"]