NLP之Stanford Parser using NLTK

原創

毛球饲养员

2020-02-22 12:21

因爲官網的使用的很不方便，各個參數沒有詳細的說明，也查不到很好的資料了。所以決定使用python配合NLTK來獲取Constituency Parser和Denpendency Parser。

一、安裝python

操作系統win10
jdk（版本1.8.0_151）
anaconda（版本4.4.0），python（版本3.6.1）
略

二、安裝NLTK

pip install nltk

安裝完成之後進入python命令中，輸入

import nltk
nltk.download()

如圖所示：

然後就會彈出一個框，具體我目前也不是很懂，大概就是提供的一些資源包，所以我就全部先download
如圖所示：

這樣就完成了。

三、stanford parser與NLTK

在不設置classpath的情況下，簡單實用stanford parser的幾個簡單的demo

1.Constituency Parser

# -*- coding: utf-8 -*-
import os
from nltk.parse.stanford import StanfordParser

os.environ['STANFORD_PARSER'] = './model/stanford-parser.jar'
os.environ['STANFORD_MODELS'] = './model/stanford-parser-3.8.0-models.jar'

parser = StanfordParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
sentences = parser.raw_parse("the quick brown fox jumps over the \" lazy \" dog .")
# for line in sentences:
#     for t in line:
#         print(t)

# GUI
for line in sentences:
    for sentence in line:
        sentence.draw()

2.Denpendency Parser

# -*- coding: utf-8 -*-
import os
from nltk.parse.stanford import StanfordDependencyParser

os.environ['STANFORD_PARSER'] = './model/stanford-parser.jar'
os.environ['STANFORD_MODELS'] = './model/stanford-parser-3.8.0-models.jar'

parser = StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")
sentences = parser.raw_parse("the quick brown fox jumps over the lazy dog")
# 返回的是tree
# for line in sentences:
#     print(line)

res = list(parser.parse("the quick brown fox jumps over the lazy dog .".split()))
for row in res[0].triples():
    print(row)

這是分割線

最終版的：

# -*- coding: utf-8 -*-

import os
from nltk.parse.stanford import StanfordDependencyParser

os.environ['STANFORD_PARSER'] = './model/stanford-parser.jar'
os.environ['STANFORD_MODELS'] = './model/stanford-parser-3.8.0-models.jar'

parser = StanfordDependencyParser(model_path="edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz")

fin = open("./data/raw.clean.test", encoding="utf-8")
fout = open("./result/test.txt", "w+", encoding="utf-8")

i = 0
for line in fin.readlines():
    if line is None or line == "":
        pass
    else:
        sentences, = parser.parse(line.split("|||")[0].split(" "))
        # print(sentences.to_conll(4))
        fout.write(sentences.to_conll(4))
        fout.write('\n')
        fout.flush()
    i += 1
    print(i)

fin.close()
fout.close()

最終的樣子非常符合我的需求

over

毛球飼養員

發佈了40 篇原創文章 · 獲贊 28 · 訪問量 8萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

NLP之Stanford Parser using NLTK

一、安裝python

二、安裝NLTK

三、stanford parser與NLTK

1.Constituency Parser

2.Denpendency Parser

python gdal 安裝使用（Windows， python 3.6.8）

每天一個linux命令-文件（1）-df

NLP之Stanford Parser

每天一個linux命令-文件（2）-du

VS2017 個人常用快捷鍵

linux 文件權限與用戶組

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結