python編程例子，【python】文本處理：利用NLTK斷句

2023-11-09 阅读 28 评论 0

摘要：【python】文本處理：利用NLTK斷句 [Code] import nltk.datadef stripTags(s):intag = [False]def chk(c):if intag[0]:intag[0] = (c != '>')return Falseelif c == '<':intag[0] = Truereturn Falsereturn Truereturn

【python】文本處理：利用NLTK斷句

[Code]

import nltk.datadef stripTags(s):intag = [False]def chk(c):if intag[0]:intag[0] = (c != '>')return Falseelif c == '<':intag[0] = Truereturn Falsereturn Truereturn ''.join(c for c in s if chk(c))
file = open("e:\\inputs.txt")
while 1:line = file.readline()if not line:breakresult = stripTags(line)sent_detector = nltk.data.load('tokenizers/punkt/english.pickle')file2 = open("e:\\outputs.txt", "a")file2.write('\n'.join(sent_detector.tokenize(result.strip())))file2.close();

[Caution]

上述代碼中需要安裝NLTK并下載punkt語料庫。

下載安裝NLTK:

詳見鏈接，很詳細。

http://blog.csdn.net/joey_su/article/details/17287559

下載punkt:

     import nltknltk.download()

選擇Models 下載punkt

版权声明：本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。

原文链接：https://808629.com/170381.html

上一篇：三國評書文本，Python文本處理：《三國演義》詞云的構建與分析

下一篇：python3抓取網頁數據，python抓取文本字段_使用Python提取文本中含有特定字符串的方法示例

标签：python編程例子 python 類 python爬蟲教程 python和java python教程 python3 python英文文本分句 python文本分析和提取

python編程例子

最新文章

阅读排行

猜你喜欢

本站为非赢利网站，部分文章来源或改编自互联网及其他公众平台，主要目的在于分享信息，版权归原作者所有，内容仅供读者参考，如有侵权请联系我们删除！

Copyright © 2022 86后生记录生活 Inc. 保留所有权利。

底部版权信息

我要关灯
我要开灯
客户电话
808629
工作时间：8:00-18:00
客服电话
电子邮件
admin@qq.com
官方微信
扫码二维码
获取最新动态
返回顶部