NLG for Fun - Python快速自动标题生成器实例

NLG for Fun - Python快速自动标题生成器实例

模块:Markovify

我们在这里使用的Py模块是markovify。

Markovify的描述:

Markovify是一个简单的,可扩展的马尔可夫链发生器。目前,它的主要用途是构建大型文本语料库的马尔可夫模型,并从中产生随机语句。但是,从理论上讲,它可以用于其他应用程序。

关于数据集:

个数据集可以从Kaggle数据集中下载

加载必需的包

import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import markovify #Markov Chain Generator

# Any results you write to the current directory are saved as output.

读取输入文本文件

inp = pd.read_csv('../input/abcnews-date-text.csv')

inp.head(3)

publish_date headline_text

020030219 aba decides against community broadcasting lic…

120030219 act fire witnesses must be aware of defamation

220030219a g calls for infrastructure protection summit

用马尔可夫链建立文本模型

text_model = markovify.NewlineText(inp.headline_text,state_size = 2)

自动生成的标题

# Print five randomly-generated sentences

for i in range(5):

print(text_model.make_sentence())

iron magnate poised to storm cleanup

meet the png government defends stockdale appointment

the twitter exec charged with animal cruelty trial

pm denies role in pregnancy

shoalhaven business boosts hunter

相关推荐