python统计文本字符串里单词出现频率的方法

作者:依山带水 时间:2021-11-10 17:38:48 

本文实例讲述了python统计文本字符串里单词出现频率的方法。分享给大家供大家参考。具体实现方法如下:


# word frequency in a text
# tested with Python24  vegaseat  25aug2005
# Chinese wisdom ...
str1 = """Man who run in front of car, get tired.
Man who run behind car, get exhausted."""
print "Original string:"
print str1
print
# create a list of words separated at whitespaces
wordList1 = str1.split(None)
# strip any punctuation marks and build modified word list
# start with an empty list
wordList2 = []
for word1 in wordList1:
 # last character of each word
 lastchar = word1[-1:]
 # use a list of punctuation marks
 if lastchar in [",", ".", "!", "?", ";"]:
   word2 = word1.rstrip(lastchar)
 else:
   word2 = word1
 # build a wordList of lower case modified words
 wordList2.append(word2.lower())
print "Word list created from modified string:"
print wordList2
print
# create a wordfrequency dictionary
# start with an empty dictionary
freqD2 = {}
for word2 in wordList2:
 freqD2[word2] = freqD2.get(word2, 0) + 1
# create a list of keys and sort the list
# all words are lower case already
keyList = freqD2.keys()
keyList.sort()
print "Frequency of each word in the word list (sorted):"
for key2 in keyList:
print "%-10s %d" % (key2, freqD2[key2])

希望本文所述对大家的Python程序设计有所帮助。

标签:python,统计,字符串
0
投稿

猜你喜欢

  • 如何实现让每句话的头一个字母都大写?

    2010-05-24 18:26:00
  • python实现简易版学生成绩管理系统

    2022-01-19 10:52:03
  • python中使用psutil查看内存占用的情况

    2022-11-11 11:26:22
  • gethostbyaddr在Python3中引发UnicodeDecodeError

    2023-06-15 09:34:33
  • Python基础篇之初识Python必看攻略

    2021-02-21 11:26:10
  • Python mutiprocessing多线程池pool操作示例

    2022-02-11 14:19:46
  • 重温Javascript继承机制

    2011-07-04 12:17:23
  • Python Queue模块详细介绍及实例

    2022-03-08 11:03:58
  • Python pyecharts 数据可视化模块的配置方法

    2022-12-09 06:24:26
  • Python字典常见操作实例小结【定义、添加、删除、遍历】

    2021-02-18 20:42:41
  • pytorch dataloader 取batch_size时候出现bug的解决方式

    2023-08-12 01:27:45
  • Oracle SQL性能优化系列学习二

    2010-07-23 13:23:00
  • python的flask框架难学吗

    2023-08-18 15:34:32
  • Apache下禁止特定目录执行PHP 提高服务器安全性

    2023-10-25 20:10:50
  • python中管道用法入门实例

    2022-06-26 21:42:08
  • asp三天学好ADO对象之第二天

    2008-10-09 12:49:00
  • WEB3.0时代的开放与聚合

    2008-08-21 17:19:00
  • Python cookbook(数据结构与算法)保存最后N个元素的方法

    2023-08-08 05:06:38
  • python 爬取知乎回答下的微信8.0状态视频

    2022-09-11 15:17:57
  • python 统计list中各个元素出现的次数的几种方法

    2022-12-09 10:04:01
  • asp之家 网络编程 m.aspxhome.com