`
Mysun
  • 浏览: 270478 次
  • 性别: Icon_minigender_1
  • 来自: 杭州
社区版块
存档分类
最新评论

正则表达式计算单词出现次数

阅读更多
给定一段文本,使用正则表达式计算该文本中不同单词的出现次数。

/*
	 * 使用正则表达式来计算一段文本当中所有以小写字母开头的
	 * 单词的出现次数
	 */
	public void testRegex(){
		String POEM =
			"Towas brillig, and the slithy toves\n" +
			"Did gyre and Gimble in the wabe.\n" +
			"All mimsy were the borogoves,\n" +
			"And the mome raths outgrabe.\n\n" +
			"Beware the Jabberwock, my son,\n" +
			"The jaws that bite, the claws that catch.\n" +
			"Beware the Jubjub bird, and shun\n" +
			"The frumious Bandersnatch.";
		Map<String, Integer> wordCount = new HashMap<String, Integer>();
               //\\b用来指定单词的边界,这里在单词的开头和结尾都使用了\\b。用来
               //区分各个单词。中间的\\w+指明一个活着多个单词字符(word character)
		Matcher m = Pattern.compile("\\b([a-z]\\w+[a-zA-Z]){1}\\b").matcher(POEM);
			
			while(m.find()) {
				if(wordCount.containsKey(m.group(0))){
					Integer count = wordCount.get(m.group(0));
					wordCount.put(m.group(0), count+1);
				}else{
					wordCount.put(m.group(0), 1);
				}
			}
			System.out.println(wordCount.toString());
	}
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics