一个小题目（单词统计）

evasiu

浏览: 165459 次
性别:
来自: 广州

最近访客更多访客>>

redky

toilet22

ilotuo

金武飞扬

博主相关

博客

微博

相册

留言

关于我

博客专栏

: TCP/IP详解卷一>阅读...
浏览量：12263

文章分类

社区版块

存档分类

博客分类：

实践
编程

今天休息的时候看到一个关于单词统计的小题目：

统计一段由字符和和空格组成的字符串中有多少个单词

题目一看觉得很简单，无非是遍历字符串，然后根据字母是不是空格之类的来统计单词的个数。博主用了一个状态机来做这件事，我觉得颇有新意，所以就记下心来了。后面的留言有人觉得博主这是把简单问题复杂化，其实我觉得不然。最近在看《设计模式精解》，开篇作者提出的观点我就觉得非常好：需求总是在发生变化的，我们必须让我们写出的代码能够适应变化，而不是为预防变化而费尽必力。一语中的，想起大三的时候刚参与实习的时候，那个时候认为需求一旦定下来了就没有理由变，从而把责任怪罪到定义需求的人上面去，其实是多么狭隘。同样，我觉得两种做法的区别就是，后者更能适应需求的变化。想想，如果字符串除了空格和字符，还加上了各种符号啊之类的，前者的代码写起来就不那么好看了，从而也不好维护了。

下面是我写的两种实现方法：

#include<iostream>
using namespace std;

/** description:
 ** given a string of alpha and spaces, 
 ** return how many words are there in the string
 ** Words are seperated by space(s)
 **/

int wordsCount( const char* str )
{
	if( str == NULL )
		return 0;
	int count = 0;
	char cur;
	while( cur = *str )
	{
		if( cur == ' ' )
			while( *(++str)!='\0' && *str==' ' );
		else
		{
			count++;
			while( *(++str)!='\0' && *str!=' ' );
		}
	}
	return count;
}

/** now let's try another solution with state machine */
int wordsCountSM( const char* str )
{
	enum STATE { InitState, SpaceState, AlphaState };
	int count = 0;
	char cur;
	STATE state = InitState;
	while( cur = *str++ )
	{
		switch(state)
		{
			case InitState:
				if( cur == ' ' )
					state = SpaceState;
				else
					state = AlphaState;
				break;
			case SpaceState:
				if( cur != ' ' )
					state = AlphaState;
				break;
			case AlphaState:
				if( cur == ' ' )
				{
					state = SpaceState;
					count ++;
				}
				break;
			default:
				break;
		}
	}
	if ( state == AlphaState )
		count++;
	return count;
}


int main()
{
	char* test1 = "this is a normal string";
	char* test2 = "  this is a abnormal string";
	char* test3 = "this is an abnormal string  ";
	char* test4 = "  this is    an abnormal string   ";
	char* tests[4] = { test1, test2, test3, test4 };
	for( int i=0; i<4; i++ )
	{
		cout<<wordsCount(tests[i])<<' '<<wordsCountSM(tests[i])<<endl;
	}
	return 0;
}

相比之下，其实写第一种实现方法的时候花了我比较多的时间，因为我很害怕出错，而第二种实现方法很规则，只要照着状态机的转移图写就行了，不容易出错，下面是我画的状态图：

单词数增加的途径只有两个：从AlphaState->SpaceState，或者是从AlphaState到字符串结束。当需求发生变化时，我们或者添加新的状态，或者对状态的转移条件做一定的修改，很方便，而且不容易出错。

查看图片附件

分享到：

我对KM算法的理解 | 数独人工解法的一些技巧及其python实现

2012-08-14 23:12
浏览 1261
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论