poj 3294 Life Forms 求n(n>1)个字符串的最长的一个子串后缀数组

ykk81ykk

浏览: 12069 次

最近访客更多访客>>

zjnyl1314

Raptor-f22

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

技术杂绘

　　Description You may have wondered why most extraterrestrial life forms resemble humans, differing by superficial traits such as height, colour, wrinkles, ears, eyebrows and the like. A few bear no human resemblance; these typically have geometric or amorphous shapes like cubes, oil slicks or clouds of dust. The answer is given in the 146th episode of Star Trek - The Next Generation, titled The Chase. It turns out that in the vast majority of the quadrant's life forms ended up with a large fragment of common DNA. Given the DNA sequences of several life forms represented as strings of letters, you are to find the longest substring that is shared by more than half of them. Input Standard input contains several test cases. Each test case begins with 1 ≤ n ≤ 100, the number of life forms. n lines follow; each contains a string of lower case letters representing the DNA sequence of a life form. Each DNA sequence contains at least one and not more than 1000 letters. A line containing 0 follows the last test case. Output Sample Input Sample Output bcdefg cdefgh ?#include #include #include #include using namespace std; ///后缀数组倍增算法 const int maxn=500000; char str[maxn]; int wa[maxn],wb[maxn],wv[maxn],wn[maxn],a[maxn],sa[max n]; int cmp(int* r,int a,int b,int l) {return r[a]==r[b]&&r[a+l]==r[b+l];} /**n为字符串长度，m为字符的取值范围，r为字符串。后面的j为每次排序时子串的长度*/ void DA(int* r,int* sa,int n,int m) { int i,j,p,*x=wa,*y=wb,*t; ///对R中长度为1的子串进行基数排序 for(i=0;i=0;i--)sa[--wn[x[i]]]=i; for(j=1,p=1;p=j)y[p++]=sa[i]-j; ///基数排序 for(i=0;i=0;i--)sa[--wn[wv[i]]]=y[i]; ///当p=n的时候，说明所有串都已经排好序了 ///在第一次排序以后，rank数组中的最大值小于p，所以让m=p for(t=x,x=y,y=t,p=1,x[sa[0]]=0,i=1;i字符串后面添加了一个0号字符，所以它必然是最小的一个后缀。而字符串中的其他字符都应该是大于0的（前面有提到，使用倍增算法前需要确保这点），所以排名第二的字符串和0号字符的公共前缀（即height[1]）应当为0.在调用calheight函数时，要注意height数组的范围应该是[1..n]。所以调用时应该是calheight(r,sa,n) 而不是calheight(r,sa,n+1)。*/ int rank[maxn],height[maxn]; void calheight(int* r,int* sa,int n) { int i,j,k=0; for(i=1;i字符串的最长的一个子串 int n=0;//总字符串长度 int m;//字符串个数 int l,r; int belong[maxn];//属于第几个字符串 int cnt[200]; int _check( int mid ){ memset(cnt,0,sizeof(cnt)); int flag= 1, ans= 0; for( int i= 1; i字符串 if( ans>= m/ 2+ 1 ) return 1; } return 0; } void print( int mid ){ memset(cnt,0,sizeof(cnt)); int ans= 0, flag= 1, isp= 0, beg; for( int i= 1; i= (m/ 2+ 1 ) ){ isp= 1; for( int j= 0; j字符串 for(int i=0;i字符串的最长的一个子串，满足该子串在超过一半以上的字符串中出现过,并输出该子串,如果有多个子串满足要求,则按字典序输出所有的子串；算法：二分长度+后缀数组将n个字符串连起来，中间用不相同的且没有出现在字符串中的字符隔开，求后缀数组。然后二分答案，将后缀分成若干组，判断每组的后缀是否出现在不小于k个的原串中。这个做法的时间复杂度为O(nlogn)*/ while(scanf("%d",&m)==1&&m) { n=0; for(int i=1;i>1; if(_check(mid)) l=mid+1; else r=mid; } int ans=r-1; if(!ans) printf("?\n"); else print(ans); printf("\n");//题目要求多输出一个空行 } return 0; }

分享到：

Java 正则表达式使用心得

2012-07-06 09:52
浏览 952
评论(0)
分类:Web前端
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论