[code jam 2009试题分析]Qualification Round - Alien Language -

fanzy618

浏览: 19844 次
性别:
来自: 北京

最近访客更多访客>>

V快乐OK

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

[code jam 2009试题分析]Qualification Round - Alien Language

博客分类：

算法

Python F#Google

题目如下：
Problem
After years of study, scientists at Google Labs have discovered an alien language transmitted from a faraway planet. The alien language is very unique in that every word consists of exactly L lowercase letters. Also, there are exactly D words in this language.
Once the dictionary of all the words in the alien language was built, the next breakthrough was to discover that the aliens have been transmitting messages to Earth for the past decade. Unfortunately, these signals are weakened due to the distance between our two planets and some of the words may be misinterpreted. In order to help them decipher these messages, the scientists have asked you to devise an algorithm that will determine the number of possible interpretations for a given pattern.
A pattern consists of exactly L tokens. Each token is either a single lowercase letter (the scientists are very sure that this is the letter) or a group of unique lowercase letters surrounded by parenthesis ( and ). For example: (ab)d(dc) means the first letter is either a or b, the second letter is definitely d and the last letter is either d or c. Therefore, the pattern (ab)d(dc) can stand for either one of these 4 possibilities: add, adc, bdd, bdc.
Input
The first line of input contains 3 integers, L, D and N separated by a space. D lines follow, each containing one word of length L. These are the words that are known to exist in the alien language. N test cases then follow, each on its own line and each consisting of a pattern as described above. You may assume that all known words provided are unique.
Output
For each test case, output
Case #X: K
where X is the test case number, starting from 1, and K indicates how many words in the alien language match the pattern.

Limits
Small dataset
1 ≤ L ≤ 10
1 ≤ D ≤ 25
1 ≤ N ≤ 10
Large dataset
1 ≤ L ≤ 15
1 ≤ D ≤ 5000
1 ≤ N ≤ 500
Sample

Input
3 5 4
abc
bca
dac
dbc
cba
(ab)(bc)(ca)
abc
(abc)(abc)(abc)
(zyx)bc

Output
Case #1: 2
Case #2: 1
Case #3: 3
Case #4: 0

这道题本身没有什么难度，唯一需要注意的就是速度了。
我的解法是找到每一个位(token)对应的单词的集合，然后将所有的集合做交集运算，最终结果集合的元素数量就是所求的结果。
为了快速找到每一个位所对应的单词集合，先便利单词的列表，为每一位建立一个从字母到单词集合的索引。
Index是一个包含了L个map容器的序列，每一个map容器对应一个位。每个map容器包含了从字母x到该位置的字母为x的映射。
这样从模式到单词可以简单的变换成集合操作。
如模式(ab)b(ac)对应的就是
wordSet = (index[0][a] | index[0][b]) & index[1][b] & (index[2][a] | index[2][c])
其中”|”是并集操作，”&”是交集操作。
Len(wordSet)就是要求的结果。
具体代码如下：

import sys

f = open(sys.argv[1])
L, D, N = [int(i) for i in f.readline().split()]

index = [{} for i in range(L)]
for i in range(D):
    word = f.readline().strip()
    offset = 0
    for l in word:
        if l not in index[offset]:
            index[offset][l] = set()    
        index[offset][l].add(word)
        offset += 1

for i in range(1, N+1):
    testcase = f.readline().strip()
    state = False
    offset = 0
    result = None
    for l in testcase:
        if l == '(':
            state = True
            s = set()
        elif l == ')':
            state = False
            offset += 1
            if result is None:
                result = s
            else:
                result &= s
            del s
        else:
            wordset = index[offset].get(l, set())
            if state :              
                s |= wordset                    
            else:
                if not wordset: #wordset is an empty set
                    print("Case #%d: 0" % i)
                    break
                if result is None:
                    result = wordset.copy()
                else:
                    result &= wordset
                if not result:
                    print("Case #%d: 0" % i)
                    break
                offset += 1
    else:
        print("Case #%d: %d" % (i, len(result)))
f.close()

分享到：

【转】gcc.x86_64的参数传递 | 【笔记】第一卷第一章基本概念

2009-09-16 16:14
浏览 1330
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

[code jam 2009试题分析]Qualification Round - Alien Language

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

[code jam 2009试题分析]Qualification Round - Alien Language

评论

发表评论

相关推荐

用python获取网卡信息

用sqlite3实现稀疏矩阵

[TAOCP第三卷6.1节]顺序搜索

最近访客更多访客>>