DP31 取大小游戏中的最优策略（附：如何对DP[0][n]进行对角线递推） Optimal Strategy for a Game @geeksforgeeks

xitonga

浏览: 596466 次

最近访客更多访客>>

morelily

jccz_zys

haining128

u012363178

博主相关

博客

微博

相册

留言

关于我

文章分类

全部博客 (1278)

社区版块

存档分类

Problem statement: Consider a row of n coins of values v1 . . . vn, where n is even. We play a game against an opponent by alternating turns. In each turn, a player selects either the first or last coin from the row,removes it from the row permanently, and receives the value of the coin. Determine the maximum possible amount of money we can definitely win if we move first.

Note: The opponent is as clever as the user.

Let us understand the problem with few examples:

1.5, 3, 7, 10 : The user collects maximum value as 15(10 + 5)

2.8, 15, 3, 7 : The user collects maximum value as 22(7 + 15)

Does choosing the best at each move give an optimal solution?

No. In the second example, this is how the game can finish:

1.
…….User chooses 8.
…….Opponent chooses 15.
…….User chooses 7.
…….Opponent chooses 3.
Total value collected by user is 15(8 + 7)

2.
…….User chooses 7.
…….Opponent chooses 8.
…….User chooses 15.
…….Opponent chooses 3.
Total value collected by user is 22(7 + 15)

So if the user follows the second game state, maximum value can be collected although the first move is not the best.

There are two choices:
1.The user chooses the ith coin with value Vi: The opponent either chooses (i+1)th coin or jth coin. The opponent intends to choose the coin which leaves the user with minimum value.
i.e. The user can collect the value Vi + min(F(i+2, j), F(i+1, j-1) )

2.The user chooses the jth coin with value Vj: The opponent either chooses ith coin or (j-1)th coin. The opponent intends to choose the coin which leaves the user with minimum value.
i.e. The user can collect the value Vj + min(F(i+1, j-1), F(i, j-2) )

Following is recursive solution that is based on above two choices. We take the maximum of two choices.

F(i, j)  represents the maximum value the user can collect from 
         i'th coin to j'th coin.

    F(i, j)  = Max(Vi + min(F(i+2, j), F(i+1, j-1) ), 
                   Vj + min(F(i+1, j-1), F(i, j-2) )) 
Base Cases
    F(i, j)  = Vi           If j == i
    F(i, j)  = max(Vi, Vj)  If j == i+1

Why Dynamic Programming?
The above relation exhibits overlapping sub-problems. In the above relation, F(i+1, j-1) is calculated twice.

很有意思的博弈问题，重要的一点是贪心法在这里是不行的，前面已经举了反证。

另外很重要的一点是对求dp[0][n]而不是dp[m][m]的问题，如何写for循环。文章里说要从对角线开始向右上角递推。

比如：

The below table represents the stored values for the string abcde.

a b c d e
----------
0 1 2 3 4
0 0 1 2 3 
0 0 0 1 2 
0 0 0 0 1 
0 0 0 0 0

How to fill the table?
The table should be filled in diagonal fashion. For the string abcde, 0….4, the following should be order in which the table is filled:

Gap = 1:
(0, 1) (1, 2) (2, 3) (3, 4)

Gap = 2:
(0, 2) (1, 3) (2, 4)

Gap = 3:
(0, 3) (1, 4)

Gap = 4:
(0, 4)

可以用以下代码得到：

用一个gap变量作为l和h之间的距离，然后l和h一起前进，直到h碰到边界n，因为l在左上方，h在右下方，所以肯定h先碰到右边界！

// Fill the table
    for (gap = 1; gap < n; ++gap)
        for (l = 0, h = gap; h < n; ++l, ++h)
            table[l][h] = (str[l] == str[h])? table[l+1][h-1] :
                          (min(table[l][h-1], table[l+1][h]) + 1);

package DP;

public class OptimalGameStrategy {

	public static void main(String[] args) {
		int[] A = {8, 15, 3, 7};
		int[] B = {2, 2, 2, 2};
		int[] C = {20, 30, 2, 2, 2, 10};
		System.out.println(optimalStrategyOfGameRec(A, 0, A.length-1));
		System.out.println(optimalStrategyOfGame(A, A.length));
		System.out.println(optimalStrategyOfGameRec(B, 0, B.length-1));
		System.out.println(optimalStrategyOfGame(B, B.length));
		System.out.println(optimalStrategyOfGameRec(C, 0, C.length-1));
		System.out.println(optimalStrategyOfGame(C, C.length));
		
	}
	
	
	public static int optimalStrategyOfGameRec(int[] A, int begin, int end){
		if(begin == end){		// 只剩下一枚硬币时，只能取那枚
			return A[begin];
		}
		if(begin+1 == end){	// 只剩下两枚时，取大的那枚
			return Math.max(A[begin], A[end]);
		}
		
		// 如果我去开头那枚，因为对手会拿较大那枚，所以留下的是较小的选择，看图
		int a = A[begin] + Math.min(optimalStrategyOfGameRec(A, begin+2, end), 
											   optimalStrategyOfGameRec(A, begin+1, end-1));
		// 如果我去拿结尾那枚，同理
		int b = A[end] + Math.min(optimalStrategyOfGameRec(A, begin+1, end-1), 
											 optimalStrategyOfGameRec(A, begin, end-2));
		return Math.max(a, b);		// 做最合算的选择
	}
	
	
	// Returns optimal value possible that a player can collect from
	// an array of coins of size n. Note than n must be even
	public static int optimalStrategyOfGame(int[] A, int n){
		int[][] dp = new int[n][n];	//dp[begin][end]保存从序号为begin到end的coin序列能取得的最大值
		
		// Fill table using above recursive formula. Note that the table
	    // is filled in diagonal fashion (similar to http://goo.gl/PQqoS),
	    // from diagonal elements to table[0][n-1] which is the result.
		for(int gap=0; gap<n; gap++){
			int begin=0, end=gap;
			while(end < n){
				// 检测条件为右上角
				int x = begin+2<=end ? dp[begin+2][end] : 0;
				int y = begin+1<=end-1 ? dp[begin+1][end-1] : 0;
				int z = begin<end-2 ? dp[begin][end-2] : 0;
				
				dp[begin][end] = Math.max(A[begin]+Math.min(x, y), 
										   A[end]+Math.min(y, z));
				begin++;
				end++;
			}
		}
		
		return dp[0][n-1];
	}

}

http://www.geeksforgeeks.org/dynamic-programming-set-31-optimal-strategy-for-a-game/

http://www.geeksforgeeks.org/dynamic-programming-set-28-minimum-insertions-to-form-a-palindrome/

分享到：