`

visit 淘宝面试题:如何充分利用多核CPU,计算很大的List中所有整数的和

    博客分类:
  • JSE
阅读更多

I've seen these two:

http://www.iteye.com/topic/711162?page=13

http://www.iteye.com/topic/713259

 

But I am not sure whether I missed something or the majority missed something. Though the chancee of later is slim, I am going to take the chance.

 

Say we have 100 numbers and we are break them into batches of size 3, 

{1, 2, 3}, {4, 5, 6} .... {97, 98, 99}, {100}

now compute the sums,

6, 15, ..., 294, 100

There are 34 numbers. Now we should break this array again into batches of size 3, right? Why does nobody do this? I am scratching my hair!

 

So here is my code. First, let's create a simple version

 

 

package my.test.brainteaser.sum;

public class SingleThreadCalculator
{
    public long calc(long[] values, int start, int end)
    {
        if (values == null || values.length == 0) return 0;

        long sum = 0;
        for (int i=start; i<end; i++) sum += values[i];
        return sum;
    }
}

 

 

Here we are using long to avoid overflow (java won't raise a flag like fortran for overflow). The extra variables start and end for avoiding array copying later on. These are counterintuitive, we shall talk about them later on.

 

Now let's create a Runnable wrapper for this:

 

 

package my.test.brainteaser.sum;

import java.util.concurrent.CountDownLatch;

public class RunnableCalculator implements Runnable
{
    private SingleThreadCalculator calc;
    private long[] values;
    private int start;
    private int end;
    private long result;

    private CountDownLatch doneSignal;

    public RunnableCalculator(SingleThreadCalculator calc, long[] values, int start, int end, CountDownLatch doneSignal)
    {
        this.calc = calc;
        this.values = values; // for fast performance, no copy
        this.start = start;
        this.end = end;
        this.doneSignal = doneSignal;
    }

    public void run()
    {
        System.out.println("Thread: " + Thread.currentThread() + " start=" + start + " end=" + end);
        result = calc.calc(values, start, end);
        this.doneSignal.countDown();
    }

    public long getResult() { return result; }
}

 

 

The countdown latcher is to wait for all jobs finishes.

 

Now the core code:

 

 

package my.test.brainteaser.sum;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class MultiThreadedCalculator
{
    private int numThreads = 0; // 0 means no threading
    private int partSize = 100;
    private SingleThreadCalculator singleThreadCalculator = new SingleThreadCalculator();

    public long calc(long[] values)
    {
        if (numThreads == 0)
            return singleThreadCalculator.calc(values, 0, values.length);

        if (values == null || values.length == 0) return 0;

        // compute how many parts we can divide
        int len = values.length / partSize;
        if (values.length % partSize != 0) len++;
        long[] sums = new long[len]; // partial results

        CountDownLatch doneSignal = new CountDownLatch(len);

        ExecutorService executor = Executors.newFixedThreadPool(numThreads);
        List<RunnableCalculator> calcs = new ArrayList<RunnableCalculator>(len);
        for (int i=0; i<len; i++)
        {
            int end;
            if (i < len - 1) // not the last one
                end = (i+1) * partSize;
            else // last one
            {
                end = values.length;
            }

            RunnableCalculator rc = new RunnableCalculator(singleThreadCalculator,
                            values, i*partSize, end, doneSignal);
            calcs.add(rc);
            executor.execute(rc);
        }

        try
        {
            doneSignal.await();
            for (int i=0; i<len; i++)
                sums[i] = calcs.get(i).getResult();

            if (sums.length <= partSize)
                return singleThreadCalculator.calc(sums, 0, sums.length);
            else return calc(sums);
        }
        catch (InterruptedException ie)
        {
            throw new RuntimeException("got interrupted", ie);
        }
    }

    public void setNumThreads(int numThreads)
    {
        if (numThreads < 0) this.numThreads = 0;
        this.numThreads = numThreads;
    }

    public void setPartSize(int partSize)
    {
        this.partSize = partSize;
    }
}

 

 

Here we have two variables, one for number of threads, another is for the part size in each thread. They are different, and very important when we want to optimize the performance in the real world.

 

There is a try block, inside, there is a recursive call, this is where we apply the same logic to the intermediate results.

 

The test case is:

 

package my.test.brainteaser.sum;

import junit.framework.TestCase;

public class SummationTest extends TestCase
{
    public void testMultithreading()
    {
        long[] a = new long[10000];
        for (int i=0; i<a.length; i++)
            a[i] = i+1;

        MultiThreadedCalculator c = new MultiThreadedCalculator();
        c.setNumThreads(4);
        c.setPartSize(100);
        long sum = c.calc(a);
        System.out.println("sum=" + sum);
        assertTrue(sum == 10001 * 5000);
    }

    public void testMore()
    {
        MultiThreadedCalculator c = new MultiThreadedCalculator();

        int len = 100000;
        c.setNumThreads(7);
        c.setPartSize(321);
        long sum = c.calc(generateCase(len, true));
        System.out.println("sum=" + sum);
        long res = 1L * len * (len + 1) / 2;
        System.out.println("res=" + res);
        assertTrue(sum == res);
    }

    public void testMore1()
    {
        MultiThreadedCalculator c = new MultiThreadedCalculator();

        int len = 54321;
        c.setNumThreads(7);
        c.setPartSize(123);
        long sum = c.calc(generateCase(len, false));
        System.out.println("sum=" + sum);
        long res = 1L * len * (len + 1) / 2;
        System.out.println("res=" + res);
        assertTrue(sum == res);
    }

    private long[] generateCase(int len, boolean inc)
    {
        long[] ret = new long[len];
        if (inc)
        {
            for (int i=0; i<ret.length; i++)
                ret[i] = i+1;
        }
        else
        {
            for (int i=0; i<ret.length; i++)
                ret[i] = len - i;
        }

        return ret;
    }
}
 

To go further beyond this puzzle, there are two practical concerns:

 

1. We should extract a general interface from SingleThreadCalculator and MultiThreadedCalculator to shield out the runtime environment(single threaded or multi threaded env) so that users can choose. There are quite a few runtime environments I've experienced, such as message based, parallel computing based, in addition to multi-threaded.

 

2. The calculator interface can be abstracted to a general computing job interface which will be passed into a computing runtime environment so that we could run different computings, such as sum, average, or others.

分享到:
评论
15 楼 wensonzhou86 2010-11-18  
思路明确,代码清晰,通俗易懂
14 楼 carydeepbreathing 2010-11-17  
代码比较干净
13 楼 flyingpig4 2010-08-12  
这个面试题是出自淘宝实践的

12 楼 merphine 2010-07-19  
oh my lady gaga
11 楼 nickevin 2010-07-19  
代码和阐述都很漂亮 多了一种思路 自己测试了一下 效率没有使用CylicBarrier高 待研究中...
10 楼 hardPass 2010-07-19  
if (!描述语言.equals(Lang.BIRD))
   print("better");
9 楼 yangyi 2010-07-18  
非常不错,如果之前的帖子体现的是基础知识,那么这一个体现的是lz的智慧
8 楼 david.org 2010-07-18  
> There are 34 numbers. Now we should break this array again into batches of size 3, right? Why does nobody do this? I am scratching my hair!

很不错的主意,赞呀.

1. 中国人的英语, 楼上别怀疑了
2. 代码看上去很舒服, 解释的也很清楚
3. 利用partSize 参数调优, 再赞.
7 楼 seagod.wong 2010-07-18  
turf...........................
6 楼 lz12366 2010-07-18  
oh  my  god  hahaha...
5 楼 zhufeng1981 2010-07-18  
good, support you.
4 楼 xgj1988 2010-07-17  
Oh my  god  ,Bird language
3 楼 fehly 2010-07-17  
szcjlssx 写道
aa87963014 写道
这难道是一位美国人发的贴么


同问,我想是楼主在练英语吧

继续问...估计是外企的.....nr
2 楼 szcjlssx 2010-07-17  
aa87963014 写道
这难道是一位美国人发的贴么


同问,我想是楼主在练英语吧
1 楼 aa87963014 2010-07-17  
这难道是一位美国人发的贴么

相关推荐

Global site tag (gtag.js) - Google Analytics