遍历大容量map的正确方法

dengminhui

浏览: 163707 次
来自: ...

最近访客更多访客>>

webeasymail

jstl1point0

yuyujulin

leafry

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Java学习

Linux 多线程 IDE thread JDK

首先，遍历map有以下方法：

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

public class MapTest {

	public static void main(String[] args) {
		Map<String, String> map = new HashMap<String, String>();
		map.put("1", "1");
		map.put("2", "2");
		map.put("3", "3");


		// 第一种：通过Map.keySet遍历key和value
		System.out.println("通过Map.keySet遍历key和value:");
		for (String key : map.keySet()) {
			System.out.println("key= " + key + "  and  value= " + map.get(key));
		}
		
		// 第二种：通过Map.entrySet使用iterator遍历key和value
		System.out.println("通过Map.entrySet使用iterator遍历key和value:");
		Iterator<Map.Entry<String, String>> it = map.entrySet().iterator();
		while (it.hasNext()) {
			Map.Entry<String, String> entry = it.next();

			System.out.println("key= " + entry.getKey() + "  and  value= "
					+ entry.getValue());
		}

		// 第三种：通过Map.entrySet遍历key和value
		System.out.println("通过Map.entrySet遍历key和value:");
		for (Map.Entry<String, String> entry : map.entrySet()) {
			System.out.println("key= " + entry.getKey() + "  and  value= "
					+ entry.getValue());
		}

		// 第四种：通过Map.values()遍历所有的value，但是不能遍历键key
		System.out.println("通过Map.values()遍历所有的value:");
		for (String v : map.values()) {
			System.out.println("value= " + v);
		}
	}

}

结果如下：

通过Map.keySet遍历key和value:
key= 3  and  value= 3
key= 2  and  value= 2
key= 1  and  value= 1
通过Map.entrySet使用iterator遍历key和value:
key= 3  and  value= 3
key= 2  and  value= 2
key= 1  and  value= 1
通过Map.entrySet遍历key和value:
key= 3  and  value= 3
key= 2  and  value= 2
key= 1  and  value= 1
通过Map.values()遍历所有的value:
value= 3
value= 2
value= 1

这四种方法都可以遍历map：

第一种是目前许多人最喜欢的一种方式，因为代码最少，看起来最简单，通过遍历keySet，再将key所对应的value查询出来，这里有一个二次取值的过程，所以并不推荐；

第二种和第三种原理是相同的，都是通过遍历Map.Entry的方式，将Entry中的key和value打印出来，第三种是比较推荐写法，因为采用jdk1.5后的遍历形式，代码看起来比较整洁；

第四种比较少用，因为我们大多数时候都是同时需要key和value的

综上所述，如果map里面内容比较少，其实采用哪种方式都可以，第一种和第三种相对简洁一些；但是一旦容量非常大时，更推荐采用第三种方式，相比于第一种将极大地节省性能。

修改一下代码，对执行时间执行一下测试

import java.util.HashMap;
import java.util.Map;

public class MapTest {
	static long MAX_LONG = 1000000L;
	static int TIMES = 10;
	static Map<String, String> map1 = new HashMap<String, String>();
	static Map<String, String> map2 = new HashMap<String, String>();
	static {
		for (long i = 0; i < MAX_LONG; i++) {
			map1.put("1" + i, "abc" + i);
			map2.put("2" + i, "def" + i);
		}
	}

	public static void main(String[] args) {

		String valueStr = ""; 
		String keyStr = "";
		long start, end;
		double totalMs;



		totalMs = 0;
		for (int i = 0; i < TIMES; i++) {

			// 通过Map.keySet遍历key和value
			start = System.currentTimeMillis();
			for (String key : map1.keySet()) {
				keyStr = key;
				valueStr = map1.get(key);
			}
			end = System.currentTimeMillis();
			System.out.println("通过Map.keySet遍历key和value耗时 " + (end - start)
					+ " ms ");
			totalMs += (end - start);
		}
		System.out.println("Times : " + TIMES + ", totalMs : " + totalMs
				+ "ms, average :" + totalMs / TIMES + "ms");
		
		
		totalMs = 0;
		for (int i = 0; i < TIMES; i++) {
			// 通过Map.entrySet遍历key和value
			start = System.currentTimeMillis();
			for (Map.Entry<String, String> entry : map2.entrySet()) {
				keyStr = entry.getKey();
				valueStr = entry.getValue();
			}
			end = System.currentTimeMillis();
			System.out.println("通过Map.entrySet遍历key和value耗时 " + (end - start)
					+ " ms ");
			totalMs += (end - start);
		}
		System.out.println("Times : " + TIMES + ", totalMs : " + totalMs
				+ "ms, average :" + totalMs / TIMES + "ms");

	}

}

以下是测试结果：

通过Map.keySet遍历key和value耗时 186 ms 
通过Map.keySet遍历key和value耗时 189 ms 
通过Map.keySet遍历key和value耗时 87 ms 
通过Map.keySet遍历key和value耗时 89 ms 
通过Map.keySet遍历key和value耗时 84 ms 
通过Map.keySet遍历key和value耗时 92 ms 
通过Map.keySet遍历key和value耗时 85 ms 
通过Map.keySet遍历key和value耗时 94 ms 
通过Map.keySet遍历key和value耗时 89 ms 
通过Map.keySet遍历key和value耗时 91 ms 
Times : 10, totalMs : 1086.0ms, average :108.6ms
通过Map.entrySet遍历key和value耗时 112 ms 
通过Map.entrySet遍历key和value耗时 98 ms 
通过Map.entrySet遍历key和value耗时 71 ms 
通过Map.entrySet遍历key和value耗时 65 ms 
通过Map.entrySet遍历key和value耗时 65 ms 
通过Map.entrySet遍历key和value耗时 64 ms 
通过Map.entrySet遍历key和value耗时 64 ms 
通过Map.entrySet遍历key和value耗时 65 ms 
通过Map.entrySet遍历key和value耗时 65 ms 
通过Map.entrySet遍历key和value耗时 65 ms 
Times : 10, totalMs : 734.0ms, average :73.4ms

可以看出，百万级别的量时，使用keySet和entrySet遍历，执行时间大概为1.5：1，这并不是最主要的差异。真正的差异还是必须看代码

map.get(key))

时执行的hash操作

    /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
     * key.equals(k))}, then this method returns {@code v}; otherwise
     * it returns {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it's also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)
     */
    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

计算hashCode是CPU密集运算，非常耗费CPU资源，如果对一个比较大的map进行遍历，会出现CPU迅速飚高的现象，直接影响机器的响应速度，在多线程的情况下，简直就是一场灾难，而采用entrySet来进行遍历，则无此问题，对这个有兴趣的同学，可以使用自己的机器试一试。

36
顶

2
踩

分享到：

如何检测线上代码的运行情况---BTrace使用 ... | 多线程书写及code review规范

2010-10-08 13:20
浏览 11071
评论(29)
分类:编程语言
查看更多

29 楼 chenhongwei0924 2010-12-20

不错。。。值得学习..

28 楼 fm_974 2010-10-26

有研究精神！赞一个

27 楼 zhang_xzhi_xjtu 2010-10-25

加上静态代码检查工具就会发现第一种方法会有warn的。

26 楼 Java_Travler 2010-10-25

很好，这样的帖子才应该鼓励和支持！这样大家才有讨论的热点和兴趣！才能学到或更好的理解更多的东西！

25 楼風一樣的男子 2010-10-24

看过源码的人都知道应该用entrySet遍历

24 楼 ldbjakyo 2010-10-14

再次回来瞧过比第一次来看补充了好多

23 楼 lwz777 2010-10-13

这样的文章才有点味道不是!

22 楼 parabellum_sky 2010-10-13

parabellum_sky 写道

怎么执行第二段代码出错了呢？
Exception in thread "main"
并且啥信息都没有～～

估计是OOM了，MAX_LONG减少一个数量级以后就可以了～但是看不出来问题～～

21 楼 parabellum_sky 2010-10-13

怎么执行第二段代码出错了呢？
Exception in thread "main"
并且啥信息都没有～～

20 楼 elan1986 2010-10-12

总结的不错！

19 楼 ansjsun 2010-10-12

su1216 写道

实践证明，不一定哪个快。。。
不过第二个基本上每次都要慢一点
测试map

public static Map<String,String> getMap(long k){
Map map=new HashMap();
for (int i = 0; i <k; i++) {
map.put(""+i,""+i);
}
return map;
}

k=10w

ps:jdk1.5

你用的是linux吧....用windows速度应该比较稳定的...其实大容量map不是考虑..速度..而更应该考虑内存的开销..集合啦..处理海量数据最容易导致..内存当掉了...
如果很追求速度..那也不要用map去做..jdk的自动装箱..本来就是很耗费时间的

18 楼 dengminhui 2010-10-11

呵呵，针对大家的疑问，我写在文章后面了，执行时间只是差异的一方面，最主要的还是CPU。

17 楼 lkj107 2010-10-11

大家都用的就是好方法

代码的执行只是一小部分功能，最大的是可读性，以及由此延伸的可维护性等

16 楼 newvirus 2010-10-11

不错的文章学习了私下地下尝试一下

15 楼 helloguoxing 2010-10-11

这样减少了map扩容的次数。不过数据量小了，就不必这样写，直接new HashMap()。

14 楼 helloguoxing 2010-10-11

map的大容量优化方法：
1、提前要知道容量要放多大，只要确保map集合的size<1024*0.75就可以了。
这个1024是自己设置的。可以在new HashMap(1024,0.75);
1024是偶数，在方法hash()-哈希算法中：h & (length-1);(length-1)要确认得出的值是奇数，这样才能解决循环遍历map产生的冲突。
0.75是3/4，因为数组满了不利于数组扩容。所以设计者建议为3/4.

13 楼 su1216 2010-10-11

12 楼 gh_aiyz 2010-10-11

我打赌第一种和第二三中的性能差别不超过1%，无论你多大容量。
LZ纯粹是臆测罢了，有点想当然了。

11 楼吃猫的鱼 2010-10-10

没说服力啊，，要是把测试结果的性能比较贴出来，就很带劲了……

10 楼 yuantong 2010-10-10

还大容量，最普通不过了

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论