论坛首页 Java企业应用论坛

hessian 序列化实现 初探

浏览 15498 次
精华帖 (0) :: 良好帖 (2) :: 新手帖 (0) :: 隐藏帖 (0)
作者 正文
   发表时间:2008-09-22  
众所周知大名鼎鼎的开源remoting的框架hessian的速度是非常快的,有人做过测试:一个UserData类,有一个字符串属性,一个日期属性,一个double属性,分别用java,hessian来序列化一百万次,结果让人吃惊,不止是hessian序列化的速度要比java的快上一倍,而且hessian序列化后的字节数也要比java的少一倍.总是疑惑不解,为什么hessian的速度会那么快,这估计还是要归功于它的序列化的实现机制。兴趣上来了,决定看一下它是如何来实现它的序列化的。

打开hessian源码,我们可以看到com.caucho.hessian.io这个包是hessian实现序列化与反序列化的核心包,从代码结构上我们不难发现,AbstractSerializerFactory,AbstractHessianOutput,AbstractSerializer,AbstractHessianInput,AbstractDeserializer是hessian实现序列化和反序列化的核心结构代码。

首先我们来看下AbstractSerializerFactory,它有2个抽象方法

根据类来决定用哪种序列化工具类
  abstract public Serializer getSerializer(Class cl)
    throws HessianProtocolException;


根据类来决定用哪种反序列化工具类
  abstract public Deserializer getDeserializer(Class cl)
    throws HessianProtocolException;


SerializerFactory继承AbstractSerializerFactory,而且在SerializerFactory有很多静态map用来存放类与序列化和反序列化工具类的映射,这样如果已经用过的序列化工具就可以直接拿出来用,不必再重新实例化工具类。

private static HashMap _staticSerializerMap;
private static HashMap _staticDeserializerMap;



在SerializerFactory中,实现了抽象类的getSerializer方法,根据不同的需要被序列化的类来获得不同的序列化工具,一共有17种序列化工具,hessian为不同的类型的java对象实现了不同的序列化工具,默认的序列化工具是JavaSerializer
public Serializer getSerializer(Class cl)
    throws HessianProtocolException
  {
    Serializer serializer;

    serializer = (Serializer) _staticSerializerMap.get(cl);
    if (serializer != null)
      return serializer;

    if (_cachedSerializerMap != null) {
      synchronized (_cachedSerializerMap) {
	serializer = (Serializer) _cachedSerializerMap.get(cl);
      }
      
      if (serializer != null)
	return serializer;
    }

    for (int i = 0;
	 serializer == null && _factories != null && i < _factories.size();
	 i++) {
      AbstractSerializerFactory factory;

      factory = (AbstractSerializerFactory) _factories.get(i);

      serializer = factory.getSerializer(cl);
    }

    if (serializer != null) {
    }

    else if (JavaSerializer.getWriteReplace(cl) != null)
      serializer = new JavaSerializer(cl);

    else if (HessianRemoteObject.class.isAssignableFrom(cl))
      serializer = new RemoteSerializer();

    else if (BurlapRemoteObject.class.isAssignableFrom(cl))
      serializer = new RemoteSerializer();

    else if (Map.class.isAssignableFrom(cl)) {
      if (_mapSerializer == null)
	_mapSerializer = new MapSerializer();
      
      serializer = _mapSerializer;
    }
    else if (Collection.class.isAssignableFrom(cl)) {
      if (_collectionSerializer == null) {
	_collectionSerializer = new CollectionSerializer();
      }

      serializer = _collectionSerializer;
    }

    else if (cl.isArray())
      serializer = new ArraySerializer();

    else if (Throwable.class.isAssignableFrom(cl))
      serializer = new ThrowableSerializer(cl);

    else if (InputStream.class.isAssignableFrom(cl))
      serializer = new InputStreamSerializer();

    else if (Iterator.class.isAssignableFrom(cl))
      serializer = IteratorSerializer.create();

    else if (Enumeration.class.isAssignableFrom(cl))
      serializer = EnumerationSerializer.create();
    
    else if (Calendar.class.isAssignableFrom(cl))
      serializer = CalendarSerializer.create();
    
    else if (Locale.class.isAssignableFrom(cl))
      serializer = LocaleSerializer.create();
    
    else if (Enum.class.isAssignableFrom(cl))
      serializer = new EnumSerializer(cl);

    if (serializer == null)
      serializer = getDefaultSerializer(cl);

    if (_cachedSerializerMap == null)
      _cachedSerializerMap = new HashMap(8);

    synchronized (_cachedSerializerMap) {
      _cachedSerializerMap.put(cl, serializer);
    }

    return serializer;
  }


在SerializerFactory中,实现了抽象类的getDeserializer方法,根据不同的需要被反序列化的类来获得不同的反序列化工具,默认的反序列化工具类是JavaDeserializer

  public Deserializer getDeserializer(Class cl)
    throws HessianProtocolException
  {
    Deserializer deserializer;

    deserializer = (Deserializer) _staticDeserializerMap.get(cl);
    if (deserializer != null)
      return deserializer;

    if (_cachedDeserializerMap != null) {
      synchronized (_cachedDeserializerMap) {
	deserializer = (Deserializer) _cachedDeserializerMap.get(cl);
      }
      
      if (deserializer != null)
	return deserializer;
    }


    for (int i = 0;
	 deserializer == null && _factories != null && i < _factories.size();
	 i++) {
      AbstractSerializerFactory factory;
      factory = (AbstractSerializerFactory) _factories.get(i);

      deserializer = factory.getDeserializer(cl);
    }

    if (deserializer != null) {
    }

    else if (Collection.class.isAssignableFrom(cl))
      deserializer = new CollectionDeserializer(cl);

    else if (Map.class.isAssignableFrom(cl))
      deserializer = new MapDeserializer(cl);
    
    else if (cl.isInterface())
      deserializer = new ObjectDeserializer(cl);

    else if (cl.isArray())
      deserializer = new ArrayDeserializer(cl.getComponentType());

    else if (Enumeration.class.isAssignableFrom(cl))
      deserializer = EnumerationDeserializer.create();

    else if (Enum.class.isAssignableFrom(cl))
      deserializer = new EnumDeserializer(cl);
    
    else
      deserializer = getDefaultDeserializer(cl);

    if (_cachedDeserializerMap == null)
      _cachedDeserializerMap = new HashMap(8);

    synchronized (_cachedDeserializerMap) {
      _cachedDeserializerMap.put(cl, deserializer);
    }

    return deserializer;
  }



下面我们来看一下HessianOutput,它继承AbstractHessianOutput成为序列化输出流的一种实现
它会实现很多方法,用来做流输出为了标志某种特定含义的分隔,在这里就不详细说了,如:
  public void startCall()
    throws IOException
  {
    os.write('c');
    os.write(0);
    os.write(1);
  }


需要注意的是方法,它会调用先serializerFactory根据类来获得serializer序列化工具类
public void writeObject(Object object)
    throws IOException
  {
    if (object == null) {
      writeNull();
      return;
    }

    Serializer serializer;

    serializer = _serializerFactory.getSerializer(object.getClass());

    serializer.writeObject(object, this);
  }



现在我们来看看AbstractSerializer,其中writeObject为必须在子类实现的方法,AbstractSerializer有17种子类实现,hessian根据不同的java对象类型来实现了不同的序列化工具类,其中默认的是JavaSerializer

abstract public class AbstractSerializer implements Serializer {
  protected static final Logger log = Logger.getLogger(AbstractSerializer.class.getName());
  
  abstract public void writeObject(Object obj, AbstractHessianOutput out)
    throws IOException;
}



以下是JavaSerializer的writeObject方法的实现,遍历java对象的数据成员,根据数据成员的类型来获得各自的FieldSerializer,一共有6中默认的FieldSerializer
  public void writeObject(Object obj, AbstractHessianOutput out)
    throws IOException
  {
    if (out.addRef(obj)) {
      return;
    }
    
    Class cl = obj.getClass();

    try {
      if (_writeReplace != null) {
	Object repl;

	if (_writeReplaceFactory != null)
	  repl = _writeReplace.invoke(_writeReplaceFactory, obj);
	else
	  repl = _writeReplace.invoke(obj);

	out.removeRef(obj);

	out.writeObject(repl);

	out.replaceRef(repl, obj);

	return;
      }
    } catch (RuntimeException e) {
      throw e;
    } catch (Exception e) {
      // log.log(Level.FINE, e.toString(), e);
      throw new RuntimeException(e);
    }

    int ref = out.writeObjectBegin(cl.getName());

    if (ref < -1) {
      writeObject10(obj, out);
    }
    else {
      if (ref == -1) {
	writeDefinition20(out);
	out.writeObjectBegin(cl.getName());
      }

      writeInstance(obj, out);
    }
  }

  private void writeObject10(Object obj, AbstractHessianOutput out)
    throws IOException
  {
    for (int i = 0; i < _fields.length; i++) {
      Field field = _fields[i];

      out.writeString(field.getName());
	
      _fieldSerializers[i].serialize(out, obj, field);
    }
      
    out.writeMapEnd();
  }
  
  private void writeDefinition20(AbstractHessianOutput out)
    throws IOException
  {
    out.writeClassFieldLength(_fields.length);
	
    for (int i = 0; i < _fields.length; i++) {
      Field field = _fields[i];
      
      out.writeString(field.getName());
    }
  }
  
  public void writeInstance(Object obj, AbstractHessianOutput out)
    throws IOException
  {
    for (int i = 0; i < _fields.length; i++) {
      Field field = _fields[i];

      _fieldSerializers[i].serialize(out, obj, field);
    }
  }



拿默认的FieldSerializer举例,还是调用AbstractHessianOutput的子类来writeObject,这个时候,肯定能找到相应的Serializer来做序列化
  static class FieldSerializer {
    static final FieldSerializer SER = new FieldSerializer();
    
    void serialize(AbstractHessianOutput out, Object obj, Field field)
      throws IOException
    {
      Object value = null;
	
      try {
	value = field.get(obj);
      } catch (IllegalAccessException e) {
	log.log(Level.FINE, e.toString(), e);
      }

      try {
	out.writeObject(value);
      } catch (RuntimeException e) {
	throw new RuntimeException(e.getMessage() + "\n Java field: " + field,
				   e);
      } catch (IOException e) {
	throw new IOExceptionWrapper(e.getMessage() + "\n Java field: " + field,
			      e);
      }
    }
  }



这样hessian的序列化架构已经都看完了,同理可以反推出hessian的反序列化机制。SerializerFactory可以根据需要被反序列化的类来获得反序列化工具类来做反序列化操作

  public Object readObject(AbstractHessianInput in)
    throws IOException
  {
    Object obj = in.readObject();

    String className = getClass().getName();

    if (obj != null)
      throw error(className + ": unexpected object " + obj.getClass().getName() + " (" + obj + ")");
    else
      throw error(className + ": unexpected null value");
  }



HessianInput继承AbstractHessianInput实现方法readObject,在该方法中hessian就通过在序列化时做的标记来判断传输数据的类型等,根据传入的类来获得相应的反序列化工具来做反序列化操作,最终获得实例对象
public Object readObject(Class cl)
    throws IOException
  {
    if (cl == null || cl == Object.class)
      return readObject();
    
    int tag = read();
    
    switch (tag) {
    case 'N':
      return null;

    case 'M':
    {
      String type = readType();

      // hessian/3386
      if ("".equals(type)) {
	Deserializer reader;
	reader = _serializerFactory.getDeserializer(cl);

	return reader.readMap(this);
      }
      else {
	Deserializer reader;
	reader = _serializerFactory.getObjectDeserializer(type);

        return reader.readMap(this);
      }
    }

    case 'V':
    {
      String type = readType();
      int length = readLength();
      
      Deserializer reader;
      reader = _serializerFactory.getObjectDeserializer(type);
      
      if (cl != reader.getType() && cl.isAssignableFrom(reader.getType()))
        return reader.readList(this, length);

      reader = _serializerFactory.getDeserializer(cl);

      Object v = reader.readList(this, length);

      return v;
    }

    case 'R':
    {
      int ref = parseInt();

      return _refs.get(ref);
    }

    case 'r':
    {
      String type = readType();
      String url = readString();

      return resolveRemote(type, url);
    }
    }

    _peek = tag;

    // hessian/332i vs hessian/3406
    //return readObject();
    
    Object value = _serializerFactory.getDeserializer(cl).readObject(this);

    return value;
  }


看见网上还有把memcached的序列化实现改成用hessian的来实现,目前还没有做研究,有兴趣的朋友可以实践一下,

说得比较多了,大体上的hessian的序列化与反序列化的机制就说到这里了,贴上来的代码不齐全,代码版本是hessian-3_2-snap-src,感兴趣的可以去hessian官网直接下载源码阅读。
论坛首页 Java企业应用版

跳转论坛:
Global site tag (gtag.js) - Google Analytics