`

Java序列化算法实现和说明

    博客分类:
  • Java
阅读更多

Serialization is the process of saving an object's state to a sequence of bytes; deserialization is the process of rebuilding those bytes into a live object. The Java Serialization API provides a standard mechanism for developers to handle object serialization. In this tip, you will see how to serialize an object, and why serialization is sometimes necessary. You'll learn about the serialization algorithm used in Java, and see an example that illustrates the serialized format of an object. By the time you're done, you should have a solid knowledge of how the serialization algorithm works and what entities are serialized as part of the object at a low level.

Why is serialization required?

In today's world, a typical enterprise application will have multiple components and will be distributed across various systems and networks. In Java, everything is represented as objects; if two Java components want to communicate with each other, there needs be a mechanism to exchange data. One way to achieve this is to define your own protocol and transfer an object. This means that the receiving end must know the protocol used by the sender to re-create the object, which would make it very difficult to talk to third-party components. Hence, there needs to be a generic and efficient protocol to transfer the object between components. Serialization is defined for this purpose, and Java components use this protocol to transfer objects.

Figure 1 shows a high-level view of client/server communication, where an object is transferred from the client to the server through serialization.

 



 

Figure 1. A high-level view of serialization in action (click to enlarge)

How to serialize an object

In order to serialize an object, you need to ensure that the class of the object implements the java.io.Serializable interface, as shown in Listing 1.

Listing 1. Implementing Serializable

import java.io.Serializable;
class TestSerial implements Serializable {
	public byte version = 100;
	public byte count = 0;
}

 

 

In Listing 1, the only thing you had to do differently from creating a normal class is implement the java.io.Serializable interface. The Serializable interface is a marker interface; it declares no methods at all. It tells the serialization mechanism that the class can be serialized.

Now that you have made the class eligible for serialization, the next step is to actually serialize the object. That is done by calling the writeObject() method of the java.io.ObjectOutputStream class, as shown in Listing 2.

Listing 2. Calling writeObject()

public static void main(String args[]) throws IOException {
	FileOutputStream fos = new FileOutputStream("temp.out");
	ObjectOutputStream oos = new ObjectOutputStream(fos);	
	TestSerial ts = new TestSerial();	
	oos.writeObject(ts);	
	oos.flush();	
	oos.close();
}

 

 

Listing 2 stores the state of the TestSerial object in a file called temp.out. oos.writeObject(ts); actually kicks off the serialization algorithm, which in turn writes the object to temp.out.

To re-create the object from the persistent file, you would employ the code in Listing 3.

Listing 3. Recreating a serialized object

public static void main(String args[]) throws IOException {
	FileInputStream fis = new FileInputStream("temp.out");	
	ObjectInputStream oin = new ObjectInputStream(fis);	
	TestSerial ts = (TestSerial) oin.readObject();	
	System.out.println("version="+ts.version);
}

 

 

In Listing 3, the object's restoration occurs with the oin.readObject() method call. This method call reads in the raw bytes that we previously persisted and creates a live object that is an exact replica of the original object graph. Because readObject() can read any serializable object, a cast to the correct type is required.

Executing this code will print version=100 on the standard output.

The serialized format of an object

What does the serialized version of the object look like? Remember, the sample code in the previous section saved the serialized version of the TestSerial object into the file temp.out. Listing 4 shows the contents of temp.out, displayed in hexadecimal. (You need a hexadecimal editor to see the output in hexadecimal format.)

Listing 4. Hexadecimal form of TestSerial

 

AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 A0 0C 34 00 FE B1 DD F9 02 00 02 42 00 05
63 6F 75 6E 74 42 00 07 76 65 72 73 69 6F 6E 78
70 00 64

 

If you look again at the actual TestSerial object, you'll see that it has only two byte members, as shown in Listing 5.

Listing 5. TestSerial's byte members

	public byte version = 100;	
	public byte count = 0;

 

The size of a byte variable is one byte, and hence the total size of the object (without the header) is two bytes. But if you look at the size of the serialized object in Listing 4, you'll see 51 bytes. Surprise! Where did the extra bytes come from, and what is their significance? They are introduced by the serialization algorithm, and are required in order to to re-create the object. In the next section, you'll explore this algorithm in detail.

Java's serialization algorithm

By now, you should have a pretty good knowledge of how to serialize an object. But how does the process work under the hood? In general the serialization algorithm does the following:

  • It writes out the metadata of the class associated with an instance.
  • It recursively writes out the description of the superclass until it finds java.lang.object.
  • Once it finishes writing the metadata information, it then starts with the actual data associated with the instance. But this time, it starts from the topmost superclass.
  • It recursively writes the data associated with the instance, starting from the least superclass to the most-derived class.

I've written a different example object for this section that will cover all possible cases. The new sample object to be serialized is shown in Listing 6.

Listing 6. Sample serialized object

class parent implements Serializable {	
   int parentVersion = 10;
}

class contain implements Serializable{
	int containVersion = 11;
}

public class SerialTest extends parent implements Serializable {
	int version = 66;	contain con = new contain();
	public int getVersion() {
			return version;	
  }	
  public static void main(String args[]) throws IOException {
  		FileOutputStream fos = new FileOutputStream("temp.out");		
  		ObjectOutputStream oos = new ObjectOutputStream(fos);		
  		SerialTest st = new SerialTest();		
  		oos.writeObject(st);		
  		oos.flush();		
  		oos.close();	
  }
}

 

 

This example is a straightforward one. It serializes an object of type SerialTest, which is derived from parent and has a container object, contain. The serialized format of this object is shown in Listing 7.

Listing 7. Serialized form of sample object

AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 05 52 81 5A AC 66 02 F6 02 00 02 49 00 07
76 65 72 73 69 6F 6E 4C 00 03 63 6F 6E 74 00 09
4C 63 6F 6E 74 61 69 6E 3B 78 72 00 06 70 61 72
65 6E 74 0E DB D2 BD 85 EE 63 7A 02 00 01 49 00
0D 70 61 72 65 6E 74 56 65 72 73 69 6F 6E 78 70
00 00 00 0A 00 00 00 42 73 72 00 07 63 6F 6E 74
61 69 6E FC BB E6 0E FB CB 60 C7 02 00 01 49 00
0E 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E 78
70 00 00 00 0B

 

 

Figure 2 offers a high-level look at the serialization algorithm for this scenario.

 

 



 

Figure 2. An outline of the serialization algorithm

Let's go through the serialized format of the object in detail and see what each byte represents. Begin with the serialization protocol information:

  • AC ED: STREAM_MAGIC. Specifies that this is a serialization protocol.
  • 00 05: STREAM_VERSION. The serialization version.
  • 0x73: TC_OBJECT. Specifies that this is a new Object.

The first step of the serialization algorithm is to write the description of the class associated with an instance. The example serializes an object of type SerialTest, so the algorithm starts by writing the description of the SerialTest class.

  • 0x72: TC_CLASSDESC. Specifies that this is a new class.
  • 00 0A: Length of the class name.
  • 53 65 72 69 61 6c 54 65 73 74: SerialTest, the name of the class.
  • 05 52 81 5A AC 66 02 F6: SerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This particular flag says that the object supports serialization.
  • 00 02: Number of fields in this class.

Next, the algorithm writes the field int version = 66;.

  • 0x49: Field type code. 49 represents "I", which stands for Int.
  • 00 07: Length of the field name.
  • 76 65 72 73 69 6F 6E: version, the name of the field.

And then the algorithm writes the next field, contain con = new contain();. This is an object, so it will write the canonical JVM signature of this field.

  • 0x74: TC_STRING. Represents a new string.
  • 00 09: Length of the string.
  • 4C 63 6F 6E 74 61 69 6E 3B: Lcontain;, the canonical JVM signature.
  • 0x78: TC_ENDBLOCKDATA, the end of the optional block data for an object.

The next step of the algorithm is to write the description of the parent class, which is the immediate superclass of SerialTest.

  • 0x72: TC_CLASSDESC. Specifies that this is a new class.
  • 00 06: Length of the class name.
  • 70 61 72 65 6E 74: SerialTest, the name of the class
  • 0E DB D2 BD 85 EE 63 7A: SerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This flag notes that the object supports serialization.
  • 00 01: Number of fields in this class.

Now the algorithm will write the field description for the parent class. parent has one field, int parentVersion = 100;.

  • 0x49: Field type code. 49 represents "I", which stands for Int.
  • 00 0D: Length of the field name.
  • 70 61 72 65 6E 74 56 65 72 73 69 6F 6E: parentVersion, the name of the field.
  • 0x78: TC_ENDBLOCKDATA, the end of block data for this object.
  • 0x70: TC_NULL, which represents the fact that there are no more superclasses because we have reached the top of the class hierarchy.

So far, the serialization algorithm has written the description of the class associated with the instance and all its superclasses. Next, it will write the actual data associated with the instance. It writes the parent class members first:

  • 00 00 00 0A: 10, the value of parentVersion.

Then it moves on to SerialTest.

  • 00 00 00 42: 66, the value of version.

The next few bytes are interesting. The algorithm needs to write the information about the contain object, shown in Listing 8.

Listing 8. The contain object

contain con = new contain();

 

 

Remember, the serialization algorithm hasn't written the class description for the contain class yet. This is the opportunity to write this description.

  • 0x73: TC_OBJECT, designating a new object.
  • 0x72: TC_CLASSDESC.
  • 00 07: Length of the class name.
  • 63 6F 6E 74 61 69 6E: contain, the name of the class.
  • FC BB E6 0E FB CB 60 C7: SerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This flag indicates that this class supports serialization.
  • 00 01: Number of fields in this class.

Next, the algorithm must write the description for contain's only field, int containVersion = 11;.

  • 0x49: Field type code. 49 represents "I", which stands for Int.
  • 00 0E: Length of the field name.
  • 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E: containVersion, the name of the field.
  • 0x78: TC_ENDBLOCKDATA.

Next, the serialization algorithm checks to see if contain has any parent classes. If it did, the algorithm would start writing that class; but in this case there is no superclass for contain, so the algorithm writes TC_NULL.

  • 0x70: TC_NULL.

Finally, the algorithm writes the actual data associated with contain.

  • 00 00 00 0B: 11, the value of containVersion.

Conclusion

In this tip, you have seen how to serialize an object, and learned how the serialization algorithm works in detail. I hope this article gives you more detail on what happens when you actually serialize an object.

About the author

Sathiskumar Palaniappan has more than four years of experience in the IT industry, and has been working with Java-related technologies for more than three years. Currently, he is working as a system software engineer at the Java Technology Center, IBM Labs. He also has experience in the telecom industry.

Resources

 

SOURCE URL : http://www.javaworld.com/community/node/2915

  • 大小: 24.9 KB
  • 大小: 18 KB
分享到:
评论
3 楼 greatwqs 2011-06-13  
Technoboy 写道
打眼一看,还以为你是作者呢,看完了才知道....

好的文章就收藏一下  
2 楼 Technoboy 2011-06-13  
打眼一看,还以为你是作者呢,看完了才知道....
1 楼 Technoboy 2011-06-13  
written by yourself or ? just foundation but senior?
There maybe some more senior subject to explore

相关推荐

    java序列化原理与算法

    详细讲解了java的序列化用处、原理、算法、如何实现。希望能帮到大家。

    json序列化与反序列化处理代码(java版本)

    极好的序列化与发序列化代码。可以处理array集合,数组或者单个对象等的序列化与反序列化。

    时间序列算法java实现

    时间序列预测法是一种定量分析方法,它是在时间序列变量分析的基础上,运用一定的数学方法建立预测模型,使时间趋势向外延伸,从而预测未来市场的发展变化趋势,确定变量预测值。

    论文研究-一个基于JSON的对象序列化算法.pdf

    提出了一种基于JSON的对象序列化算法,该算法通过分析JSON文法并建立对象导航图,透明地将Java对象序列化成JSON表达式,使客户端能够很好地利用JavaScript引擎来解析JSON响应,有效地解决了解析XML所造成的缺陷。

    Java序列化的机制和原理

    Serialization(序列化)是一种将对象以一连串的字节描述的...在这里你能学到如何序列化一个对象,什么时候需要序列化以及Java序列化的算法,我们用一个实例来示范序列化以后的字节是如何描述一个对象的信息的。……

    Java SE编程入门教程 java序列化(共14页).pptx

    Java SE编程入门教程 java序列化(共14页).pptx Java SE编程入门教程 java异常(共57页).pptx Java SE编程入门教程 java正则(共8页).pptx Java SE编程入门教程 properties(共3页).pptx Java SE编程入门教程 ...

    JAVA操作系统课设作业银行家算法图形化模拟程序

    是一个模拟银行家算法的操作系统程序,采用JAVA16编写,有图形化界面GUI。包括对各种情况的检查安全序列、资源的分配和收回,m种资源n个进程,实时显示系统剩余资源量等

    银行家算法模拟程序JAVA实现

    用JAVA实现的一个银行家算法模拟程序 银行家算法;处理及调度;安全序列

    银行家算法的java代码实现,Swing写的界面

    java代码实现了银行家算法,界面写的个人认为还是较为细致的,完整的实现了找安全序列等算法功能,可作为参考学习银行家算法。

    java算法大全源码包.zip

    Java实现如下算法: 1.链表 链表用来存储数据,由一系列的结点组成。这些结点的物理地址不一定是连续的,即可能连续,也可能不连续,但链表里的结点是有序的。一个结点由数据的值和下一个数据的地址组成。一个链表...

    java语言银行家算法

    java语言的简单银行家算法,其中涉及初始化矩阵以及检查安全序列,是否重分配等

    数据挖掘18大算法实现以及其他相关经典DM算法

    期望最大化算法,可以拆分为2个算法,1个E-Step期望化步骤,和1个M-Step最大化步骤。他是一种算法框架,在每次计算结果之后,逼近统计模型参数的最大似然或最大后验估计。详细介绍链接 Apriori Apriori算法是关联...

    用Java实现的银行家算法

    由于各个借贷者的allocation都是预先设定的为0,所以添加进去之后都是0,所以,如果是安全的,则执行输出的安全序列一定是你添加借贷者的顺序,但是你可以自己修改,或者初始化时手动改变借贷者的allocation,谢谢...

    Java实现基数排序算法(源代码)

    基数排序是一种非比较型整数排序算法,它通过按数字的各个位数进行排序来实现整体序列的有序化。该算法首先确定待排序序列中最大数的位数,然后从最低位(个位)开始,依次对每一位进行排序。在排序过程中,通过创建...

    JAVA实现的操作系统页面置换

    JAVA实现的界面化操作系统页面置换模拟 1.用户可以为程序指定内存块数 2.用户可以自由设置程序的页面访问顺序 3. 用户可在OPT、FIFO和LRU算法选择一个,并能观看到页面置换过程。 内含word设计文档和打包好的jar...

    java实现遗传算法.md

    遗传算法 此示例演示了一个简单的遗传算法,用于查找一组二进制数字的序列,以使它们的总和最...遗传算法的基本步骤包括初始化种群、计算适应度、选择、交叉和突变。您可以根据自己的问题和需求来修改和扩展这个示例。

    Java实现快速排序算法(源代码)

    在Java实现中,快速排序算法通过quickSort方法接收待排序数组和左右索引作为参数,递归地调用partition方法进行数据划分,并分别对划分后的子序列进行排序。partition方法选择数组中的一个元素作为基准,通过比较和...

    JAVA_API1.6文档(中文)

    java.io 通过数据流、序列化和文件系统提供系统输入和输出。 java.lang 提供利用 Java 编程语言进行程序设计的基础类。 java.lang.annotation 为 Java 编程语言注释设施提供库支持。 java.lang.instrument 提供...

    遗传算法与粒子群算法的实现

    本框架提供了有关粒子群算法(PSO)和遗传算法(GA)的完整实现,以及一套关于改进、应用、测试、结果输出的完整框架。 本框架对粒子群算法与遗传算法进行逻辑解耦,对其中的改进点予以封装,进行模块化,使用者可以...

    数据结构与算法分析Java语言描述(第二版)

    Java51.4.1 使用Object表示泛型1.4.2 基本类型的包装1.4.3 使用接口类型表示泛型1.4.4 数组类型的兼容性1.5 利用Java5泛性实现泛型特性成分1.5.1 简单的泛型类和接口1.5.2 自动装箱/拆箱1.5.3 带有限制的通配符...

Global site tag (gtag.js) - Google Analytics