`
sonyfe25cp
  • 浏览: 202657 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

Java中utf-8编码的String变成byte数组

阅读更多
描述:
1.页面上输入汉字,然后提交utf-8编码格式
2.后台接收该汉字,将汉字分开.获得其utf-8编码


页面文件:
<%@ page contentType="text/html;charset=utf-8"%>
<%request.setCharacterEncoding("utf-8");%>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
     <meta http-equiv="content-type" content="text/html; charset=UTF-8"> 
    <title>My JSP 'index.jsp' starting page</title>
  </head>
  
  <body>
    <form action = "servlet\InputTest" method = "get">
    	<input type="text" name="inputtext" />
    	<input type="submit" />
    </form>
  </body>
</html>



后台文件:
		throws ServletException, IOException {

		response.setContentType("text/html");
		PrintWriter out = response.getWriter();
		out
				.println("<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">");
		out.println("<HTML>");
		out.println("  <HEAD><TITLE>A Servlet</TITLE></HEAD>");
		out.println("  <BODY>");
		out.print("    This is ");
		out.print(this.getClass());
		out.print(" [ ");
		if (request.getParameter("inputtext") != null) {
			String t = request.getParameter("inputtext");
			out.println(t);
			
			System.out.println(t+"   :t.value");
			System.out.println(t.length()+"   :t.length");
			
			for (int i = 0 ; i < t.length() ; i++) {
			    System.out.println(Integer.toHexString((byte)t.charAt(i)& 0xFF).toUpperCase()+"   :t.charAt");
			}
out.print(" ] ");
		out.println(", using the GET method");
		out.println("  </BODY>");
		out.println("</HTML>");
		out.flush();
		out.close();
	}
}


虽然有乱码...但取到了前台页面的汉字的utf-8的码了...具体怎么把乱码再去掉..目前还未解决...后续完成之..

ps:
将通过request.getParameter()方法或得到的String直接得到byte[],封装成了一个方法..如下
public byte[] transFromUTF8(String s){
		byte[] b=new byte[s.length()];
		for (int i = 0 ; i < s.length() ; i++) {
		    System.out.println(Integer.toHexString((byte)s.charAt(i)& 0xFF).toUpperCase()+"   :16进制utf-8");
		    b[i]=(byte)s.charAt(i);
		}
		return b;
	}


控制台打印效果:
前台输入"士大夫"
后台显示:
????¤§?¤?   //  乱码状态的"士大夫"
9   :t.length  //士大夫 这三个字的byte长度
E5   :t.charAt // 这些就是士大夫 三个字的UTF-8编码...每个字由3个构成
A3   :t.charAt
AB   :t.charAt
E5   :t.charAt
A4   :t.charAt
A7   :t.charAt
E5   :t.charAt
A4   :t.charAt
AB   :t.charAt


---------------------------------------
将自己学习中的东西记录下来...

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics