`
waiting
  • 浏览: 232674 次
  • 性别: Icon_minigender_1
  • 来自: cq
社区版块
存档分类
最新评论

MySQL COMPACT栏格式导致输出乱码

阅读更多
CentOS 5.2, PHP 5.1.6, MYSQL 5.0.77. 数据库UTF8,表MYISAM引擎,采用PEAR::MDB2访问。
读取出的中文会有乱码尾巴。使用EMS/navicat查看数据都正常。折腾半天最后发现应该和表的Row format格式有关:如果使用COMPACT格式则会出现记录输出时中文文字后面跟着乱码。即使把格式改成Dynamic已有的记录在输出时还是有乱码,估计是这条记录的row结构并未更新,手动更新这条记录的中文后(这条记录)在输出时就不会出现乱码尾巴了。

查询资料,关于compact row format:
引用
#

InnoDB tables use a compact storage format. In versions of MySQL earlier than 5.0.3, InnoDB rows contain some redundant information, such as the number of columns and the length of each column, even for fixed-size columns. By default, tables are created in the compact format (ROW_FORMAT=COMPACT). If you wish to downgrade to older versions of MySQL, you can request the old format with ROW_FORMAT=REDUNDANT.

The presence of the compact row format decreases row storage space by about 20% at the cost of increasing CPU use for some operations. If your workload is a typical one that is limited by cache hit rates and disk speed it is likely to be faster. If it is a rare case that is limited by CPU speed, it might be slower.

The compact InnoDB format also changes how CHAR columns containing UTF-8 data are stored. With ROW_FORMAT=REDUNDANT, a UTF-8 CHAR(N) occupies 3 × N bytes, given that the maximum length of a UTF-8 encoded character is three bytes. Many languages can be written primarily using single-byte UTF-8 characters, so a fixed storage length often wastes space. With ROW_FORMAT=COMPACT format, InnoDB allocates a variable amount of storage in the range from N to 3 × N bytes for these columns by stripping trailing spaces if necessary. The minimum storage length is kept as N bytes to facilitate in-place updates in typical cases.


引用
# 在MySQL/InnoDB中,InnoDB表使用更紧凑的存储格式。在以前版本的MySQL中,InnoDB记录包含一些冗余信息,例如列数目和每个列的长度,即使对于固定大小的列。默认情况,创建的表为紧凑格式(ROW_FORMAT=COMPACT)。如果想要降级旧版本的MySQL/InnoDB,可以用ROW_FORMAT=REDUNDANT要求旧的格式。
# 紧凑的InnoDB格式也改变了包含UTF-8数据的CHAR列的保存方式。在ROW_FORMAT=REDUNDANT格式中,UTF-8 CHAR(n)占用3*n字节,假定UTF-8编码的字符的最大长度是3字节。许多语言可以主要用单字节UTF-8字符来编写,固定的存储长度通常会浪费空间。通过根据需要剥离尾部的空格,ROW_FORMAT=COMPACT格式为这些列分配可变数量的n..3*n字节。最小存储长度按顺序保存为n字节,以在典型情况下帮助更新。

http://www.guduo.net/2009/04/000189.html

估计是compact紧凑格式使得中文字符在保存时有问题,或者是导致在输出时不能正常判断字符切分位置。
不过我发现InnoDB格式似乎并不能指定compact row format.
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics