`
fantaxy025025
  • 浏览: 1247700 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类

java汉字判断,中文符号判断,中文标点符号判断

 
阅读更多

===

=

=

 

一、判断中文汉字

 

str.matches("[\u4e00-\u9fcc]+")
Pattern pattern = Pattern.compile("[\u4e00-\u9fcc]+");
System.out.println(pattern.matcher(str).find());

  

 

缺点:只能判断汉字,不能判断中文标点符号。

 

二、判断中文汉字和标点

 

        Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);
        if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS
                || ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS
                || ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A
                || ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B //为什么这个方法缺少了呢?补上了刚刚。TODO 查查什么意思
                || ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION
                || ub == Character.UnicodeBlock.GENERAL_PUNCTUATION
                || ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS) {
            return true;
        }
        return false;
private static boolean isChinesePuctuation(char c) {
		Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);
		if (ub == Character.UnicodeBlock.GENERAL_PUNCTUATION
                || ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION
                || ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS
                || ub == Character.UnicodeBlock.CJK_COMPATIBILITY_FORMS
                || ub == Character.UnicodeBlock.VERTICAL_FORMS) {//jdk1.7
			return true;
		}
		return false;
	}

 

private static boolean isChineseByScript(char c) {
		Character.UnicodeScript sc = Character.UnicodeScript.of(c);
        if (sc == Character.UnicodeScript.HAN) {//jdk1.7
        	return true;
        }
        return false;
	}

 

 缺点:汉字标点一起判断了。

 

 

三、单独判断中文标点

仔细看上一个方法中的Character.UnicodeBlock.XXX

阅读文档了解意思,自然能知道如何做。

 

=

=

=

 

 

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics