`

第13章 字符串操作

 
阅读更多
1.
public class Concatenation {
  public static void main(String[] args) {
    String mango = "mango";
    String s = "abc" + mango + "def" + 47;
    System.out.println(s);
  }
}
上面这段代码,编译器会自动引入StringBuilder来构造s对象。这种情况下编译器会自动地为你优化性能。

但是,不要盲目相信编译器,请看下面的代码:

public class WhitherStringBuilder {

  //这种构建字符串的方式,没循环一次都要构建一个StringBuilder对象,很影响性能
  public String implicit(String[] fields) {
    String result = "";
    for(int i = 0; i < fields.length; i++)
      result += fields[i];
    return result;
  }

  //这种显示地构建StringBuilder对象,比较好,它不会再每次循环都构建对象。
  public String explicit(String[] fields) {
    StringBuilder result = new StringBuilder();
    for(int i = 0; i < fields.length; i++)
      result.append(fields[i]);
    return result.toString();
  }
}

2.在使用StringBuilder的append的时,参数不能是字符串相加,例如:
append(a+":"+b),这样编译器会掉入陷阱,从而为你创建一个StringBuilder对象来处理括号内的字符串操作。

3.无意识的递归

public class InfiniteRecursion {
  public String toString() {
    return " InfiniteRecursion address: " + this + "\n";
  }
  public static void main(String[] args) {
    List<InfiniteRecursion> v =
      new ArrayList<InfiniteRecursion>();
    for(int i = 0; i < 10; i++)
      v.add(new InfiniteRecursion());
    System.out.println(v);
  }
}
当字符串后跟+,而+后面又不是字符串时,编译器就会试着把它转换成字符串,从而调用其toString方法,这样就造成了递归,从而

在运行时报错。正确的做法是:super.toString()

4.String类的方法都会返回一个新的String对象。同时,如果没有发生改变,String的方法只是返回指向原来对象的引用而已,这样

可以节省存储空间和避免额外的开销。

5.正则表达式就是以某种方式来描述字符串。在java中“\\”表示要插入一个正则表达式的反斜杠。
6.
\d 表示[0-9]
\D 表示[^0-9]非数字
\w 表示[a-zA-Z0-9]
\W 表示[^\w]
.  匹配任意字符   所以,在java中要匹配点,必须把它转义成普通的点“\\.”
x? 表示0次或一次
x+ 表示1次或多次
x* 表示0次货多次
x{n}  恰好n次
x{n,}  至少n次
x{n,m}  至少n次,最多m次
例如:
public static void main(String[] args) {
    System.out.println("--4".matches("-?\\d+"));
    System.out.println("5678".matches("-?\\d+"));
    System.out.println("+911".matches("-?\\d+"));
  
   //表示一个-或+,或二者都没有,|表示或得意思。因为在+在正则表达式中有特殊的含义,所有必须用\\将其转义成普通的+号
    System.out.println("+911".matches("(-|\\+)?\\d+"));
  }

7.abc+,不是表示匹配1次或多次abc,是表示匹配1次或多次c,(abc)+,这样才是匹配abc;
8.
[size=x-large][/size]


第二部分 相关类说明:
1.java.util.regex
Class Pattern

public final class Pattern
extends Object
implements Serializable

A compiled representation of a regular expression.

A regular expression, specified as a string, must first be compiled into an instance of this class. The resulting pattern can then be used to create a

Matcher object that can match arbitrary character sequences against the regular expression. All of the state involved in performing a match resides in the

matcher, so many matchers can share the same pattern.

A typical invocation sequence is thus

Pattern p = Pattern.compile("a*b");
Matcher m = p.matcher("aaaaab");
boolean b = m.matches();
A matches method is defined by this class as a convenience for when a regular expression is used just once. This method compiles an expression and matches an

input sequence against it in a single invocation. The statement

boolean b = Pattern.matches("a*b", "aaaaab");
is equivalent to the three statements above, though for repeated matches it is less efficient since it does not allow the compiled pattern to be reused.


一些正则表达式:
\\ The backslash character
\s 空白字符:[ \t\n\x0B\f\r]    
\S 非空白字符:[^\s]
p("192".matches("[0-2][0-9][0-9]"));//true  []表示范围,[0-2]表示在0到2范围内一个数字,[0-9]表示在0到9范围内一个数字 
p("a".matches("[abc]"));// true  ,匹配中括号内有一个a
p("a".matches("[abc]"));// true  ,匹配中括号内有一个a 
p("a".matches("[^abc]"));//false,匹配除了abc以外的任意字符 
p("a".matches("[a-zA-Z]"));//true,匹配小写的a-z或者大写的A-Z范围内一个字符 
p("a".matches("[a-z | A-Z]"));//true 匹配小写的a-z或者大写的A-Z范围内一个字符 
p("a".matches("[a-z[A-Z]]"));//true  匹配小写的a-z或者大写的A-Z范围内一个字符 
p("R".matches("[A-Z && [RGB]]"));//true 必须在A-Z范围内并且是RGB范围中的一个
[]表示范围,一个中括号匹配一个字符 

p(" \n\r\t\f".matches("\\s{5}"));//true,匹配5个空白字符 
p(" ".matches("\\s"));//true,匹配一个空白字符 
p(" ".matches("\\S"));//false,匹配一个非空白字符 
p("a_9".matches("\\w{3}"));//true,匹配3位字符组成的单词 
p("$_%".matches("\\w{3}"));//false 
p("abc866666%&^#".matches("[a-z]{1,3}\\d+[&%^#]+"));//true,a-z 出现1次到3次,数字出现一次或多次,&%^# 出现一次或多次 
p("\\abc".matches("\\\\[a-z A-Z]{1,3}"));//true,正则表达式匹配一个反斜线必须用\\

边界匹配器
^ 行的开头      $ 行的结尾  \b 单词边界  \z 输入的结尾 
//  \B 非单词边界  \A 输入的开头 \G 上一个匹配的结尾  \Z 输入的结尾,仅用于最后的结束符(如果有的话) 
 
//匹配以h开头的紧跟a-z的字符出现1到3次,后接o的单词边界,后面为零个或者多个任意的字符 
p("hello world".matches("^h[a-z]{1,3}o\\b.*"));//true 
p("hello world".matches("^h.*d$"));//true,匹配以h开头的后接任意零个或者多个以d结尾的字符 
p("helloWorld".matches("^h[a-z]{1,3}o\\b.*"));//false

捕获组()里面的最为一个整体
捕获组(capturing group)是将多个字符作为单独的单元来对待的一种方式

对一个正则表达式编译之后,可以重复使用;
相关方法:
(1)compile

public static Pattern compile(String regex)
Compiles the given regular expression into a pattern.
Parameters:
regex - The expression to be compiled
(2)matcher

public Matcher matcher(CharSequence input)
Creates a matcher that will match the given input against this pattern.
Parameters:
input - The character sequence to be matched
Returns:
A new matcher for this pattern
(3)matches

public static boolean matches(String regex,
                              CharSequence input)
Compiles the given regular expression and attempts to match the given input against it.
An invocation of this convenience method of the form

Pattern.matches(regex, input);
behaves in exactly the same way as the expression
Pattern.compile(regex).matcher(input).matches()
If a pattern is to be used multiple times, compiling it once and reusing it will be more efficient than invoking this method each time.

Parameters:
regex - The expression to be compiled
input - The character sequence to be matched

2.java.lang.Object
  java.util.regex.Matcher
An engine that performs match operations on a character sequence by interpreting a Pattern.
相关方法:
(1)matches

public boolean matches()
Attempts to match the entire region against the pattern.
If the match succeeds then more information can be obtained via the start, end, and group methods.

Returns:
true if, and only if, the entire region sequence matches this matcher's pattern
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics