`

用java实现"awk -d"功能(保留多行重复)

    博客分类:
  • Java
阅读更多
一般用过linux脚本的都知道"awk -d"的用法: 只显示有重复数据行,每种重复行只显示其中一行.

而我的需求是希望显示所有的重复行, 而不是只是一行. 因为目前对shell脚本不是很熟练, 下面是java代码的实现,感觉比想象的复杂, 备忘一下:

public class ReadCardCode {
    public static void main(String[] args) throws Exception {
        BufferedReader reader =
                new BufferedReader(new FileReader("sort.log"));
        BufferedWriter writer1 =
                new BufferedWriter(new FileWriter("result.log"));
        BufferedWriter writer2 =
                new BufferedWriter(new FileWriter("result-2.log"));
        int count = 6;
        int i = 0;
        String current = null;
        String curItemId = null;
        // 将同一商品的所有记录取出放到一边, 如果这些记录大于1则说明有重复, 输出. 否则抛弃
        List<String> lineList = new ArrayList<String>(10);
        try {
            while ((current = reader.readLine()) != null) {
                String[] curArray = current.split(" ");
                if (curItemId == null) { // 首行
                    lineList.add(current);
                } else { // 下一行
                    if (curArray[2].equals(curItemId)) { // 相同行加入列表
                        lineList.add(current);
                    } else {
                        writeLineList(writer1, writer2, lineList);
                        // 下一轮首行
                        lineList.clear();
                        lineList.add(current);
                    }
                }

                curItemId = curArray[2];

                // if (i++ > 6) {
                // break;
                // }
            }
            writeLineList(writer1, writer2, lineList);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                writer2.close();
                writer1.close();
            } catch (Exception ee) {
                ee.printStackTrace();
            }
        }
    }

    private static void writeLineList(BufferedWriter writer, BufferedWriter writer2, List<String> lineList)
            throws IOException {
        if (lineList.size() > 1) { // 输出前面相同行
            for (String line : lineList) {
                write(writer, writer2, line);
            }
        }
    }

    private static void write(BufferedWriter writer, BufferedWriter writer2, String str) throws IOException {
        BufferedWriter w = writer;
        // String[] curArray = str.split(" ");
        // String itemId = curArray[2].replace("itemId=", "");
        // long route = Long.valueOf(itemId) % 2;
        // if (route == 1) {
        // w = writer;
        // } else {
        // w = writer2;
        // }
        w.write(str);
        w.newLine();
        w.flush();
    }
}
分享到:
评论
2 楼 night_stalker 2010-06-08  
用 stdio 更通用
gets("\0").lines.group_by{|l|l[/:.+/]}.each{|_,v|puts v if v[1]}


调用:
ruby p.rb < sort.log > result.log
1 楼 RednaxelaFX 2010-06-07  
也就是说Ruby脚本的话:
open('result.log', 'w') {|f| File.readlines('sort.log').group_by {|l| l.split[2]}.each {|_, v| f << v if v.size > 1} }

相关推荐

Global site tag (gtag.js) - Google Analytics