`

在oracle中如何删除表中的重复数据

 
阅读更多

1.利用rowid

oracle中,每一条记录都有一个rowidrowid在整个数据库中是唯一的(rowid确定了每条记录是oracle中的哪一个数据文件、块、行上)。在重复的记录中,可能所有列的内容都相同,但rowid不会相同。SQL语句如下:

DELETE FROM EMPLOYEE

 WHERE ROWID IN (SELECT A.ROWID

                   FROM EMPLOYEE A, EMPLOYEE B

                  WHERE A.ROWID < B.ROWID

                    AND A.NAME = B.NAME);

如果已经知道每条记录只有一条重复的,这个sql语句适用。但是如果每条记录的重复记录有N条,这个N是未知的,就要考虑适用下面这种方法了。

2.结合rowid利用max/min函数

DELETE FROM EMPLOYEE A

 WHERE ROWID NOT IN

       (SELECT MAX(B.ROWID) FROM EMPLOYEE B WHERE A.NAME = B.NAME);

这里也可以使用min或者用下面的语句

DELETE FROM EMPLOYEE A

 WHERE ROWID < (SELECT MAX(B.ROWID) FROM EMPLOYEE B WHERE A.NAME = B.NAME);

这里如果把max换成min的话,前面的where子句中需要把<改为>

3.利用group by

跟上面的方法思路基本是一样的,不过使用group by减少显性的比较条件,提高效率。

DELETE FROM EMPLOYEE

 WHERE ROWID NOT IN (SELECT MAX(ROWID) FROM EMPLOYEE T GROUP BY T.NAME);

或者

DELETE FROM EMPLOYEE

 WHERE NAME IN

       (SELECT NAME FROM EMPLOYEE GROUP BY NAME HAVING COUNT(*) > 1)

   AND ROWID NOT IN

       (SELECT MIN(ROWID) FROM EMPLOYEE GROUP BY NAME HAVING COUNT(*) > 1);

4.利用分析函数

SELECT *

  FROM EMPLOYEE

 WHERE ROWID IN (SELECT RID

                   FROM (SELECT ROWID RID,

                                ROW_NUMBER() OVER(PARTITION BY NAME ORDER BY ROWID) N

                           FROM EMPLOYEE)

                  WHERE N <> 1);

注:如果重复的是多个字段(利用group by)

DELETE FROM EMPLOYEE

 WHERE (NAME, SEX) IN (SELECT NAME, SEX

                         FROM EMPLOYEE

                        GROUP BY NAME, SEX

                       HAVING COUNT(*) > 1)

   AND ROWID NOT IN (SELECT MIN(ROWID)

                       FROM EMPLOYEE

                      GROUP BY NAME, SEX

                     HAVING COUNT(*) > 1);

随便说一下,以上语句的执行效率是很低的,可以考虑建立临时表,将需要判断重复的字段、rowid插入临时表中,然后删除的时候在进行比较。比如:

CREATE TABLE 临时表 AS

  SELECT A.字段1, A.字段2, MAX(A.ROWID) DATAID

    FROM 正式表 A

   GROUP BY A.字段1, A.字段2;

DELETE FROM 表名 A

 WHERE A.ROWID <> (SELECT B.DATAID

                     FROM 临时表 B

                    WHERE A.字段1 = B.字段1

                      AND A.字段2 = B.字段2);

最后一点,对于完全重复记录的删除可以将查询的记录放到临时表中,然后再将原来的表记录删除,最后将临时表的数据导回原来的表中。如下:

CREATE TABLE 临时表 AS(SELECT DISTINCT * FROM 表名);

TRUNCATE TABLE 正式表;

INSERT INTO 正式表 (SELECT * FROM 临时表);

DROP TABLE 临时表;

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics