0 0

poi读取xlsx表格内容并插入到mysql中,OutOfMemoryError5

两个Excel表,通过jspsmart上传到服务器,在服务器上通过poi读取表格内容,放到各自对应的bean对象,然后再把bean存到list里面。然后获取List内容,再连接池中一行一行插入到数据库中。因为数据量比较大(24万行数据),每次执行到往list中存的时候,就出现异常了,tomcat的 xms已经设到了1424MB(set JAVA_OPTS=-server -Xms1424m -Xmx1424m -XX:PermSize=128m -XX:MaxPermSize=256m -XX:-UseGCOverheadLimit),一个Excel表中的字段较少,就能成功插入,但另一个表就不行了。
下面是插入失败表的相关方法,请大神指点。
插入失败的bean类字段:
public class OaToDoEntity {
	private String pendingCode;
	private String pendingDate;
	private String pendingUserID;
	private String pendingURL;
	private String pendingStatus;
	private String pendingLevel;
	private String pendingCityCode;
	private String pendingSourceUserID;
	private String pendingSource;
	private String Status;
	private String gsname;
	private String pendingTitle;
        ......
}


从Excel表格获取内容,存到bean中,再存到List中的方法:
public static ArrayList<OaToDoEntity> getOAInsertBean(InputStream in) {
		ArrayList<OaToDoEntity> al = new ArrayList<OaToDoEntity>();
		if (null != in) {
			try {
				Workbook workbook = FileUtil.getWorkbook(in);
				if (null != workbook) {
					Sheet sheet = workbook.getSheetAt(0);
					System.out.println("得到Excel表的行数为:" + sheet.getLastRowNum());
					// 读取Excel内容
					for (int j = 0; j <= sheet.getLastRowNum(); j++) {
						Row rowS = sheet.getRow(j);
						OaToDoEntity oa = new OaToDoEntity();
						if (rowS != null) {
							oa.setPendingCode(FileUtil.getCellValue(rowS.getCell(0)).trim());
							oa.setPendingTitle(FileUtil.getCellValue(rowS.getCell(1)).trim());
							oa.setPendingDate(FileUtil.getCellValue(rowS.getCell(2)).trim());
							oa.setPendingUserID(FileUtil.getCellValue(rowS.getCell(3)).trim());
							oa.setPendingURL(FileUtil.getCellValue(rowS.getCell(4)).trim());
							oa.setPendingStatus(FileUtil.getCellValue(rowS.getCell(5)).trim());
							oa.setPendingLevel(FileUtil.getCellValue(rowS.getCell(6)).trim());
							oa.setPendingCityCode(FileUtil.getCellValue(rowS.getCell(7)).trim());
							oa.setPendingSourceUserID(FileUtil.getCellValue(rowS.getCell(8)).trim());
							oa.setPendingSource(FileUtil.getCellValue(rowS.getCell(9)).trim());
							oa.setStatus(FileUtil.getCellValue(rowS.getCell(10)).trim());
							oa.setGsname(FileUtil.getCellValue(rowS.getCell(11)).trim());

							al.add(oa);
						}
					}
				}
			} catch (Exception e) {
				e.printStackTrace();
				log.error(e);
			}
		}
		return al;

	}



从List中取得bean,并取出对应的字段插入到数据库方法:
public static void insertOAToDo(ConnectionPoolManager cpmp,
			ArrayList<OaToDoEntity> all) {

		ConnectionPoolManager cpm = cpmp;
		ArrayList<OaToDoEntity> al = all;
		try {
			if (null != al) {
				String clearTb = "TRUNCATE TABLE oa_todo";
				System.out.println("准备清空oa_todo表");
				cpm.execute(clearTb);
				System.out.println("完成清空oa_todo表");
				System.out.println("得到list里元素个数为:" + al.size());
				StringBuffer sql = new StringBuffer("");
				// 读取Excel内容
				for (int j = 0; j < al.size(); j++) {
					//OaToDoEntity oa = al.get(j);
					if (null != al.get(j)) {
						sql.delete(0, sql.length());
						sql
								.append("insert into audit.oa_todo(pendingCode,pendingTitle,pendingDate,");
						sql
								.append("pendingUserID,pendingURL,pendingStatus,pendingLevel,");
						sql
								.append("pendingCityCode,pendingSourceUserID,pendingSource,Status,gsname) ");
						sql.append(" values('" + al.get(j).getPendingCode() + "','"
								+ al.get(j).getPendingTitle().replaceAll("[']", "”")
								+ "','" + al.get(j).getPendingDate() + "','"
								+ al.get(j).getPendingUserID() + "','"
								+ al.get(j).getPendingURL() + "','"
								+ al.get(j).getPendingStatus() + "','"
								+ al.get(j).getPendingLevel() + "','"
								+ al.get(j).getPendingCityCode() + "','"
								+ al.get(j).getPendingSourceUserID() + "','"
								+ al.get(j).getPendingSource() + "','"
								+ al.get(j).getStatus() + "','" + al.get(j).getGsname()
								+ "')");
						System.out.println(sql.toString());
						cpm.execute(sql.toString());
						//oa = null;
					}
				}
			}
		} catch (Exception e) {
			e.printStackTrace();
			log.error(e);
		}

	}



调用以上方法的关键语句:
InputStream fis = new FileInputStream(oafile);
			ArrayList<OaToDoEntity> al = FileUtil.getOAInsertBean(fis);
			FileUtil.insertOAToDo(cpm,al);

为了尽量减少内存占用,许多中间引用对象都没使用,但依然不行,求指点……
2013年12月30日 09:39

14个答案 按时间排序 按投票排序

0 0

建议几点:
1.数据分批处理;
2.excel文件读取的时候,不用普通的IO,使用FileChannel文件管道操作。
3.复杂点考虑MapReduce,其实说白了也是分批处理任务,最后再整合。

2014年1月01日 00:05
0 0

看代码:

CSVWriter:

package com.util;

import java.io.Closeable;
import java.io.IOException;
import java.io.PrintWriter;
import java.io.Writer;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.List;

import au.com.bytecode.opencsv.ResultSetHelper;
import au.com.bytecode.opencsv.ResultSetHelperService;

public class CSVWriter implements Closeable{

	 public static final int INITIAL_STRING_SIZE = 128;
	  private Writer rawWriter;
	  private PrintWriter pw;
	  private char separator;
	  private char quotechar;
	  private char escapechar;
	  private String lineEnd;
	  public static final char DEFAULT_ESCAPE_CHARACTER = '"';
	  public static final char DEFAULT_SEPARATOR = ',';
	  public static final char DEFAULT_QUOTE_CHARACTER = '"';
	  public static final char NO_QUOTE_CHARACTER = '\u0000';
	  public static final char NO_ESCAPE_CHARACTER = '\u0000';
	  public static final String DEFAULT_LINE_END = "\n";
	  private ResultSetHelper resultService = new ResultSetHelperService();
	  
	  private static final String LINE_SEPERATOR=System.getProperty("line.separator"); 

	  public CSVWriter(Writer writer)
	  {
	    this(writer, ',');
	  }

	  public CSVWriter(Writer writer, char separator)
	  {
	    this(writer, separator, '"');
	  }

	  public CSVWriter(Writer writer, char separator, char quotechar)
	  {
	    this(writer, separator, quotechar, '"');
	  }

	  public CSVWriter(Writer writer, char separator, char quotechar, char escapechar)
	  {
	    this(writer, separator, quotechar, escapechar, LINE_SEPERATOR);
	  }

	  public CSVWriter(Writer writer, char separator, char quotechar, String lineEnd)
	  {
	    this(writer, separator, quotechar, '"', lineEnd);
	  }

	  public CSVWriter(Writer writer, char separator, char quotechar, char escapechar, String lineEnd)
	  {
	    this.rawWriter = writer;
	    this.pw = new PrintWriter(writer);
	    this.separator = separator;
	    this.quotechar = quotechar;
	    this.escapechar = escapechar;
	    this.lineEnd = lineEnd;
	  }
	  
	  public static CSVWriter newDefaultSeparatorNoQuotechar(Writer writer) {
		  return new CSVWriter(writer, CSVWriter.DEFAULT_SEPARATOR, CSVWriter.NO_QUOTE_CHARACTER);
	  }
	  
	  public static CSVWriter newDefaultSeparatorNoQuotecharNoEscape(Writer writer) {
		  return new CSVWriter(writer, CSVWriter.DEFAULT_SEPARATOR, CSVWriter.NO_QUOTE_CHARACTER, CSVWriter.NO_ESCAPE_CHARACTER);
	  }

	  public void writeAll(List<String[]> allLines)
	  {
	    for (String[] line : allLines)
	      writeNext(line);
	  }

	  protected void writeColumnNames(ResultSet rs)
	    throws SQLException
	  {
	    writeNext(this.resultService.getColumnNames(rs));
	  }

	  public void writeAll(ResultSet rs, boolean includeColumnNames)
	    throws SQLException, IOException
	  {
	    if (includeColumnNames) {
	      writeColumnNames(rs);
	    }

	    while (rs.next())
	    {
	      writeNext(this.resultService.getColumnValues(rs));
	    }
	  }

	  public void writeNext(List<String []> nextLines) {
		 if (nextLines != null && nextLines.size() > 0) {
			 for (String [] nextLine:nextLines) {
				 writeNext(nextLine);
			 }
			 this.pw.flush();
		 }
	  }
	  
	  public void writeNext(String[] nextLine)
	  {
	    if (nextLine == null) {
	      return;
	    }
	    StringBuilder sb = new StringBuilder(128);
	    for (int i = 0; i < nextLine.length; i++)
	    {
	      if (i != 0) {
	        sb.append(this.separator);
	      }

	      String nextElement = nextLine[i];
	      if (nextElement == null)
	        continue;
	      if (this.quotechar != 0) {
	        sb.append(this.quotechar);
	      }
	      sb.append(stringContainsSpecialCharacters(nextElement) ? processLine(nextElement) : nextElement);

	      if (this.quotechar != 0) {
	        sb.append(this.quotechar);
	      }
	    }
	    sb.append(this.lineEnd);
	    this.pw.write(sb.toString());
	  }

	  private boolean stringContainsSpecialCharacters(String line)
	  {
	    return (line.indexOf(this.quotechar) != -1) || (line.indexOf(this.escapechar) != -1);
	  }

	  protected StringBuilder processLine(String nextElement)
	  {
	    StringBuilder sb = new StringBuilder(128);
	    for (int j = 0; j < nextElement.length(); j++) {
	      char nextChar = nextElement.charAt(j);
	      if ((this.escapechar != 0) && (nextChar == this.quotechar))
	        sb.append(this.escapechar).append(nextChar);
	      else if ((this.escapechar != 0) && (nextChar == this.escapechar))
	        sb.append(this.escapechar).append(nextChar);
	      else {
	        sb.append(nextChar);
	      }
	    }

	    return sb;
	  }

	  public void flush()
	    throws IOException
	  {
	    this.pw.flush();
	  }

	  public void close()
	    throws IOException
	  {
	    flush();
	    this.pw.close();
	    this.rawWriter.close();
	  }

	  public boolean checkError()
	  {
	    return this.pw.checkError();
	  }

	  public void setResultService(ResultSetHelper resultService) {
	    this.resultService = resultService;
	  }

}


CSVUtil:
package com.util;

import java.io.BufferedOutputStream;
import java.io.BufferedWriter;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.OutputStreamWriter;
import java.io.UnsupportedEncodingException;



public class CSVUtil {

	public static CSVWriter initDataLoadWriter(String dataloadFilePath)
			throws FileNotFoundException, UnsupportedEncodingException {
		BufferedOutputStream fosDataload;
		BufferedWriter writerDataload;
		CSVWriter dataloadWriter;
		fosDataload = new BufferedOutputStream(new FileOutputStream(dataloadFilePath));
		writerDataload = new BufferedWriter(new OutputStreamWriter(fosDataload,"utf-8"), 5 * 1024 * 1024);
		dataloadWriter = CSVWriter.newDefaultSeparatorNoQuotecharNoEscape(writerDataload);
		return dataloadWriter;
	}
}


Portal:
package com.core;

import java.io.FileInputStream;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;

import com.pojo.OaToDoEntity;
import com.sun.rowset.internal.Row;
import com.util.CSVUtil;
import com.util.CSVWriter;

public class Portal {

	public static void main(String[] args) {
		try {
			InputStream in = new FileInputStream("D://test.xlsx");
			String csvFile = getOAInsertBean(in);
			ConnectionPoolManager cpmp = new ConnectionPoolManager()//这里根据你的逻辑添加
			insertOAToDo(cpmp, csvFile);
		} catch (Exception e) {
			e.printStackTrace();
		}
		
	}

	public static String getOAInsertBean(InputStream in) {
		String newCSVFileName = "D://temp.csv";

		CSVWriter writer = null;
		try {
			writer = CSVUtil.initDataLoadWriter(newCSVFileName);

			ArrayList<OaToDoEntity> al = new ArrayList<OaToDoEntity>();
			List<String[]> commitRows = new ArrayList<String[]>();
			if (null != in) {
				Workbook workbook = FileUtil.getWorkbook(in);
				if (null != workbook) {
					Sheet sheet = workbook.getSheetAt(0);
					System.out.println("得到Excel表的行数为:" + sheet.getLastRowNum());
					// 读取Excel内容
					for (int j = 0; j <= sheet.getLastRowNum(); j++) {
						Row rowS = sheet.getRow(j);
						OaToDoEntity oa = new OaToDoEntity();
						if (rowS != null) {
							String[] row =new String[12];
							for (int i=0; i<12; i++) {
								row[i] = FileUtil.getCellValue(rowS.getCell(i)).trim();
							}
							commitRows.add(row);
							if (commitRows.size() == 1000) {
								writer.writeNext(commitRows);
								commitRows.clear();
							}
						}
					}
				}

			}
		} catch (Exception e) {
			e.printStackTrace();
		} finally {
			if (writer != null) {
				try {
					writer.close();
				} catch (Exception e) {
					e.printStackTrace();
				}
			}
		}

		return newCSVFileName;

	}
	
	protected static String getDataLoadSQL(String tableName, String path, String columns) {  
        String sql = "LOAD DATA LOCAL INFILE '%s' INTO TABLE %s CHARACTER SET %s FIELDS TERMINATED BY ',' ENCLOSED BY '\u0000' LINES TERMINATED BY '\n' (%s)";  
        sql = String.format(sql, path,tableName, "utf8", columns);  
        return sql;  
}  
	
	public static void insertOAToDo(ConnectionPoolManager cpmp,  
           String filePath) {  
  
        ConnectionPoolManager cpm = cpmp;  
        try {  
            if (null != filePath) {  
                String clearTb = "TRUNCATE TABLE oa_todo";  
                System.out.println("准备清空oa_todo表");  
                cpm.execute(clearTb);  
                System.out.println("完成清空oa_todo表");  
                
                System.out.println("开始导入数据到oa_todo表");  
                cpm.execute(getDataLoadSQL("oa_todo",filePath, "pendingCode,pendingTitle,pendingDate,pendingUserID,pendingURL,pendingStatus,pendingLevel,pendingCityCode,pendingSourceUserID,pendingSource,Status,gsname")); 
                System.out.println("入数据到oa_todo表结束");  
            }  
        } catch (Exception e) {  
            e.printStackTrace();  
            log.error(e);  
        }  
  
    }  
}


需要导入opencsv-2.3.jar

2013年12月31日 15:35
0 0

public static void main(String[] args) {
		 List<JSONObject>  list = new ArrayList<JSONObject>();  
		    int i = 0;  
		    JSONObject jObject=null;
		    try{  
		        for(;;i++){  
		        	jObject = new JSONObject();
		        	jObject.put("aa", "aaaaa"+i);
		        	jObject.put("bb", "bbbbb"+i);
		        	jObject.put("cc", "bbbbb"+i);
		        	jObject.put("dd", "bbbbb"+i);
		        	jObject.put("ee", "bbbbb"+i);
		        	jObject.put("ff", "bbbbb"+i);
		        	jObject.put("gg", "bbbbb"+i);
		        	jObject.put("hh", "bbbbb"+i);
		        	jObject.put("ii", "bbbbb"+i);
		        	jObject.put("jj", "bbbbb"+i);
		        	jObject.put("kk", "bbbbb"+i);
		        	jObject.put("ll", "bbbbb"+i);
		        	jObject.put("mm", "bbbbb"+i);
		        	jObject.put("nn", "bbbbb"+i);
		        	jObject.put("oo", "bbbbb"+i);
		        	jObject.put("pp", "bbbbb"+i);
		        	jObject.put("qq", "bbbbb"+i);
		        	jObject.put("rr", "bbbbb"+i);
		        	jObject.put("ss", "bbbbb"+i);
		        	jObject.put("tt", "bbbbb"+i);
		        	jObject.put("uu", "bbbbb"+i);
		        	jObject.put("vv", "bbbbb"+i);
		        	jObject.put("ww", "bbbbb"+i);
		        	jObject.put("xx", "bbbbb"+i);
		        	jObject.put("yy", "bbbbb"+i);
		        	jObject.put("zz", "bbbbb"+i);
		            list.add(jObject);  
		        }  
		    }catch(java.lang.OutOfMemoryError e) {  
		    	e.printStackTrace();
		        System.err.println("eror:i =" + i );
		    }  
		   
	}

往list中丢对象
java.lang.OutOfMemoryError: GC overhead limit exceeded
	at net.sf.json.JsonConfig.<init>(JsonConfig.java:80)
	at net.sf.json.JSONObject.element(JSONObject.java:1764)
	at net.sf.json.JSONObject.put(JSONObject.java:2380)
	at ListLen.main(ListLen.java:36)
eror:i =251247


-Xmx1124m

只能丢25w条哦,所有还是建议你把list分批处理

2013年12月31日 14:31
0 0

你的思路有问题,MYSQL大数据的插入直接用LOAD DATA INFILE就可以了,我看你的那个excell应该是可以直接通过sql插入到数据库的。

protected String getDataLoadSQL(String tableName, String path, String columns) {
		String sql = "LOAD DATA LOCAL INFILE '%s' INTO TABLE %s CHARACTER SET %s FIELDS TERMINATED BY ',' ENCLOSED BY '\u0000' LINES TERMINATED BY '\n' (%s)";
		sql = String.format(sql, path,tableName, "utf8", columns);
		return sql;
}



this.simpleDAO.execute(getDataLoadSQL("oa_todo",oafile, "pendingCode,pendingTitle,pendingDate,pendingUserID,pendingURL,pendingStatus,pendingLevel,pendingCityCode,pendingSourceUserID,pendingSource,Status,gsname"));


不要搞得那么麻烦了。

2013年12月31日 14:00
0 0

public static void main(String[] args) {
    List<Integer>  list = new ArrayList<Integer>();
    int i = 0;
    try{
        System.out.println("i =" + i );
        for(;;i++){
            list.add(i);
        }
    }catch(java.lang.OutOfMemoryError e) {
        System.out.println("i =" + i );
    }
 
    }

你先看下你的环境,list最多能放多少条数据

2013年12月31日 11:33
0 0

poi提供了两种读取excel的方式,一种是类似xml的Dom,把整个文档load到内存中再解析,这种方法在org.apache.poi.xssf.usermodel包中,好用,但占内存;还有一种就是类似xml的Sax解析,每次只读取一条记录,这种方式占用的内存很小,但API用起来没那么方便,专门为读取大文件而设定的,这种方法在org.apache.poi.xssf.eventusermodel包中。

2013年12月31日 10:02
0 0

你这个处理方法太叨了,24万行数据你全放到list里,能不爆内存。我建议你可以设置一个量,比如1000个数据。只要往list里插够1000条,就直接写入数据库,然后清空list,接入填充下一个1000条。

2013年12月30日 22:03
0 0

处理大数据量增加内存肯定不是最好的解决方式。
POI在加载excel文件的时候是把整个文件都加载到内存了,文件大了肯定会内存溢出了。
最好是用csv或文本格式的文件,可以用IO流的方式一行一行读取,再处理。
或则可以将文件分块再进行处理。

2013年12月30日 13:19
0 0

示例代码

public static ArrayList<OaToDoEntity> getOAInsertBean(InputStream in) {  
  ArrayList<OaToDoEntity> al = new ArrayList<OaToDoEntity>();  
  if (null != in) {  
  try {  
    Workbook workbook = FileUtil.getWorkbook(in);  
    if (null != workbook) {  
      Sheet sheet = workbook.getSheetAt(0);  
      System.out.println("得到Excel表的行数为:" + sheet.getLastRowNum());  
      // 读取Excel内容  
      for (int j = 0; j <= sheet.getLastRowNum(); j++) {  
        Row rowS = sheet.getRow(j);  
        OaToDoEntity oa = new OaToDoEntity();  
        if (rowS != null) {  
          oa.setPendingCode(FileUtil.getCellValue(rowS.getCell(0)).trim());  
          oa.setPendingTitle(FileUtil.getCellValue(rowS.getCell(1)).trim());  
          oa.setPendingDate(FileUtil.getCellValue(rowS.getCell(2)).trim());  
          oa.setPendingUserID(FileUtil.getCellValue(rowS.getCell(3)).trim());  
          oa.setPendingURL(FileUtil.getCellValue(rowS.getCell(4)).trim());  
          oa.setPendingStatus(FileUtil.getCellValue(rowS.getCell(5)).trim());  
          oa.setPendingLevel(FileUtil.getCellValue(rowS.getCell(6)).trim());  
          oa.setPendingCityCode(FileUtil.getCellValue(rowS.getCell(7)).trim());  
          oa.setPendingSourceUserID(FileUtil.getCellValue(rowS.getCell(8)).trim());  
          oa.setPendingSource(FileUtil.getCellValue(rowS.getCell(9)).trim());  
          oa.setStatus(FileUtil.getCellValue(rowS.getCell(10)).trim());  
          oa.setGsname(FileUtil.getCellValue(rowS.getCell(11)).trim());  

          al.add(oa);  
        } 
        if(al.size()==100){ // 每100条插入
          将 ArrayList<OaToDoEntity> al 插入到数据库
          然后将al清空  al.clear();
        } 
      } 
      if(al.size()!=0){ // 剩余的
        将 ArrayList<OaToDoEntity> al 插入到数据库
        然后将al清空  al.clear();
      }  
    }  
  } catch (Exception e) {  
    e.printStackTrace();  
    log.error(e);  
  }  
  }  
  return al;  
}  

2013年12月30日 10:42
0 0

是.xls还是.xlsx?如果是后者可以使用事件监听模式读取,如果是前者,Workbook workbook = FileUtil.getWorkbook(in); 可能就内存溢出了,试着把in换成BufferedInputStream只是提高一部分读取容量,但是文件过大依然会out的

2013年12月30日 10:29
0 0

省略放到list操作,从Excel中读取一条就插入一条

2013年12月30日 10:23
0 0

poi确实很占内存,如果excel文件很大,在new workbook的时候就溢出了,可以用eventmodel的方式来读取。

2013年12月30日 10:17
0 0

我跟你遇到的问题差不多,也是效率慢得要死,数据量和你不在一个级别,所以没有内存溢出,但是业务逻辑比你多几步。很是蛋疼啊。
   

2013年12月30日 10:09
0 0

分批次插入到数据库,比如每读100条就插入, 不要一次性读取后在插入

2013年12月30日 09:49

相关推荐

Global site tag (gtag.js) - Google Analytics