Using Java 7
import java.io.IOException; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; public class Test { public static void main(String[] args) throws IOException { Path source = Paths.get("c:/temp/0multipage.tif"); System.out.println(Files.probeContentType(source)); // output : image/tiff } }
The default implementation is OS-specific and not very complete. It's possible to register a better detector, like for example Apache Tika, see Transparently improve Java 7 mime-type recognition with Apache Tika.
Using javax.activation.MimetypesFileTypeMap
activation.jar is required, it can be downloaded from http://java.sun.com/products/javabeans/glasgow/jaf.html.
The MimetypesFileMap class is used to map a File to a Mime Type. Mime types supported are defined in a ressource file inside the activation.jar.
import javax.activation.MimetypesFileTypeMap; import java.io.File; class GetMimeType { public static void main(String args[]) { File f = new File("gumby.gif"); System.out.println("Mime Type of " + f.getName() + " is " + new MimetypesFileTypeMap().getContentType(f)); // expected output : // "Mime Type of gumby.gif is image/gif" } }
The built-in mime-type list is very limited but a mechanism is available to add very easily more Mime Types/extensions.
The MimetypesFileTypeMap looks in various places in the user's system for MIME types file entries. When requests are made to search for MIME types in the MimetypesFileTypeMap, it searches MIME types files in the following order:
- Programmatically added entries to the MimetypesFileTypeMap instance.
- The file .mime.types in the user's home directory.
- The file <java.home>/lib/mime.types.
- The file or resources named META-INF/mime.types.
- The file or resource named META-INF/mimetypes.default (usually found only in the activation.jar file).
This method is interesting when you need to deal with incoming files with the filenames normalized. The result is very fast because only the extension is used to guess the nature of a given file.
Using java.net.URL
Warning : this method is very slow!.
Like the above method a match is done with the extension. The mapping between the extension and the mime-type is defined in the file [jre_home]\lib\content-types.properties
import java.net.*; public class FileUtils{ public static String getMimeType(String fileUrl) throws java.io.IOException, MalformedURLException { String type = null; URL u = new URL(fileUrl); URLConnection uc = null; uc = u.openConnection(); type = uc.getContentType(); return type; } public static void main(String args[]) throws Exception { System.out.println(FileUtils.getMimeType("file://c:/temp/test.TXT")); // output : text/plain } }
A note from R. Lovelock :
I was trying to find the best way of getting the mime type of a file and found your sight very useful. However I have now found a way of getting the mime type using URLConnection that isn't as slow as the way you describe.
import java.net.FileNameMap; import java.net.URLConnection; public class FileUtils { public static String getMimeType(String fileUrl) throws java.io.IOException { FileNameMap fileNameMap = URLConnection.getFileNameMap(); String type = fileNameMap.getContentTypeFor(fileUrl); return type; } public static void main(String args[]) throws Exception { System.out.println(FileUtils.getMimeType("file://c:/temp/test.TXT")); // output : text/plain } }
Using Apache Tika
Tika is subproject of Lucene, a search engine. It is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
This package is very up-to-date regarding the filetypes supported, Office 2007 formats are supported (docs/pptx/xlsx/etc...).
Tika has a lot of dependencies ... almost 20 jars ! But it can do a lot more than detecting filetype. For example, you can parse a PDF or DOC to extract the text and the metadata very easily.
import java.io.File; import java.io.FileInputStream; import org.apache.tika.metadata.Metadata; import org.apache.tika.parser.AutoDetectParser; import org.apache.tika.parser.Parser; import org.apache.tika.sax.BodyContentHandler; import org.xml.sax.ContentHandler; public class Main { public static void main(String args[]) throws Exception { FileInputStream is = null; try { File f = new File("C:/Temp/mime/test.docx"); is = new FileInputStream(f); ContentHandler contenthandler = new BodyContentHandler(); Metadata metadata = new Metadata(); metadata.set(Metadata.RESOURCE_NAME_KEY, f.getName()); Parser parser = new AutoDetectParser(); // OOXMLParser parser = new OOXMLParser(); parser.parse(is, contenthandler, metadata); System.out.println("Mime: " + metadata.get(Metadata.CONTENT_TYPE)); System.out.println("Title: " + metadata.get(Metadata.TITLE)); System.out.println("Author: " + metadata.get(Metadata.AUTHOR)); System.out.println("content: " + contenthandler.toString()); } catch (Exception e) { e.printStackTrace(); } finally { if (is != null) is.close(); } } }
You can download here a ZIP containing the required jars if you want to check it out.
Using JMimeMagic
Checking the file extension is not a very strong way to determine the file type. A more robust solution is possible with the JMimeMagic library. JMimeMagic is a Java library (LGLP licence) that retrieves file and stream mime types by checking magic headers.
// snippet for JMimeMagic lib // http://sourceforge.net/projects/jmimemagic/ Magic parser = new Magic() ; // getMagicMatch accepts Files or byte[], // which is nice if you want to test streams MagicMatch match = parser.getMagicMatch(new File("gumby.gif")); System.out.println(match.getMimeType()) ;
Thanks to Jean-Marc Autexier and sygsix for the tip!
Using mime-util
Another tool is mime-util. This tool can detect using the file extension or the magic header technique.
import eu.medsea.mimeutil.MimeUtil; public class Main { public static void main(String[] args) { MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector"); File f = new File ("c:/temp/mime/test.doc"); Collection<?> mimeTypes = MimeUtil.getMimeTypes(f); System.out.println(mimeTypes); // output : application/msword } }
The nice thing about mime-util is that it is very lightweight. Only 1 dependency with slf4j
Using Droid
DROID (Digital Record Object Identification) is a software tool to perform automated batch identification of file formats.
DROID uses internal and external signatures to identify and report the specific file format versions of digital files. These signatures are stored in an XML signature file, generated from information recorded in the PRONOM technical registry. New and updated signatures are regularly added to PRONOM, and DROID can be configured to automatically download updated signature files from the PRONOM website via web services.
It can be invoked from two interfaces, a Java Swing GUI or a command line interface.
http://droid.sourceforge.net/wiki/index.php/Introduction
Aperture framework
Aperture is an open source library and framework for crawling and indexing information sources such as file systems, websites and mail boxes.
The Aperture code consists of a number of related but independently usable parts:
- Crawling of information sources: file systems, websites, mail boxes
- MIME type identification
- Full-text and metadata extraction of various file formats
- Opening of crawled resources
For each of these parts, a set of APIs has been developed and a number of implementations is provided.
Other Method
ONE:
FileDataSource fds = new FileDataSource(new File("c:/some/path/file.xlsx"));
System.out.println("Content-Type is: "+fds.getContentType());
TWO:
import java.net.FileNameMap;
import java.net.URLConnection;
public class FileUtils {
public static String getMimeType(String fileUrl)
throws java.io.IOException
{
FileNameMap fileNameMap = URLConnection.getFileNameMap();
String type = fileNameMap.getContentTypeFor(fileUrl);
return type;
}
public static void main(String args[]) throws Exception {
System.out.println(FileUtils.getMimeType("file://c:/temp/test.TXT"));
// output : text/plain
}
}
THREE:
Apache Tika 1.3 offers tika-core (http://search.maven.org/#artif...|org.apache.tika|tika-core|1.3|bundle), which does NOT load any more dependencies.
Minimal code example (with theInputStream and theFileName being the "input"):
try (InputStream is = theInputStream;
BufferedInputStream bis = new BufferedInputStream(is);) {
AutoDetectParser parser = new AutoDetectParser();
Detector detector = parser.getDetector();
Metadata md = new Metadata();
md.add(Metadata.RESOURCE_NAME_KEY, theFileName);
MediaType mediaType = detector.detect(bis, md);
return mediaType.toString();
}
http://blog.csdn.net/qiuhan/article/details/12586943
相关推荐
获取到JAVA的后缀名,集成到Util类,方便以后工作的时候调用!
Java根据文件内容获取文件类型,防止文件伪造后缀名。
java文件的工具类,封装了常用的操作,尤其针对文件的实际类型,通过获取文件的byte,来查看文件起始字节的魔数值,通过魔数值来判断文件的类型,工具集合了常用的文件类型对应的魔数,也封装了文件类型的判断方法
JAVA 根据Url 接口 获取文件名称和类型,亲测可用。输入参数地址即可。
FileNameUtils.getSuffix : 获取文件后缀 如 C:\A\B\test.txt 返回: txt /home/usr/test.txt 返回 txt test.txt 返回: txt FileNameUtils.getFilename: 获取文件名 如 C:\A\B\test.txt 返回 test.txt /home...
本段代码主要是使用Java编写的递归获取指定路径下获取匹配后缀文件列表程序,可以做出多种扩展。如有疑问,可以留言,欢迎下载和支持。 本段代码主要是使用Java编写的递归获取指定路径下获取匹配后缀文件列表程序,可以...
本篇文章给大家详细讲述了Java IO文件后缀名过滤的相关知识点,以及实例代码分享,有需要的朋友跟着小编一起学习下。
java 获取文件的真实类型(不是根据文件的后缀名称判断类型) commons-io-2.6.jar commons-lang-2.6.jar
java 解析 tar gz文件 两种方法 目前我找到的Java解析tar.gz文件的两种方法 附带这各自的jar包 希望对大家有用
shp文件:地理信息系统,也被称作GIS,它主要的扩展类型是SHAPEFILE (.SHP),一个包含了矢量地理空间数据的流行文件格式,描述了几何形态,点,线和多边形...该方法通过java代码实现将shp文件的数据读取以及存入数据库
主要介绍了Java获取文件的类型和扩展名的实现方法的相关资料,需要的朋友可以参考下
获取某个路径下的 所有文件夹和子文件夹 java后缀的文件的总行数 扩展名可随意改
提供java中对文件类的各种基本操作,主要包括获取文件的后缀名称,读取文件内容,写入文件内容,拷贝文件,将文件转换为二进制数组等操作,转换为Blob格式类等操作
用php分析URL网址,可以得到文件名、目录路径,还有其它数据,原理就是使用PHP的explode函数分隔字符串。
主要介绍了如何通过java获取文件名和扩展名,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下
根据网络文件地址 ,获取文件后缀名称,下载到本地
代码演示了几种获取MimeType类型的方法,包括jar包
可以转换文件的编码格式,网上有些项目和本地的eclipse的编码格式不同就会发生乱码,很烦。然后就各方参考,整理出了这一份代码...会自动获取文件的编码格式,只需要输入文件路径,需要转码的格式,以及文件后缀即可。
主要介绍了java获取文件扩展名的方法,结合实例形式分析了使用正则与字符串截取两种获取扩展名的操作技巧,需要的朋友可以参考下
在上传文件时,常常要对文件的类型即对文件的后缀名进行判断,用javascript可以很容易的做到这一点。用Javascript解析一个带绝对路径的文件名并得到后缀名的方法有很多种,这里列出一种,以供参考。 对于一个带绝对...