- 浏览: 1637926 次
- 性别:
- 来自: 北京
文章分类
- 全部博客 (405)
- C/C++ (16)
- Linux (60)
- Algorithm (41)
- ACM (8)
- Ruby (39)
- Ruby on Rails (6)
- FP (2)
- Java SE (39)
- Java EE (6)
- Spring (11)
- Hibernate (1)
- Struts (1)
- Ajax (5)
- php (2)
- Data/Web Mining (20)
- Search Engine (19)
- NLP (2)
- Machine Learning (23)
- R (0)
- Database (10)
- Data Structure (6)
- Design Pattern (16)
- Hadoop (2)
- Browser (0)
- Firefox plugin/XPCOM (8)
- Eclise development (5)
- Architecture (1)
- Server (1)
- Cache (6)
- Code Generation (3)
- Open Source Tool (5)
- Develope Tools (5)
- 读书笔记 (7)
- 备忘 (4)
- 情感 (4)
- Others (20)
- python (0)
最新评论
-
532870393:
请问下,这本书是基于Hadoop1还是Hadoop2?
Hadoop in Action简单笔记(一) -
dongbiying:
不懂呀。。
十大常用数据结构 -
bing_it:
...
使用Spring MVC HandlerExceptionResolver处理异常 -
一别梦心:
按照上面的执行,文件确实是更新了,但是还是找不到kernel, ...
virtualbox 4.08安装虚机Ubuntu11.04增强功能失败解决方法 -
dsjt:
楼主spring 什么版本,我的3.1 ,xml中配置 < ...
使用Spring MVC HandlerExceptionResolver处理异常
Information Retrieval Resources
Information on Information Retrieval (IR) books, courses, conferences and other resources.
Books on Information Retrieval (General)
Introduction to Information Retrieval.
C.D. Manning, P. Raghavan, H.
Schütze. Cambridge UP, 2008. Classical and web information retrieval systems:
algorithms, mathematical foundations and
practical issues.
Modern Information Retrieval.
R. Baeza-Yates, B.
Ribeiro-Neto. Addison-Wesley, 1999. Widely used and cited.
Information Retrieval: Algorithms and Heuristics.
D.A. Grossman, O. Frieder. Springer,
2004. Excellent textbook.
Managing Gigabytes.
I.H. Witten, A. Moffat,
T.C. Bell. Morgan Kaufmann, 1999. The authority on
index construction and compression.
Finding Out About.
R. Belew. Cambridge UP,
2001. More suitable for undergraduate classes than
other books listed here.
Information Retrieval:
A Health and Biomedical Perspective.
W.R. Hersh. Springer, 2002. As the title says: a
health/biomedical perspective.
TREC: Experiment and Evaluation in Information
Retrieval.
E.M. Voorhees, D.K. Harman. MIT
Press, 2005. A survey of recent research results.
Language Modeling for Information Retrieval.
W.B.
Croft, J. Lafferty. Springer, 2003. Language models are
of increasing importance in IR.
Readings in Information Retrieval.
K. Sparck Jones, P. Willett. Morgan Kaufmann, 1997.
A collection of classical IR papers.
Recommended Reading for IR Research Students.
A. Moffat, J. Zobel, D. Hawking. SIGIR Forum, 39(2), 2005.
Not a book, but a collection of seminal papers, more up-to-date
than Sparck-Jones et al.
Information Storage and Retrieval Systems.
G. Kowalski, M.T. Maybury. Springer, 2005.
"... takes a system approach, discussing all aspects of an Information
Retrieval System."
The Geometry of Information Retrieval.
C.J. van Risjbergen. Cambridge UP, 2004. Am ambitious attempt
to develop quantum mechanics as a new foundation for IR.
Introduction to Modern Information Retrieval.
G.G. Chowdhury. Neal-Schuman, 2003.
Intended for students of library and information studies.
Text Information Retrieval Systems.
C.T. Meadow, B.R. Boyce, D.H. Kraft, C.L. Barry. Academic Press, 2007.
Also takes a library/information science perspective.
More Books
Books on Web Information Retrieval
Information Retrieval in Practice.
B. Croft, D. Metzler, T. Strohman. Pearson Education, 2009.
Mining the Web: Analysis of Hypertext and Semi Structured Data.
S. Chakrabarti. Morgan Kaufmann, 2002. The best
introduction for web-centric IR.
Google's PageRank and beyond: The science of Search Engine Rankings.
Amy N. Langville, Carl D. Meyer. Princeton University Press, 2006. More focused on the algorithms of PageRank, but also covers general web IR.
Modeling the Internet and the Web: Probabilistic Methods and Algorithms.
P. Baldi, P. Frasconi, P. Smyth. Wiley, 2003.
A bit terse. Recommended for those who have a good foundation in
probability theory, but are new to IR.
Good books for implementing a search engine
Managing Gigabytes
(see above)
Building Search Applications: Lucene, Lingpipe, and Gate.
M. Konchady.
Mustru Publishing, 2008.
Lucene in Action.
O. Gospodnetic, E. Hatcher.
Manning Publications, 2004.
Spidering Hacks.
K. Hemenway, T. Calishain.
O'Reilly, 2003.
Online Books - Browsable
Introduction to Information Retrieval (see above)
Finding Out About (see above)
Information Retrieval.
C. J. van
Rijsbergen. Butterworths, 1979. The classic. Almost 40
years old, but still worth reading.
Information Retrieval.
T. van der Weide. 2004. Introduction to IR and hypertext.
Online Books - PDF
Introduction to Information Retrieval (see above)
Information Retrieval in Practice.
B. Croft, D. Metzler, T. Strohman. Pearson Education, 2009. (two chapters)
Information Retrieval.
C. J. van
Rijsbergen. Butterworths, 1979.
Information Retrieval Interaction.
P. Ingwersen. Taylor Graham, 1992. Focuses on user
interaction in IR.
Information Retrieval: A Survey.
Ed Greengrass. 2000. Good survey of "classical" IR, but little or
no coverage of recent work (e.g., language models, PageRank, SVMs).
Various tutorials at Mi Islita
Research Centers
CMU (LTI)
Dublin CU
Geneva (Viper)
Glasgow
Helsinki Institute for Information Technology
IBM
Illinois Institute of Technology
Information Retrieval Facility (IRF)
Microsoft Research
NIST
Peking
Pittsburgh
Queen Mary
Sheffield
UIUC
UMASS
Courses
Berkeley (SIMS)
CMU
Cornell
DePaul
IIT
Johns Hopkins I
Johns Hopkins II
Maryland
MPI
Otago
Pittsburgh
Princeton
Stanford
Stuttgart
Texas
UMASS
Problem Sets / Assignments
IIR exercises
Bilkent
DePaul
Minas Gerais
North Texas
Stuttgart
Tennessee
Web Information Retrieval
webir.org
Search Engine Watch
Users' Guide to Web Searching
PageRank
Subareas, Applications, Methods
Graphical interfaces to support information search
Information Retrieval & Extraction
Information Retrieval & Machine Learning
Text Mining & Web Mining
INEX: XML retrieval
Geographic Information Retrieval
Music Information Retrieval
CLIR & Multilingual Information Retrieval
Cross-Language
Information Retrieval (CLIR) Resources
N-Grams in Information Retrieval
Agent-based Information Retrieval
Audio Information Retrieval
Adversarial Information Retrieval
Conferences
TREC
Cross Language Evaluation Forum (CLEF)
SIGIR 2008
(last),
SIGIR 2009
(next)
CIKM 2008
,
CIKM 2009
WWW 2008
,
WWW 2009
JCDL 2008
,
JCDL 2009
RIAO 2007
,
RIAO 2010
ECIR 2009
,
ECIR 2010
SPIRE 2008
SPIRE 2009
Norbert Fuhr's IR conference calendar
Journals
ACM Transactions on Information Systems (TOIS):
dblp
home
Information Processing and Management (IP&M):
dblp
home
Information Retrieval:
dblp
home
International Journal on Digital Libraries:
dblp
home
Journal of the
American Society of Information Science and Technology (JASIST):
dblp
home
SIGIR Forum:
dblp
home
Journal of Documentation
D-Lib Magazine
Data & Knowledge Engineering:
dblp
home
Information Processing Letters:
dblp
home
Information Research
Information Systems:
dblp
home
Journal of Intelligent Information Systems:
dblp
home
Knowledge and Information Systems:
dblp
home
Foundations and Trends in Information Retrieval:
home
Popular Articles
Wikipedia: Information Retrieval
A. Singhal: Modern Information Retrieval: A Brief Overview
D. Austin: How Google Finds Your Needle in the Web's Haystack
S.E. Robertson, K. Sparck Jones: Simple, proven approaches to text retrieval
Bruce Croft: What Do People Want From IR
Information Retrieval on the World Wide Web
Michael Lesk: The Seven Ages of Information Retrieval
Software
C. Middleton, R. Baeza-Yates: A Comparison of Open Source Search Engines
(contains an up-to-date list of available search engine software)
Doug Oard's list of available
text retrieval systems
Avi Rappoport:
open source search engines
MySQL full text search
Text to Matrix Generator
, a MATLAB toolbox for indexing, retrieval
and other text processing tasks
Collections
U. of Glasgow list of available
text retrieval collections
NLP/IR corpus list at NUS
NLP/IR corpus list at Edinburgh
SMART at Cornell
(downloads of a number of collections, stop lists, SMART retrieval system etc.)
Internet archive
(limited availability)
Linguistic Data Consortium
Professional Organizations
ACM SIGIR
BCS IRSG
Other Collections of Information Retrieval Links
ACM SIGIR
David Karger
Other Resources
Glossary
(Modern Information Retrieval)
Information retrieval research links @ Search Tools
BUBL: Information Retrieval Links
LSU: Information Retrieval Systems
Open Directory: Information Retrieval Links
UBC: Indexing Resources
IR & Neural Networks, Symbolic Learning, Genetic Algorithms
A
stop list
(a list of stop words)
Chris Manning's
NLP resources
Weiguo Patrick Fan's
text mining links
发表评论
-
Lucene 索引格式
2013-06-25 20:11 0索引结构: 索引层次 ... -
计算广告学
2012-08-12 13:53 0计算广告学一: 1、核 ... -
《Lucene in Action》简单笔记
2011-12-22 09:19 0第一章 Meet Lucene -
使用Jsoup抽取数据
2011-03-20 19:22 4918Jsoup是一个Java的HTML解析器,提供了非常方便的抽取 ... -
常见文件类型识别
2010-09-22 20:09 11792根据文件的后缀名识别文件类型并不准确,可以使用文件的头信息进行 ... -
(zz)信息检索领域资料整理
2010-06-05 13:05 3137A Guide to Information Retrieva ... -
Introduce to Inforamtion Retrieval读书笔记(2)
2009-10-31 13:02 1894The term vocabulary and posting ... -
Introduce to Inforamtion Retrieval读书笔记(1)
2009-10-25 23:49 2013很好的一本书,介绍的非常全面,看了很久了,还没有看完,刚看完前 ... -
Query Log Mining notes
2009-10-02 18:08 1246Enhancing Efficiency of Search ... -
百度搜索的一些高级语法
2009-08-27 20:06 18951.title语法 就是在title ... -
Hadoop好书推荐:Hadoop The Definitive Guide
2009-08-16 22:49 3617第一本详细介绍Hadoop的书籍,从网上下来看了几章,作者是H ... -
Java开源搜索引擎[收藏]
2008-04-24 00:09 2881Egothor Egothor是一个用Java编写的开 ... -
分享一本斯坦福的信息检索的教材
2008-01-04 23:59 2434斯坦福的信息检索的教材,还没出版,先分享一下电子版原稿. 对于 ... -
分享一本搜索引擎的电子书
2007-12-29 19:42 2498还没有来得及看,但搜索引擎的书不是很好找,先放上,希望对大家能 ... -
分享一个Nutch入门学习的资料
2007-12-18 20:49 4244分享一个Nutch入门学习的资料,感觉写的还不错. -
搜索引擎Nutch源代码研究之一 网页抓取(4)
2007-12-17 22:37 8357今天来看看Nutch如何Parse网页的: Nutch使用了两 ... -
[转]MAP/REDUCE:Google和Nutch实现异同及其他
2007-12-15 19:21 2954设计要素 nutch包含以下几个部分: 辅助类 Log:记载运 ... -
Nutch源代码学习小小总结一下
2007-12-15 19:13 4426我现在看得源码主要是网页抓取部分,这部分相对比较容易。我首先定 ... -
搜索引擎Nutch源代码研究之一 网页抓取(3)
2007-12-15 16:39 4547今天我们看看Nutch网页抓取,所用的几种数据结构: 主要涉及 ... -
搜索引擎Nutch源代码研究之一 网页抓取(2)
2007-12-15 00:36 5529今天我们来看看Nutch的源代码中的protocol-h ...
相关推荐
搜索引擎排序怎么整合垂直搜索的资源 Federated search is an information retrieval technology that allows the simultaneous search of multiple searchable resources. A user makes a single query request ...
The proliferation of information housed in computerized domains makes it vital to find tools to search these resources efficiently and effectively. Ordinary retrieval techniques are inadequate because...
An “Information Retrieval” scheme is implemented. We built index of the ClueWeb12-B-13 using Lucene and use manually and automatically constructed queries to fetch pages from the data subset. ...
formation retrieval and information organization, machine translation, and natural language interfaces, among others. However, as in any science, the activities of the researchers are mostly ...
(b) Derived attributes are usually stored because storage improves retrieval performance. (c) Derived attributes must be stored. (d) Derived attributes are usually not stored because they can be ...
* A detailed look at data storage, retrieval, and sharing using preferences, files, databases, and Content Providers. * Instructions for making the most of mobile portability by creating rich map-...
Common examples of middleware used by applications built on the Microsoft stack include (but certainly is not limited to) Microsoft Internet Information Services (IIS) (formerly Internet Information ...
Because the trees are balanced, finding any record requires about the same amount of resources, and retrieval speed is consistent because the index has the same depth throughout. Clustered and ...
photocopy, microfilm, retrieval system, or by any other means now known or hereafter invented without the prior written permission of Wind River System, Inc. This document is designed to support the ...
Try It Out: Very Simple Record Retrieval 436 Try It Out: Very Simple Record Storage 438 Try It Out: Data Storage in MySQL 439 Try It Out: Storing and Retrieving Documents 441 Try It Out: A ...
audio”的目录中,将MIDI放在另一个名为“ midi”的目录中: mkdir audio wget "http://resources.mpi-inf.mpg.de/SMD/SMD_MIDI-Audio-Piano-Music.html" -e robots=off -r -l1 -nd --no-parent -A.
WebDork v1.0.3 一个开源工具,用于查找有关公司/组织的公开可用敏感信息!WebDork 一个Python工具,用于自动完成一些填空工作,以查找信息泄露。 欢迎提出建议和问题,因为我知道代码永远不可能是完美的。...
No part of this publication may be reproduced, stored in a retrieval system, or transmitted,in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission...
awesome-hungarian-nlp:针对匈牙利人的NLP资源精选清单
and supersedes information contained in otherdocuments, including previously installed release notes.Borland recommends that you read this file in its entirety.NOTE: If you are updating a localized ...
音乐深度学习(DL4M) 由大学LaBRI(, )的 (, )提供。 波尔多(, ),CNRS(, )和SCRIME()。 TL; DR关于音乐深度学习的科学文章的非详尽列表:(文章标题,pdf链接和代码),(表-更多信息),(围嘴-...
1.4.3 Retrieval and analysis of digital evidence . . . . . . . . . . . . . 23 1.4.4 Sources of digital evidence . . . . . . . . . . . . . . . . . . . . . . . 27 1.5 Legal considerations. . . . . . . ....