Data Mining: Information Extraction
Methods and challenges in information extraction for data mining
Keywords:
data mining, information extraction, natural-language, named entities, relationships, text corpus, documents, data-mining techniques, patterns, biomedical abstractsAbstract
Animportant approach to text mining involves the use of natural-languageinformation extraction. Information extraction (IE) distills structured data orknowledge from un-structured text by identifying references to named entitiesas well as stated relationships between such entities. IE systems can be usedto directly extricate abstract knowledge from a text corpus, or to extract concretedata from a set of documents which can then be further analyzed withtraditional data-mining techniques to discover more general patterns. Wediscuss methods and implemented systems for both of these approaches andsummarize results on mining real text corpora of biomedical abstracts, jobannouncements, and product descriptions. We also discuss challenges that arisewhen employing current information extraction technology to discover knowledgein textDownloads
Download data is not yet available.
Published
2012-02-01
Issue
Section
Articles