Extracting Various Classes of Data from Biological Text

(0)
19
Nov
Author : NS2 Projects Category : NS2 PROJECTS 2015
Tags : Ns2 Projects, Ns2 Projects for students, Ns2 Projects with source code

One of the key goals of biological Natural Language Processing (NLP) is the automatic information extraction from biomedical publications. Most current constituency and dependency parsers overlook the semantic relationships between the constituents comprising a sentence and may not be well suited for capturing complex long-distance dependencies. We propose in this paper a hybrid constituency-dependency parser for biological NLP information extraction called EDC_EDC.

EDC_EDC aims at enhancing the state of the art of biological text mining by applying novel linguistic computational techniques that overcome the limitations of current constituency and dependency parsers outlined above, as follows: (1) it determines the semantic relationship between each pair of constituents in a sentence using novel semantic rules, and (2) it applies a semantic relationship extraction model that extracts information from different structural forms of constituents in sentences. EDC_EDC can be used to extract different types of data from biological texts for purposes such as protein function prediction, genetic network construction, and protein-protein interaction detection. We evaluated the quality of EDC_EDC by comparing it experimentally with six systems. Results showed marked improvement.

Extracting Various Classes of Data from Biological Text using the Concept of Existence Dependency

NS2 PROJECTS TOPICS

Recent Posts

Categories