Presentation Title
Understanding Biological Research Documents using a Neural Network
Degree Name
Master of Computer Science and Systems (MCSS)
Department
Institute of Technology
Location
Carwein Auditorium (KEY 102), UW Tacoma
Event Website
http://guides.lib.uw.edu/tactalks
Start Date
19-5-2016 6:55 PM
End Date
19-5-2016 7:00 PM
Abstract
We as human beings are capable of working with patterns to learn and comprehend complex pieces of information, especially in text. Related concept help us in finding closest meaning to the unknown concept. Therefore encyclopedias as a form of ontologies are frequently used to research and explore concepts to find forward references. Our brain has a neural network for learning patterns and therefore we are motivated in creating an artificial neural network to learn similar patterns.
Ontologies are a simple way of expressing concepts in a hierarchical manner. For analyzing text, ontologies help in defining surrounding context and can greatly reduce research time. However, there are several challenges in creating ontologies, such as understanding words from specialized domains that are not in the general vocabulary and defining a relation between entities.
WordNet, Wikipedia(DBPedia), and ConceptNet are examples of relations derived from human curated, or semicurated, databases. The relations extracted in this manner are typically of high quality as they are collected through peerreviewed processes such as WordNet and Wikipedia or crowdsourced processes such as ConceptNet.
Our goal for this project is to generalize the relations learned from semicurated databases to extract similar relations from a domain specific corpus, such as biological research publications. Our system will convert these relations into a graph with numerical attributes in order to understand domainspecific patterns. Next we generalize these patterns and apply them on a new corpus to generate annotations, which we’ll use to build ontologies. We then validate these new ontologies to domainspecific dictionaries to measure accuracy.
The project will result in a service where given a research article, it will highlight important concepts and arrange them in a hierarchical order with crossreferences from the entire corpus. This will greatly reduce the research time as the system would proactively suggest meaning for the concepts in any article.
COinS
Understanding Biological Research Documents using a Neural Network
Carwein Auditorium (KEY 102), UW Tacoma
We as human beings are capable of working with patterns to learn and comprehend complex pieces of information, especially in text. Related concept help us in finding closest meaning to the unknown concept. Therefore encyclopedias as a form of ontologies are frequently used to research and explore concepts to find forward references. Our brain has a neural network for learning patterns and therefore we are motivated in creating an artificial neural network to learn similar patterns.
Ontologies are a simple way of expressing concepts in a hierarchical manner. For analyzing text, ontologies help in defining surrounding context and can greatly reduce research time. However, there are several challenges in creating ontologies, such as understanding words from specialized domains that are not in the general vocabulary and defining a relation between entities.
WordNet, Wikipedia(DBPedia), and ConceptNet are examples of relations derived from human curated, or semicurated, databases. The relations extracted in this manner are typically of high quality as they are collected through peerreviewed processes such as WordNet and Wikipedia or crowdsourced processes such as ConceptNet.
Our goal for this project is to generalize the relations learned from semicurated databases to extract similar relations from a domain specific corpus, such as biological research publications. Our system will convert these relations into a graph with numerical attributes in order to understand domainspecific patterns. Next we generalize these patterns and apply them on a new corpus to generate annotations, which we’ll use to build ontologies. We then validate these new ontologies to domainspecific dictionaries to measure accuracy.
The project will result in a service where given a research article, it will highlight important concepts and arrange them in a hierarchical order with crossreferences from the entire corpus. This will greatly reduce the research time as the system would proactively suggest meaning for the concepts in any article.
https://digitalcommons.tacoma.uw.edu/tactalks/2016/spring/3