Presenter Information

Varun MittalFollow

Degree Name

Master of Computer Science and Systems (MCSS)

Department

Institute of Technology

Streaming Media

Location

Carwein Auditorium (KEY 102), UW Tacoma

Event Website

http://guides.lib.uw.edu/tactalks

Start Date

19-5-2016 6:55 PM

End Date

19-5-2016 7:00 PM

Abstract

We as human beings are capable of working with patterns to learn and comprehend complex pieces of information, especially in text. Related concept help us in finding closest meaning to the unknown concept. Therefore encyclopedias as a form of ontologies are frequently used to research and explore concepts to find forward references. Our brain has a neural network for learning patterns and therefore we are motivated in creating an artificial neural network to learn similar patterns.

Ontologies are a simple way of expressing concepts in a hierarchical manner. For analyzing text, ontologies help in defining surrounding context and can greatly reduce research time. However, there are several challenges in creating ontologies, such as understanding words from specialized domains that are not in the general vocabulary and defining a relation between entities.

WordNet, Wikipedia(DBPedia), and ConceptNet are examples of relations derived from human curated, or semicurated, databases. The relations extracted in this manner are typically of high quality as they are collected through peerreviewed processes such as WordNet and Wikipedia or crowdsourced processes such as ConceptNet.

Our goal for this project is to generalize the relations learned from semicurated databases to extract similar relations from a domain specific corpus, such as biological research publications. Our system will convert these relations into a graph with numerical attributes in order to understand domainspecific patterns. Next we generalize these patterns and apply them on a new corpus to generate annotations, which we’ll use to build ontologies. We then validate these new ontologies to domainspecific dictionaries to measure accuracy.

The project will result in a service where given a research article, it will highlight important concepts and arrange them in a hierarchical order with crossreferences from the entire corpus. This will greatly reduce the research time as the system would proactively suggest meaning for the concepts in any article.

Share

COinS
 
May 19th, 6:55 PM May 19th, 7:00 PM

Understanding Biological Research Documents using a Neural Network

Carwein Auditorium (KEY 102), UW Tacoma

We as human beings are capable of working with patterns to learn and comprehend complex pieces of information, especially in text. Related concept help us in finding closest meaning to the unknown concept. Therefore encyclopedias as a form of ontologies are frequently used to research and explore concepts to find forward references. Our brain has a neural network for learning patterns and therefore we are motivated in creating an artificial neural network to learn similar patterns.

Ontologies are a simple way of expressing concepts in a hierarchical manner. For analyzing text, ontologies help in defining surrounding context and can greatly reduce research time. However, there are several challenges in creating ontologies, such as understanding words from specialized domains that are not in the general vocabulary and defining a relation between entities.

WordNet, Wikipedia(DBPedia), and ConceptNet are examples of relations derived from human curated, or semicurated, databases. The relations extracted in this manner are typically of high quality as they are collected through peerreviewed processes such as WordNet and Wikipedia or crowdsourced processes such as ConceptNet.

Our goal for this project is to generalize the relations learned from semicurated databases to extract similar relations from a domain specific corpus, such as biological research publications. Our system will convert these relations into a graph with numerical attributes in order to understand domainspecific patterns. Next we generalize these patterns and apply them on a new corpus to generate annotations, which we’ll use to build ontologies. We then validate these new ontologies to domainspecific dictionaries to measure accuracy.

The project will result in a service where given a research article, it will highlight important concepts and arrange them in a hierarchical order with crossreferences from the entire corpus. This will greatly reduce the research time as the system would proactively suggest meaning for the concepts in any article.

https://digitalcommons.tacoma.uw.edu/tactalks/2016/spring/3