Text Mining and Ontology Applications in Bioinformatics and GIS Shamkant B. Navathe College of Computing Georgia Institute Of Technology Abstract: This talk will present some general problem areas and solutions in two fields of applications of machine learning: bioinformatics and Geographic Information Systems (GIS). The bioinformatics arena is very broad and encompasses many problems such as gene finding in sequences, molecular pathway construction, protein structure prediction etc. We will outline our research on finding important keywords from the biomedical literature by statistical analysis and some natural language analysis. We have also incorporated ontologies such as UMLS (Unified Medical Language System) to determine relationships among biological and medical concepts. The primary goal of this work has been to interpret the long lists of genes that are derived in microarray experiments used to understand and treat diseases. We are able to cluster genes based on their functional similarity. We have also used lists of keywords as feature vectors to drive SVM models for a classification of literature. In particular, we have dealt with the classification of relevant literature for Public health at the CDC (Centers of Disease Control). We will briefly explain the discovery of biomarkers for cancer using a technique that combines SVM and gene ontology. In the latter part we address integration of geographic information to allow collaboration and exchange of data between private and public organization as well as the different levels of government. Ontologies have been shown in research to enhance the conceptual modeling of geographic data and allow a more effective and efficient way of integrating multiple sources of information. Different aspects such as fuzziness of the features, different levels of accuracy, precision and scale, heterogeneity of data models, generalization of concepts etc. may be resolved using ontologies. It still remains a challenge to use ontologies in order to automatically resolve the diverse geo-integration issues. One area that we are investigating is to integrate data related to shelters and hospitals with appropriate diverse geo-information sources so as to improve emergency management during floods, hurricanes and natural disasters. Time permitting the speaker will also mention some recent work related to the modeling of circulation within buildings using better conceptual models of space and users and developing a constraint based approach to validate building designs.