| Who are we |
| |
The Service
d'Informatique Médicale (SIM) is part of the Radiology
and Medical Informatics Department of the University Hospitals
of Geneva, This entity is in charge of development of medical
applications like patient record, medical orders and other knowledge
based applications.
A group of SIM has been long specialized for Natural
Language Processing. Under the leadership of Robert Baud,
several scientists spend months or years as active participants
of this group. They are: Anne-Marie Rassinoux, Christian Lovis,
Judith Wagner, Laurence Alpay, Patrick Ruch, Paul Fabry. See the
list of publications for more details.
|
| The Patient Process
Ontology |
| |
The Patient
Record is the main source of information about patients and
for knowledge extraction. The Patient Record being mainly made
of free text, one has to concentrate on the main axes governing
its content. It appears that two axes account for up to 70% of
the whole content: body part and process. In other words, the
story of a patient is composed of a set of statements like "Process
has_location BodyPart". A model of Anatomy is hopefully satisfactory
available under the form of the Foundational Model of Anatomy
(FMA). The situation is more difficult at the level of a model
of Patient Process, where numerous terminologies may help, but
where the level of a well-formed ontology has not been reached
and where the specific aspects of a process as found in a narrative
about the patient have not been sufficiently considered. There
is clearly a need for a Patient Process Ontology (PPO).
On the contrary of multiple ontologies oriented principally on
indurent objects, processes are occurent objects: they occur at
some point in time, this means they have a start time and a stop
time. The Patient Record can be seen as a set of co-occurring
processes concerning the patient. Simultaneously the patient is
recovering from a pneumonia, is following a care process, complicated
by a diabetes, the patient is taking a prescribed drug, he is
subject to an allergic reaction, and he is becoming older: altogether
there are in a single sentence description already six different
parallel processes, not necessarily connected by causal links.
This is typically the essence of the Patient Record.
When an adverse event is reported in the patient story, the important
point is generally not this event but the recovery from the newly
created situation. A patient with a broken leg is experiencing,
to the point of view of the Patient Record, a process of recovery,
started by an accident and ending when the patient is healthy
again. The same is true when a drug is prescribed: as long as
the medical order is active, the patient is in a process influenced
or guided by this prescription; this is not the trigger event,
which is important, but the follow up. Even the age of the patient
is considered as an aging process starting at birth. On the basis
of this argument, several aspects of the Patient Record may be
considered as processes.
The top objects of this new ontology are presented and documented.
There are a number of intrinsic difficulties to be solved and
the presentation will emphasize some possible solutions. In order
to match the reality, a set of true patient letters have been
manually analysed for extraction of actual processes and comparison
with the ontology. The objective is to annotate the available
medical lexicons for all the entries pointing to objects of this
ontology, preparing for a sound representation of the Patient
Record and opening the way for new intelligent applications.
|
|
From Terminologia Anatomica
to the Foundational Model of Anatomy |
| |
The Terminologia Anatomica TA is the result of a consensus of
anatomists working under the umbrella of the Federative Committee
on Anatomical Terminology. In 1998, they published a reference
terminology on gross anatomy. Recently, another effort has been
done under the form of the Foundational Model of Anatomy FMA,
compatible with the TA, but with the formal qualities necessary
for adequate handling of such a terminology for computer processing.
The SIM has been active since 2003 on the aspect of Natural Language
Processing NLP in relation to the TA and in conjunction with the
FMA. The goal is to develop a data base representation of the
TA, especially tailored to the need of NLP. First, an relational
implementation has been developed in order to accommodate the
structure of the TA, the links with the FMA, and the numerous
synonym terms, past, present and future. Second, the TA being
originally available in Latin and English, a translation into
French has been achieved. Third, a bridge to the relevant Mesh
terms for any TA entry is prepared.
The TA is a universal consensus, but its success is strongly
dependent of its usage. In order to favour the TA dissemi-nation,
a number of accompanying measures are necessary. The main one
is the translation into several languages. Such translation should
reach a good level of quality and should be validated by agreed
relevant committees. Such initiatives are at least underway for
French and Spanish.
Another important accompanying measure is the release of different
services and tools. The most basic ones should be available in
the public domain. The opening of TA-dedicated website is certainly
a need.
|
| The Lexical Suite |
| |
The SIM has been involved for two decades now with Natural Language
Processing of medical texts. In the eighties, during a sabbatical
year, Naomi Sager - once named the mother of medical NLP - was
the trigger of new developments. Since then, the SIM was involved
in numerous research projects like Helios or Galen.
The cumulative development of several tools results in a package
of NLP utilities, known as the Lexical Suite, tailored for French,
English and German. Data resources have been set up resulting
in a French lexicon with more than 46'000 entries, a relevant
lexicon for the French medical domain. Such a lexicon is intended
to be made available in the public domain in an effort known as
UMLF and meaning Unified Medical Language for French.
|
| Retrieval
and Categorization Tools |
| |
The SIM is also active in information retrieval. Therefore, we
participate in main competitions related to the biomedical domain
(TREC Genomics, BioCreative). Our approaches combine general purpose
retrieval tools, implementing advanced retrieval models, such
as the Deviation from Randomness, and knowledge-driven modules
based on the UMLS, tailored to improve navigation in biomedical
text repositories. Because application areas range from literature
articles in medicine and bioinformatics to clinical contents,
our tools are largely language and genre-independent. Recently,
in cooperation with the Swiss-Prot team of the SIB and the EBI,
we have started to investigate the development of tools to help
annotation of proteins in Swiss-Prot using automatic categorization
tools based on the Gene Ontology.
|
| Gene Ontology |
| |
Formaly, the Gene Ontology is a controlled vocabulary organized
as a direct acyclic graphs (DAGs). It merges three structured
vocabularies, that describe gene products in terms of their associated
biological process, cellular component (about 1400) and molecular
function in a species-independent manner. The molecular function
terms describe activities at the molecular level. A biological
process is accomplished by one or more ordered assemblies of molecular
functions. The cellular component is a component of the cell,
which is part of some larger object. For example either an anatomical
structure or a gene product group.
Because, the Gene Ontology contains more than 15000 concepts,
annotating proteins with the full ontology is a rather difficult
task for humans, hence the importance of assisting categorization
tools to help maintaining consistency of the curation process.
|
| Publications |
| |
| 
|
by Robert Baud |
| (as first author only, limited
to the period 2000 to 2004)
- Baud RH, Ruch P, Gaudinat A, Fabry P, Lovis C, Geissbuhler A.
Coping with the variability of medical terms, IOS Press, Medinfo
2004;2004:322-6.
- Baud RH, A natural language based search engine for ICD10 diagnosis
encoding.
Med Arh. 2004;58(1 Suppl 2):79-80.
- Baud RH, Ruch P, Lovis C, Rassinoux A-M, Geissbuhler A. De la
composition des mots français du domaine médical
par des entités signifiantes. JFIM Journées Francophones
d'Informatique Médicale, September 2003, Tunis.
- Baud RH, Ruch P. The future of Natural Language Processing for
Biomedical Applications. IJMI 67 (2002) p1-5.
- Baud RH, Lovis C, Rassinoux A-M, Ruch P, Geissbuhler A, Controlling
the Vocabulary for Anatomy. Proc AMIA Symp, 2002, p26-30.
- Baud RH, Lovis C, Weber P, Geissbuhler A. Multilingual approach
to ICD 10: On the need for a source reference database. Medical
Informatics Europe, Budapest 2002, MIE'2002, IOS Press, Stud Health
Technol Inform. 2002;90:406-10.
- Baud RH, Lovis C, Ruch P, Rassinoux AM. Conceptual Search in
Electronic Patient Record. Proc MEDINFO 2001.
- Baud RH, Weber P, Lovis C. Coding in context. PCS/E-EFMI-WG1
Special Topic Conference, Bruges, 10-13 October, 2001.
- Baud RH, Lovis C, Ruch P, Rassinoux A-M. A Light Knowledge Model
for Linguistic Applications. J Am Med Inform Assoc 2001; (Symposium
Suppl).
- Baud RH, Ruch P, Lovis C, Rassinoux AM. Recherche conceptuelle
dans les textes médicaux. 8ème Journées Francophones
d'Informatique médicale, Informatique et santé,
Springer-Verlag France, Volume 9.
- Baud RH, Lovis C, Ruch P, Rassinoux AM. A Toolset for Medical
Text Processing. Medical Informatics Europe, Hannover 2000, MIE'2000,
IOS Press.
- (A search on http://www.ncbi.nlm.nih.gov/entrez/query.fcgi with
"Baud R" will display a complete list of 72 publications
by the author).
|
| 
|
by Patrick Ruch |
| (Related to Ontology)
- P Ruch. Query Translation by Text Categorization, COLING 2004,
2004.
- P Ruch, R Baud, and A Geissbühler. Learning-free Text Categorization,
AIME 2003, LNCS/LNAI 2780, Dojat M; Keravnou E; Barahona P (Eds.).
- P Ruch, R Baud, and A Geissbühler. Using Lexical Disambiguation
and Named-Entity Recognition to Improve Spelling Correction in
the Electronic Patient Record Art Intell Med, Volume 29, Issues
1-2, September-October, Pages 169-184, 2003.
- P Ruch, R Baud, A Geissbuhler, and AM Rassinoux. Comparing general
and medical texts for information retrieval based on natural language
processing: an inquiry into lexical disambiguation. Proceedings
of Medinfo'2001, pages 261-5, 1999.
- P Ruch, J Wagner, P Bouillon, and R Baud. Tag-like Semantics
for Medical Document Indexing. J Am Med Inform Assoc (Symposium
Suppl), pages 137-141, 1999.
(more than 20 papers in MEDLINE
) |
| Curricullum Vitae |
| |
| |
|