[VoCamp Discuss] Introduction, topics of interest
Matthias Samwald
samwald at gmx.at
Wed Sep 17 07:46:48 PDT 2008
Dear VoCamp participants,
My name is Matthias Samwald, I am currently working for the Semantic Web
Company [1] in Vienna, Austria, and for DERI Galway [2] in Ireland. I have a
background in biomedical research (my master thesis concerned brain
research and synapse formation) and in the field of applied Semantic Web
technology (my doctoral thesis was focused on using RDF/OWL for biomedical
research). I am a participant of the W3C Interest Group for Semantic Web in
Health Care and Life Science [3][4][5]. While my background is biomedical
research, I am interested in applying Semantic Web technology in all areas
of technology, business and science.
Here are some topics which I would like to address at the VoCamp Oxford.
Some of them are oriented on use-cases from the "KiWi - Knowledge in a Wiki"
project [6], which also includes industrial partners such as Sun
Microsystems. Others are more driven by needs of biomedical research. My
topics of main interest are: 1) Associative Tags; 2) Agreement,
Disagreement, discourse; 3) Corporate Semantic Web, 4) "Are upper level
ontologies/vocabularies not so bad after all?", 5) " Cleaner schemas and
ontologies". Details below.
__Associative Tags__
Tagging is one of the key components of the 'Web 2.0', and Semantic Web
technologies will help to make tagging even more powerful. Schemas such as
SCOT or MOAT have already been established, and make it possible to 'tag'
not only with simple strings, but with entities. These entities (such as
concepts described in SKOS) can be associated with clear semantics and can
be further described with RDF statements, to describe hierarchies of
entities, or to link entities to rich data sources such as DBpedia. This
enables sophisticated data-integration and cross-data source queries that
would not have been able with simple, string-based tags.
On the other hand, Semantic Web developers can learn from the simplicity
that has made tagging so successful. Creating useful tags is very simple,
and good user interfaces can further improve the simplicity of creating
useful tag with feature such as autocompletion and tag recommendation. This
simplicity should server as a role model for many Semantic Web applications.
Specifically, I am interested in what I call 'associative tags', bundles of
tags/entities/concepts that can be used for the simple representation of
facts. The primary intention of creating aTags is not the categorization of
the document, but the representation of the key facts inside the document.
Key facts in the biomedical domain might be, for example,
"Protein A interacts with protein B" (which can be represented with an aTag
comprising of the three entities "Protein A", "Molecular interaction" and
"Protein B") or
"Overexpression of protein A in tissue B is the cause of disease C" (an aTag
comprising of the four entities "Overexpression", "Protein A", "Tissue B"
and "Disease C").
Once the aTags from these different sources are aggregated, it is possible
to pose a query such as "show me molecules that are associated with
molecules that are associated with disease C", yielding "protein A" as an
answer. Hierachies (in the form of rdfs:subClassOf and skos:narrower) can be
used to expand queries based on background knowledge (e.g., that "disease D"
is a subclass of "disease C").
In many cases (especially with some ontologies in the biomedical domain),
creating such associative tags can be much simpler than the creation of
'real' statements, i.e., relations between individuals and property
restrictions of classes.
I would like to present these ideas in more detail, get feedback, and
discuss possible alignment of these ideas with ontologies such as SCOT.
__Agreement, Disagreement, discourse__
Many people in the Semantic Web community are interested in the
representation of argumentation structures on the web. For example: stating
that one snippet of text contains statements that are in disagreement with
another snippet of text, which is in agreement with yet another snippet of
text. This can be of use for many knowledge domains, such as news articles
or biomedical publications. Of special interest in this context are
extensions of established schemas, especially SIOC. There is also another
ontology called SWAN that is specifically tailored to the biomedical domain,
and efforts to align SWAN with SIOC have started recently [7].
I am interested in applying such technologies both for biomedical research
and for corporate Semantic Wikis.
__Corporate Semantic Web__
As Semantic Web technologies are finally getting mature enough to allow
industrial uptake, it is becoming clear that ontologies for describing
organization structures and business processes are still lacking maturity.
FOAF allows us to represent basic information about persons, organizations
and their relationships, but lacks vocabulary for stating that one person is
the boss of another person, that a project consists of several subtasks, et
cetera. While there are some small projects that try to create such
schemas/ontologies, a solution of widespread acceptance does not seem to be
in sight at the moment.
It would be great if we could collect and review ongoing efforts and try to
identify possible steps towards creating schemas/ontologies that can be used
in corporations.
__Are upper level ontologies/vocabularies not so bad after all?__
FOAF seemingly tried it a long time ago - foaf:Person is a subclass of,
"http://xmlns.com/wordnet/1.6/Person", foaf:Document
"http://xmlns.com/wordnet/1.6/Document" and so on. Linking to external
schemas/ontologies (or making use of their classes and properties directly)
can definitly help in facilitating semantic interoperability. For a long
time, many web developers were very skeptical about such 'top-down'
approaches of data integration, but recently the recognition of the
potential values of such resources seems to be increasing. In parallel, the
recent 1-2 years brought us some very large upper ontologies that are
available as linked data, such as:
Wordnet 2.0, hosted by the W3C
Yago/DBpedia
OpenCyc (now with new URIs)
UMBEL (derived from OpenCyc and others).
I think the practice of re-using and linking to such upper ontologies should
become popular (again). It helps in creating a highly interlinked Semantic
Web, and helps us to avoid re-inventing the wheel for each new
schema/ontology.
We could discuss the pros and cons of current upper ontologies available as
linked data, and also discuss if they could server a more important role for
schema/ontology developers in the future.
__Cleaner schemas and ontologies__
Working with established ontologies and schemas in ontology editors can be a
chore. Most have dependencies on other ontologies, but don't use
owl:imports. Most use an awkward mix of OWL statements and RDF(S), resulting
in ontologies that are OWL Full. Many require some OWL reasoning to make use
of sameAs statements and inverse properties, but at the same time reasoning
is complicated because the ontologies are OWL Full or even contain logical
inconsistencies. Often enough, there seems to be no practical reason for the
design choices that caused the trouble: some minor changes can turn a messy
OWL Full ontology into an OWL lite or OWL DL ontology. At the moment, many
different working groups have created local versions of schemas such as FOAF
or Dublin Core that are valid OWL-DL to fix that problem.
It doesn't have to be this way.
Trying to adhere to OWL lite/DL and adding owl:imports statements can help
building cleaner, modular and more sustainable ontologies, and does not
require significant additional effort during the creation of ontologies.
Maybe we can find a consensus that this would be a worthwhile goal, and
develop plans towards reaching that goal.
[1] http://www.semantic-web.at/
[2] http://deri.ie
[3] http://www.w3.org/2001/sw/hcls/
[4] http://www.w3.org/TR/hcls-kb/
[5] http://www.w3.org/TR/hcls-senselab/
[6] http://wiki.kiwi-project.eu/
Cheers,
Matthias Samwald
Semantic Web Company, Austria // DERI Galway, Ireland
P.S.: This mailing list has been quite silent so far. Are all participants
of VoCamp subscribed by now? I think that exchanging some ideas before the
actual meeting starts is very important! Please introduce yourself and your
plans!
More information about the discuss
mailing list