[VoCamp Discuss] Introduction, topics of interest

Wed Sep 17 07:46:48 PDT 2008

Dear VoCamp participants,

My name is Matthias Samwald, I am currently working for the Semantic Web 
Company [1] in Vienna, Austria, and for DERI Galway [2] in Ireland. I have a 
background in biomedical research (my master thesis concerned  brain 
research and synapse formation) and in the field of applied Semantic Web 
technology (my doctoral thesis was focused on using RDF/OWL for biomedical 
research). I am a participant of the W3C Interest Group for Semantic Web in 
Health Care and Life Science [3][4][5]. While my background is biomedical 
research, I am interested in applying Semantic Web technology in all areas 
of technology, business and science.

Here are some topics which I would like to address at the VoCamp Oxford. 
Some of them are oriented on use-cases from the "KiWi - Knowledge in a Wiki" 
project [6], which also includes industrial partners such as Sun 
Microsystems. Others are more driven by needs of biomedical research. My 
topics of main interest are:  1) Associative Tags; 2) Agreement, 
Disagreement, discourse; 3) Corporate Semantic Web, 4) "Are upper level 
ontologies/vocabularies not so bad after all?", 5) " Cleaner schemas and 
ontologies". Details below.

__Associative Tags__
Tagging is one of the key components of the 'Web 2.0', and Semantic Web 
technologies will help to make tagging even more powerful. Schemas such as 
SCOT or MOAT have already been established, and make it possible to 'tag' 
not only with simple strings, but with entities. These entities (such as 
concepts described in SKOS) can be associated with clear semantics and can 
be further described with RDF statements, to describe hierarchies of 
entities, or to link entities to rich data sources such as DBpedia. This 
enables sophisticated data-integration and cross-data source queries that 
would not have been able with simple, string-based tags.
On the other hand, Semantic Web developers can learn from the simplicity 
that has made tagging so successful. Creating useful tags is very simple, 
and good user interfaces can further improve the simplicity of creating 
useful tag with feature such as autocompletion and tag recommendation. This 
simplicity should server as a role model for many Semantic Web applications.

Specifically, I am interested in what I call 'associative tags', bundles of 
tags/entities/concepts that can be used for the simple representation of 
facts. The primary intention of creating aTags is not the categorization of 
the document, but the representation of the key facts inside the document. 
Key facts in the biomedical domain might be, for example,
"Protein A interacts with protein B" (which can be represented with an aTag 
comprising of the three entities "Protein A", "Molecular interaction" and 
"Protein B") or
"Overexpression of protein A in tissue B is the cause of disease C" (an aTag 
comprising of the four entities "Overexpression", "Protein A", "Tissue B" 
and "Disease C").
Once the aTags from these different sources are aggregated, it is possible 
to pose a query such as "show me molecules that are associated with 
molecules that are associated with disease C", yielding "protein A" as an 
answer. Hierachies (in the form of rdfs:subClassOf and skos:narrower) can be 
used to expand queries based on background knowledge (e.g., that "disease D" 
is a subclass of "disease C").
In many cases (especially with some ontologies in the biomedical domain), 
creating such associative tags can be much simpler than the creation of 
'real' statements, i.e., relations between individuals and property 
restrictions of classes.
I would like to present these ideas in more detail, get feedback, and 
discuss possible alignment of these ideas with ontologies such as SCOT.

__Agreement, Disagreement, discourse__
Many people in the Semantic Web community are interested in the 
representation of argumentation structures on the web. For example: stating 
that one snippet of text contains statements that are in disagreement with 
another snippet of text, which is in agreement with yet another snippet of 
text. This can be of use for many knowledge domains, such as news articles 
or biomedical publications. Of special interest in this context are 
extensions of established schemas, especially SIOC. There is also another 
ontology called SWAN that is specifically tailored to the biomedical domain, 
and efforts to align SWAN with SIOC have started recently [7].
I am interested in applying such technologies both for biomedical research 
and for corporate Semantic Wikis.

__Corporate Semantic Web__
As Semantic Web technologies are finally getting mature enough to allow 
industrial uptake, it is becoming clear that ontologies for describing 
organization structures and business processes are still lacking maturity. 
FOAF allows us to represent basic information about persons, organizations 
and their relationships, but lacks vocabulary for stating that one person is 
the boss of another person, that a project consists of several subtasks, et 
cetera. While there are some small projects that try to create such 
schemas/ontologies, a solution of widespread acceptance does not seem to be 
in sight at the moment.
It would be great if we could collect and review ongoing efforts and try to 
identify possible steps towards creating schemas/ontologies that can be used 
in corporations.

__Are upper level ontologies/vocabularies not so bad after all?__
FOAF seemingly tried it a long time ago -  foaf:Person is a subclass of, 
"http://xmlns.com/wordnet/1.6/Person", foaf:Document 
"http://xmlns.com/wordnet/1.6/Document" and so on. Linking to external 
schemas/ontologies (or making use of their classes and properties directly) 
can definitly help in facilitating semantic interoperability. For a long 
time, many web developers were very skeptical about such 'top-down' 
approaches of data integration, but recently the recognition of the 
potential values of such resources seems to be increasing. In parallel, the 
recent 1-2 years brought us some very large upper ontologies that are 
available as linked data, such as:

Wordnet 2.0, hosted by the W3C
Yago/DBpedia
OpenCyc (now with new URIs)
UMBEL (derived from OpenCyc and others).

I think the practice of re-using and linking to such upper ontologies should 
become popular (again). It helps in creating a highly interlinked Semantic 
Web, and helps us to avoid re-inventing the wheel for each new 
schema/ontology.
We could discuss the pros and cons of current upper ontologies available as 
linked data, and also discuss if they could server a more important role for 
schema/ontology developers in the future.

__Cleaner schemas and ontologies__
Working with established ontologies and schemas in ontology editors can be a 
chore. Most have dependencies on other ontologies, but don't use 
owl:imports. Most use an awkward mix of OWL statements and RDF(S), resulting 
in ontologies that are OWL Full. Many require some OWL reasoning to make use 
of sameAs statements and inverse properties, but at the same time reasoning 
is complicated because the ontologies are OWL Full or even contain logical 
inconsistencies. Often enough, there seems to be no practical reason for the 
design choices that caused the trouble: some minor changes can turn a messy 
OWL Full ontology into an OWL lite or OWL DL ontology. At the moment, many 
different working groups have created local versions of schemas such as FOAF 
or Dublin Core that are valid OWL-DL to fix that problem.
It doesn't have to be this way.
Trying to adhere to OWL lite/DL and adding owl:imports statements can help 
building cleaner,  modular and more sustainable ontologies, and does not 
require significant additional effort during the creation of ontologies. 
Maybe we can find a consensus that this would be a worthwhile goal, and 
develop plans towards reaching that goal.

[1] http://www.semantic-web.at/
[2] http://deri.ie
[3] http://www.w3.org/2001/sw/hcls/
[4] http://www.w3.org/TR/hcls-kb/
[5] http://www.w3.org/TR/hcls-senselab/
[6] http://wiki.kiwi-project.eu/

Cheers,
Matthias Samwald
Semantic Web Company, Austria // DERI Galway, Ireland

P.S.: This mailing list has been quite silent so far. Are all participants 
of VoCamp subscribed by now? I think that exchanging some ideas before the 
actual meeting starts is very important! Please introduce yourself and your 
plans!