Semantic Web Technologies: A Tutorial Li Ding University

Semantic Web Technologies: A Tutorial Li Ding University

Semantic Web Technologies: A Tutorial Li Ding University of Maryland Baltimore County Joint work with Deborah McGuinness, Tim Finin and Anupam Joshi Presented at Kodak Research Laboratories, Rochester, New York 18 July 2006 @ The Web has made people smarter craigslist 2 @ 3 But what about machines? tell register Machines still have a very minimal understanding of text and images. @ 4 Motivation: machine friendly data Natural Language Li Ding is a person as seen by a person

as seen by a machine XML represent structures Li Ding as seen by a person <> as seen by a machine Semantic Web - represent more semantics represent structures enable common vocabulary associate symbols with logic interpretation for inference @ Semantic Web Technologies @ 6 Semantic Web Layers Semantic Aspect Web Aspect HTTP

"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. Berners-Lee, Hendler & Lassila, Scientific American, 2001 Image source: @ 7 The Semantic Web is simple Each URI denotes a concept Don't say "colour" say URIs are connected by triples Relational database RDF (Resource Description Framework) Machines read data as directed RDF graph Source: Tim Berners-Lee, Putting the Web back into Semantic Web, ISWC2005 Keynote @ 8 Example: RDF graph and syntax t1 Li Ding t2 RDF Graph URI, Literal, BNode Triple The entire graph means: there exist a person whose name is Li Ding.

SWDs: Semantic Web documents; SWOs: semantic web ontologies; pure SWD: not embeded note: Statistics of top level domain is also used in characterizing the Web (Henziger and Lawrence 2004) @ 20 Source websites of SWD Jan 2005- Aug 2005 100000 1000000 1, 125911 100000 2, 17474 3, 5200 10000 1000 y = 6236.7x -0.6629 R2 = 0.9622 100 10 80401, 2 100517, 1 1 1 10 100 1000

10000 100000 100000 0 m: # of SWDs 10000 1000 100 y = 6598.8x -0.7305 R2 = 0.9649 10 1 1 10 100 1000 10000 100000 1000000 m: # of SWDs Invariant found! y: # of websites hosting >= m SWDs

y: # of websites hosting >= m SWDs Jan 2005- Mar 2006 The number of websites hosting more than m SWDs follows power law distribution Similar to the Web Head: virtual hosting Tail: crawling strategy @ 21 Size of SWD Number of SWDs Embedded SWDs are small 69% have 3 triples 96% have <10 triples; Pure SWDs 60% have 5 to 1000 triples. Special size of RSS 130 Number of SWOs

SWOs # of triples 17 triples for channel 7 triples for each of the 15 items Biased by PML, Small ones from RDF test Largest is 1M @ 22 Age of SWD Measured by the last-modified time of SWD PSWD: Exponential distribution SWO: flat tail -- ontology development interests decrease? pswd swo (pml filtered) 1000000 Expon. (pswd) y = 2E-48e 0.0032x

100000 10000 1000 100 10 1 7/20/1995 4/15/1998 1/9/2001 10/6/2003 7/2/2006 @ How Semantic Web Terms are used? All usage distributions follow Power distribution 23 Few SWTs been well populated 371 has >100 class-instance 1208 has>100 property-instances @ 24 Swoogle Rank (citation based) indegree=1,077,768,mean(inflow)=0.100 0.25 1 0.11 2 indegree=432,984,mean(inflow)=0.039 0.51 0.10 0.30 0.35 0.11 indegree=86,959,mean(inflow)=0.069 0.18 0.16 5 0.03 indegree=270,178,mean(inflow)=0.168 0.20 0.10 6 0.12

0.43 8 0.17 0.27 4 0.12 0.11 0.07 0.06 0.16 indegree=861,416,mean(inflow)=0.096 indegree=54,909,mean(inflow)=0.042 9 0.17 Computed using Swoogle metadata by May 2006 indegree=155,949,mean(inflow)=0.036 0.25 0.23 0.05 0.03

7 0.27 0.10 0.12 0.20 0.08 indegree=57,066,mean(inflow)=0.195 0.21 0.07 0.10 10 indegree=16,380,mean(inflow)=0.167 3 0.29 indegree=512,790,mean(inflow)=0.217 @ Semantic Web Applications @ 26

TAGA: Travel Agent Game in Agentcities Ontologies Motivation Features Technologies Market dynamics Auction theory (TAC) Semantic web Agent collaboration (FIPA & Agentcities) Owl as a content languag e FIPA (JADE, April Agent Platform) Semantic Web (RDF, OWL) Web (SOAP,WSDL,DAML-S) Internet (Java Web Start ) Report Contract Owl for protocol descriptio n travel.owl travel concepts fipaowl.owl FIPA content lang. auction.owl auction services tagaql.owl query language Owl for representation

and reasoning Report Direct Buy Transactions Report Auction Transactions Market Oversight Agent s ue q Re Customer Agent Open Market Framework Auction Services OWL message content OWL Ontologies Global Agent Community t Bulletin Board Agent CF P Report Travel Package Proposal Travel Agents d Bi Bi

d Auction Service Agent Direct Buy Owl for service description Web Service Agents FIPA platform infrastructure services, including directory facilitators enhanced to use OWL-S for service discovery (offline now) @ 27 Semantic Content Publishing data stored in database PHP generates both HTML and OWL HTML pages link to corresponding OWL no more web scraping FOAF PHP PHP -- ebiquity group website Mysql database @ 28 Rei Policy Language Rei is a declarative policy language for describing policies over actions Reasons over domain dependent information Currently represented in OWL + logical variables Based on deontic concepts Permission, Prohibition, Obligation, Dispensation Models speech acts Delegation, Revocation, Request, Cancel Meta policies Priority, modality preference Policy engineering tools Reasoner, IDE for Rei policies in Eclipse

@ Example: enforcing privacy The speakerpolicy doesnt want others to know the 29 specific room that hes in, but is willing for others to know hes on campus He defines the following privacy policy Share my location with a granularity >= State The broker isLocated(US) => Yes! isLocated(Maryland) => Yes! isLocated(UMBC) => Uncertain.. isLocated(ITE-RM210) => Uncertain.. @ Cobra: Context Broker Architecture Ontology 30 Agents

Service Inference Policy @ Web-scale semantic web data access agent Search vocabulary Compose query Populate RDF database data access service ask (person) inform (foaf:Person) 31 the Web Index RDF data Search URIrefs in SW vocabulary ask (?x rdf:type foaf:Person)

inform (doc URLs) Search URLs in SWD index Fetch docs Query local RDF database @ Swoogle Semantic Web Search Engine Harvesting Semantic Web data from the Web Provide search/navigation services for machines (via REST+ RDF/XML) 32 Digest doc, term, namespace Links Also serves human users Status Running since summer 2004 1.6M RDF documents, 300M RDF triples, 10K ontologies

@ 33 Ontology Dictionary From web of document to web of data Aggregate from multiple sources Inductively learned definition Onto 1 foaf:name Onto 2 rdfs:domain owl:Class foaf:Person foaf:Person foaf:name rdf:type foaf:Agent rdfs:subClassOf rdfs:domain rdf:type owl:Class wob:hasInstanceDomain

foaf:Person wob:hasInstanceDomain dc:title rdfs:subClassOf SWD3 foaf:name foaf:Person foaf:Agent rdf:type dc:title Tim Finin Dr. @ Semantic Web Challenges Winners 2003 CS AKTive Space (CAS) is an integrated Semantic Web application which provides a way to explore the UK Computer Science Research domain across multiple dimensions for multiple stakeholders, from funding agencies to individual researchers. 34 2004 Flink itself is also likely to be unique as a crossover between a social experiment

and a semantic application. 2005 CONFOTO is a browsing and annotation service for conference photos. @ Triple Shop: SPARQL dataset finder Who knows Anupam 35 Joshi? Show me their names, email address Compose a SPARQL query without FROM clause 2. Parse SPARQL query, search Swoogle for related URLs, and compose a dataset 3. Run SPARQL query on dataset @ Integrating Social Networks data FOAF FOAF Network Reputation Systems J. Golbeck

source knows RDF RDF/XML DBLP Coauthor Database HTML Citeseer Rank H. Chen knows T. Finin Golbecks Trust Network sink sameName Reputation Trust network Computation Entity mapping Tie strength Trust aggregation

knows A. Joshi hub Trust J. Hendler F. Perich Kagal island Google PageRank knows L. Ding P. Kolari 36 Y. Peng L. Ding L. Kagal co-author T. Finin 28

6 1 A. Sheth A. Joshi 1 5 M. P. Singh H. Chen F. Perich DBLP Coauthor Network @ 37 Inference Web Infrastructure WWW SDS OWL-S/BPEL (DAML/SNRC) CWM (TAMI) JTP (DAML/NIMD) SPARK (CALO) N3 Proof Markup Language (PML) KIF Trust

SPARK-L UIMA Text Analytics (NIMD/Exp Agg) Justification Provenance Toolkit IWTrust IW Explainer/ Abstractor Trust computation End-user friendly visualization IWBrowser Expert friendly Visualization IWSearch search engine based publishing IWBase provenance registration [Inference Web] Framework for explaining question answering tasks by abstracting, storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by question answerers. @ 38

PML: Proof Markup Langauge isQueryFor Query foo:query1 (type TonysSpecialty ?x) Question foo:question1 (what is Tonys Specialty) IWBase hasAnswer NodeSet foo:ns1 (hasConclusion ) fromQuery isConsequentOf hasLanguage Language hasInferencEngine hasRule InferenceStep hasAntecendent NodeSet foo:ns2 (hasConclusion ) fromAnswer InferenceStep InferenceEngine InferenceRule Source

hasVariableMapping Mapping isConsequentOf hasSourceUsage SourceUsage hasSource usageTime Justification Trace @ IWBrowser Justification and Provenance 39 @ 40 Tracking Provenance via RDF Molecule decompose The graphs RDF molecules An RDF graph G t2 foaf:knows t1 foaf:name t1 Li Ding t4 t3 foaf:name foaf:mbox Tim Finin

t2 t3 t4 t3 mailto:[email protected] Match sub-Graph Web pages containing one or more molecules discovered by Swoogle Ding, L.; Finin, T.; Peng, Y.; Pinheiro da Silva, P.; McGuinness, D.L. Tracking RDF Graph Provenance using RDF Molecules. Proceedings of the Fourth International Semantic Web Conference (poster), November 2005. 2005 , @ Conclusion The Semantic Web simple but powerful Standardized by W3C: RDF, RDFS, OWL Current focuses 41

Query -- SPARQL Rules SWRL, RIF Web services OWL-S, WSDL-S, SAWSDL Best practice and deployment but cannot do everything Open questions Business model, Industry adoption? Privacy? @ 42 Recommended Readings Tutorials Starting points

Semantic Web Road map, (since 1998), Tim Berners-Lee The Semantic Web, Scientific American, May 2001, Tim Berners-Lee, James Hendler and Ora Lassila Ontology Development 101: A Guide to Creating Your First Ontology, 2001, Natalya F. Noy and Deborah L. McGuinness Semantic Web Tutorials, W3C Semantic Web activity, W3C Semantic Web Interest Group, W3C Semantic Web News, Planet RDF - aggregated blogs, Dave Becketts Resource Description Framework (RDF) Resource Guide Swoogle Semantic Web Search Engine, Semantic Web reference card, Conferences and Journals International Semantic Web Conference (ISWC) European Semantic Web Conference (ESWC) Semantic Technology Conference (SemTech) Journal of Web Semantics @ Ongoing W3Cs Semantic Web Activity RDF Data Access Working Group RuleML => SWRL=> RIF Best Practices Working Group

RDQL => SPARQL Rules Interchange Working Group 43 Vocabulary management, e.g. WordNet Thesauri SKOS (Simple Knowledge Organization System) Image Annotation DOAP (Description of a Project) Many tutorials and demos Semantic Annotations for Web Services Description Language Working Group OWL-S and WSDL-S WSDL 2.0 @

Recently Viewed Presentations

  • Ephesians 3:17-18 (NIV) I pray that you will

    Ephesians 3:17-18 (NIV) I pray that you will

    Ephesians 3:17-18 (NIV) "Christ's love is greater than anyone can ever know, but I pray that you will be able to know that love." How Much Does God Love Me? God's love is wide enough to include everyone. Psalm 145:17...
  • Introduction - Western Washington University

    Introduction - Western Washington University

    The Huxley Core Class It's the work of faculty and students Persons decent, patient, and prudent Learning essentials about environmental knowledge A student's introduction to Huxley College Isn't easy for those involved Wonderful to see understanding evolve Of the questions...
  • Exercise Metabolism - جامعة الملك سعود

    Exercise Metabolism - جامعة الملك سعود

    2. Rest-to-Exercise: Aerobic Rest-to-Exercise: The Oxygen Deficit (aerobic) (anaerobic) 3. Recovery What needs to happen in recovery? ATP-PCr system P + Cr + energy PCr Glycolysis? Lactic acid removal from muscle fiber Where does the energy comes from? Lactic Acid...
  • Ankle Joint - Yola

    Ankle Joint - Yola

    Talocrural Joint-joint in the ankle found between the tibia, fibula, and talus. Dorsi/plantar flexion. Subtalar Joint -joint in the ankle found between the talus and calcaneus. The subtalar joint allows gliding and rotation, which are involved in inversion and eversion...
  • Options to A Basic Apple Insurance Policy

    Options to A Basic Apple Insurance Policy

    OPTIONS TO A BASIC APPLE INSURANCE POLICY Adding additional quality based coverage to your apple policy Perils Excluded From the Basic Policy Size Color Shape Russeting These perils are insured causes of loss when a producer adds the quality endorsement.
  • Seasonal Behaviour in Plants

    Seasonal Behaviour in Plants

    Even this is not the full story - flowering probably a complicated example of an endogenous rhythm. Leaves detect the flowering stimulus. Flowering Stimulus Detection. In experiments* Chrysanthemums (an SDP): Flower when leaves only are given short days.
  • Representing Motion - East Troy High School

    Representing Motion - East Troy High School

    Representing Motion ... The Tip to Tail Method Always put the tip of one vector to the tail of the other. The Resultant: The one vector that represents the sum of the vectors. This vector can effectively replace the other...
  • Improving Support to Young Carers and their Families

    Improving Support to Young Carers and their Families

    Improving Support to Young Carers and their Families ... private and mainstream agencies Developing the voluntary sector role as change agents Challenging main stream agencies to demonstrate that they are delivering their core business to young carers and disabled parents...