Date:
Wed, May 21, 2008 10:08:50 PMFrom:
Robin Cover
Subject:
XML Daily Newslink. Wednesday, 21 May 2008
XML Daily Newslink. Wednesday, 21 May 2008
A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS http://www.oasis-open.org
Edited by Robin Cover
====================================================
This issue of XML Daily Newslink is sponsored by
Sun Microsystems, Inc. http://sun.com
====================================================
HEADLINES:
* Microsoft Office 2007 SP2 to Support XPS, PDF v1.5, PDF/A, and ODF v1.1
* Processing Linked Web Data with XSLT
* State of the Semantic Web
* DITA, DocBook, and the Art of the Document
* W3C Call for Implementations: XQuery and XPath Full Text 1.0
* Web-based Spreadsheets with OpenOffice.org and Dojo
* OASIS Open Standards Forum 2008
* A Uniform Resource Identifier for Geographic Locations ('geo' URI)
----------------------------------------------------------------------
Microsoft Office 2007 SP2 to Support XPS, PDF v1.5, PDF/A, and ODF v1.1
Staff, Microsoft Announcement
Microsoft announced that with the release of Microsoft Office 2007 Service
Pack 2 (SP2) scheduled for the first half of 2009, the list of supported
document formats will grow to include support for XML Paper Specification
(XPS), Portable Document Format (PDF) 1.5, PDF/A, and Open Document Format
(ODF) v1.1. "When using SP2, customers will be able to open, edit and save
documents using ODF and save documents into the XPS and PDF fixed formats
from directly within the application without having to install any other
code. It will also allow customers to set ODF as the default file format
for Office 2007. To also provide ODF support for users of earlier versions
of Microsoft Office (Office XP and Office 2003), Microsoft will continue
to collaborate with the open source community in the ongoing development
of the Open XML-ODF translator project on SourceForge.net. In addition,
Microsoft has defined a road map for its implementation of the newly
ratified International Standard ISO/IEC 29500 (Office Open XML). IS29500,
which was approved by the International Organization for Standardization
(ISO) and International Electrotechnical Commission (IEC) in March, is
already substantially supported in Office 2007, and the company plans
to update that support in the next major version release of the Microsoft
Office system, code-named 'Office 14'. Consistent with its interoperability
principles, in which the company committed to work with others toward
robust, consistent and interoperable implementations across a broad range
of widely deployed products, the company has also announced it will be an
active participant in the future evolution of ODF, Open XML, XPS, and PDF
standards. Microsoft will join the OASIS technical committee working on
the next version of ODF and will take part in the ISO/IEC working group
being formed to work on ODF maintenance. Microsoft employees will also
take part in the ISO/IEC working group that is being formed to maintain
Open XML and the ISO/IEC working group that is being formed to improve
interoperability between these and other ISO/IEC-recognized document
formats. The company will also be an active participant in the ongoing
standardization and maintenance activities for XPS and PDF. It will also
continue to work with the IT community to promote interoperability between
document file formats, including Open XML and ODF, as well as Digital
Accessible Information System (DAISY XML), the foundation of the globally
accepted DAISY standard for reading and publishing navigable multimedia
content. Microsoft is also committed to providing Office customers with
the ability to open, edit and save documents in the Chinese national
document file format standard, Uniform Office Format (UOF)."
http://xml.coverpages.org/Office2007-Formats.html
See also the Interop Vendor Alliance Web site: http://interopvendoralliance.org/default.aspx
----------------------------------------------------------------------
Processing Linked Web Data with XSLT
Uche Ogbuji, DevX.com
The Semantic Web is a grand vision for increasing the power of the web
through better expression and management of context. Semantic Web
developers are building a framework to open up and connect organized
information, which takes advantage of many popular developments on the
web, such as the success of Wikipedia, Creative Commons-licensed
publishing on sites like Flickr, and various blogs. A portion of this
framework is the Linking Open Data (LOD) community initiative (seeded
by the W3C Semantic Web Education and Outreach group). A goal of LOD is
to weave together separate collections of open data using deep linking
and RDF (Resource Description Framework) representations. The hallmark
of LOD is to make it easy for web developers to create and process
compatible data. Utilizing LOD calls for a broad war chest of tools and
techniques that cover the diverse expertise of Web developers. One
popular tool for processing data on the web is XSLT (Extensible
Stylesheet Language Transformations), building on the growth of XML
as a data format on the web. XSLT is not a general-purpose programming
language -- so it is limited in its uses -- including LOD processing.
However, XSLT is very useful to handle auxiliary roles in such processing
that involves transforming XML. This article explores specialized areas
for the use of XSLT 1.0 in LOD processing. The focus is on XSLT 1.0
(XSLT 2.0 does offer more for LOD processing, but it is far more complex
and much less used by the community). XSLT 1.0 has more processors than
2.0 and the EXSLT set of community extensions, which has strong support
in Firefox 3.0, provides facilities that bring it close to the power of
XSLT 2.0... As for any web development, use whatever tools you prefer
for Linking Open Data (LOD), but there are a few things that make XSLT
attractive. For one, XSLT processing is much faster than Javascript/DOM
in almost all browsers. Also, some web developers prefer to learn XSLT
rather than other more general programming languages. By using Semantic
Web technologies now, you strengthen your position as a web developer
for the future. Ideally, you should feel empowered to use a combination
of languages for processing, and to target each language to its greatest
strength.
http://www.devx.com/semantic/Article/37820
----------------------------------------------------------------------
State of the Semantic Web
Ivan Herman, Conference Presentation
This presentation was delivered by Ivan Herman, W3C Semantic Web
Activity Lead, at the 2008 Semantic Technology Conference held in
San Jose, California, on May 18, 2008. The history of the Semantic
Web goes back several years now. The 55-slide presentation summarizes
what has been achieved, where we are, and where we are going. Ivan
Herman joined the Centre for Mathematics and Computer Sciences (CWI)
in Amsterdam in 1988 where he holds a tenured position. He joined
the W3C Team as Head of W3C Offices in 2001 while maintaining his
position at CWI. Ivan served as Head of Offices until 2006, when he
was asked to take the Semantic Web Activity Lead position, which is
now his principal work at W3C. As summarized in Bruno Pinheiro's
blog: "[Herman] made a broad presentation of what they're focusing
at the W3C, which are the discussions that are burning at the
community and talked about some technologies that they are putting
their bets on. As far as I saw, Dublin Core and FOAF are a common
sense at the vocabulary level, as they appeared as good examples in
both presentations and in every book about semantic. SPARQL is the
Query Language that with RDF and WOL seems to be under the spotlight
now. Ivan talked a little about an interesting project called the
'Linking Open Data Project', which Goal is to 'expose open databases
in RDF', setting RDF links among data items from different databases
and setting up SPARQL endpoints to query the data. The first practical
projectOne of the projects of this initiative is the DBPedia: by
extracting data from that 'infobox' on wikipedia pages (right columm)
from a City, for example, and integrating with the city information
on the US Census database they can build a stronger an richer
knowledge of that city. At this elaboration stage there are still
lots of issues, but these were the ones Ivan talked about: security,
trust, provenance; ontology merging, alignment, term equivalences;
Uncertainty. The most important for me were the ontology merging and
uncertainty. The web as we know was build on sharing and linking
documents. Now, on the Semantic wave the same concept must be applied.
There's no need to build a complete new ontology on geonames, for
example. Just link to an existing and build one just for your own
knowledge domain..."
http://www.w3.org/2008/Talks/0518-SanJose-IH/HTML/
See also Bruno Pinheiro's blog: http://www.brunopinheiro.com.br/blog/2008/05/19/semtech-2008-1st-day/
----------------------------------------------------------------------
DITA, DocBook, and the Art of the Document
Kurt Cagle, O'Reilly Reviews
Structured documentation provides a level of uniformity that can then
serve for reusing content from a single document source. Today that is
important because such structured source documents can in turn be
transformed into HTML, into PDFs, PostScript files, RTF and Microsoft
Word. Such source documents can also serve to power binary help files,
to provide first-level semantics for text-to-speech and VoiceML
applications and so forth - all at the same time. A consistent document
language makes it possible to build transformations to import partial
content into output for labels on cans or boxes, and provides a single
point of authority for translation into foreign languages... DocBook
and DITA both provide XML Markup for describing different facets of
technical documentation. DocBook actually has its origins, ironically
enough, with O'Reilly & Associates as a language used to lay out narrative
technical books, based primarily upon the works of Norman Walsh and Robert
Stayton. DocBook was originally an SGML specification, and was one of the
first non-W3C specifications to be converted to XML, with the formal
specification for DocBook being then assigned to OASIS-Open as part of
their documentation activity. It is used primarily for describing books,
articles, research papers and (with some additions) slides, but its
structured layout also makes it attractive for storing technical articles
with small to moderate sized organizations. Indeed, even today, many of
the books that O'Reilly produces are laid it first in DocBook... DITA,
on the other hand, evolved from the Darwin Information Typing Architecture
developed by IBM in order to create individual 'topics' of content --
such as those that might be used for an online documentation system. The
topics in turn are organized by topic maps that establish a hierarchical
structure for the topics. Topics in turn use a basic layout language
which borrows somewhat from HTML, but extends it to include figures,
examples, notes, screen displays and so forth. DITA works especially
in those cases where narrative content is limited to the domain of a
single topic (such as the individual entry within a help application),
although efforts are underway to try to extend this to formal business
documents, with mixed success. As a technology, DITA seems to work best
in those situations where you're dealing with content that can be parsed
into distinct chunks that have to be updated by a wide number of authors.
http://www.oreillynet.com/xml/blog/2008/05/dita_docbook_and_the_art_of_th.html
----------------------------------------------------------------------
W3C Call for Implementations: XQuery and XPath Full Text 1.0
S. Amer-Yahia, C. Botev, S. Buxton (et al., eds), W3C Technical Report
W3C has issued a call for implementations in connection with the
publication of "XQuery and XPath Full Text 1.0" as a Candidate
Recommendation. This document has been jointly developed by the W3C
XML Query Working Group and the W3C XSL Working Group, each of which
is part of the XML Activity. It will remain a Candidate Recommendation
until at least 15-September-2008, and will not be submitted for
consideration as a W3C Proposed Recommendation until its four key exit
critera are met. A Test Suite for this document is under development,
and implementors are encouraged to run this test suite and report their
results. The editorial teams have also released Working Drafts for
"XQuery and XPath Full Text 1.0 Requirements" and "XQuery and XPath
Full Text 1.0 Use Cases." The CR document defines the language and the
formal semantics of XQuery and XPath Full Text 1.0. Additionally, the
document defines an XML syntax for XQuery and XPath Full Text 1.0.
XQuery and XPath Full Text 1.0 extends the syntax and semantics of XQuery
1.0 and XPath 2.0... As XML becomes mainstream, users expect to be able
to search their XML documents. This requires a standard way to do
full-text search, as well as structured searches, against XML documents.
A similar requirement for full-text search led ISO to define the
SQL/MM-FT standard. SQL/MM-FT defines extensions to SQL to express
full-text searches providing functionality similar to that defined in
this full-text language extension to XQuery 1.0 and XPath 2.0. XML
documents may contain highly structured data (fixed schemas, known types
such as numbers, dates), semi-structured data (flexible schemas and
types), markup data (text with embedded tags), and unstructured data
(untagged free-flowing text). Where a document contains unstructured
or semi-structured data, it is important to be able to search using
Information Retrieval techniques such as scoring and weighting... As
XQuery and XPath evolve, they may apply the notion of score to querying
structured data. For example, when making travel plans or shopping for
cameras, it is sometimes useful to get an ordered list of near matches
in addition to exact matches. If XQuery and XPath define a generalized
inexact match, we expect XQuery and XPath to utilize the scoring
framework provided by XQuery and XPath Full Text.
http://www.w3.org/TR/2008/CR-xpath-full-text-10-20080516/
See also the Requirements document: http://www.w3.org/TR/2008/WD-xpath-full-text-10-requirements-20080516/
----------------------------------------------------------------------
Web-based Spreadsheets with OpenOffice.org and Dojo
Oleg Mikheev and Doan Nguyen Van, Java World Magazine
As functionality traditionally associated with desktop applications
moves to the Web, developers are looking for new ways to handle that
computational heavy lifting on the server side. Many Web applications
these days aim to replace a corresponding desktop application in one
way or another. For instance, most Web grids and tables, such as those
in Google Spreadsheets, essentially mimic desktop office spreadsheets.
But if you need to create a Web-based application that behaves like an
office suite, there's no need to reinvent the wheel: the open source
OpenOffice.org suite can actually serve as the powerhouse behind a Web
application. In this article, you'll learn how to combine OpenOffice.org
and Dojo to create a simple Ajax-based spreadsheet application much
like Google Spreadsheets. OpenOffice.org is a cross-platform office
suite. It is based on a component model called Universal Network Objects,
or UNO, which allows components to communicate across the network
oblivious to the platform they run on and the language they were written
in. Though it's usually thought of as a desktop application,
OpenOffice.org can be also run in server mode. In this mode,
OpenOffice.org listens to a network port for connections. You can
connect to an OpenOffice.org server running either on a local or remote
computer and use the UNO environment to work with documents. UNO
libraries for both client and server modes are part of the standard
OpenOffice.org distribution... The example application in the article
used Dojo and its grid component as a front end. Dojo is a powerful
JavaScript framework with lots of components ready to be used with
almost no development effort. It also provides an AOP-like mechanism
to add custom behavior to its components. The combination of
OpenOffice.org and Dojo resulted in a working application resembling
Google Spreadsheets and capable of displaying and editing cell values --
and all that with minimal development effort spent.
http://www.javaworld.com/javaworld/jw-05-2008/jw-05-spreadsheets.html
----------------------------------------------------------------------
OASIS Open Standards Forum 2008
Staff, OASIS Announcement
OASIS announced that the annual OASIS European Forum will be held
October 1-3, 2008 near London. The theme will focus on "Security
Challenges for the Information Society." OASIS invites proposals for
presentations, panel sessions, and interoperability demonstrations
related to this theme. Funding for the Forum is provided by OASIS
Foundational Sponsor members, BEA, IBM, Primeton, and Sun Microsystems,
and by IDtrust. "Open exchange of information and access to online
services also pose challenges and threats. Service providers want to
authenticate the identity of individuals requesting access, and
determine the resources and services they are entitled to access.
Users want their identity and personal data and privacy to be protected
adequately, and the confidentiality of sensitive data they are submitting
to be respected. In today's Internet and in many large private network
infrastructures, heterogeneity and diversity are the rule rather than
the exception. Security infrastructures need open standards and
interoperability to scale to the huge deployments that are being rolled
out today. Some of these security standards from OASIS and other
organizations support a model where identity authentication, access
control, digital signature processing, encryption and key management
are provided as services that can be distributed and shared. The Open
Standards Forum 2008 will provide users who are evaluating or looking
to deploy such security infrastructures with an opportunity to explore
the state of the art in security services, standards and products. It
will also provide users with an opportunity to present and share their
use cases, requirements and (initial) experience with other users and
with some of the leading experts in this field."
http://events.oasis-open.org/home/forum/2008
----------------------------------------------------------------------
A Uniform Resource Identifier for Geographic Locations ('geo' URI)
Alexander Mayrhofer and Christian Spanring (eds). IETF Internet Draft
Members of the IETF Geographic Location/Privacy (GEOPRIV) Working Group
have published an initial -00 version of the draft "Uniform Resource
Identifier for Geographic Locations ('geo' URI)." The document specifies
an Uniform Resource Identifier (URI) for geographic locations using the
'geo' scheme name. A 'geo' URI provides latitude, longitude and
optionally altitude of a physical location in a compact, simple,
human-readable, and protocol independent way... An increasing number
of Internet protocols and data formats are being enriched by
specifications on how to add information about geographic location to
them. In most cases, latitude as well as longitude are added as
attributes to existing data structures. However, all those methods
are specific to a certain data format or protocol, and don't provide
a generic way to protocol independent location identification. The
'geo' URI scheme is another step into that direction and aims to
facilitate, support and standardize the problem of location identification
in geospatial services and applications. 'Geo' URIs identify a geographic
location using a textual representation of the location's spatial
coordinates in either two or three dimensions (latitude, longitude, and
optionally altitude). Such URIs are independent from a specific protocol,
application, or data format in which they might be contained... Because
the 'geo' URI is not tied to any specific protocol, and identifies a
physical location rather than a network resource, most of the general
security considerations on URIs do not apply. he URI syntax does make it
possible to construct valid 'geo' URIs which don't identify a valid
location on earth. Applications must not use URIs which such invalid
values, and should warn the user when such URIs are encountered... The
IETF Geographic Location/Privacy (GEOPRIV) Working Group, part of the
Real-time Applications and Infrastructure Area activity, was chartered
to assess the authorization, integrity and privacy requirements that
must be met in order to transfer location information, or authorize
the release or representation of such information through an agent. A
goal of this working group is to deliver a specification that has
broad applicablity and will become mandatory to implement for IETF
protocols that are location-aware. The group has produced several final
RFCs.
http://xml.coverpages.org/draft-mayrhofer-geopriv-geo-uri-00.txt
See also the IETF Geographic Location/Privacy (GEOPRIV) Working Group: http://www.ietf.org/html.charters/geopriv-charter.html
----------------------------------------------------------------------
XML Daily Newslink and Cover Pages are sponsored by:
BEA Systems, Inc. http://www.bea.com
IBM Corporation http://www.ibm.com
Primeton http://www.primeton.com
Sun Microsystems, Inc. http://sun.com
----------------------------------------------------------------------
XML Daily Newslink: http://xml.coverpages.org/newsletter.html
Newsletter archive: http://xml.coverpages.org/newsletterArchive.html
Newsletter subscribe: newsletter-subscribe@xml.coverpages.org
Newsletter ***: newsletter-***@xml.coverpages.org
Newsletter help: newsletter-help@xml.coverpages.org
Cover Pages: http://xml.coverpages.org/
----------------------------------------------------------------------


Back to newsletter list