password
username
Sponsored by CakeMail, an email marketing software.
Newsletter preview


XML Daily Newslink. Thursday, 15 May 2008
A Cover Pages Publication http://xml.coverpages.org/
Provided by OASIS http://www.oasis-open.org
Edited by Robin Cover

====================================================
This issue of XML Daily Newslink is sponsored by
Primeton http://www.primeton.com
====================================================

HEADLINES:

* SEC Proposes Requirement for Use of XBRL in Financial Reporting
* YANG: A Data Modeling Language for NETCONF
* Yahoo SearchMonkey Is Out of Its Cage
* The State of the Service Component Architecture (SCA): An Update
* Revision of the PREMIS Data Dictionary for Preservation Metadata
* A Survey of Trust and Reputation Systems for Online Service Provision
* Defending the XML Angle Bracket Tax

----------------------------------------------------------------------

SEC Proposes Requirement for Use of XBRL in Financial Reporting
Staff, U.S. SEC Announcement

The U.S. Securities and Exchange Commission has voted to formally
propose using new technology to get important information to investors
faster, more reliably, and at a lower cost. At the center of the SEC
proposal is "interactive data": computer tags similar in function to
bar codes used to identify groceries and shipped packages. The
interactive data tags uniquely identify individual items in a company's
financial statement so they can be easily searched on the Internet,
downloaded into spreadsheets, reorganized in databases, and put to any
number of other comparative and analytical uses by investors, analysts,
and journalists. The proposed rule would require all U.S. companies to
provide financial information using interactive data beginning next
year for the largest companies, and within three years for all public
companies. According to SEC's John White, all of the technology is
coming together to make electronic filing a true analytical tool. The
staff has gathered valuable experience during the almost three years
that public companies have been submitting interactive data in our
voluntary filer program... The SEC's proposed schedule would require
companies using U.S. Generally Accepted Accounting Principles with a
worldwide public float over $5 billion (approximately the 500 largest
companies) to make financial disclosures using interactive data
formatted in eXtensible Business Reporting Language (XBRL) for fiscal
periods ending in late 2008. If adopted, the first interactive data
provided under the new rules would be made public in early 2009. The
remaining companies using U.S. GAAP would provide this disclosure
over the following two years. Companies using International Financial
Reporting Standards as issued by the International Accounting Standards
Board would provide this disclosure for fiscal periods ending in late
2010. The disclosure would be provided as additional exhibits to annual
and quarterly reports and registration statements. Companies also would
be required to post this information on their websites. The required
tagged disclosures would include companies' primary financial statements,
notes, and financial statement schedules. Initially, companies would
tag notes and schedules as blocks of text, and a year later, they
would provide tags for the details within the notes and schedules. XBRL
is an XML-based schema that focuses specifically on the requirements of
business reporting. XBRL builds upon XML, allowing accountants and
regulatory bodies to identify items that are unique to the business
reporting environment. The XBRL schema defines how to create XBRL
documents and XBRL taxonomies, providing users with a set of business
information tags that allows users to identify business information in
a consistent way. XBRL is also extensible in that users are able to
create their own XBRL taxonomies that define and describe tags unique
to a given environment.

http://sec.gov/news/press/2008/2008-85.htm
See also the XBRL FAQ document: http://www.xbrl.org/faq.aspx

----------------------------------------------------------------------

YANG: A Data Modeling Language for NETCONF
Martin Bjorklund (ed), IETF Internet Draft

A version -00 IETF Internet Draft has been published for "YANG: A Data
Modeling Language for NETCONF." Today, the NETCONF protocol (IETF RFC
4741) lacks a standardized way to create data models. Instead, vendors
are forced to use proprietary solutions. In order for NETCONF to be a
interoperable protocol, models must be defined in a vendor-neutral way.
YANG provides the language and rules for defining such models for use
with NETCONF. YANG is a data modeling language used to model configuration
and state data manipulated by the NETCONF protocol, NETCONF remote
procedure calls, and NETCONF notifications. This document describes the
syntax and semantics of the YANG language, how the data model defined
in a YANG module is represented in XML, and how NETCONF operations are
being used to manipulate the data. YANG models the hierarchical
organization of data as a tree in which each node has a name, and either
a value or a set of child nodes. YANG provides clear and concise
descriptions of the nodes, as well as the interaction between those
nodes. YANG structures data models into modules and submodules. A module
can import data from other external modules, and include data from
submodules. The hierarchy can be extended, allowing one module to add
data nodes to the hierarchy defined in another module. This augmentation
can be conditional, with new nodes to appearing only if certain conditions
are met. YANG models can describe constraints to be enforced on the data,
restricting the appearance or value of nodes based the presence or value
of other nodes in the hierarchy. These constraints are enforceable by
either the client or the server, and valid content must abide by them.
YANG defines a set of built-in types, and has a type mechanism through
which additional types may be defined. Derived types can restrict their
base type's set of valid values using mechanisms like range or pattern
restrictions that can be enforced by clients or servers... YANG strikes
a balance between high-level object-oriented modeling and low-level
bits-on-the-wire encoding. The reader of a YANG module can easily see
the high-level view of the data model while seeing how the object will
be encoded in NETCONF operations. YANG is an extensible language,
allowing extension statements to be defined by standards bodies, vendors,
and individuals. The statement syntax allows these extensions to coexist
with standard YANG statements in a natural way, while making extensions
stand out sufficiently for the reader to notice them. YANG modules can
be translated into an XML format called YIN, provided in Appendix B,
allowing applications using XML parsers and XSLT scripts to operate on
the models. XML Schema files can be generated from YANG modules, giving
a precise description of the XML representation of the data modeled in
YANG modules.

http://xml.coverpages.org/draft-ietf-netmod-yang-00.txt
See also the IETF Network Configuration (NETCONF) Working Group: http://www.ietf.org/html.charters/netconf-charter.html

----------------------------------------------------------------------

Yahoo SearchMonkey Is Out of Its Cage
Clint Boulton, eWEEK

Playing off the belief that Web users turn to search engines to get
information to complete tasks, Yahoo has released its new open
developer platform to let programmers write applications that boost
the relevance of search results. SearchMonkey comprises three layers:
First, Yahoo partner publishers, such as The New York Times, Yelp,
eBay and StumbleUpon, share structured data with Yahoo. Third-party
developers then access this content through semantic markup languages,
such as microformats and RDF, standardized XML feeds, Web services
APIs, and page extraction, to create widgets. These widgets will
include navigational links, reviews, contact information and locations
to provide enhanced search listings. Finally, developers make these
apps available in a gallery on Yahoo, from which consumers can grab
them to customize their searches. According to the online SearchMonkey
Guide: Site Owners have web sites containing the data retrieved by
SearchMonkey applications. To be used by SearchMonkey applications,
this data must be structured. Site owners can make structured data
available to Yahoo! in any of the following ways: (1) Atom Feeds: Site
owners push data to Yahoo! by submitting Atom feeds. (2) Markup: Site
owners markup up their web pages with microformats or RDFa/eRDF,
extracted by Yahoo! when crawling these URLs. Microformats are the
leading established standard for web page markup. RDFa and eRDF are
also widely accepted standards. Since these are all open standards,
marking up your pages using any of these formats makes your content
more easily reusable. (3) Web Services: Site owners create custom Web
Services that provide access to their structured data... The Adjunct
Syntax Specification (with Relax NG Compact Syntax specification)
describes a method called DataRSS for embedding arbitrary metadata
within feed vocabularies, including RSS, Atom, IDIF, and others. A
"searchmonkey-profile Vocabulary Specification" defines a set of terms
(classes, properties and data types) recommended for use in DataRSS
feeds and in pages with embedded RDFa and eRDF. It builds on
well-established vocabularies such as Dublin Core and the FOAF vocabulary,
as well as common RDF vocabularies for microformats such as hCard,
hCalendar and hReview. Its set of common terms can help developers to
get started... The following microformats are supported: hCard, hCalendar,
hReview, hFeed and XFN. You may use any RDF or OWL vocabulary. However,
if you are publishing data using a custom-made vocabulary, make sure
you make the schema definition easy to discover so that others can
understand your data and build applications on it. The best approach
is to follow the recommendations regarding "cool URIs" and serve both
textual and machine processable vocabulary definitions at the locations
where your URIs are pointing to... Microformats, eRDF and RDFa are
different ways of embedding metadata inside Web pages. They represent
different trade-offs in terms of ease of authoring versus expressibility.
Microformats are the easiest to write and understand, but may not fill
all your metadata needs. In particular, you may not find an appropriate
vocabulary to represent your information. eRDF and RDFa allow you to
work with any RDF or OWL vocabulary, and create your own vocabulary or
reuse existing ones. eRDF is a subset of the full RDF model; for example,
you can only make statements about the current page. RDFa offers all
the features of RDF, making it the most complex of the three formalisms
but also the most powerful one.

http://www.eweek.com/c/a/Search/Yahoo-SearchMonkey-is-Out-of-its-Cage/
See also the developer documentation: http://developer.yahoo.com/searchmonkey/

----------------------------------------------------------------------

The State of the Service Component Architecture (SCA): An Update
David Chappell, Blog

"I moderated a panel on Service Component Architecture (SCA) at JavaOne
last week. I was also the moderator for last year's SCA panel, and
several of the same people were on the panel with me this time. While
the things we talked about were broadly similar, two things stand out
about what's changed in a year. The first is that SCA is real, or at
least part of it is. One of the things the SCA specs define is an
XML-based language called the Service Component Definition Language
(SCDL). SCDL is meant to provide a vendor-neutral way to describe how
components created in various technologies, such as Java, BPEL, and
Spring, are configured and wired together to create applications. Vendors
were showing SCDL in real products on the JavaOne floor -- Oracle had
an especially nice demo -- and so it's clear that this part of SCA is
seeing some success. Whether SCDL will in fact provide much cross-vendor
portability remains to be seen. As usual, this depends on how many
proprietary extensions vendors add. Still, a standard language for
describing the components and assembly of an application is a useful
idea, and the signs so far are promising. The second thing that stands
out after a year is less promising: It's the confusion around how to
write SCA components. Along with SCDL, the SCA specs define how to
create components using several different technologies. Yet the various
SCA vendors and open source projects can't agree on which of these to
implement. SCA support for Spring components, for example, is hit or
miss: some SCA offerings support it, some don't. BPEL is much the same:
Oracle is a big fan, while the open source Fabric3 currently has no
BPEL support. And just as it was a year ago, support for SCA's new
programming model for creating Java components is uneven. As I've
written before, I believe that this aspect of the spec is really
important -- it unifies the diverse approaches of Java EE much as
Microsoft's Windows Communication Foundation (WCF) unified the diverse
programming models in the original .NET Framework... The stated goal
of SCA is to provide application portability. Widespread support for
SCDL is an essential part of this, but so is agreeing on how to create
SCA components. For SCA to really improve portability, the vendors and
open source projects that support it need to agree on how their
customers should create components.

http://www.davidchappell.com/blog/2008/05/state-of-sca-update.html
See also the OASIS Open Composite Services Architecture (CSA) Member Section: http://www.oasis-opencsa.org/

----------------------------------------------------------------------

Revision of the PREMIS Data Dictionary for Preservation Metadata
Brian F. Lavoie, D-Lib Magazine

Released in May 2005, the PREMIS Data Dictionary for Preservation
Metadata was the first comprehensive specification for preservation
metadata produced from an international, cross-domain consensus-building
process. The PREMIS working group, jointly sponsored by OCLC and RLG,
consisted of more than 30 experts from 5 countries, representing
libraries, archives, museums, government agencies, and the private
sector. After about two years, the Maintenance Activity felt that
enough feedback had accumulated to warrant undertaking the first
revision of the Data Dictionary and its XML schema. The revision process
began in October 2006, and ended with the release of the PREMIS Data
Dictionary 2.0 in April 2008. This article briefly describes the revision
process and its outcomes, including a summary of the major changes
appearing in the new version of the Dictionary. The Maintenance Activity
is establishing a registry (standard vocabularies for particular semantic
units) in the near future, populated initially by lists of suggested
values for semantic units supplied in PREMIS 2.0. Implementers will be
encouraged to contribute other vocabularies in use to the registry. A
mechanism is under development to enable the identification of the
source of these controlled vocabularies and to validate appropriate
values using an XML schema. A registry of controlled vocabularies should
be of considerable value to the community, both as a reference to inform
implementation decisions, and as a means of encouraging convergence and
standardization... The PREMIS schema has been endorsed by the Metadata
Encoding and Transmission Standard (METS) Editorial Board as an approved
extension schema for METS. The METS schema is widely used by digital
repositories as a packaging mechanism for objects and their associated
metadata. A number of questions have emerged as to how the PREMIS Data
Dictionary and schema should be used in conjunction with METS. The
Maintenance Activity has convened a group of experts to develop a set
of guidelines and recommendations for using PREMIS and METS, and a
working draft of their findings is now available online.

http://www.dlib.org/dlib/may08/lavoie/05lavoie.html
See also the U.S. Library of Congress PREMIS site: http://www.loc.gov/standards/premis/

----------------------------------------------------------------------

A Survey of Trust and Reputation Systems for Online Service Provision
A. Josang, R. Ismail, C. Boyd (eds), Decision Support Systems

This article preprint was contributed to the OASIS Open Reputation
Management Systems (ORMS) Technical Committee document repository by
TC Co-chair Nat Sakimura (Nomura Research Institute, Ltd). The OASIS
TC was recently chartered to develop an Open Reputation Management System
(ORMS) that provides the ability to use common data formats for
representing reputation data, and standard definitions of reputation
scores. Document abstract: "Trust and reputation systems represent a
significant trend in decision support for Internet mediated service
provision. The basic idea is to let parties rate each other, for example
after the completion of a transaction, and use the aggregated ratings
about a given party to derive a trust or reputation score, which can
assist other parties in deciding whether or not to transact with that
party in the future. A natural side effect is that it also provides an
incentive for good behaviour, and therefore tends to have a positive
effect on market quality. Reputation systems can be called collaborative
sanctioning systems to reflect their collaborative nature, and are
related to collaborative filtering systems. Reputation systems are
already being used in successful commercial online applications. There
is also a rapidly growing literature around trust and reputation systems,
but unfortunately this activity is not very coherent. The purpose of
this article is to give an overview of existing and proposed systems
that can be used to derive measures of trust and reputation for Internet
transactions, to analyse the current trends and developments in this
area, and to propose a research agenda for trust and reputation systems."

http://www.oasis-open.org/committees/download.php/28303/JIB2007-DSS-Survey.pdf
See also the OASIS ORMS TC: http://www.oasis-open.org/committees/orms/

----------------------------------------------------------------------

Defending the XML Angle Bracket Tax
Norm Walsh, Blog

"I've spent a couple of days trying to decide if I want to respond to
Jeff Atwood's swipe at XML... Jeff tries to show how much better RFC
822 is for email. There's no question that it's more compact; I could
learn to author email in XML, but I'm not anxious to do it. On the other
hand, it's pretty obvious that XML is actually better. Jeff summarizes
with a perfectly reasonable statement: 'I don't necessarily think XML
sucks, but the mindless, blanket application of XML as a dessert topping
and a floor wax certainly does. Like all tools, it's a question of how
you use it.' I can't really disagree with that. XML may be my hammer
of choice, but I don't hang picture hooks with a sledge hammer. If your
data is really simple, maybe just a set of key/value pairs, and if both
the key and the value are strings, and if the consequences of bad data
are negligible, and if there's no possibility that there will ever be
any additional complexity, then sure, maybe a flat text file is all you
need... RELAX NG has both an XML syntax and a compact (non-XML) syntax.
It's possible to author in both of them, and you can translate from one
to the other without any loss of data, and with minimal loss of formatting.
I author mostly in the compact syntax. Nevertheless, I absolutely rely
on the XML syntax because having the XML syntax makes the entire schema
amenable to processing with an enormous range of XML tools. General
purpose tools that work equally well with RELAX NG and other XML languages.
Tools that I did not have to write, test, debug, or document. The lesson,
if there's a lesson, is that even if you think a non-XML syntax is better
for one purpose or another, the ability to translate into (and back out of)
an XML syntax is a good thing. Of course, devising two syntaxes, and
making them isomorphic, and making it possible to translate back and forth
without destroying one format or the other, is a huge amount of work.
It's usually easier to just use XML... I don't necessarily think all the
alternatives to XML suck, but the mindless, knee-jerk rejection of XML
because it contains a small amount of additional syntax certainly does.
Like all tools, it's a question of how you use it. Please think twice
before subjecting yourself, your fellow programmers, and your users to
more fragile, ASCII-only, ad hoc syntaxes.

http://norman.walsh.name/2008/05/13/thetax
See also James Clark on JSON: http://blog.jclark.com/2007/04/xml-and-json.html

----------------------------------------------------------------------

XML Daily Newslink and Cover Pages are sponsored by:

BEA Systems, Inc. http://www.bea.com
IBM Corporation http://www.ibm.com
Primeton http://www.primeton.com
Sun Microsystems, Inc. http://sun.com

----------------------------------------------------------------------

XML Daily Newslink: http://xml.coverpages.org/newsletter.html
Newsletter archive: http://xml.coverpages.org/newsletterArchive.html
Newsletter subscribe: newsletter-subscribe@xml.coverpages.org
Newsletter ***: newsletter-***@xml.coverpages.org
Newsletter help: newsletter-help@xml.coverpages.org
Cover Pages: http://xml.coverpages.org/

----------------------------------------------------------------------