Sunday, June 14, 2009

Usage and Purpose of XML Namespaces

The Problem:
According to the XML data model, an XML document is a hierarchy of nested elements, consisting of a name and a set of attributes. The attributes also have a name and a value. All these tag names are defined by the developers. This freedom however comes with an inherent problem attached. Different people work with different domains, but the phraseology used can often be common. Applications make use of the elements' names and attributes to determine how to process the element. In a distributed environment like the Internet this is rather problematic, as different people might use the same element names to mean different things. One XML document may use the element table to describe a html table, another one may use a table element to describe a furniture, however applications aren't smart enough to judge the difference between the context of elements from different markup languages that share the same name. Thus, due to the name collision and ambiguity an application has no way of knowing how to process the table element.

Code Sample: HTML table element

<table>
  <
tr>
    <
td>Product</td>
    <
td>Price</td>
  </
tr>
  <
tr>
    <
td>Coffee Table</td>
    <
td>199.99</td>
  </
tr>
</
table>


Code Sample: Furniture table element

<table sku=”12222221″>
  <
type>Coffee Table</type>
  <
price>199.99</price>
  <
inStock>yes</inStock>
  <
material>maple</material>
</
table>

The Solution:
The XML namespaces recommendation defines a way to distinguish between duplicate element type and attribute names. It allows you to resolve ambiguity and avoid "collisions", so that schemas created by one organization will not conflict with those created by another. Just as two Java classes can have the same name as long as they are defined in separate packages, two XML elements can have the same name as long as they belong to different namespaces. When you place a set of tags into a namespace, the tags are given a context and the ability to retain a unique ID based on the context in which they are used. In other words, it becomes possible to utilize both of the mentioned earlier table-tags, even though they are named identically, in the same document while retaining different meanings for the two tags.

A namespace is declared using the reserved XML attribute xmlns, the value of which must be a URI (Uniform Resource Identifier) reference, which is usually a URL. However the URI has no semantic meaning and is not actually read, it is simply treated by an XML parser as a string. Using a URI to identify a namespace, rather than a simple string (such as "xhtml"), reduces the possibility of different namespaces using duplicate identifiers. The declaration includes a short prefix with which elements and attributes can be identified e.g. xmlns:xhtml=http://www.w3.org/1999/xhtml. After such a definition each elements belonging to the specific namespace has to be qualified with the prefix. Doing this repeatedly for each element can be painful. In such cases, you can declare a default namespace instead. However, at any point in time, there can be only one default namespace in existence. Declaring a default namespace means that any element within the scope of the default namespace declaration will be qualified implicitly, if it is not already qualified explicitly using a prefix. As with prefixed namespaces, a default namespace can be overridden too. The scope of an XML namespace declaration is that part of an XML document to which the declaration applies. An XML namespace declaration remains in scope for the element on which it is declared and all of its descendants, unless it is overridden or undeclared on one of those descendants

Code Sample: HTML table and Furniture table with namespaces

<table xmlns=http://www.w3.org/tr/xhtml
          xmlns:furn=”http://www.furniture.org/tables”>
  <
tr>
    <
td>Product</td>
    <
td>Price</td>
  </
tr>
  <
tr>
    <
td>
      <
furn:table sku=”1222221″>
        <
furn:type>Coffee Table</furn:type>
      </
furn:table>
    </
td>
    <
td>
      <
furn:table sku=”1222221″>
        <
furn:price>199.99</furn:price>
      </
furn:table>
    </
td>
  </
tr>
</table>

Saturday, June 6, 2009

Thought on COCOMO2 used for software development projects.

COCOMO (COnstructive COst MOdel) was developed by Barry Boehm in the early seventies by collecting data from many projects to gather an empirical database of development e fforts for tasks included in these projects. Thus COCOMO provided the first solid data on the productivity of engineers in the workplace. In the nineties, Boehm launched the COCOMO II project, attempting to gather similar data on a much broader scale and to address some of the changes to software development processes and methodologies from the last two decades since COCOMO was first introduced (e.g. prototyping, incremental development, component reuse, CASE tool support, etc.). The COCOMO II equation embeds many project parameters and is defined
as follows:

Effort = A x SizeB x M,

where Eff ort refers to the person months needed to complete the project; A represents the type of project and there are three possible values for this parameter; Size is defined by using a SLOC estimate or function point count; B is a derived metric which includes the sum of five cost driver metrics and M is a metric for eff ort multiplier. The COCOMO II equation defines seven e ffort multipliers for early life cycle estimating. One of the main diffculties applying the COCOMO II technique is coping with the very broad solution
space. Trying to perform an e ffort estimation using the COCOMO II method at an early project stage, would mean that a product manager would have 3 options to choose a project type, 55 options for cost drivers, 57 options for eff ort multipliers and an eff ort value for the Size parameter or alltogether 730000000 diff erent settings combinations, which is too much to review even for the eagerest manager. Another problem with the COCOMO II technique is that it requires, already in an very early project stage, a project size estimation, which however is kind of paradoxical, as if such estimation would exist - it would be fairly easy to formulate a reasonable e ffort estimation.