Friday, August 14, 2009

On comparing algorithmic cost estimates prepared in di fferent organisations

Roughly summarized in the algorithmic cost estimation the cost is estimated as a mathematical function linking costs or inputs with metrics to produce an estimated output. This function arise from the analysis of historical cost information which relates some commonly used attributes for cost (usually product size, function points, object points, etc.), to the project cost:

Effort = A x Size B x M

where A is an organization-dependent constant, B reflects the disproportionate e ffort for large projects and M is a multiplier reflecting product, process and people attributes. Due to the constant A beeing a organization-dependant constant, the algorithmic cost model will diff er from one organization to another. Another great disadvantage of this models is the inconsistency of the estimates, studies show that estimates vary from 85% - 610% between predicated and actual values. By adjusting the weightings of the attributes (in the formula - the multiplier M), also called calibrating to the specific environment, the accuracy of the model can be greatly improved. This calibration however is in a sense customization of the generic model for a specific environment (situation, organization, etc) and would result in a model, which is not useful outside of this particular environment (e.g. organization) it was calibrated for.
Conclusion: The estimates of factors contributing to B and M are subjective and thus organization and situation-dependant. Due to this algorithmic cost models are not directly comparable form one organization to another.

Video searching on the web

With Google, Yahoo Video, AltaVista, Singingfish, Dogpile, Blinx.tv, YouTube, Truveo, AOL Video, Live Search and other search engines are for a while now indexing videos too.

Traditional search engines on the World Wide Web index web pages by treating them as plain text documents, and indexing the content of the page in order to allow users to look for it. This however is not practicable for images, videos and audio content. Therefore searching for multimedia data (images, video and audio) introduces challenging problems in many areas. Not only the question what should be used to indexed multimedia data but also how this index has to be queried in order to retrieve the information back is still a grand challenge in this area.

Different search engines use slightly different "techniques" to index video files. Among those techniques are:
  • Using text from the filename.
  • Using alternate text.
  • Using the text in the hyperlink or other relevant text from the web page, which links to the particular video
  • The video header information, which usually include title, author and depending on the video format copyright information.
  • Textual meta data.
  • User tags for the video file.
Relying entirely on this handful of approaches has many drawbacks. Metadata, for instance, often doesn't have enough information to identify a video, and the weakness of user tags, is that they can be misused. Therefore in recent years a couple of more innovative approaches for indexing videos have arisen. The following the following two I find particularly interesting:
  • The search engine Blinkx, for example, uses speech-recognition technology in addition to standard metadata and surrounding text searches. It converts audio speech into searchable text by extracting the audio information accompanying most video files and useing it to create a searchable text index of "words".
  • Researchers at the university of Leeds (former Oxford) are working on another innovative solution which aims to make the content of a video searchable, instead of only the text description and meta data. In order to do this they have developed a system that uses a combination of face recognition, close-captioning information, and original television scripts to automatically name the faces in the videos. There is still a long way to go, but this innovation is seen as the first step in getting automated descriptions of the happenings in a video.

Google Web APIs

Google Web APIs are for developers and researchers interested in using Google as a resource in their applications and enables them to easily find and manipulate information on the web and to query more than 8 billion web documents directly from their own computer programs. Google uses the SOAP and WSDL standards to act as an interface between the user’s program and Google API and officually support Java, .NET, Ruby and Perl. By using the API developers can issue search requests to Google's web pages index and receive results as structured data (number of results, URI's, Snippets, Query Time, etc.). Additionally developers can access information in the Google cache and can check the spelling of words. To start using the API one needs to download and install the API package from http://www.google.com/apis/ and create an account to get an license key (however the google FAQs state that google is no longer issuing new API keys, so this step can be problematic). The key I received last year is limited to 1,000 queries/day. Last but not least one will need a SOAP implementation like e.g. Apache Axis, SOAP::Lite for Perl, SOAP4R if Ruby is the language of choice, etc. The installed API package contains:

  • googleapi.jar - Java library for accessing the Google Web APIs service.

  • GoogleAPIDemo.java - Example program that uses googleapi.jar. dotnet/

  • Example .NET - programs that uses Google Web APIs.

  • APIs_Reference.html - Reference doc for the API. Describes semantics of all calls and fields.

  • Javadoc - Documentation for the example Java libraries.

  • Licenses - Licenses for Java code that is redistributed in this package.

  • GoogleSearch.wsdl -WSDL description for Google SOAP API.

  • soap-samples/ - Different examples

Following an small example, I found somewhere on the web, of using the SOAP::Lite SOAP implementation to make a Google query:

Example: query.pl

#!/usr/local/bin/perl –w
use SOAP::Lite;
# Configuration
$key = "The Google API Key Goes Here";
# Initialize with local SOAP::Lite file
$service = SOAP::Lite
-> service('file:GoogleSearch.wsdl');
$query= “Viadrina”;
$result = $service
-> doGoogleSearch(
$key, # key
$query, # search query
0, # start results
10, # max results
"false", # filter: boolean to turn on/off automatic filtering
"", # restrict (string) , e.g. "linux"
"false", # safeSearch: boolean
"", # language restrict e.g. lang_de
"", # input encoding
"" # output emcoding
);

if(defined($result->{resultElements})) {
print join "\n",
"Found:",
$result->{resultElements}->[0]->{title},
$result->{resultElements}->[0]->{URL},
$result->{resultElements}->[0]->{snippet} . "\n"
}

print "\n The search took ";
print $result->{searchTime};
print "\n\n";
print "The estimated Number of results for your query is: ";
print $result->{estimatedTotalResultsCount};
print "\n\n";