Self updating map
As a consequence, the opportunity to develop versioned road maps, programmatically interoperable via SPARQL is now at hand.
Simultaneous to the expansion of the TCGA, the tooling required for enabling computational ecosystems for data-driven medical genomics (Almeida, 2010) is maturing rapidly, to the point that tools operating within and providing such ecosystems are beginning to appear (Almeida , 2012b).
According to the company, each Tele Atlas map updated – of which there is at least one per quarter – includes “thousands of miles of new roads and Points of Interest”.
It also takes into account changes and corrections made through Tom Tom’s Map Share service, where users can notify the company of any incorrect mapping they have come across.
More importantly, in 2011, there was a momentous change in the level of data interoperability of the TCGA data repository: data files are now available directly through HTTP calls to a central directory, located at Jac.
This opens entirely new opportunities for interactive reproducible data analysis and visualization.
These specific design elements align with the concept of knowledge reengineering and represent a sharp departure from top-down approaches in grid initiatives such as Ca BIG.
As a consequence, an attempt to use the 2010 RDF road map linked above to traverse the current contents of the TCGA initiative is likely to produce a significant number of unresolvable links to data files.The concern that the web browser is computationally inefficient for advanced numerical procedures has also been amply overcome, as we found when identifying sequence analysis procedures making use of the Map Reduce (Dean and Ghemawat, 2008) distributed computing template (Almeida , 2012).A core requirement of applications operating within such a computational ecosystem is the ability to discover, access and analyze subsets of large data services, as underscored by the recent doubling of the number of recognized breast cancer subtypes (Curtis , 2012), themselves make use of broad datasets, their results are often the starting point for further study of the numerous biomolecular bases for tumorigenesis.Creation of such a road map represents a significant data modeling challenge, due to the size and fluidity of this resource: each of the 33 cancer types is instantiated in only partially overlapping sets of analytical platforms, while the number of data files available doubles approximately every 7 months.Results: We developed an engine to index and annotate the TCGA files, relying exclusively on third-generation web technologies (Web 3.0).