I find it pretty fascinating how far we have come in that data itself is meaningful to other data with little outside intervention. However, it seems to me that there may actually be two trains of thought in this problem (much of which is highly developed and effective from the search technologies of the early 90's to today...). Although the holy grail of the
internet may very well be the semantic web, the very nature of it seems to lend to information hiding.
The whole purpose of a 'search engine' is to crawl and discover the information that is deeply hidden from a human interaction level - this requires some explanation: Think about
Yahoo.com (circa 2000) the whole purpose was to have a catalog of web interests that were both navigational and intuitive to assist the user (a system that seems very familiar to a library catalog card). This was effective but required understanding of the material that was being cataloged into this portal.
Search (a good modern search like
google or the present Yahoo) effectively changes this perspective by assuming that a set of websites (the
internet) cannot be cataloged into a portal but rather
ranked where the users and designers of websites actually influence the popularity and contextual property of the result set.
A 'semantic web' seems to be a web of inferred result sets by the
meaning. These results would be even more meaningful given that it would both answer a given question and give related contextual content. But, therein may be the
rub - it seems by that very definition there is the opportunity for information hiding.
The meaning and the reality of what is being said; an example: A semantic search might be: "I am looking for a question." and the result set might be: "This is a statement?" (
sp - it should not include the question mark...). Where a search result might not know what a question is or what it means it does differentiate between literals. Therefore there is almost an ease of confusion rather than towards clarity (importantly the riff-
raff (spam) still seems to be able to work to control these result sets),
Now the reality of these technologies seems to be a compromise (which really is our current state of the art); where search and semantic technologies function in harmony. (I also realize this is a somewhat myopic view...it is more for the thought)
Submitted for approval by no one and crawled by no one, but maybe something.