Doretta: a new way to search online?

Doretta is a Live Messenger Contact. You may add her to your contacts and chat. Doretta is a very nice girl, has a blog and speaks italian. Actually, it is promotional tool: Doretta is a bot designed to answer your questions.

When you ask something, she answers opening a Live Search Windows:

Doretta

The idea is very nice, but it’s not very deployed. I found a contest, closed :-(

http://www.robotinvaders.com/main/Default.aspx

[tags]doretta82, live messenger bot, robotinvaders[/tags]

Papers: Why do you search?

This should be the first question of my PhD research, where I’d like to identify search task. A paper answers partially this question:

Rose, D. E., Levinson, D. 2004. Understanding User Goals in Web Search, WWW2004.

Authors give a hierarchy on Search Goals, derived by a study on Altavista queries. The three main areas of their proposed framework are:

  • Navigational: the goal is to get to a website
  • Informational: the goal is to learn something
  • Resource: the goal is to obtain a resource (download, view, interact, obtain)

Their work is based on Broder, A. 2002. A taxonomy of web search. SIGIR Forum.

It seems that Rose is going further:

Rose, D. E. 2006. Reconciling Information-Seeking Behavior with Search User Interfaces for the Web, in Journal of the american society for information science and technology, 57(6):797-799.

In this recent short paper, Rose identifies the principles that should guide who is creating next search engine:

  • Different interfaces should be available to match different search goals
  • The interface should facilitate the selection of appropriate contexts for the search
  • The interface should support the iterative nature of the search task. In particular, it should invite refinement and exploration.

[tags]web search, search goals, search tasks, rose, broder, levinson[/tags]

Some new trends in Search

According to Emre Sokullu, Read/WriteWeb, the three areas where Search Engines will improve are:

  • UI Enhancements
  • Technology Enhancements
  • Approach Enhancements (Vertical Engines)

I’m quite interested in all the three areas, because my research focuses on UI issues related to search task.

Here (and here) Sokullu defines Search 2.0 as third generation search. But Danny Sullivan does not agree. Alex Iskold talked previously on vertical search.

[tags]search engines, search 2.0, social search, vertical search[/tags]

Paper: a taxonomy for tagging system

 Very interesting paper: 

Marlow C., Naaman M., Boyd D., and Davis M., 2006. Position Paper, Tagging, Taxonomy, Flickr, Article, ToRead, Citeulike.

A good analysis on existing tagging system on the web. And a very good way to describe them through a taxonomy.

The first dimension is System design and attributes, summarized in this table:

The second dimension is User incentives: future retrieval, contribution and sharing, attract attention, play and competition, self presentation, opinion expression.

[tags]tag, folksonomy, research[/tags]

Changes in Search

Via Slashdot, I read this post: 7 Search Evolutions for ‘07 by Dr. Riza C. Berkan. Dr. Berkan is a nuclear scientist specialized in artificial intelligence and fuzzy logic and CEO of hakia, “the Web’s new “meaning-based” search engine with the sole purpose of improving search relevancy and interactivity”.

According to Dr. Berkan, the 7 search evolutions next years will be:

    1. The first time a search engine will have an alternative to indexing; new technology like QDEXing will be developed.
    2. The first time ontological semantics will be used that will enable a search engine to perceive concepts beyond words and retrieve results with meaningful equivalents.
    3. The first time that search results will include highlighted best sentences as a result of semantic analysis rather than bolded keywords as a result of finding incidences.
    4. The first time that a single query will bring a gallery of results equivalent to running multiple queries about the meaningful variations of the same topic.
    5. The first time a search engine will let users evaluate answers on the spot by displaying uninterrupted and coherent text snippets, often letting searchers forgo having to click through to links and saving time.
    6. The first time a search engine will have a dialogue utility that will help point out best answers or suggest a Gallery for a more engaging human-like search experience.
    7. The first time a search engine’s data will grow by detection of new knowledge rather than by detection of new pages. Search engine growth by knowledge will be the new direction for the industry for 2007.

[tags]search engine, hakia, semantic web[/tags]

Article: a theory on groups

I read Clay Shirky’s A Group Is Its Own Worst Enemy. The article was first published in 2003, but it’s still very interesting.

Some notes.

Shirky defines social software as software that supports group interaction. This is something new: the last technology that supported group interaction (and still does) was the table.

Some behaviors come from individuals and only appear as coordinated. Some others are group-related. So you can study a group as multiple individuals, nor as unique entity. You have to handle with individual behaviors and group effects.

The patterns in which a group interaction will evolve are three:

  1. members will start talking about sex (eg. most of chats)
  2. a common external enemy is identified and the group will cohalize against it (eg. open source people against Bill Gates)
  3. the religious thing: people identify something that’s beyond any critique (eg. football team, movie director, book)

To avoid the disgregation, you should give a structure to the group. Note that this should be a technical AND social solution. Technics-only solution wont distinguish between normal behaviors and abuse.

The second part of the article try to explain why the web 2.0 thing is happening right now. Although the article was written in 2003, this part is still valid. Shirky points out that the technology needed for the web 2.0 was there since the first Mosaic was launched (~1994). We got Geocities instead of Blogger because we did not know what we were doing.

Now we see that this thing is going mainstream. This thing is different from everything we saw before: it’s web-native. It’s not something adapted to the web, this is really born on the web.

The last part is on social software design. Shirky think you have to accept three thing and deal with other four things.

The things you have to accept if you’re designing social software are:

  1. you cannot completely separate technical and social issues
  2. members are different than users. Ie. some users are more users than others. The core group, the every-day users, should have the means to protect themself against one-time-users that may corrupt the group
  3. the core group has rights that trump individual rights in some situation. This is not something you may decide. This will happens, because it’s the group’s instinct of surviving. Citizenship should be different from ability to log on. The core group will find the way to protect against the tyranny of the majority, or the group will die.

The four things to design for:

  1. the identity thing: identity should grant reputation. This does not mean that everyone have to use the real name. This means that user cannot change their nickname
  2. you have to design a way for there to be members in good standing, such as identity (reputation), a sort of karma, a “member since” thing
  3. you need barriers to participation. Eg in Slashdot: everyone can read, anonymous cowards can post, non-anonymous cowards can post with higher ranking. And it’s difficult to be able to moderate
  4. you have to find a way to spare the group from scale

Shirky ends saying that “the act of writing social software is more like the work of an economist or a political scientist”.

PS: since my wordpress categories are a mess, I’m testing technorati tags

[tag]article, social web, groups, shirky[/tag]

The web 2.0 began in 1940

The social software isn’t new. It started in 1940, according to a post by Cristopher Allen: Tracking the evolution of social software.

The review goes from the memex (1940) to Augmentation, Groupware and CSCW (Computer-Supported Collaborative – or Cooperative – Work). These are different steps of the software tools that allows or helps groups working together.

The scientists involved are usually computer scientists or psychologists. Communication scientists should be ok for this kind of business, in particular communication technologist.

 The last step we saw is what’s called web 2.0 or social software. Tim O’Really defined one year ago and his What Is Web 2.0 is still up to date.

 To me, the most interesting sentence is

bloggers [...] have a disproportionate role in shaping search engine results

My life as a map

I’m beginning to use MindManager, a software for visualizing and managing information, allowing individuals and teams to more effectively think, plan, and collaborate. I hope this will help me in organizing my business, my researches and my life. To try, jff (just for fun), I’m going to prepare a visual map for cat management.

Social search is not new

Chris Sherman has some good thoughts on social search. First of all, he defines social search tools as “Internet way-finding services informed by human judgment.”

Actually, social search is older than search engines: the first rudimental examples are the link pages that everyone, in early days of the web, had on her site, starting from Tim Berners-Lee. These pages were organized lists of preferred links, chosen and commented by someone. I remember that every Internet Service Provider had such a page.

Yahoo! was an evolution of this things: an organized directory of selected website.

Now Google, with the PageRank algorithm, is a good examples of partially automated social search: webpages are collected by robots, by their ranked is evaluated starting from what webmasters decided.

Now, after the coming of Web 2.0, popular social search tools are bookmarking, tagging, boting services.

Chris Sherman focuses on some problems of social search: the web grows “too quickly for humans to keep up with it.” Algorithmic search is needed to be comprehensive. Furthermore, tags are not the solution to categorized the web: language is ambiguous and, even with a controlled vocabulary, people are too lazy to use it.

Although I think there is lot of work to solve this last problem (see my posts’ categorization, it’s a mess), I’m still positive with tagging and social search: old ways to categorized links have more problems. See Ontology is Overrated: Categories, Links, and Tags by Clay Shirky.