Delicious: Finding high quality information is not easy nor fast

Delicious: Finding high quality information is not easy nor fast


Share


To tackle this problem, we created the SPEAR algorithm. SPEAR (Spamming-resistant Expertise Analysis and Ranking) is a new technique to measure the expertise of users by analyzing their public activities on platforms like Delicious. In the case of the latter, this means analyzing the timeline of the bookmarking and tagging activities of users. The focus of SPEAR is on the ability of users to find new, high quality information on the Internet. A great benefit of SPEAR is that it returns two very useful sets of results: first, a list of users ranked by their expertise; and second, a list of websites ranked by their quality. So, whether you are looking for experts on Delicious for the programming language JavaScript or want to find the best websites on photography, SPEAR can help.

delicious preview Delicious: Finding high quality information is not easy nor fast

On top of that, the algorithm has been shown to be very resistant to spamming attacks. We tested the SPEAR algorithm using data from Delicious – over 71,000 Web documents, 0.5 million users, and 2 million shared bookmarks. We set the algorithm to find JavaScript experts, for example, and it produced a list of users; the top two were professional software developers, and not a single spammer was ranked in the Top 200.

Technically, SPEAR is based on the well-known information retrieval algorithm HITS, a technique presented in 1999 that is used by search engines to rank Web pages. We came up with SPEAR by modifying HITS so that it fits to the characteristics of open and shared systems like Delicious and extended it with a new component that integrates the timeline of user activities into its analysis. This resulted in further performance improvements of the algorithm (refer to Figure 1 below).

The two main elements of the new SPEAR algorithm are:

1. Mutual reinforcement of user expertise and document quality: A user’s expertise in a particular topic depends on the quality of the documents she or he has found, and the quality of documents in turn depends on the expertise of the users who have found them.

2. Discoverers vs. followers: Expert users should be discoverers – they tend to be faster than others to identify new and high quality documents. In other words, “the early bird catches the worm” (see also Figure 1). SPEAR gives more credit to users the earlier they find high quality documents.

The combination of both these elements has the effect that SPEAR favors quality over quantity of user actions, and that the algorithm is quite resistant to today’s spamming attacks.

We believe SPEAR is very useful in the context of open systems, particularly, social networks. That said, we are already researching the next version of the algorithm – the popularity of online services like Delicious is rising, and so is the spam threat. Whether we want to improve the user experience on Delicious or win the arms race against spammers, there’s still a lot of work left to do! [Delicious]

Blog Widget by LinkWithin

Post information:
This entry was posted on Tuesday, September 1st, 2009 at 10:02 pm and is filed under Internet Trends
blog comments powered by Disqus
           Sponsors: TechJump! l Kiten l Mahallo Media l Alen Mak l Politics
Go techWALL Homepage