What is Toluu?
Toluu is a free service for sharing the feeds you read and discovering new ones.
Get Invite

Synthèse

"Opération qui procède du simple au composé, de l'élément au tout."


PageRank Effect on Collaborative FilteringJuly 30

I have done some experiments on the impact of PageRank on a collaborative filtering recommender for journal articles. The results are counterintuitive - to me anyway - but I think they might have a plausible explanation (I’m working on one anyway.)

I followed in the footsteps of TechLens+ and used article references as a proxy for “ratings” - in other words, assume that one article citing another means a (boolean) “positive vote” for the cited article. It’s a poor approximation, but it addresses the cold-start problem for a digital library recommender.

The idea behind using PageRank was to refine these boolean ratings and rank them on a scale. Using numeric PageRank values on the ratings (rather than a boolean value) has a surprising effect: Top-N prediction quality goes down! Furthermore, random values for PageRank are about the same as boolean (constant) values for PageRank.

I trust the Daniel Lemire is right about the value of negative results.


Digg RecommenderJuly 29

I’m a little behind on my summer blogging (reading and writing) and seem to have missed Digg’s announcement for their recommendation feature.

I don’t Digg myself, but from the video demo (on the blog link above), it looks like they have struck a nice balance between pure collaborative filtering, topic classification and serendipity of recommendations (from different samples of users within a given similarity neighbourhood.)


Canada # 1 in Computer ScienceJuly 10

Glen Newton alerts us to a recent article published in Scientometrics from which he deduced that Canada is the #1 producer of Computer Science research papers (per capita). This doesn’t come as a complete surprise, given the overall #6 ranking that I had noted Canada had in overall scientific publications output.


DistractedJuly 3

I heard journalist Maggie Jackson this morning speaking on the radio about her new book. Distracted: The Erosion of Attention and the Coming Dark Age.

Despite our wondrous technologies and scientific advances, we are nurturing a culture of diffusion, fragmentation, and detachment. In this new world, something crucial is missing–attention. Attention is the key to recapturing our ability to reconnect, reflect, and relax; the secret to coping with a mobile, multitasking, virtual world that isn’t going to slow down or get simpler. Attention can keep us grounded and focused–not diffused and fragmented.

The Wall Street Journal review of the book relates that:

In the end, Ms. Jackson makes her way to a Buddhist monastery, where people are learning to practice samatha – that is, to exercise voluntary control over their attention. Mountain retreats may not be for everyone, but the spirit of such an effort makes obvious sense in an era of information glut and tech-driven interruptions. Of course, if samatha – or something like it – turns out to be a good idea, it will be blogged about, praised in group emails, discussed online and debated in instant messages. Work will just have to wait.

So the answer to informati


Google Makes us StupidJune 23

So far, I’ve liked everything I’ve seen and read by Nicholas Carr (author of “The Big Switch: rewiring the world from Edison to Google”). I was interested and challenged by his recent article in The Atlantic “Is Google Making us Stupid“.  His basic thesis is that the information overload that results from the availability of huge amounts of data from search engines is making us unable to read closely and think deeply.

As part of the five-year research program, [scholars from the university of London] examined computer logs documenting the behavior of visitors to two popular research sites, one operated by the British Library and one by a U.K. educational consortium, that provide access to journal articles, e-books, and other sources of written information. They found that people using the sites exhibited “a form of skimming activity,” hopping from one source to another and rarely returning to any source they’d already visited. They typically read no more than one or two pages of an article or book before they would “bounce” out to another site.

[See this report for more