Jump to content
Washington DC Message Boards

Understanding the concept of DCpages


Recommended Posts

Bernie forwarded me this...


If you think you're reading the news, be warned that this story -- and any other on the web -- will be barely read by anyone 36 hours after it was first posted. That's the message from a team of statistical physicists who have analysed how people access information online. Albert-László Barabási of the University of Notre Dame in the US and colleagues in Hungary have calculated that the number of people who read news stories on the web decays with time in a power law, and not exponentially as commonly thought. Most news becomes old hat within a day and a half of being posted -- a finding that could help website designers or people trying to understand how information gets transferred in biological cells and social networks (Phys. Rev. E 73 066132).



The web portal



Physicists like Barabási are interested in studying the World Wide Web because it is an example of a "complex network", with a topology that changes as new documents and links are continually added. His team pictures a typical news web site as a series of circular blobs, or "nodes", each of which corresponds to an individual news story, with a line joining each node if the two stories are connected by a hyperlink (see figure). The area of each blob is proportional to the logarithm of the number of visits to each document.


Their model reveals that a typical news site has a relatively stable "skeleton" -- corresponding to the overall organization of the site -- along with nodes (that is, actual stories) that are only temporarily linked to the main structure before being deleted from the site or not linked any more. In this sense, the network resembles a biological cell's regulatory network, whose "wiring" can change rapidly during a cell cycle. It is also a bit like social networks: we each have a relatively stable core network of friends and acquaintances but the number of people we interact with can vary drastically from one day to the next.


To get a fuller understanding of such networks, Barabási and colleagues decided to study the visiting patterns on a popular Hungarian news and entertainment portal (origo.hu). Thanks to automatically assigned "cookies", the scientists were able to reconstruct the browsing history of about 250,000 visitors to the site over the course of a month.


The researchers found that the documents belonging to the skeleton of the website receive an approximately constant stream of visitors, which means that the cumulative number of visitors accessing these documents increases linearly in time. In contrast, the news documents receive the most hits directly after their release, and decrease with time. Thus, the cumulative numbers of visits here reach saturation after just a few days.


Barabasi's team calculated the "half-life" of a news document, which corresponds to the period in which half of all visitors that eventually access it have visited. The researchers found that the overall half-life distribution follows a power law, which indicates that most news items have a very short lifetime, although a few continue to be accessed well beyond this period. The average half-life of a news item is just 36 hours, or one and a half days after it is released. While this is short, it is longer than predicted by simple exponential models, which assume that web page browsing is less random than it actually is.


The short life of a news item -- combined with random visiting patterns of readers -- implies that people could miss a significant fraction of news by not visiting the portal when a new document is first displayed, which is why publishers like to provide e-mail news alerts. The results also show that people read a particular web page not just because it looks interesting but because it can be accessed easily.


Although the average half-life varies for different types of sites, the decay laws identified are likely to be generic because they do not depend on content, but are manly determined by a user's visiting and browsing patterns.


"Such quantitative approaches to online media not only offer a better understanding of information access, but could have important commercial applications as well – from better portal design to understanding information diffusion, flow, and marketing in the online world," say the researchers.

Edited by wiley
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...