Archive for the ‘Uncategorized’ Category
Eventually consistent world-view
Posted by: Manas in Uncategorized on December 19th, 2009
Till sometime back, we used to build systems that were consistent at all times. When the user takes an action, all the data will be updated at all places synchronously. If any of the step fails, the entire transaction will be rolled back so that there was no inconsistency in the data.
That’s how it was till sometime back.
Then came a time when the data became geographically distributed, more data had to be manipulated per request, the data resided in multiple repositories and most importantly, the response time had to be reduced. This led to a new class of design where the data wasn’t always consistent. It was eventually consistent.
Eventual consistency of data meant that you could update part of the data now, and part of the data later and some part much later. The reason for doing this was simple, people demanded more real-timeliness; more than was possible by keeping data always consistent. In the new scheme of things, the application design will try to keep data always consistent for a specific user while at the aggregate level, it may not be. And more importantly, even for a user, a glitch once in a while has become OK.
The users today have become more tolerant to once-in-a-while glitches while the responsiveness rules.
Now, let’s look at media. Once upon a time, the journalists used to work very hard to come up with a consistent world-view that they could present to their readers. They’d verify the information from multiple sources to ensure that there were no holes in the story. Of course, it was a time-consuming process. It required work that had to be done before the news was broken.
But the very nature of news required speed in breaking the news. This speed requirement conflicted with the operating model of corroborating the information from multiple sources or verifying things to ensure that whatever was reported was always consistent with what actually happened. The competition forced media to take a route where the news would be broken as soon as it arrives and further verification would be done a little later.
That’s where things became interesting. Now, there was a shift in the model. The responsiveness of a media setup became more important than accuracy. You could be wrong once-in-a-while but you had to be first. Earlier, a witness would tell a journalist and a journalist would in turn verify and publish. But now, since the journalist would just publish and verification might happen later, the witness could just short-circuit the whole process and directly publish.
The first wave of this transition happened with blogs and twitter now rules the scene. It’s all because of responsiveness. It’s true that people can post false information also. But then the right information also trickles along. People may be misled for a while but they are fine with that. They eventually get to know the truth. The world-view with Twitter is eventually consistent. And that’s all that is required.
Media may be in the denial mode for now but they should understand it very well. They were the ones who invented eventual consistency of world-view to increase responsiveness. Or rather, the world-view of people has always been eventually-consistent, the current media setup was the first one to leverage it for increasing their responsiveness.
Why Twitter Search can rival Google Search
Posted by: Manas in Uncategorized on June 17th, 2009
There is something about Twitter search that makes people ga-ga about it. It’s called real-timeliness. While Google search shows what’s been around for a longish time (in the internet sense of the word), Twitter can show you what’s the latest, right now kind-of-stuff.
However, one point is completely being missed out in the debate of aged-content Vs fresh-content. And the point is that PageRank algorithm, when applied to Twitter content, will give better results as compared to when it is applied to whole of web.
Let me explain.
PageRank algorithm essentially computes the popularity of a piece of content in the context of some keywords. So, there are two aspects: popularity and the context of keyword. Let me take them one by one:
- Popularity: It is measured by finding out how many other pages link to a particular page. Not all the links are considered equal. If a link comes from a domain that itself has a lot of incoming links, it is considered far more valuable compared to the domain which doesn’t have much of incoming links. This is in effect a mimickry of the reputation system that’s prevalent in the society. If a highly reputed person praises you, that praise is worth more than the praise from a less reputed person. And the reputation is a function of how much praise you eventually get.
- Keyword Context: So, when a link is made to a webpage, it is made within a context. That’s the keyword context. Technically, it is called the topical relevance of a link. What it means is that if I praise you and in my praise I say that you are very helpful, the keyword helpful gets attached to you. If I say you are very artistic, the word artistic gets attached to you.
Now, let’s see how this PageRank algorithm applies to Twitter.
The popularity of a link can be measured by how many times it has been published on Twitter. Moreover, more the followers of the person who tweets the link, more the brownie points for that link.
The keyword context for the link is the entire tweet. A tweet itself is so short, that it the message enveloping the link has high topical relevance.
All in all, the PageRank algorithm when applied to Twitter will give very good results.
Let’s see who is the first one to exploit this
By the way, for a more in-depth understanding of this subject, check out the trail on Search Engine Optimization. Though the primary focus is SEO, it still has good content on understanding how the search engines work.
Computer Devata
Posted by: Manas in Uncategorized on May 12th, 2009
You go to an office, you want a report from them, you ask them for the report. You can’t get the report yet, you must wait. What could be the reasons?
Not-so-long-ago
- The person who creates the report has gone for tea.
- The file from which data needs to be copied is missing. Someone is trying to locate it.
- The person who needs to sign the report missed his bus and is still on the way to office.
And many more like these. But what’s the single reason that I get today when my work can’t be done by the concerned officials?
- The server is down.
That’s all. I couldn’t collect a medical report from Manipal hospital because their server was down and they couldn’t print the report. My Bangalore Airtel numbers are not getting provisioned because their server is down. Once I stood on Deccan Airways window for a ticket for 3 hours because their server was down.
The life is moving online very fast. However, we are not doing enough to make the online world reliable. The common practise is to just move all the operations and the data online and assume that it will always work.
But the reality is that it doesn’t always work! Wake up guys, build reliability and the speed into your system. Don’t assume they come for free. On the contrary, they cost a lot of money.
RSS