Blown To Bits

Archive for the ‘What is information?’ Category

The Internet Could Not Have Been Invented Today

Sunday, September 7th, 2008 by Harry Lewis

If you want to know why not, read “When Academia Puts Profit Ahead of Wonder,” an opinion piece in today’s New York Times. It’s about the unforeseen consequences of the Bayh-Dole act, which was meant to provide a profit motive to universities, to encourage them to transfer their scientific and technological discoveries to private enterprise as quickly as possible. As a result, the spirit of science and applied science has changed. One of the first thing that happens to students today is that they are informed that the university has rights to inventions and discoveries that come about as part of sponsored research. When I wrote some math software in 1968 that enabled users to write equations in ordinary 2-D notation and to see the graphs of those equations on a screen, I don’t think I had even heard the word “patent.” It was just not part of the vocabulary — certainly not the university’s possible interest.

If the Internet protocols were developed in a university setting today, the university would almost have to patent them and then give a single private company a long-term exclusive license to use them. The Internet would not be common property, and research at other universities would be restricted by the legal requirement that they negotiate use of the patent rights.

It’s a new world, and not a better one. Jennifer Washburn’s book, University, Inc., which is mentioned in the article, is also excellent, even though it’s a few years old now.

Endwistle’s alias

Monday, June 2nd, 2008 by Harry Lewis

An alias is literally just ‘another’ — another name someone uses, or another identity. An alibi (alias ubi) is ‘another place’ where a suspect in a criminal place claims he was at the time the crime was committed.

The term ‘alias’ has been adopted into tech talk to describe what happens when information is lost in the course of capturing it as bits. When you see the pixellation of a low-resolution image, or the staircase effect on what is supposed to be a straight, smooth line, you are seeing an aliasing phenomenon. The staircase is as close to a straight line as can be drawn using only a few pixels, but if what you were depicting really was a staircase, you’d get exactly the same representation. Different realities, when reduced to bits, wind up as the same representation, and there is no way to know from those bits alone which reality they came from.

Information is always discarded when anything continuous is represented as bits. The question is not whether such data loss happens, but whether it matters. And whether it matters depends on how the representation is going to be used. The author photo on this site is a good representation of us, but not if you wanted to recognize us from behind. In a digital audio file, it may not matter if very high frequencies are discarded, since most people over the age of 20 couldn’t hear them anyway.

What does this have to do with Mr. Entwistle, who is standing trial on charges of murdering his wife and child? We noted earlier that his computer gave up some bits that the prosecution planned to use against him: the URLs of some adult-oriented web sites he had visited. Apparently the prosecution will argue that these bits are relevant because the URLs gave a glimpse of Mr. Entwistle’s sexual dissatisfaction, thus helping establish a motive for the murder. Not so fast: the defense doesn’t deny that those sites were visited, but offers another interpretation of the same bits. As the Boston Herald explains,

Attorney Elliot Weinstein argued turning to steamy online porn sites is not necessarily an indication of a joyless sex life; it could also mean a couple was looking to spice up their marriage.

“It might improve sexual activity . . . it might be a curiosity,” Weinstein said during the final pretrial arguments in Middlesex Superior Court in Woburn.

Searching for porn may just be for “interest,” or “excitement” or to “expand knowledge,” Weinstein added in his appeal to strike any online sex surfing as evidence of prior “bad acts.”

The judge will decide whether these bits are relevant, and if they are, the jury will get to decide whose interpretation of them is more plausible. But the defense’s basic point is sound: decontextualized bits can represent more than one reality, and our digital fingerprints, while revealing, are an imperfect representation of who we really are.

Fighting World Hunger with BITS

Saturday, May 17th, 2008 by Ken Ledeen

As we wrote Blown to Bits, we came to recognize that many of the stories in the news were “bits stories.” Sometimes it’s a bit of stretch, other times far less so. Consider world hunger.

The price of rice has been rising. A story last month in the New York Times reported that rice producing coountries were cutting back on exports, civil unrest was rising, and a crisis loomed. For populations that spend a large portion of their income on food, populations where rice is often a staple of their diet, these increases can be devastating. It is a complex problem with potentially dire consequences.

But how is it a bits story?

The University of Washington’s “Nutritious Rice for the World” project is seeking to mitigate world hunger by analyzing rice proteins. The goal is to make it possible for farmers to grow rice strains with higher yields, greater resistance to disease, and even improved nutrition. It’s a noble cause, and a difficult scientific and technical problem. The computational needs of the project are enormous. Using conventional computing approaches the time to complete the analysis could well be measured in centuries.

We could just wait for computing power to increase. With computers doubling their performance approximately every year (close enough for this calculation), in a a decade, they will be 1000X faster. A task that would take 200 years using todays’ computers should take 73 days then. Wait another decade, and they will be 1,000,000 times faster and our protein analysis will take less than 2 hours. But the rice crisis won’t wait that long.

The team at the University of Washington had a better solution – harness the aggregate computing capacity of thousands and thousands of otherwise idle computers – your computer and mine (if you choose to participate). They joined the World Computing Grid project.

A computing grid is a loose collection of computers that work cooperatively, each doing a small portion of a large computing task. It’s similar to the way Google works – dividing the processing of your query across lots and lots of computers so that the response is fast. This particular grid joins technology and social involvement, allowing individuals to “contribute” unused computer time. In addition to analyzing rice proteins, WCG now has active programs for cancer research, AIDS, protein folding, denque fever, and more. The WCG harnesses the computing power of over 1,000,000 computers from more than 380,000 participants.

This is one more example of the transformative power of the digital revolution. Not only is it possible to do complex protein structure analysis, but also we can share the task across thousands, even millions of computer linked through the Internet, computers that belong to ordinary citizens of the world, with a shared purpose, part of a community that has been made possible only by virtue of the social connectivity that the Web engenders and supports.

Even world hunger is a bits story.

Electronic Medical Records Dangerous to Your Health?

Tuesday, May 13th, 2008 by Harry Lewis

Writing in the April 17 New England Journal of Medicine, Pamela Hartzband and Jerome Groopman throw cold water on the way the electronic medical record is being “touted as a panacea for nearly all the ills of modern medicine.” Among the touters they mention are George Bush, Michael Bloomberg, the presidential candidates, insurance companies, Google, and Microsoft. Electronic records are now in wide use, especially in leading teaching hospitals. But in actual practice, they observe, the record is sometimes simply “clinical plagiarism,” in which “physicians have clearly cut and pasted large blocks of text, or even complete notes, from other physicians.” It is now so easy to drop in a patient’s lab results in their entirety, that finding the wheat in the bushels of chaff is “like ‘Where’s Waldo?’,” according to one of the authors’ colleagues. Cutting and pasting for completeness replicates garbage and gold indifferently, making the electronic record “a powerful vehicle for perpetuating erroneous information.” “The worst kind of electronic medical record,” they write, “requires filling in boxes with little room for free text.” In a brief clinic visit, physicians may spend most of their time pointing and clicking rather than talking to the patient. I am reminded of the state automobile inspection process, at the end of which the driver is handed a printout of the dozens of details the inspector “checked” on the screen but not on the vehicle.

In the summer of 1971, KSL needed a programmer for his pioneering startup company, Computer Systems for Medicine, Inc. The company’s systems would take medical histories from patients. The system was a DEC PDP-8 computer and a teletype machine. HRL was starting graduate school and needed to earn a few bucks over the summer. He did the coding. He managed to wedge the entire program into 4K of 12-bit PDP-8 words. The working system was operationally a marvel; the branching logic was much more efficient at homing in on problems than the old fill-it-all-out paper history forms, and as Weizenbaum had discovered five years earlier with ELIZA, people would tell the computer things they did not feel comfortable telling a living, breathing human being.

That was 37 years ago. The engineers have been much more successful at increasing the storage capacity of computing devices than than society has been at figuring out how to make good use of all those bits that are now captured, reproduced blindly, and, very often, never examined critically.

The Underground Bits Economy

Thursday, April 10th, 2008 by Hal Abelson

One sign of a maturing industry is the development of aftermarkets. First there were cars, then there were used car dealers. And first there were bits, and then there were … used bits dealers? Some used bits transactions are legit, if possibly annoying. You give Sam’s Health Foods your email address so Sam can confirm your order for organic bean sprouts, and the next thing you know, you are receiving emails from Mary’s Gardening Tools. Sam decided to share his email address files with Mary, and Mary thinks that bean-sprout-eaters are more likely than other people to be gardeners. Of course, this is the kind of “sharing” that puts a few bucks in Sam’s pocket.

Other used bits dealers are like the people who steal catalytic converters and fancy headlamps from late-model cars and then sell them on the black market. There is a robust underground economy in bank account numbers, credit card numbers, eBay accounts, and even full identities. According to Symantec Global Internet Security Threat Report (downloadable free here), the going rate for bank account numbers is $10-$1000, while credit card numbers are $0.40-$20.00 each (but are usually sold in bulk). Bank account numbers cost more, because getting money from a bank account is quicker and, if properly done, leaves fewer fingerprints than converting a credit card number to cash. Identities go for $1-$15, but EU identities cost more than US identities, perhaps because of rising demand.

It’s a fascinating report. Symantec is in the security business, but many of the trends and recommendations are of general interest, unrelated to Symantec’s products. For example, the robust market in bank account and credit card numbers has made services like Paypal increasingly popular. Such electronic payment systems are guaranteed against misuse and they do not require revealing any financial information to the online store.