Archive for the 'IT/Comm' Category

Can you spot the spam source?

Saturday, August 19th, 2006

McAffee SiteAdvisor offers quizzes to test users’ skills about sites that might lead to spam and spyware. I found them interesting. It’s not always possible to tell what site may lead to spam simply based on the site’s looks. And in some cases you have to do a reasonably careful reading of the site’s privacy policy to figure out whether use of the service may result in hundreds of spam messages within a few days of signing up.

This is an interesting idea, a potentially neat way to educate users about spam and spyware problems. The tool is lacking significanty in one domain though. I think it would be MUCH more useful if the results page included an analysis of the privacy policies to point out to users what it is exactly that should serve as a red flag in the various policy statements.

The survey I administered last Winter to a sample of 1,300+ college students about their Internet uses included a question about how often, if ever, students read a site’s privacy policy. It turns out that 37% of respondents never do so and an additional 41% only do so rarely. No wonder people are still struggling with spam problems.

Unfortunately, at some level it doesn’t matter what you do if your friends are not careful with your address. I have a very private address I had only given out to a few dozen people emphasizing several times that they should never enter it on any Web sites (e.g. ecards or whatnot) and should only use it for one-on-one communication (so also requesting that they avoid its inclusion on cc lines). Some of my friends couldn’t follow these requests and now the address receives about 40 spam messages/day. I realize that’s not a lot in the grand scheme of things, but the point is that none of that was due to anything I had done with the address given that I had never entered it on any Web sites and had only ever used it to send one-on-one emails to a few dozen people.

The AOL data mess

Monday, August 7th, 2006

Not surprisingly this is the kind of topic that spreads like wildfire across blogland.
AOL search data snippet

AOL Research released (link to Google cache page) the search queries of hundreds of thousands of its users over a three month period. While user IDs are not included in the data set, all the search terms have been left untouched. Needless to say, lots of searches could include all sorts of private information that could identify a user.

The problems in the realm of privacy are obvious and have been discussed by many others so I won’t bother with that part. (See the blog posts linked above.) By not focusing on that aspect I do not mean to diminish its importance. I think it’s very grave. But many others are talking about it so I’ll focus on another aspect of this fiasco.

As someone who has research interests in this area and has been trying to get search companies to release some data for purely academic purposes, needless to say an incident like this is extremely unfortunate. Not that search companies have been particularly cooperative so far – based on this case not surprisingly -, but chances for future cooperation in this realm have just taken a nosedive.

To some extent I understand. No company wants to end up with this kind of a mess on their hands. And it would take way too much work on their part to remove all identifying information from a data set of this sort. I still wonder if there are possible work-arounds though, such as allowing access on the premises or some such solution. But again, that’s a lot of trouble, and why would they want to bother? Researchers like me would like to think we can bring something new to the table, but that may not be worth the risk.

Note, however, that dealing with sensitive data is nothing new in academic research. People are given access to very detailed Census data, for example, and confidentiality is preserved. From what I can tell the problem here did not stem from researchers, it was someone at AOL who was careless with the information. But the outcome will likely be less access to data for all sorts of researchers.

Another question of interest: Now that these data have been made public what are the chances for approval from a university’s institutional review board for work on this data set? (Alex raises related questions as well.) Would an approval be granted? These users did not consent to their data being used for such purposes. But the data have been made public and theoretically do not contain any identifying information. Even if they do, the researcher could promise that results would only be reported in the aggregate leaving out any potentially identifying information. Hmm…

For sure, this will be a great example in class when I teach about the privacy implications of online behavior.

Not surprisingly, people are already crunching the data set, here are some tidbits from it.

A propos the little snippet I grabbed from the data (see image above), see this paper of mine for an exploration of spelling mistakes made while using search engines and browsing the Web. About a third of that sample was AOL users.

The image above is from data in the xxx-01.txt file.

Scrollable ads

Monday, August 7th, 2006

GMail does something very smart with the Sponsored Links it displays in the Webclips area just above the message view area, it lets the user scroll back and forth among the ads.

Maybe I’m an odd one for actually looking at ads on occasion, but sometimes they do tell you about helpful or interesting information and services. So I like to click on them sometimes. However, more often than not, I just glance at them in the corner of my eye as I am about to move to another page. What then happens is that the ad changes. In GMail, I can just click on the back button in Webclips and get the ad (or whatever RSS feed I may have missed).

GMail Webclip

On most sites this is not possible (e.g. Yahoo! Mail). If you click the back button of your browser, chances are that some other ad is dynamically generated on the page you were just viewing by the time you return to it. It’s a bummer as some of those ads could be of interest to users a split second later.

Data ain’t just for geeks anymore

Wednesday, June 28th, 2006

Via Jim Gibbon I’ve discovered GapMinder. Wow! It’s a wonderful visualization tool for data. The focus is on world development statistics from the UN. The tool is incredibly user-friendly and let’s you play around with what variables you want to see, what you want highlighted in color, whether you want to log the data, what year you want to display, and whether you want to animate the time progression (oh, and how quickly).

I’ve made an example available on YouTube. (I used Gapminder to create the visualization and Hypercam to capture it.)

Here is some context for that particular graph. My first interests in research on Internet and social inequality concerned the unequal global diffusion of the medium. I wrote my senior thesis in college on this topic and then pursued it further – and thankfully in a more sophisticated manner – in graduate school. So this is a topic that has been of interest to me for a while and it’s great to be able to play with some visual representations of the data.

So what you have on the video graph is a look at Internet diffusion by income (logged) from 1990-2004. I picked color coding by income category, which is somewhat superfluous given that the horizontal access already has that information, but I thought it added a little something. (For example, to summarize the puzzle of my 1999 paper – the first to run more than bivariate analyses on these data -, it focused on explaining why all the red dots are so widely dispersed on the graph despite all representing rich long-term democratic countries.)

Thanks to the tool’s flexibility, you can change it so that the color coding signifies geographical region and could then tell immediately that what continent you are on – an argument some people in the literature tried to make – has little to do with the level of Internet diffusion.

Gapminder example

Imagine the possibilities of all this in, say, classroom presentations. Jim links to a great presentation using this tool. (Although I disagree with the presenter’s conclusion at the end about the leveling of differences regarding Internet diffusion.)

I recommend checking out the tool on your own for maximum appreciation of its capabilities.

Annotated maps

Tuesday, June 27th, 2006

As you may have noticed by now, I like maps. In fact, geography was the only elective I took in high school, two optional years in addition to the two required (no, I didn’t go to high school in the U.S. as you are likely able to guess from that info). Those classes included lots of material of less interest to me (e.g. leading mineral producers in the world and what shrubs grow in the tundra), but we also got to look at maps a lot, which was the main reason I was hooked.

Image Hosted by Free image hosting*

Given these interests, I was excited to find Quikmaps this morning, a service that lets you annotate Google Maps, save them, go back and edit them, and in the meantime post them on your Web site. There have been other related services (GMapTrack comes to mind), but none have managed to do this as well as Quikmaps. I have been using Wikimapia for some map annotation purposes, but it’s not so good when the locations you are specifying have limited appeal. The one problem with such independent little upstarts is you never know how long they’ll be around (e.g. GMapTrack is nowhere to be found) so it’s not clear how much time and effort one should spend creating maps.

Nonetheless, if you want to explain to someone how to find you or want to annotate your favorite locations (or just restaurants) in town, this seems like a very helpful service.

[thanks]

[*] I have purposefully avoided embedding a map here. I don’t want CT page loads to be too taxing on the Quikmaps site. It should be busy enough dealing with the digg effect .

Bloggers on survey findings

Friday, June 16th, 2006

Rob Capriccioso of Inside Higher Ed reports on what Glenn Reynolds of InstaPundit, Markos Moulitsas Zúniga of Daily Kos and Jessica Coen of Gawker think about college students’ lack of interest in political blogs and Beltway gossip.

While I appreciate that they are happy with students spending their time on things other than politics, their responses ignore the fact that students do follow news, they just don’t do so on political blogs. All of the responses present time spent on these blogs as competition for time spent having fun with friends. However, findings from the survey suggest that students do follow current events (59% look up local or national news daily or weekly; 44% look up international news that frequently) so it’s not as though students only care about sex and beer. Granted, the survey doesn’t ask about the specific type of news they follow, but chances are that some of the material overlaps with topics covered on these blogs.

Additional info in the article includes my response to the inevitable question: “What about porn?”.

What do college students do online?

Wednesday, June 14th, 2006

How does the popularity of Facebook compare to MySpace among a diverse group of college students? What types of blogs are students most likely to read? How many have ever visited Instapundit or Daily Kos?

As mentioned earlier, last month I gave a talk at the Beyond Broadcast conference hosted at Harvard Law School. The conference folks have now made the presentations available in both audio and video format. You can listen to or watch my talk misleadingly titled “Just a Pretty Face(book)? What College Students Actually Do Online”. (The title is misleading, because the talk is not about Facebook or even social-networking sites more generally speaking. Rather, it’s about what young people do online and how it differs by type of background.) I have put the presentation slides online in case you are curious to see the specifics (those are hard to follow on the video and there wasn’t enough time for me to mention stats in the presentation).

I should note that these are all still preliminary findings as I need to do more data cleaning and there’s tons more to do on the analysis front. But I don’t anticipate major changes in the findings presented given the size of the sample.

If you prefer text over these various other options I will be writing up the findings this summer and will post a link once it’s done. But if you can’t wait to find out the answers to the above questions then I recommend clicking on one of the above links. (All this information is toward the end of the presentation.)

Okay, fine, I won’t make it that difficult. The quick answers to the above questions are (again, for this group of college students):
1. Facebook is more popular (Facebook 78%, MySpace 51%)
2. Political blogs are the least popular type of blogs (from among the ones asked, which included personal journals, arts/culture/music, technology, sports)
3. 1% have ever visited each

There’s lots more info in the presentation.

Recall that many of you took a survey back in January here on CT about your use of various sites and services. I haven’t forgotten that I still owe you a summary of the responses and that is forthcoming as I analyze the college student Internet use data. I thought reporting the former may be more interesting in the context of the latter thus the delay.

Seminar on “The Wealth of Networks”

Tuesday, May 30th, 2006

Crooked Timber is running a seminar on Yochai Benkler’s The Wealth of Networks. The book discusses several important and interesting issues and we’re hoping that these comments will only be a start of conversations about them. The introductory post has links to all of the contributions (by Henry Farrell, Dan Hunter, John Quiggin, Eszter Hargittai, Jack Balkin and Siva Vaidhyanathan) including a response from Benkler.

Weekend map projects

Friday, May 26th, 2006

WikiMapia

There are some exciting developments in the online map space these days. WikiMapia is a wiki approach to Google Maps that let’s you add notes and tags to maps all over. MapCruncher is a program that lets you draw maps on top of other maps (or something like that). I haven’t been able to try out the latter yet due to some of the requirements, but I’m hoping it’ll come together soon as it sounds very promising.

[thanks and thanks]

UPDATE: I finally got MapCruncher to work. It requires Windows XP and the .NET 2.0 runtime, which is not as obvious as the Web site makes it sound. Also, rendering the map (overlaying the north-side map of the Chicago El on Virtual Earth) took about 18 minutes, not the 5-10 the site suggests.

Beyond Broadcast

Friday, May 12th, 2006

Berkman in Second Life
Today (Friday), the Berkman Center for Internet & Society at the Harvard Law School is hosting a conference on Reinventing Public Media in a Participatory Culture. In addition to the face-to-face discussions, the conference is also integrating digital media in neat ways for participation by those who can’t be at the meeting physically. For example, there is a Berkman Island (including a 3D replica of the Ames Courtroom at the Harvard Law School) in Second Life. If you get a chance, come join us, it looks like there will be some very interesting presentations and discussions.

A flickr of new spam

Thursday, April 6th, 2006

Recently, I have received a few requests from Web sites asking permission to use my photos posted on Flickr. Of course, there is a flattering element to all this. Wow, someone thinks some of my photographs are worthy of being reproduced. Perhaps not surprisingly, however, these requests are rarely for photos I consider particularly good or interesting.

The last such email I received had a curious subject line: “Re: Your jennifer Aniston Photographs”. I don’t have any “jennifer Aniston” photographs, not any I can recall. That was clue #1 as to the possibly fishy nature of the message. Clue #2: the link provided in the email that I should click if I was interested in sharing my photos with the site’s members seemed to be an individualized link (a sequence of numbers after a generic URL) suggesting that my response was being tracked. The URL had “flickr” in it, a convenient way to confuse people and have them think that they’re simply clicking on a Flickr photo link. No, it was a link to the site being advertised by the message.

Yes folks, I think these supposedly flattering messages are all about advertisements for the sites in question. They don’t really care to use our photos, they are mostly just interested in getting the word out about their sites and services. Some of them at least put in some effort by looking up a relevant photo to suggest for inclusion. But others don’t even bother to pretend that they have any connection to you other than including you in a new type of spam scheme.

I know there are several Flickr users who read my blog. I have heard from one of you about a similar experience. Anyone else? I’m purposefully not listing the sites that have contacted me, I’m not going to play along. However, I’m curious if anyone else received a message from “Calder” with the cryptic link.

A matching problem

Saturday, April 1st, 2006

This year’s Google April Fool’s joke is Google Romance, a service that will help you find your romantic match. It’s sort of cute, although I think some of their past jokes have been better.

The site does bring up something I have been meaning to blog about so I’ll take this opportunity. It concerns the paradox of matching services such as dating Web sites or job search sites. I haven’t thought about this issue too much, but enough to blog about it. (What’s the threshold for blogability, by the way?:)

Services such as dating and job search sites promise the user to find a perfect match, whether in the realm of romance or the labor market. But deep down, is it really in the interest of these sites to work well? After all, if they do a good job then the seekers are no longer relevant customers and the sites lose their subscribers.

One way to deal with this is to offer additional services that go beyond the matching process. For example, the match-making site eHarmony now has a service for married couples. It is an interesting idea. It seems like a reasonable way to expand their user (subscription!) base so they are not dependent on keeping matchless those whom they promise to connect. Moreover, I can see that they may have quite a loyal user base in those whom they helped find their matches. Job sites can also offer services that go beyond the initial match. Nonetheless, I think there is an interesting tension in all this.

On a not completely unrelated note: Happy Birthday to GMail! Fortunately, that was not an April Fool’s two years ago. I came across the Google Romance notice on Google’s homepage, because I saw the GMail birthday icon and wanted to see if they had it in bigger on the Google homepage (a page I never visit otherwise, because why would I in the age of search toolbars). The birthday image is not reproduced there, but I did see the Romance link. (Yes, I’m obsessed with knowing how people end up on various sites and I’m projecting here by assuming that anyone else cares.)

The Electronic Frontier Foundation

Monday, March 20th, 2006

Two blogs with which I am affiliated – Crooked Timber and Lifehacker – made the top 10 list of referral blogs to EFF’s fundraising campaign. I was so glad to hear that!


Support Bloggers' Rights!
Support Bloggers’ Rights!

Geekier than geeky

Thursday, March 16th, 2006

You may have to be a pretty particular breed to appreciate the following, but I can’t be the only one around here.:) I found this Web 2.0 or Star Wars Character quiz quite entertaining. I scored 33 and while it is probably a sign of something positive that I didn’t score higher, I was still a bit disappointed. My point range gets the following recommendation: “As your doctor, I recommend moving out of your parents’ basement.” The whole thing is quite amusing, try it. Don’t look at the score chart until you’ve taken the quiz, you don’t want to spoil that part of the fun.

Favorite tech writing?

Thursday, March 16th, 2006

The University of Michigan Press is putting together a volume called The Best of Technology Writing 2006. The editorial team is soliciting suggestions for pieces, including blog posts.

[W]e’re asking readers to nominate their favorite tech-oriented articles, essays, and blog posts from the previous year. The competition is open to any and every technology topic–biotech, information technology, gadgetry, tech policy, Silicon Valley, and software engineering are all fair game. But the pieces that have the best chances of inclusion in the anthology will conform to these three simple guidelines:

    1. They’ll be engagingly written for a mass audience; if the article requires a doctorate to appreciate, it’s probably not up our alley. Preference will be given to narrative features and profiles, “Big Think” op-eds that make sense, investigative journalism, sharp art and design criticism, intelligent policy analysis, and heartfelt personal essays.

    2. They’ll be no longer than 5,000 words.

    3. They’ll explore how technological progress is reshaping our world.

The resulting publication will be available both in book form and online.

Hop on over to digitalculture.org for more information and to submit your nominations.

Unique photo gift ideas

Sunday, February 5th, 2006

Here is another Lifehacker feature for you by yours truly: Unique photo gift ideas. Note the kid illustration. That’s me.:) I decided to live out the 15 minutes of baby fame I never got back when. Once back at my machine, I’ll post the image on Flickr in full size and will add a link here, in case you’re curious.

UPDATE: I’ve posted the images: greeting card, movie poster.

Quick survey: sites and services

Tuesday, January 24th, 2006

It’s been a while since we’ve had a survey around here. This one is on what sites and services you know about and use. It should take no more than 2.5 minutes. I’ll report back with results and why I am interested in this in a few days.

Take the survey. Thanks!

A twist on online communities

Friday, January 20th, 2006

Judging from my posts around here – not to mention my daily browsing habits – I’m obsessed with Flickr. I wanted to take a step back and give a bit of basic info about the site to those who are not that familiar with it. It is my way of trying to spread all that Flickr goodness to more people.

Flickr may seem like no more than a photo-sharing Web site, but it’s actually much more than that. It is a large community of people sharing images, yes, but also learning about a myriad of topics, exploring nearby and distant lands, and communicating with people from all
over the world. In some ways it resembles corners of blogworld. One important difference is that a good chunk of the communicating is done through images rather than text.

Flickr can help you get to know people in all sorts of ways through their photos (and I don’t just mean by looking at what they had for dinner, although frankly, if the cook or restaurant is a good one, that can be interesting as well), you can also get to know cities (e.g. the Guess Where Chicago and Guess Where NYC groups are both fun and informative), learn about healthy foods, read thought-provoking (or not) quotes, and much more.

In case you don’t need these basics, perhaps you’ll find some helpful tips in my guide to finding great photos on Flickr published yesterday on Lifehacker. Consider that the second installment to this post.

Here are some of the basic features of the site. Some of the links below will only work if you are logged in to the system. If you have a Yahoo! account then you are all set. If not, sign up for a free account now, you won’t regret it.*

  • At the most basic level, Flickr is for uploading and sharing your photos. There are several tools available for this from uploading in the browser to stand-alone applications (and even widgets). Or you can forward your cameraphone photos directly to your account.
  • Once you have uploaded your pictures, you can make them completely public, only accessible to contacts designated as family, only accessible to contacts designated as friends, accessible to both family and friends, or completely private.
  • You can post photos under Creative Commons license allowing others to use your images depending on the specifics. You can
    set a default license for all your uploads.
  • You can mark other people’s photos as your Favorites if you want to have easy access to them later. You do this by clicking on the Add to Faves button above the photo.
  • You can organize your photos into Sets. You can create new Sets under Organize. Also, once you have a Set, you can add a picture to it by clicking the Add to Set icon above the image.
  • You can join Groups based on various themes and topics. Click on Groups and then do a search on a topic of interest. Choose the group and join it as a member. Once you are member of a group, you can add photos to it. To add one of your photos to a Group, click on the Send to Group icon above the photo you are viewing. (You can only add your own photos to Groups.)
  • You can create Groups (private, invitation-only or completely public) organized around themes. If public then others can contribute their own photos to your group. Groups can also have ongoing discussions.
  • You can comment on others’ photos. You can also easily follow whether people have commented on or favorited any of your photos. The system also lets you see all the comments you have made on others’ photos and whether photos you have commented on have received additional comments.
  • You can add notes to your photos (or others’ photos if they allow it) by clicking on the Add Note tab above the image. Drag the box to the area on the photo that you want to annotate and add your comment.

As you can tell by this list of features, much of Flickr goodness comes from sharing photos with others in various systematic ways. There is also a lot of communicating that gets done in the comments and on the notes to photos.

Now that you know some of the basics of the site, you may be interested in this guide to finding great photos on the system.

* I am not affiliated with either Flickr or Yahoo!, I just think Flickr is a super service and want to help people understand it better so they become members of the community.

GMail Delete button!

Friday, January 20th, 2006

It looks like the makers of GMail have finally succumbed to pressure and have added a Delete button to all messages and folder views. I think it would be interesting to see all the internal discussions that surrounded the evolution of how one gets rid of messages in GMail: “Move to Trash” -> “Trash this Message” -> “Delete”. Finally.

Celebrating ten years of First Monday

Tuesday, January 17th, 2006

The journal First Monday started publishing IT-related articles on the Web in May, 1996. The entire archives of the journal have remained freely accessible to the public over the years. First Monday will be celebrating its 10th anniversary this coming May in Chicago with a conference appropriately focusing on issues concering open collaboration on the Internet. In line with the journal’s history and the meeting’s topic, the program and related materials will be available online for all to see. Submissions are due February 6, 2006.