"So who can trust all those ruthless spying stealing slimeball Internet companies anyway?" Because of hype, media, and frustration about so many problems that I saw as easy to fix that weren't getting fixed, this is the question that weighed on my mind for many years.
Who is leaking data and how fast? Who can be trusted and who can't?
In the fall of 2004, with the help of Dr. Soule, I focused the question into the premise of an experiment. I would register with several web services and see how fast and how far the e-mail address spread.
Step 1. 100 e-mail accounts
We asked our school system administrator for help in creating 100 new unique e-mail addresses. I wanted to use a different e-mail address for every site and each sign up on that site.
Step 2. Configure Microsoft Outlook
The only methods of accessing the e-mail accounts was to use a web tool (ick!) or a mail reader. Outlook may not have been the best choice, but it did the job. After setting up all 100 accounts (manually, one at a time), I was able to check the progress of each quite simply.
In a later part of the experiment, I used Outlook to export the results to an Microsoft Excel spreadsheet for faster and more efficient analysis. Note that for some reason, Outlook exported all fields that I needed EXCEPT for the "Recieved Date". Come on Microsoft! How dumb is it to export all fields from an e-mail box except for the date! Ok, calming down now… breathe…. breathe…
Step 3. Selecting the Companies
- They must offer a web-based service
- The service must be free
- They must collect an e-mail address as part of the sign up process
- They must be popular or well known (there were some exceptions) with particular focus on companies that advertise heavily through Spam, banners, or pop-ups
- The company behind the service must be based, in or operate legally in, the USA
Note that we did not use services we knew would generate Spam such as pornography, gambling, and warez sites. The point of the experiment was to test sites that people generally trust.
Here is the final list of companies (after botched sign ups were removed):
PC World Magazine
University of Phoenix
Big 5 Sporting Goods
Medical Hair Restoration
Some of the lesser-known companies were added as a result of my personal use of Internet. Any company I came into contact with during the sign-up phase, I added.
Step 4. Signing Up
We set a time limit of about 3 weeks for the sign-up period figuring we'd account for the time diffferential later. The sign ups were done under the following three profiles:
- Normal Account
This user doesn't really know much about computers or is too impatient to wattch what they're doing. They will go through the sign up process as quickly as possible without bothering to read fine print or make changes to default options.
Desired result – For this user, we should get any only newsletters or notices that we didn't opt-out of during set up.
Note that for all Normal accounts, I entered my actual mailing address along with a code (customized for each company) so I could also track any junk mail that resulted from the sign up.
The more knowledgable user who knows to look for the check boxes that opt-out of lists and sharing when possible. These users will often put only the data that's necessary for the sign up process and no more (and many will put fake data for everything but the e-mail).
Desired result – We shouldn't get anything except for companies that hide options in the account pages.
This user is far more rare. They will do the sign up process then log into their account and scan all the profile options looking and opting-out of any mailing lists or data sharing options that are set by defult. There aren't too many of these types of sign up since most companies give you all relevent options during the sign up process.
Desired result – There should be no e-mails other than system messages.
To manage these three profiles, I first did a Normal account. If the service presented any privacy or data-sharing options, I returned and did a Savvy sign up with a different e-mail address. If, after signing up and logging in, I was able to find any options that weren't present during sign-up, I would create a Maxx account.
During the sign-up period, some additional profile types were clearly needed. Each of these I consider "casual contact" meaning that I provide my e-mail address for a service, but expect it to be one-time use.
Sites that say "send this site to a friend" or "send a card". In these cases you enter the e-mail address of some hapless victim and then what? Do they get Spam too?
Desired result – The introduction message should be sent to my friend, and I should receive at most two messages stating that the message has been sent or received.
In these cases, I just ordered a service or product from a site and don't want any further interaction with them. Will they hold on to my data and use it for their benefit?
Desired result – Messages relating to billing and shipping only.
What happens when I just send site feedback? Or ask customer service a question?
Desired result – Messages directly related to the feedback only.
Will they actually leave us alone if we check all those boxes to opt-out?
The purpose of opening up to three accounts per service was to measure to what degree companies honored our choices. We theorized that at least some of these companies would share the data or send us e-mail despite our requests to be left alone.
Step 5. Sit Back and Wait
Design, set up, and sign-up periods were completed October 1st, 2004. Now that everything was in place, all we had to do was watch the Spam roll in… or so we thought.
Spammed to Oblivion!
Well, not really. The original plan was to not open or click on any links in the e-mails for two reasons:
- We knew that the companies can tell when e-mails with graphics are opened. This could have caused them to change their behavior by realizing that we were at least looking at their messages
- Clicking on links to other companies within the e-mail will tell the destination company where where we came from. This could also affect the results either by encouraging the destination company to start spamming us or alert the sending company that we're suckers that open links and they should send us more mail with third party links
The problem with this approach is that after three months, we weren't getting results. Oh sure, the Normal accounts were getting all kinds of e-mails, but they were all newsletters and such that we had agreed to take during the sign-up process. To fix the problem, we introduced three phases to the project.
|This would be the first three months (October to December) that we had already measured and the rest of January (which we were in the middle of by the time we implemented the phases). This would be the period where we didn't preview or open any e-mails.
|Now we would not only open all e-mails (sent Jan 31st or later), but click on links to try to incite as much attention to ourselves as possible.
Continue to open and click on links in e-mails, but now try to unsubscribe from all e-mail lists either by changing our settings on the company website or clicking the "unsubscribe" links in e-mails (if any). The purpose of this phase was to continue trying to entice Spam, but also to see if companies respected unsubscribe attempts.
While we would have liked to run the experiment for longer, March was our cut-off so we could analyze and present the data during April (graduation was in May).
In all, I created 75 total accounts distributed among 51 different services. That included 53 Normal accounts, 11 Savvy accounts, 2 Maxx accounts, 2 Order accounts, 1 Referral accounts, and 6 that I couldn't include due to sign up problems.
Email by Type
Spam Experiment: Total number of e-mails received by type
This chart shows the total number of e-mails sent to all accounts during the 5 month period of the experiment broken down by type.
First party e-mails and system messages are messages sent to Normal
accounts due to failing to opt out of mailing lists.
Second party e-mails are advertisements for third party services, but they are sent by the original site we signed up with which prevents the third party site from getting our personal information.
This too consists of messages sent to Normal accounts as a result of failing to opt out during sign up.
Third party e-mails are the ones we are most interested in. These are Unsolicited Commercial E-mails (UCE) that either came to a Normal account, but had nothing to do with any lists or options presented during sign up, or were sent to any non-Normal account.
Of these, we received only five total messages!
The apparent answer to the question of how fast and how far our information spreads is apparently not that fast, and not that far.
Conclusion 1: Most major web services and companies will respect your request to not receive e-mails or have your information sold.
While in truth, I don't know if any of my test information got sold or not, if it did, I didn't receive any Spam as a result.
Email by Company
Spam Experiment: Total number of e-mails received by type and by company where the total was more than 5 e-mails
60% of the Normal accounts received less than 5 e-mails each even including the Normal accounts. Of the remaining 40%, only 3 accounts had more than 1.5 e-mails per week.
Conclusion 2: Very few companies send you more than 1 or 2 messages a week even if you do a click-through sign up.
Physical junk mail
Besides number and type of e-mails, we tracked physical junk mail by using a different first name and two to three letter code in the address for each sign up. Therefore, if any mail came to my address with one of the fake names or with the address code, we would know immediately who was responsible.
However, during the entire experiment I only received two pieces of physical mail and neither was unexpected. One was a brochure from the Medical Hair Restoration group and the other was for the University of Phoenix (both of which I was told during the sign-up process that I'd receive).
Conclusion 3: Most online services don't bother sending physical mail if they've already got your e-mail address.
There are two ways to unsubscribe from e-mail lists. The first, which has nearly become an industry standard, is to have a section at the bottom of an e-mail that provides a link for canceling further mail.
Unsubscribe section of a Ubid newsletter
Unsubscribe section of a Wal-Mart newsletter
Typically clicking one of these links immediately removes you from their lists with no hassles. However, some make it harder than others.
- PC Magazine forced me to click the unsubscribe link on each type of newsletter they were sending me (about 4 different newsletters).
- PC World took me to a page that unsubscribed me from marketing, but not their newsletters. I had to perform some extra steps to fully unsubscribe.
- Security Space made me enter my own e-mail address on their page instead of entering it for me and I had to respond to an "Unsubscribe" e-mail they sent me to confirm the "un-subscription".
- Freeipods redirected my browser to an advertisement after I successfully unsubscribed.
The second method is to force the user to log into their account, find all the options that generate mail, and turn them off. This can be far more time consuming and challenging. For example, it took me almost a full month to unsubscribe from one of the Lycos services due to shutting off options, getting more mail, and logging back into my account to track down another option I missed the first time.
Despite various difficulties, every company in this experiment ceased all e-mail activity after I (sucessfully) unsubscribed.
Conclusion 4: Despite rumors to the contrary, the unsubscribe links DO work.
The conclusion only counts e-mails from a legitimate companies (see "The Conclusion" section below)
Secondary Results – Dishonesty and Deception
I also counted the number and type of ads, the wording used during sign up, wether policies were opt-in or opt-out, and more. This really got off topic and Dr. Soule suggested that I focus on the parts that related to this experiment. That data is listed below, but the rest of the data and the work that began is still something I want to implement soon (please see my =link("DSS")?> page for details).
There are two major forms of subject-field dishonesty. The first, which I simply labeled as Deceptive, is when the subject is clearly a lie.
- "No cost laptop #66052" (Freeipods.com)
- "Noone Inparticular, Claim your Complimentary $250 Shopping Gift Card" (Ebaum's World)
- "Participants Needed! Receive a 500 USD Gift-Card" (Freeipods)
- "Larry, you're invited for a resume makeover" (Monster.com)
The one's I found the most offensive were the e-mails that referred to me by my (fake) names or in some other way makes the e-mail appear to be for you as an individual when it clearly isn't (such as the laptop e-mail with a "reference number" in it). Fortunately, these types of e-mails are easy to spot and delete.
The second (and worse) major form of subject dishonesty, which I labeled Cryptic, are e-mails that are ambiguous such that you can't tell if it's legitimate or not.
- "Happy New Year" (ubid)
- "Exclusive Subscription Opportunity" (PC Magazine)
- "Introducing My Blog Site" (Fortunecity)
Spam Experiment: Total number of e-mails received with dishonest or deceptive subjects
From-address field dishonesty
Again, there were two ways that I identified companies could use dishonest "From Address" fields.
Deceptive – By shifting the "From Address" value for each e-mail, a company can make it harder for Spam filters to block their e-mails.
Cryptic – A little more rare and a lot worse, these are the e-mails who's from addresses appear to be legitimate.
The only offender in this experiment was Lycos who occasionally sent out an e-mail labeled "Printing Services" or "Business Cards", but a hard-core spammer will use common names like "Bob" or "Susan" hoping that you know someone with that name and will open the e-mail without checking the "from address" field (which Microsoft Outlook STILL doesn't let you display in your inbox).
Spam Experiment: Total number of e-mails received with dishonest or deceptive 'From' fields
Letgitimate online providers are not generally interested in sending unsolicited e-mails or selling your name to people who do. Even if they are, by opting-out where available, you will avoid most Spam.
Some of the companies in this experiment I would consider less than honest. Those that used deceptive and cryptic e-mails, ones that use Spam as an advertising tool, ones that offer "free" items that are clearly not, etc. In this experiment, even these companies respected opt-out options (if they had any) and the unsubscribe requests.