A Scientific Approach to Website Registration Design
Whether we like to develop it or not, the registration process is a key element to any website. The marketing, sales, and development teams always seem to be at odds over the issue of agreeing on the perfect registration form.
At what point in the engagement / sales process should a customer be prompted to register? Should the form be split in parts or put all on one page? What questions should be asked? What order should they be asked in? Which ones should be required/optional?
In many cases, an online business's key strategy is to get as much user information as possible, as this is generally considered to add value to a corporation. However (as you can probably attest to personally), customers are loathe to give out any personal details, and may even be prepared to provide bogus information. Not only must companies worry about maximizing customer registrations, they also have to worry about the depth and accuracy of the information gathered.
I recently read an article that highlights how a badly-designed registration process can hurt your business. After reading that, it made me wonder: did anyone consider taking a scientific approach to solving the problem of designing the optimum registration process? This is what I'll try to propose here. A simple scientific method can be constructed from the following steps:
- Assumptions
- The Question
- The Hypothesis
- The Experiment
- Analysis / Conclusion
We will discuss steps 1 to 4 here and omit 5, as implementing the experiment is beyond the scope of this post.
Assumptions
Value of registration data
Let's assume that every successful customer registration carries with it a certain dollar value to the company. A customer registration involves an individual accessing a website, filling in a registration form, committing to a terms of use agreement, and finally responding to an automatically-generated e-mail confirming their registration. In this process, the company has a unit of data that contains (1) information about a person, (2) acknowledgment that the information can be used by the company in a certain way, and (3) verification that the information has some degree of authenticity. Therefore, it's reasonable to assume that a real dollar amount can be tied to each customer registration.
Let's further assume that this dollar amount increases as the amount of information collected about a person increases. More specifically, each registration data unit is more valuable as the number of questions collected on the registration form increases. Two points can be made that support this assumption. First, the higher the amount of information we have on the user, the easier it is to verify the authenticity of the information. Second, it allows a marketing process to be applied more effectively towards that user. This is because we have more ways of contacting the individual, and we have information that would allow us to make the marketing process better targeted to the user.
We can also assume that the value of the information will increase at a decreasing rate, as we add more questions to the registration form. The reason for this is because we would ask the most critical and valuable questions first (such as e-mail and first/last name). Subsequent questions such as demographics or personal preferences would be included only on the larger registration forms.
Visitor / Registrant behavior
The majority of website visitors will not become registered users. If you're interested, James Grohol explores the phenomenon of anonymity (and its adverse effect on online communities) in this article. We accept this as a reasonable assumption.
We can further assume that longer registration forms will produce less registered users. This is because website visitors are generally impatient, and not willing to devote a lot of time on filling out a registration form. They are also hesitant to give out a lot of personal information online. While many users would grudgingly provide their name and e-mail, few would eagerly follow up with their address, age, salary level, and the name of their dog.
Finally, we will assume that the number of registered users will fall at a decreasing rate, as we increase the number of questions on the registration form. I don't really have much of an argument for this one. Just think of it this way: what change will turn away more users from registering? Adding two questions to a 5-question registration form, or adding two questions to a 20-question one? The former seems to be the answer that seems to make sense to me.
The Question
Now that we have set up our assumptions, we pose the scientific question: how much information do we collect for each registration to yield the maximum value for the total collected registration data? In other words, what is the best number of questions to ask, if we want to get the most money out of the collected user data?
The Hypothesis
Based on the assumptions made, we can sketch a graph of "Value per registrant" versus the "Amount of information collected per registration form." The shape of the graph was determined from our assumptions.
Figure 1
In a similar fashion, we sketch a graph of "Number of registrants" versus the "Amount of information collected per registration form." Note the point labeled "r" represents total number of visitors to the site (including the ones who didn't register).
Figure 2
If we multiply these graphs together, we get "Total value of registrants" on the Y-axis, which is what we need. If we model Figure 1 as an exponential (Y1 = abx , a>0 , 0<b<1) and Figure 2 as a logarithmic (Y2 = c[logd(x+d)-1] , c>0 , d>1) function, the rough shape of the resultant graph is shown below (Y3 = Y1 x Y2).
Figure 3
As you can see, there is a clear maximum, labeled "i", that occurs relatively near the origin.
The Experiment
We can set up many different registration forms, which are selected at random to a potential registrant. The reason for the random selection is to reduce environmental effects, as there isn't an easy way to implement a scientific control in this case. Each user registration would be associated with a certain registration form. The pool of registration forms should be kept fairly small, so that a good number of registrations are made.
The only part that I haven't really figured out is how to attach a monetary value to each "group" of registrations. Coming up with a way of determining actual dollar values seems like it'd make the experiment far more involved than something that can be summed up in a few paragraphs. It'd be interesting to hear some ideas!
Shortcomings of this model
Unfortunately, questions on a registration form are a discrete variable, while I have treated it as a continuous one (i.e. you cannot ask 1.5 questions, so the registration form data is not continuous). Also, we didn't consider other factors, such as the time spent by the user on each question. This is important, as the length of time spent answering a question can strongly determine whether or not a user bothers answering it.
Furthermore, just because the question takes a long time to answer doesn't necessarily means that it adds a lot of value to the registration. Although it may be possible to convert the heterogeneous and discrete variable of "registration questions" to a homogeneous and continuous one, it is beyond the scope of this posting. Nevertheless, this issue should be addressed, if this experiment is carried out.
Clearly, the scientific method presented here is overly simplified. The point of this blog post is to show that it can be useful to apply the scientific method to the web, in some cases.
User Registration and Web 2.0
One of the aspects of Web 2.0 is the recognition that it's not all about registration data. Companies and investors are beginning to value active, vibrant online communities over stagnant data pools of names, e-mails and addresses.
However, this doesn't mean the end of online registration forms. It means that the registration approach will inevitably change as the valuation of registered users by an online business becomes more complex. The sweet spot between quality and quantity will still exist in the world of Web 2.0, and you can count on web companies continuing in their efforts to find it.
-- Paul Sobocinski
- paulsobocinski's blog
- Login or register to post comments
