Leskovec et. al., ACM 2006
From ScribbleWiki: Analysis of Social Media
Contents |
[edit] The Dynamics of Viral Marketing
- Authors: Leskovec J., Adamic L. A. and Huberman B. A.
- Conference: 7th ACM Conference on Electronic Commerce, 2006.
- Link: http://portal.acm.org/citation.cfm?id=1134732
- Maintainer: Sameer Badaskar
[edit] Overview
This paper presents a data driven analysis of a recommendation network of customers. The dataset was collected from 4 million users making 16 million (approx) recommendation between 2001 - 2003. Ideally, the seller would like to have an epidemic in the customer network in the form of exponential growth of customer-base. It may be tempting to draw parallels to epidemological models for viral propagation but there are key differences. Firstly, infecting the most connected nodes of the network does not help in getting epidemic response to a product, as such well connected people tend to be termed as spammers. Secondly, a large percentage of products sales have very few takers (the long tail phenomenon) which prevent a viral growth of the customer network. The paper specifically attempts to answer the following questions in the context of viral marketing:
- Is the overall growth of the recommendation network viral ?
- Does giving incentive for recommendation (eg: 10% discount for successfull recommendation) help or hurt ?
- What is the depth of recommendation chains in a typical network ?
- How does the number of incoming recommendations affect the probability of buying ?
Finally, a regression analysis is performed to correlate parameters like product price, number of recommendation etc to the recommendation success.
[edit] Dataset Used
The dataset used for the recommendation network analysis consists of approximately 16 million recommendations each recommendation having the following parameters
- Sender Customer ID (anonymized)
- Receiver Customer ID (anonymized)
- Product ID
- Time of recommendation
- Price of product
- Indication whether the recommendation resulted in a purchase.
The datapoints are grouped into 4 product categories: DVDs, Books, Music and Videos and the 4 categories are considered separately during the analysis. DVDs have the highest average recommendation per node while Books have the most number of recommendations put together. An incentive of 10% would be given to a customer for a successful recommendation and a 10% discount would be given on a recommended purchase.
[edit] Network Growth
The recommendation network (nodes of the graph) is shown to grow linearly with time. A viral growth would mean an exponential growth in the number of nodes in the network. Since this is not observed, it is the first indication that the recommendation scheme is clearly not showing viral trends
[edit] Depth of Recommendation Chains
If a customer in the network makes a successful recommendation to another customer, we say that the depth of the recommendation chain (or cascade) is 1. In general, a recommendation chain (or cascade) is characterized by size, depth, topology etc. Using a subgraph mining algorithm described in (Leskovec et. al. 2006), a histogram of cascade sizes is constructed for all the 4 product categories. The histogram shows that cascade sizes for all product categories exhibit power laws which means that simpler and smaller cascades are more frequent and deep cascades are very rare. This implies that the recommendation scheme has a limited effect and does not result in deep recommendation chains. This is the final proof that the recommendation trend is non-viral.
[edit] Effect of Incoming Recommendations
It may be assumed that as the number of incoming recommendations to a node in the network increase, the probability of a purchase also increases. However after ignoring a certain number of recommendations, the user tends to become immune to recommendations that come thereafter. Indeed, a plot of probability-of-buying vs number of incoming recommendations shows that the probability of buying increases initially but saturates quickly for all product categories. Infact, in the case of books, the probability of buying decreases with incoming recommendations! This may be due to the fact that the books network consists of smaller communities having specific interests and is more inert to recommendations of arbitrary kind.
Another conception is that as more recommendations are exchanged between two customers, the probability of buying would increase since there is a high level of trust. However, a large number of exchanged recommendations can mean that the customers look at each other as spammers and hence, tend to ignore future recommendations. Experiments in the paper show that purchase probability decreases as more recommendations get exchanged between customers.
[edit] Effect of Incentives
A customer can recommend a product to a select few friends or many friends. In the presence of an incentive, a customer tends to send out many recommendations in order to maximize the probability of success. This results in a decrease in the trust-value of the customer since he/she tend to be perceived as a spammer and hence, his/her recommendations have limited effect. Indeed the paper shows that the number of purchases (on average) saturate as customers send out more and more recommendations to others. Thus, giving incentive for recommendation has a limited effect and can even hurt the purchase probability. This is counter-intuitive in the sense that one would normally assume that incentivizing recommendations would help boost recommendation success.
[edit] Correlation Analysis
Finally, a regression analysis is performed to ascertain the parameters of a product that are well correlated with the recommendation success. For each product, the following parameters are considered:
- number of recommendations
- number of senders of recommendations
- number of recommendation recipients
- price of the product
- number of reviews
- average product rating
The regression function is <math> y = exp(\sum_i \alpha_i log(x_i) + C) </math> where <math> y </math> is the recommendation success given by the fraction of successful recommendations. <math> \alpha_i </math> are the regression coefficients and <math> x_i </math> are the product parameters explained above. Regression analysis using approximately 48000 products shows that
- Price of product and number of recommendations are positively correlated with recommendation success.
- Number senders and receivers of recommendations are negatively correlated
- Number of reviews and review ratings are uncorrelated.
1. and 2. agree with observations made in previous sections. Reviews might be uncorrelated because of the "long tail effect". Most product types have very few takers and for such products, there exist fewer or no reviews. Thus, the chances of a person consulting reviews is much less since they tend to have prior knowledge of the specific product that they want to buy.
[edit] Conclusions
Here are a few key take-aways from the paper
- Pricey products have more chance of recommendation success. It does not imply increasing the price of the product but just that pricey products are of appeal to small close-knit communities where the chances of recommendation success are high
- Viral marketing should have a community focused approach. For example in the case of books, the recommendation network consists of many smaller communities with specific interests.
- Mode of recommendation is important and email can be quite impersonal. A fixed template recommendation email might get ignored but not a personalized one.
- Incentivizing recommendations can have a limited or even negative impact on recommendation success.
- Innovation in the product affects viral adoption (free email, loads of inbox space etc).