If you want a real life example of how you can get the wrong impression of someone from data mining, take a look at my Audioscrobbler data.
It shows Damien Rice as being my second most played artist. I own TWO songs from Damien Rice. What happened is that got the songs from iTunes, had those two on repeat, and then let it run while I was gone for the day and then over night.
Same with Liz Phair, Larry the Cable Guy, Sister Hazel and Outkast. The comedy stuff is really unrepresentative because the tracks are so short, whereas some Philip Glass tracks are an hour long.
Instead of music, imagine if it was credit card data. It isn’t too hard to come up with scenario of how your credit card might have unrepresentative purchases on it for a short period of time.