Return of the Facebook Snatchers

Filed under: Hacking, Passwords

First and foremost: if you want to cut to the chase, just download the torrent. If you want the full story, please read on....

Background

Way back when I worked at Symantec, my friend Nick wrote a blog that caused a little bit of trouble for us: Attack of the Facebook Snatchers. I was blog editor at the time, and I went through the usual sign off process and, eventually, published it. Facebook was none too happy, but we fought for it and, in the end, we got to leave the blog up in its original form.

Why do I bring this up? Well last week @FSLabsAdvisor wrote an interesting Tweet: it turns out, by heading to https://www.facebook.com/directory, you can get a list of every searchable user on all of Facebook!

My first idea was simple: spider the lists, generate first-initial-last-name (and similar) lists, then hand them over to @Ithilgore to use in Nmap's awesome new bruteforce tool he's working on, Ncrack.

But as I thought more about it, and talked to other people, I realized that this is a scary privacy issue. I can find the name of pretty much every person on Facebook. Facebook helpfully informs you that "[a]nyone can opt out of appearing here by changing their Search privacy settings" -- but that doesn't help much anymore considering I already have them all (and you will too, when you download the torrent). Suckers!

Once I have the name and URL of a user, I can view, by default, their picture, friends, information about them, and some other details. If the user has set their privacy higher, at the very least I can view their name and picture. So, if any searchable user has friends that are non-searchable, those friends just opted into being searched, like it or not! Oops :)

The lists

Which brings me to the next topic: the list! I wrote a quick Ruby script (which has since become a more involved Nmap Script that I haven't used for harvesting yet) that I used to download the full directory. I should warn you that it isn't exactly the most user friendly interface -- I wrote it for myself, primarily, I'm only linking to it for reference. I don't really suggest you try to recreate my spidering. It's a waste of several hundred gigs of bandwidth.

The results were spectacular. 171 million names (100 million unique). My original plan was to use this list to generate a list of the top usernames (based on first initial last name):

 129369 jsmith
  79365 ssmith
  77713 skhan
  75561 msmith
  74575 skumar
  72467 csmith
  71791 asmith
  67786 jjohnson
  66693 dsmith
  66431 akhan

Or first name last initial:

 100225 johns
  97676 johnm
  97310 michaelm
  93386 michaels
  88978 davids
  85481 michaelb
  84824 davidm
  82677 davidb
  81500 johnb
  77800 michaelc

Or even the top usernames based on first name dot last name (sorry, I can't link this one due to bandwidth concerns; but it's included in the torrent):

  17204 john.smith
   7440 david.smith
   7200 michael.smith
   6784 chris.smith
   6371 mike.smith
   6149 arun.kumar
   5980 james.smith
   5939 amit.kumar
   5926 imran.khan
   5861 jason.smith

Or even the most common first or last names:

 977014 michael
 963693 john
 924816 david
 819879 chris
 640957 mike
 602088 james
 584438 mark
 515686 jason
 503658 robert
 484403 jessica

 913465 smith
 571819 johnson
 512312 jones
 503266 williams
 471390 brown
 386764 lee
 360010 khan
 355639 singh
 343220 kumar
 324972 miller

So, those are the top 10 lists. But I'll bet you want everything!

The Torrent

But it occurred to me that this is public information that Facebook puts out, I'm assuming for search engines or whatever, and that it wouldn't be right for me to keep it private. Why waste Facebook's bandwidth and make everybody scrape it, right?

So, I present you with: a torrent! If you haven't download it, download it now! And seed it for as long as you can.

This torrent contains:

  • The URL of every searchable Facebook user's profile
  • The name of every searchable Facebook user, both unique and by count (perfect for post-processing, datamining, etc)
  • Processed lists, including first names with count, last names with count, potential usernames with count, etc
  • The programs I used to generate everything

So, there you have it: lots of awesome data from Facebook. Now, I just have to find one more problem with Facebook so I can write "Revenge of the Facebook Snatchers" and complete the trilogy. Any suggestions? >:-)

Limitations

So far, I have only indexed the searchable users, not their friends. Getting their friends will be significantly more data to process, and I don't have those capabilities right now. I'd like to tackle that in the future, though, so if anybody has any bandwidth they'd like to donate, all I need is an ssh account and Nmap installed.

An additional limitation is that these are only users whose first characters are from the latin charset. I plan to add non-Latin names in future releases.

Permalink Comments (112) Ron Bowes Jul 26, 2010

112 Responses to “Return of the Facebook Snatchers”

  1. junofeeng Says:

    With facebook's growth, more and more hackers focus on it.The security becomes increasingly important

  2. Paul Evans Says:

    Looks great as a dictionary for driving brute-force SSH/website attacks or similar. What's the betting that there's at least 10,000 users in that list whose password is some variation on their date of birth which, of course, they'll publish too?

  3. floyd Says:

    Nice. It's available via Google as well (even if Google doesn't like this query --> Captcha):

    inurl:"/directory/people" site:facebook.com

  4. Karel Blinker Says:

    Now what is left is generating passwords from the names (like your jsmith and smith) and trying these out to access the accounts. With that amount of data you will have many hits.

  5. kyle Says:

    Keep up the good work!

  6. Richard Ferbe Says:

    Good find! thanks for the share

  7. Sean Sullivan Says:

    Nice post. Thanks for the mention.

    171 million names (from the latin charset)... is that the A-Z but not the 1-26?

    I'm curious how often the index is updated. One of our researchers (who has had an account for a long time) isn't listed even though his options would allow it. And our CRO, Mikko Hypponen is in the index, but another Mikko Hyppönen, that can be found via Google, isn't. (And there are five M.H. overall when searching inside of Facebook.)

    In any case, if there are actually 500 million accounts, and only (only!) 171 plus million names in your torrent. Does this mean that more than half of Facebook accounts have taken the time to opt-out?

    Seems like a lot. Conventional wisdom holds that most people don't adjust their privacy settings. (But I never cared for that bit of CW.)

    Here's another fun index for you: http://www.facebook.com/family/

  8. Matt Gardenghi Says:

    Sean Sullivan: Iago probably borked his research.... ;-)

  9. MrMiGu Says:

    Have you checked the phone book lately?

  10. Ron Bowes Says:

    Hey Sean,

    Yes, I did A-Z but not 1-26. 1-26 offered some unique challenges, and I've started working on phase 2 where those will be harvested. I'm in Vegas for Blackhat/Defcon right now, though, when I get back I'm going to start working on 1-26 and everything else.

    It's really hard to answer your other questions. Facebook's 500 million claim may not be true, and their directory seems somewhat sketchy as to what they do/don't include. Additionally, I'm suspicious whether or not the directory updates as I'm going, because I might skip some/hit duplicates if it does.

    in any case, it's super cool. :D

    Now, I'm on a bus at the Hoover Dam right now, so I'd better sign off and enjoy my bus tour. See y'all soon!

  11. kats Says:

    Seems like a very roundabout way to do this:

    for ((i = 0;; i++)); do curl "http://www.facebook.com/profile.php?id=$i" > $i.html; done

    You'll get everything that Facebook has visible publicly, friends and all. You can find Zuckerberg at i=4.

  12. Liz Says:

    The difference in user numbers could be because you used Romance/European language alphabet. Since most users of Facebook aren't in the U.S., you might need to try alternative alphabets (Cyrillic, Japanese, Greek, etc.) if Facebook allows for these alternatives.

  13. wuntee Says:

    Possibly another easier/faster way - enumerate all numbers:

    http://graph.facebook.com/4

  14. winikeh Says:

    I would love to see the ncrack script to go along with this.

  15. mark Says:

    Facebook bumped up their IDs to 100 trillion and their IDs include all objects in the "graph" now, so iterating by IDs could take a very long time.

  16. YAY Says:

    Great work.

    I don't know why people are talking about nrack though as you can't login with just the username/profile id.

  17. Ulrich Says:

    They claim 500 M users, but how
    come I can only count to some
    340 M?

    http://graph.facebook.com/340100101

    Can anybody find a higher userID?

  18. Sean Sullivan Says:

    Neil Rubenking at PCMag/Security Watch played around with the graph method. The small program that he wrote would have taken 18 years to collect all the info: http://blogs.pcmag.com/securitywatch/2010/05/facebook_id_hack_-_no_real_pro.php

  19. Ben Further Says:

    @Sean Sullivan

    Solution:
    -Define ID-Ranges (for 24h hours)
    -Find some "helper kittys"
    -And get crackin

    18Y -> ~6570D

    So u only need about 6570 ppl to get it done in a day :)

  20. Gongo Bazook Says:

    are pics also in the torrent?

  21. Matt Gardenghi Says:

    I haven't looked at the data myself, but from what Ron was saying, it is unlikely. It appears to just be data grepped from the results.

  22. Ron Bowes Says:

    Hi Gongo,

    For Phase I, I haven't downloaded the pics.

  23. Helios Says:

    Send me an email, we'll talk bandwidth for you to use.

  24. I'm on the list Says:

    I'm downloading just to see if I am on the list. I had previously set my privacy settings on Facebook to be open to anyone that looked. This is a perfect way for me to see what is truly available to the public. Even though I am not listing my info in this post (I don't need the EXTRA attention) this will help me tremendously in my experiment. I must say thank you to SkullSecurity for putting this together.
    Cheers!
    (I plan on seeding this for several weeks)

  25. Joe Crow Says:

    Unfortunately for me, my position as the number 2 blogger, which dropped to 3 back then, has fallen off completely since those good ol' days!

    Great post.
    Joe Crow

  26. Iam Furthest Says:

    uhhmm, you'd need 6570 systems you mean.. not 6570 ppl?
    try 555-rentabotnet, create one yourself, or start a folding@home like screensaver :D
    this is just to superevilmindedwhoohootakeovertheworldanddontdoanythyhingwithitwhenyouredonecauseitwasonlyforthefun kinda stuff

  27. quinametin Says:

    Is it easy to adjust the script to collect pics also?

  28. Anon Says:

    Apparently the OP doesn't have any friends on Facebook..So he has to hack his way into finding people, what a sad little man.

  29. g04t Says:

    get a life dud. srsly.

  30. Ron Bowes Says:

    Heh, I love the flamebait. Keep it coming!

  31. Anonymous Says:

    500 million user count claim is probably because of Google's top 1000 most visited sites...
    http://www.google.com/adplanner/static/top1000/

    Still I think you should not have released this. There are other way's to get people's attention, and this will only work misuse in the hand.

  32. Demetry Gutierrez Says:

    You should hack all your haters accounts. Good work by the way, brings to light the lack of security we call security today.

  33. Simo Says:

    for pics, you just do graph.facebook.com/zuck/picture?type=large ..and whoomp there it is!

  34. anon Says:

    Why didn't you include information other than the URL and name? Would be really useful to have the other details included. As is this is kind of worthless.

  35. Nick Fisher Says:

    I thought about doing a similar thing for Ebay accounts (using the feedback ratings), but then I said "What's the point ?"....

  36. Phil Says:

    So Demetry, people should have there accounts hacked if they disagree with this childish act, and they say people on the net aren't mature...

    Highlighting a flaw which really isn't a flaw, accessable data on a social networking site, hardly Watergate is it?

    Tune in next week when our increpid reporter discovers, some firms sell your email addresses on to third party after telling you they do...

  37. yeeeah Says:

    If only more people were seeding.

  38. Anonymous Says:

    FB may be poor with their security settings and it does need to be addressed. However, some of the details that you have made even more high profile in your release of them and the subsequent media hype may just have taken the work out of it for a non IT literate paedophile. I am sure you know that there are misguided children on FB too. Congratulations you got traffic to your site and kudos at the expense of the very audience you claim to be protecting, well done!!!!! A little more personal responsibility would have been wise perhaps.

  39. Matt Gardenghi Says:

    Anonymous: Seriously? Did you read the post? Look at the data? The fact is that this is a compilation of names. Not pictures, not email, not birthdays. These are just black and white text. You can go on FB and search for "jane doe" and come up with the same results + profile pics. That's more valuable to a pedophile than this list. Please try to read a "little" bit before posting. OK? Thanks.

  40. Hoover Says:

    But why did you do this then make it public?

    Whose side are you on?

  41. Matt Gardenghi Says:

    Hoover: That's a black/white fallacy. You are assuming that posting information makes one bad. This is no different then yellowpages.com posting your phone number. The information is public and in one place. Ron simply grabbed a chunk of it for data analysis. How is the analysis of public data bad?

    You frame this like it's a part of the vulnerability disclosure debate. Its not. This data was deliberately made public for the use of Google, Bing et al. (At least as I understand it.) So that being the case, why would Ron be taking sides (good or bad) by publishing the data that is published on search engines now?

  42. me Says:

    it wont download

  43. Ponty.net Says:

    Fascinating work. Proof above all proof that FB is nowehere near as secure as it needs to be. I disagree completely with their attempts to make information more 'searchable'.
    Personal data should not be harvestable.

  44. Anon Says:

    Bring on the Cyber War, the most deadly war yet to begin in modern age.

    They are not prepared.

  45. fak3r Says:

    I'm very interested in learning more about this, I'll try to grab the torrent tonight and bring it to DEFCON - will you have any spare time over the weekend? If you're going to work/talk about this at DEFCON, I'd like to be involved. I'm following you on twitter (@fak3r) now too, so pipe up if anything is going on. See you by the pool.

  46. David Curry Says:

    Matt already addressed this but...

    Hoover: He's obviously on the information's side.

  47. Tracksomebody Says:

    http://tracksomebody.com/?cat=118

    paste any url of a facebook image
    and it'll return their name and url to their myspace

  48. Tracksomebody Says:

    their facebook*

    sorry was working on something else at the time

  49. Matt Gardenghi Says:

    fak3r: Ron will be at Fyodor's nmap talk on Friday in the front of the room. He said to look for him there. Or drop into #skullsecurity and chat directly.

  50. Hannah Says:

    ... and?

    I could have told you there were lots of people called "John Smith" on facebook without any effort at all. I don't know what's more worrying; the time you've spent on this or the lunatics commenting on it who have completely failed to grasp the point. Matt Gardenghi and Phil excepted.

  51. Matt Gardenghi Says:

    Thanks Hannah. Nice to see another person grasp this situation for what it really is. To quote a comment I saw recently regarding the growing media circus: "It's sad that this is some of your least interesting work, and it's getting such attention." That commenter was correct. Ron's other work is far more interesting and useful: nmap scripts; conficker detection; energizer backdoor detection....

    Stick around Hannah, it's not usually this nuts here.

  52. Seth Stahlman Says:

    An observation about the ease of building names from Facebook public data seems rather timely, considering this NYT article:
    http://www.nytimes.com/2010/07/25/magazine/25privacy-t2.html?_r=3&pagewanted=1

    Would be interesting to run Ron's program in a year and do a comparison on the new data to the handy lists in the torrent, just to see what's stale and what's changed.

  53. dude Says:

    I uploaded the torrent file on rapidshare: http://rapidshare.com/files/409692690/fbdata.torrent.html

  54. loveecho Says:

    Great work!
    Thx for share!!!!
    God bless you.

  55. Concerned Says:

    The following comment has me worried, in regard to the pedophelia discussion here, privacy matters, right vs wrong etc.

    "Once I have the name and URL of a user, I can view, by default, their picture, friends, information about them, and some other details. If the user has set their privacy higher, at the very least I can view their name and picture. So, if any searchable user has friends that are non-searchable, those friends just opted into being searched, like it or not! Oops :) "

    Oops indeed. Thus there is access to pictures and details. A jackpot for many groups out there. Did you put your religion in the details or your political stance? You might be a target.

    This is why you dont put any personal info on the web and why keeping max security is always smart.

    Rons intention might be good, but he can serve as an inspiration to those who would want to do something similar for harmful intentions.

    Just some food for thought.

  56. Muni Says:

    Hey Ron,

    Awesome work! Really interesting. My question is how often you index the data? Every day facebook user registrations are increasing exponentially. How are you going to keep your data updated?

    Muni.

  57. Bon Says:

    Can any one upload that torrent data to hotfile.com?

  58. clyang Says:

    Hi Ron,
    Thanks for your work! I already upload all bz2 files to hotfile. If people not able to download with BT. Please try the following links:
    http://hotfile.com/dl/58180613/b424710/facebook-f.last-withcount.txt.bz2.html
    http://hotfile.com/dl/58180632/e160598/facebook-first.l-withcount.txt.bz2.html
    http://hotfile.com/dl/58180642/9d6f4cc/facebook-firstnames-withcount.txt.bz2.html
    http://hotfile.com/dl/58180649/9b4144d/facebook-lastnames-withcount.txt.bz2.html
    http://hotfile.com/dl/58180822/5260779/facebook-names-original.txt.bz2.html
    http://hotfile.com/dl/58180985/41862da/facebook-names-unique.txt.bz2.html
    http://hotfile.com/dl/58181148/d2967b2/facebook-names-withcount.txt.bz2.html
    http://hotfile.com/dl/58181816/179517c/facebook-urls.txt.bz2.html

    Regards,

  59. kho chi Says:

    Cool Dude

    Good job, I would like to see too
    on the Apps side

    Once downloading user email
    and personal information, I was able to re-create and po-pulate
    the Open Graph data-structure that
    relates a person to their friends

  60. Ron Bowes Says:

    @Muni: I have no idea, really. I'm hoping to do Phase 2, with even more users, in August. We'll see how much overlap there is!

  61. Ron Bowes Says:

    @Concerned:

    People with harmful intentions are no better off after releasing the data than they were before. The data has always been there, and for all we know they've been collecting it. Raising awareness like I did can only serve to help the problem.

  62. Vid Says:

    Hi Ron,

    I am a young entrepreneur. I have an easier & legal way for you to get the name, first name, last name, gender & picture of all users. I would like to discuss this with you. Let me know if you are interested.

    Thanks

  63. what we need Says:

    We need people to do things like this. People do not realize what they are getting into when they open up a social networking site. I work in IT and fully suport this exercise. I would say that you should try to get as much data as you can without breaking the law. I am even willing to lend you my bandwith, I have 25/15 and would consider upgrading to 50/20.

  64. Jeff Says:

    I'm feeling a little dense here... What does this prove that we didn't already know before? If I want to find someone on facebook, I can do a search for them, find them, and view their public profile info. This is just a big list of people's public profile names is it not? Or am I missing something...

  65. Thema3x Says:

    @urlich, its simple, the id is a consecutive number, i have an ex-girlfriend who has a id greater than 690,000,000, thats why we need to count the id ex-members like me. :D
    @Ron I hope you can provide us the data dictionary of the files, it's such hard open this files and don't know how to read them

  66. Frank Says:

    It is worth remembering that many government already trawl social network sites, or fund such services and thence provide data collection services for a fee:

    Google, CIA Invest in ‘Future’ of Web Monitoring
    http://www.wired.com/dangerroom/2010/07/exclusive-google-cia/

    ‘Project Indect’: An A.I. to police all of Europe
    http://rawstory.com/08/news/2009/09/20/project-indect-an-ai-to-police-all-of-europe/

  67. aiya Says:

    I don't get it. What are you all on about ? It's already public information. The only point of interest is **if someone changes their privacy settings to full you still know "John Smith" has an account**,, so what? I bet there are thousands of "John Smith"s and how about George Bush, loads of them,, and Heidi Woodwind, I bet she exists too. Infact, I bet all name variations exist. So what you have is ID Name lookup,, And that gives you what ? You can search facebook for Heidi Woodwind now, and it shows 59 Results with pictures. It even gives you veriations (heidi woodland, heidi woodward).. Your text doesn't. Its also available in Google as shown... I still don't understand what a 1.4GB text file gives you that this doesn't. This IS NOT a security breach, you have not got information that people didn't allow / set in their profile. My settings are highest, I am NOT in the list. So.. proves this data is 100% useless. now scan in your phone book, and torrent that for everyone to wow at.. and don't forget the yellow pages,.. offer it to wikileaks, they will just reject it as already public information. USELESS

  68. Matt Gardenghi Says:

    aiya: The primary purpose was to collect combinations of "real first names and last names." The fact that they are from FB is incidental. This is all about determining the frequency of real names for purposes other than FB hacking.

    Frankly, FB hacking isn't that interesting unless you are A) trying to target a company through an employee on FB or B) you are trying to exploit FB users for profit/political points/etc.

    Point A is only useful for Pentetration Testing or Espionage. Point B is just illegal.

    Having a list of actual names makes brute force tools more useful. (There is the assumption that brute force tools will be used correctly within the bounds of the law by legitimate security researchers.)

  69. Andrew Says:

    But I shouldnt HAVE to protect my self, and secure everything. Only dull, boring, psychotic people with crap to hide secure every single little thing. Why can't I just enjoy my profile being out in the open. Why does a hacker (is this a black or a white hat site? I'd assume black but I can't tell) have to give away my name and url to people? It's all public information I agree, but giving it away especially since apple, at&t and TONS of other companies are downloading this, doesnt seem morally right. Keep this info for yourself, sharing it just so companies can bug the shit out of us is a bunch of crap. Good for you you are all about security, but everyones differnt, so please dont use your commie views as an excuse to post this torrent. I have mine unsecured so I can be me, and express myself. I should have to worry about a fellow hacker trying to make everyone in the world be exactly like him, probably fat as fuck and no vagina to finger. I don't want to be like you, I want to be like me, and only me.

  70. Andrew Says:

    meh and im not trying to bash you op personally, i just dont like it when people want everything to be the same about everybody. People are different for a reason.

  71. bodmin Says:

    Hi ,Ron.
    Nice work, I appreciate your programming skills very much, I like programming crawlers and spiders myself. But unfortunately your script has no practical value. Do you have any idea how to cut out mails and interests of users? :-) Regards, Sergey.

  72. Andy Says:

    Want to know if you were included in these files? This web page will tell you...
    http://nohasslesites.com/FacebookNames

  73. yonose Says:

    Hello there

    You did a really good job here!!!

    I hope Ncrack also serves you well too.

    Regards.

  74. BuddyGusto.com Says:

    The real open Facebook starts with BuddyGusto.com here people share there FB likes out of their own free will with people they do not know to get new FBfriends with the same likes....

  75. Ron Bowes Says:

    To the couple people who are complaining about the data being useless -- I know, and I completely agree. The media definitely took it the wrong way. ohwell, it's been a fun ride :)

  76. Madeline Says:

    I have been trying to download via torrent.
    Seems people are not seeding. If somebody can mail me directly would be gr8.

    For a moment last week, i thought the big boys , read facebook and all are arm twisting and preventing sharing of information when this site was down last week.

  77. Jason Says:

    I have a VPS that has plenty of bandwidth I am not using. I could probably donate some of that bandwidth towards this. I would first have to check with my VPS provider to make sure using my VPS for "research" purposes is allowed. I only use about 3 to 5 gigs of my bandwidth per month on my VPS, and it has 450 Gb/mo. bandwidth. Swing by my website and let me know if interested (let me know via commenting on any of the posts, or in the forums).

  78. Alice Says:

    I did read somewhere once that it was against Facebook's TOS to use a parser to collect data from their site. I'm surprised they didn't actually do anything to prevent it. I mean they could easily have prevented by blocking the IP when it's just using too much bandwidth in too little time. I guess they don't really care.

    I was wondering how long it took you to parse the whole site. I guess a few months with a 100mbit line? Great work.

  79. Anonymous Says:

    Much ado about nothing. I certainly don't mind though. I've found a highly interesting blog to watch because of it.

  80. Lina Says:

    wow nice work!
    I'm wondering, how can I access the different events on facebook. most of them are public and I could really use their data.

    any help?

  81. Ronald Says:

    Yes this information is public and its our own fault, but YOU are the one who collated it to be misused by all the immoral people out there.

    Any consequences that come of this are YOUR fault, not facebook's and not even the users'.

    Someone else probably would have done it, but they didn't - you did. Congratulations, you are a dick.

  82. Matt Gardenghi Says:

    Ronald,

    Take a deep breath. This is public info. I'm sorry that you don't like people's names being collated; I assume you've never seen the phone book before. If some fool needs this collation of data before they can do bad, then they are dumb enough that they will get caught. Intelligent crooks (OK, most dumb crooks) won't need the leg up that this data doesn't actually provide. Those that need this data were also voted "most likely recipient of the Darwin Award" in their highschool year books.

  83. Rick Smith Says:

    @Ronald: I think you are making the assumption that Ron is the only person/group to have collected the FB information. The real problem is that the information is available to the world. How many others have gathered the same data (and more) and keep it to themselves?

  84. David Curry Says:

    Andrew: So... Why do you get to be yourself, but we can't be ourselves? Hypocritical much? Also, ad hominem, wonderful.

  85. u1106 Says:

    Hmm, should I be glad or disappointed? I'm not on the list. (I have highly customized privacy settings, most fields in my profile are just empty, but I allow search engines)

    Actually none of the 10+ of my friends I tried now are on the list either. And I'm sure may of them have never touched their privacy settings.

    Even users that Google finds are not on the list.

    So for some reason the list is very much incomplete (these are all users with [a-z] only). So maybe the 500 million users isn't that incorrect after all.

  86. violated Says:

    well...you convinced me to delete my facebook account...

  87. create free blogs Says:

    Hey Ron,

    How was your security conference ? Your work is quite impressive. I am researching about Network forensics. what about user emails with the name list ? If not, can you let me know what additions can we make in the script to fetch emails as well.

    I am ready to contribute my bandwidth - got 50 meg per sec connection ;)..

    Regards,
    Nick

  88. Julia Says:

    I think this is a pretty sad reflection on what our world has become. Exactly why do you think it's ok to STEAL people's information? If we wanted something from you - we'd ask. You put up a "Spam protection" spot in the leave a reply section - and yet you are promoting spam . . . Hypocrite!

  89. neofutur Says:

    @julia : define "STEAL" ?
    how can you steal public information ?

    if I say "Julia posted a comment here" I stole you something too ?

  90. muj Says:

    Good that I don't have facebook

    Visit

    http://www.gbay.co.cc

  91. 4ud1t0r Says:

    This sh1t cracks me up...

    1st - Ron :
    Interesting work. Keep it up. I say "interesting" because those of us that are in the game and who read this blog know that your other work is far more "useful"...

    2nd - Everyone else:
    Go to http://www.google.com
    Type in your name.
    Anything come up?

    THIS IS THE SAME THING!!!!

    Read my lips: PUBLIC INFO IS PUBLIC!!!! Get a life!

    If nothing comes up (which I suspect will be the case with most of the posters here)... then really GET A LIFE!

  92. harrel Says:

    What... the, nobody is safe on facebook

  93. Jancis Says:

    wow this is so interesting. a list of names you got. nice. .. yawn. can you get more? what are you all so happy about this, this is boring. man you should get a life.

  94. carlo r Says:

    hey
    do you have anything that will work for Linkedin in the same manner???

  95. Ron Bowes Says:

    Not yet, but if somebody else wants to do it I'll happily post it + give them free credit. I don't really have the time right now.

  96. r3dfish Says:

    Hey guys,
    We gave a speech at DefCon 17 where we analyzed facebook scrapes using Hadoop. In this video we talk about scraping time stamps from peoples walls to map out the micro marketing implications of facebook usage. Check it out here:
    http://www.hackedexistence.com/project-facebook.html
    The complete speech can be seen here:
    http://www.hackedexistence.com/project-hadoop.html

  97. EnglishStan Says:

    What database will these files open with?

    Notepad crashes!!

  98. PES Says:

    excuse me, can anyone help me to open the files. i dunno the way to open the database, can someone help me to open the database? or telling the way? thx.

  99. Jonathan Sieling Says:

    I have all the bandwidth you need at 100mbs. I would $love$ to have the email address, employer, title, location, filled in.

    As for your next task. Put an entire user of your choice in SQL for their entire FB history. I bet you would be able to tell how often the go to specific places, or regular ruitines. I really want this for myself to see what times i usually update my status and how many bars i frequent each weekend.

    Bottom line, capture ANY AND ALL data you can, and somebody will find a use for it.

  100. black shadow Says:

    What about finding a name if you just have their picture, like "tineye" is it possible?

  101. Ron Bowes Says:

    Interesting idea, but I would have to harvest the pictures first.

  102. Magestik Says:

    black shadow> I'm already working on this. I'm working with a guy who know many thing about imaging (including facial recognition). I'm going to make a big database with a little script which crawl graph.facebook.com (JSON and images). I'm going to need some help ...

    Wikipedia says : "U.S. Department of State operates one of the largest face recognition systems in the world with over 75 million photographs that is actively used for visa processing."
    We can beat them ^^

  103. totzpalanz Says:

    we're you able to download the emails for this 1 million FB users?

  104. chandan Says:

    wow this is ridiculous.

    FB should be facing problem now

  105. quinametin Says:

    Magestik> I did the same :) I've tried to use OpenCV but the result is not very good... the error rate is very high. I've tried with 900.000 pics.

  106. tomas sanchez Says:

    People are starting to find use of it. Some guy made a username dictionary out of it.

    http://www.4shared.com/file/X-gu_-UQ/facebookusernamestxt.html
    http://www.mediafire.com/?38936aa9d3jkeva
    http://rapidshare.com/files/412403215/facebook.usernames.txt.zip
    http://www.megaupload.com/?d=7LNBFEDE

  107. Magestik Says:

    quinametin> I'm training with a few pictures (40) and eigenfaces algorithm... With my recent improvement the software can check 1 picture in 0.01s ... And the results are always OK !

    The problem is 0.01s per image is too slow because it would take 22 days for 100.000.000 images ... So I have to continue to increase speed, then I'll test it on more images.

  108. quinametin Says:

    @Magestick Maybe we can exchange some experiences :)

  109. SundayDriver Says:

    Well I have been using the URL table to point my own crawler to capture email, interest, etc. However, for the account i used on the crawler, email addresses are now being displayed as an image. If i use my personal account, emails are back to being text. I need help!

  110. Asim Zeeshan Says:

    I downloaded this torrent and uploaded it to my VPS. Anyone who wishes to download this data via http can do so from here.

    http://ash.li-node.com/fb_torrent.tar

    more locations on my blog
    http://www.asim.pk/2010/08/20/download-personal-details-of-100m-facebook-users/

  111. apodfuid Says:

    Wow. Anyone who doesnt know how this is useful is missing the point lol. ne thing though. you should really consider putting up a software suite that adds common password text. such as one that inner caps, does backwards, adds ! at front and end,does the pw backwards etc. Your work in the password cracking field is very impressive, and tools such as these would be very helpful, especially if put in a suite.

  112. Ron Bowes Says:

    @apodfuid - John the Ripper already does a lot of that, so when I need to crack passwords (or generate bigger lists) that's what I do.

Leave a Reply