adrianszymczak / fb-crawl Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/fb-crawl
Automatically exported from code.google.com/p/fb-crawl
fb-crawl.pl is a script that crawls/scrapes Facebook friends and adds their information to a database. It can be used for social graph analysis and refined Facebook searching. FEATURES - Multithreaded - Aggregates information from multiple accounts REQUIREMENTS - Perl 5 or greater - MySQL INSTALLATION $ cd fb-crawl $ chmod +x fb-crawl.pl $ ./fb-crawl.pl fb-crawl.pl will set up all the required database tables. All you have to do is provide it with the MySQL connection information and Facebook account. $ ./fb-crawl.pl -u [email protected] -host mysql.host -user fb-crawl -pass mysqlPassword OPTIONS -u Facebook email address. -p Facebook password. -host MySQL server IP address or host name (default: localhost); -port MySQL port (default: 3306) -user MySQL user (default: root) -pass MySQL password. -db MySQL database (default: facebook) -tables MySQL table names for info, wall, and friends in that order. Formatted in a colon separated list. (default: info:wall:friends) -info User info save method (default: append) append: This appends new comma separated information to the row and keeps the old information. Useful when you want to save all user changes and don't care about when it was updated. insert: This inserts new user information in a new row (degrages searchability). Useful when you want to save all user changes and when that info was updated. replace: This replaces all the current user information in the database with the new information. Useful when you only care about the most recent user information. -i Crawl user's information and add to info table. -w Crawl user's wall posts and add to wall table. -f Crawl user's friends and add to friends table. -self Crawl your profile too. -t Threads (default: 16) -https Use SSL encryption. -proxy Use an HTTP proxy. host[:port] -timeout Timeout in seconds (default: 30) -depth Crawl depth (default: 0) 0 - only your friends 1 - friends of friends 2 - friends of friends of friends 3 - friendception -url Crawl these url(s) and also crawl the user's friends if -depth > 1 example: -url http://fb.com/profile.php?id=12345,profile.php?id=54321,john.smith.3 -name Search for and crawl these name(s) and also crawl the user's friends if -depth > 1. This works by using Facebook's search by and using the first result. For more precision use -url. example: -name "John Smith, Jane Smith" -new Only crawl users that aren't in the database. -old Only crawl users that are in the database. -plugins Plug-ins to include. example: birthday2date.pl,location2LatLon.pl -h Help EXAMPLES Crawl your friends' Facebook information, wall, and friends: $ ./fb-crawl.pl -u [email protected] -i -w -f Crawl John Smith's Facebook information, wall, and friends: $ ./fb-crawl.pl -u [email protected] -i -w -f -name 'John Smith' Crawl Facebook information for friends of friends: $ ./fb-crawl.pl -u [email protected] -depth 1 -i Crawl Facebook information of John Smith's friends of friends: $ ./fb-crawl.pl -u [email protected] -depth 1 -i -name 'John Smith' Extreme: Crawl friends of friends of friends of friends with 200 threads: $ ./fb-crawl.pl -u email@address -depth 4 -t 200 -i -w -f MYSQL EXAMPLES Find local singles: SELECT `user_name`, `profile` FROM `info` WHERE `current_city` = 'My Current City, State' AND `sex` = 'Female' AND `relationship` = 'Single' Find some Harvard singles: SELECT `user_name`, `profile` FROM `info` WHERE `college` = 'Harvard University' AND `sex` = 'Female' AND `relationship` = 'Single' How many Facebook employees have you crawled? SELECT count(*) FROM `info` WHERE `company` = 'Facebook' Find John Smith's friends: SELECT `friends` FROM `friends` WHERE `name` = 'John Smith' PLUG-INS fb-crawl.pl will open a perl script that can analyze and modify user information before it goes into the database. The script should contain a function with the same name as the file. The function is passed a hash reference with the current user's information in it. To load a plug-in use the -plugins option: $ ./fb-crawl.pl -u email@address -i -plugins location2latlon.pl,birthday2date.pl location2latlon.pl: This plug-in adds the user's coordinates to the database using the Google Geocoding API. birthday2date.pl: This plug-in convert the user's birthday to MySQL date (YYYY-MM-DD) format. See plugin files for implementation details. FAQ It's logging in but won't load my friends? You probably have SSL enabled on your account. You need to use the -https option. Can't locate object method "ssl_opts" via package "LWP::UserAgent" You need to install LWP::Protocol::https. $ sudo perl -MCPAN -e 'install LWP::Protocol::https'
i appended -i but the info table is empty..what i do wrong?
Original issue reported on code.google.com by [email protected]
on 10 Apr 2013 at 8:40
Is there any working link?
Original issue reported on code.google.com by [email protected]
on 15 Apr 2015 at 11:29
What steps will reproduce the problem?
1. run with cmd: ./fb-crawl.pl -u xxx@xxx -p xxx -https -i -w -f
2. display error:
Use of uninitialized value $fb_user_id in numeric eq (==) at ./fb-crawl.pl line
510.
Argument "\x{6d}\x{6c}..." isn't numeric in numeric eq (==) at ./fb-crawl.pl
line 510.
3.
What is the expected output? What do you see instead?
table into database don't have data
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 1 Dec 2014 at 3:26
What steps will reproduce the problem?
1. Execute ./fb-crawl.pl -u [email protected] -p xxxxx -https -i -w -f
Result is:
+ Connecting to [email protected] on port 3306
+ Checking Tables
| Table "info" exists
| Table "wall" exists
| Table "friends" exists
+ Logging in...done
+ Entering depth level: 0 (your friends)
+ Loading My Name's friends. User ID: 123456789
+ Entering depth level: 1 (friends of friends)
+ Entering depth level: 2 (friends of friends of friends)
+ Entering depth level: 3 (friends of friends of friends of friends)
+ 0 profiles crawled in 8 seconds
The DB tables are created, but no data is saved into it.
Original issue reported on code.google.com by [email protected]
on 27 Jun 2013 at 3:25
.nontouch ._55wr{padding:4px}
.nontouch ._55ws{padding:6px}
.nontouch ._56hq{padding:8px}._52j9{color:#adb2bb}
._52ja{color:#6a7180}
._52jb{color:#141823}
.touched ._592p ._52j9, .touched ._592p._52j9, .touched._592p ._52j9,
.touched._592p._52j9, .touched ._592p ._52ja, .touched ._592p._52ja,
.touched._592p ._52ja, .touched._592p._52ja, .touched ._592p._52jb,
.touched._592p ._52jb, .touched ._592p ._52jb, .touched._592p._52jb, .touched
._592p, .touched._592p{color:#fff}
._56bq{font-size:11px;line-height:16px;text-transform:uppercase}
._52jc{font-size:12px;line-height:16px}
._52jd{font-size:14px;line-height:20px}
._52je{font-size:16px;line-height:20px}
._52jf{font-size:18px;line-height:24px}
._52jg{font-weight:normal}
._52jh{font-weight:bold}
._52ji{text-align:left}
._52jj{text-align:center}
._52jk{text-align:right}._56bg{border:0;display:block;margin:0;padding:0}.btn{bo
rder:solid 2px;cursor:pointer;margin:0;padding:2px 6px 3px;text-align:center}
.btn.largeBtn{display:block}
button.largeBtn,
input.largeBtn{width:100%}
.btnForm{display:inline;border:none;padding:0}
.btnD,
.acb .btnC,
.btnI,
.nontouch a.btnD,
.nontouch .acb a.btnC,
.nontouch a.btnI{background:#f3f4f5;border-color:#ccc #aaa #999;color:#505c77}
.acb .btnD,
.btnC,
.acb .btnI,
.nontouch .acb a.btnD,
.nontouch a.btnC,
.nontouch a.btnC:visited,
.nontouch .acb a.btnI{background:#3b5998;border-color:#8a9ac5 #29447E
#1a356e;color:#fff}
.nontouch .btnC.disabled{color:#9dabce}
.btnS,
.nontouch a.btnS,
.nontouch a.btnS:visited{background:#69a74e;border-color:#98c37d #3b6e22
#2c5115;color:#fff}
.btnN,
.nontouch a.btnN,
.nontouch a.btnN:visited{background:#ee3f10;border-color:#f48365 #8d290e
#762610;color:#fff}
.btn .img{pointer-events:none}.btn,
.btnForm{display:inline-block}
.btn + .btn,
.btnForm + .btnForm,
.btn + .btnForm,
.btnForm + .btn{margin-left:3px}
.largeBtn + .largeBtn{margin-left:0;margin-top:6px}
.btn input{background:none;border:none;margin:0;padding:0}
.btnD input,
.acb .btnC input,
.btnI input{color:#505c77}
.acb .btnD input,
.btnC input,
.acb .btnI input,
.btnS input,
.btnN input{color:#fff}.nontouch a,
.nontouch a:visited{color:#3b5998;text-decoration:none}
.nontouch .sub,
.nontouch .sub:visited{color:gray}
.nontouch .sec,
.nontouch .sec:visited{color:#6d84b4}
.nontouch .inv,
.nontouch .inv:visited{color:#fff}.nontouch a:focus,
.nontouch a:hover,
.nontouch .sub:focus,
.nontouch .sub:hover,
.nontouch .sec:focus,
.nontouch .sec:hover{background-color:#3b5998;color:#fff}
.nontouch .inv:focus,
.nontouch .inv:hover,
.nontouch .inv:hover .fcy,
.nontouch .inv:focus .fcy{background-color:#fff;color:#3b5998}._5pkb,
._5pkc{margin:0}
._5pkb li, ._5pkc
li{display:block;list-style:none}body{text-align:left;direction:ltr}
body, tr, input, textarea, button{font-family:sans-serif}
body,
p, figure,
h1, h2, h3, h4, h5, h6,
ul, ol, li, dl, dd, dt{margin:0;padding:0}
h1, h2, h3, h4, h5, h6{font-size:1em;font-weight:bold}
ul, ol{list-style:none}._513c #viewport{margin:0
auto;max-width:600px}#page{position:relative}.lr{width:100%}
.lr .r{text-align:right}.img{border:0;display:inline-block;vertical-align:top}
i.img u{position:absolute;width:0;height:0;overflow:hidden}.nontouch
._5ui2{background:#eceff5}
.nontouch ._5ui2 a, .nontouch ._5ui2 a:visited{color:#2b55ad}
.nontouch ._5ui2 a:hover, .nontouch ._5ui2
a:focus{background:#2b55ad;color:#fff}
.nontouch ._5ui3, .nontouch ._5ui4{padding:0 6px 6px}
.nontouch ._5ui5{border-top:1px solid
#dfe2e8;padding:3px}/*]]>*/</style></head><body tabindex="0" class="nontouch x1
ff _513c iframe acw"><div class="mfsm"><div id="viewport"><div class="acb aps"
id="header"><table cellspacing="0" cellpadding="0" class="lr"><tr><td
valign="top"><h1><a href="/home.php?refid=9"><img
src="https://fbstatic-a.akamaihd.net/rsrc.php/v2/yz/r/aKhO2tw3FnO.png"
width="76" height="20" class="img" alt="facebook" /></a></h1></td><td
valign="top" class="r"><a class="btn btnS" href="/r.php?refid=9">Create
Account</a></td></tr></table></div><div id="objects_container"><div id="root"
role="main" class="_5so8 acw" data-sigil="context-layer-root
content-pane"><table class="_4g33"><tbody><tr><td class="_4g34"><div class="acy
aps abb"><span class="mfss">You must log in first.</span></div><div class="aclb
_5rut"><form method="post" class="mobile-login-form _5so9" id="login_form"
novalidate="1"
action="https://m.facebook.com/login.php?next=https%3A%2F%2Fm.facebook.com%2Fpro
file.php%3Fid%3D0
! Request Failed: https://www.facebook.com/ajax/browser/list/allfriends/?uid=ml
version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN"
"http://www.wapforum.org/DTD/xhtml-mobile10.dtd"><html
xmlns="http://www.w3.org/1999/xhtml"><head title="Welcome to
Facebook"><title>Welcome to Facebook</title><meta name="description"
content="Facebook helps you connect and share with the people in your life."
/><meta name="referrer" content="default" id="meta_referrer" /><style
type="text/css">/*<![CDATA[*/.nontouch ._5ruq{border:solid 1px
#999;font-size:medium;width:89%}
#root._5t66{background-color:#395a99;color:#fff}
#root._5t66 a{color:#fff}
._5t67{padding:56px 36px 0 36px;text-align:center}._5so8 #header
td{padding:2px;vertical-align:middle}
._5so8 .other-links{line-height:2em}
._5so8 .button_area .btnC{color:#fff}
._5so8 .button_area .btnR{background-color:#b9b9b9;color:#000}
._5so8 .button_area .btnR:hover{background-color:#999;color:#000}
._5so9 div.button_area{width:89%}
._5so9 #form_fields > div > div{padding:0 3px 0 0}
._5so9 .button_area .largeBtn{padding:5px}
._4u99{color:#87898c;font-size:13px;font-weight:700;padding:5px 0 0
4px}.nontouch
._4g33{border:0;border-collapse:collapse;margin:0;padding:0;width:100%}
.nontouch ._4g33 tbody, .nontouch ._52wc > tr > td, .nontouch ._52wc > tbody >
tr > td, .nontouch ._4g33 td._52wc, .nontouch ._52wf > tr > td, .nontouch
._52wf > tbody > tr > td, .nontouch ._4g33 td._52wf{vertical-align:top}
.nontouch ._52wd > tr > td, .nontouch ._52wd > tbody > tr > td, .nontouch
._4g33 td._52wd{vertical-align:bottom}
.nontouch ._52we > tr > td, .nontouch ._52we > tbody > tr > td, .nontouch
._4g33 td._52we{vertical-align:middle}
.nontouch ._4g33 td{padding:0}
.nontouch ._4g33 td._55wq{padding:2px}
.nontouch ._4g33 td._55wr{padding:4px}
.nontouch ._4g33 td._55ws{padding:6px}
.nontouch ._4g33 td._56hq{padding:8px}
.nontouch ._4g34{width:100%}.acw{background-color:#fff}
.acbk{background-color:#000}
.acb{background-color:#3b5998}
.aclb{background-color:#eceff5}
.acdb{background-color:#31394a}
.acg{background-color:#f2f2f2}
.acy{background-color:#fffbe2;color:#7f7212}
.acr{background-color:#ffebe8;color:#6d220d}.aps{padding:2px 3px}
.apm{padding:4px 3px}
.apl{padding:6px 3px}.abt{border-top:1px solid}
.abb{border-bottom:1px solid}
.acw{border-color:#e9e9e9}
.acb{border-color:#1d4088}
.aclb{border-color:#d8dfea}
.acg{border-color:#ccc}
.acy{border-color:#e2c822}
.acr{border-color:#dd3c10}.fcb{color:#000}
.fcg{color:gray}
.fcw{color:#fff}
.fcl{color:#3b5998}
.fcs{color:#6d84b4}.mfsxs{font-size:x-small}
.mfss{font-size:small}
body, tr, input, textarea, .mfsm{font-size:medium}
.mfsl{font-size:large}form{margin:0;border:0}.nontouch ._55wp{padding:0}
.nontouch ._55wq{padding:2px}
.nontouch ._55wr{padding:4px}
.nontouch ._55ws{padding:6px}
.nontouch ._56hq{padding:8px}._52j9{color:#adb2bb}
._52ja{color:#6a7180}
._52jb{color:#141823}
.touched ._592p ._52j9, .touched ._592p._52j9, .touched._592p ._52j9,
.touched._592p._52j9, .touched ._592p ._52ja, .touched ._592p._52ja,
.touched._592p ._52ja, .touched._592p._52ja, .touched ._592p._52jb,
.touched._592p ._52jb, .touched ._592p ._52jb, .touched._592p._52jb, .touched
._592p, .touched._592p{color:#fff}
._56bq{font-size:11px;line-height:16px;text-transform:uppercase}
._52jc{font-size:12px;line-height:16px}
._52jd{font-size:14px;line-height:20px}
._52je{font-size:16px;line-height:20px}
._52jf{font-size:18px;line-height:24px}
._52jg{font-weight:normal}
._52jh{font-weight:bold}
._52ji{text-align:left}
._52jj{text-align:center}
._52jk{text-align:right}._56bg{border:0;display:block;margin:0;padding:0}.btn{bo
rder:solid 2px;cursor:pointer;margin:0;padding:2px 6px 3px;text-align:center}
.btn.largeBtn{display:block}
button.largeBtn,
input.largeBtn{width:100%}
.btnForm{display:inline;border:none;padding:0}
.btnD,
.acb .btnC,
.btnI,
.nontouch a.btnD,
.nontouch .acb a.btnC,
.nontouch a.btnI{background:#f3f4f5;border-color:#ccc #aaa #999;color:#505c77}
.acb .btnD,
.btnC,
.acb .btnI,
.nontouch .acb a.btnD,
.nontouch a.btnC,
.nontouch a.btnC:visited,
.nontouch .acb a.btnI{background:#3b5998;border-color:#8a9ac5 #29447E
#1a356e;color:#fff}
.nontouch .btnC.disabled{color:#9dabce}
.btnS,
.nontouch a.btnS,
.nontouch a.btnS:visited{background:#69a74e;border-color:#98c37d #3b6e22
#2c5115;color:#fff}
.btnN,
.nontouch a.btnN,
.nontouch a.btnN:visited{background:#ee3f10;border-color:#f48365 #8d290e
#762610;color:#fff}
.btn .img{pointer-events:none}.btn,
.btnForm{display:inline-block}
.btn + .btn,
.btnForm + .btnForm,
.btn + .btnForm,
.btnForm + .btn{margin-left:3px}
.largeBtn + .largeBtn{margin-left:0;margin-top:6px}
.btn input{background:none;border:none;margin:0;padding:0}
.btnD input,
.acb .btnC input,
.btnI input{color:#505c77}
.acb .btnD input,
.btnC input,
.acb .btnI input,
.btnS input,
.btnN input{color:#fff}.nontouch a,
.nontouch a:visited{color:#3b5998;text-decoration:none}
.nontouch .sub,
.nontouch .sub:visited{color:gray}
.nontouch .sec,
.nontouch .sec:visited{color:#6d84b4}
.nontouch .inv,
.nontouch .inv:visited{color:#fff}.nontouch a:focus,
.nontouch a:hover,
.nontouch .sub:focus,
.nontouch .sub:hover,
.nontouch .sec:focus,
.nontouch .sec:hover{background-color:#3b5998;color:#fff}
.nontouch .inv:focus,
.nontouch .inv:hover,
.nontouch .inv:hover .fcy,
.nontouch .inv:focus .fcy{background-color:#fff;color:#3b5998}._5pkb,
._5pkc{margin:0}
._5pkb li, ._5pkc
li{display:block;list-style:none}body{text-align:left;direction:ltr}
body, tr, input, textarea, button{font-family:sans-serif}
body,
p, figure,
h1, h2, h3, h4, h5, h6,
ul, ol, li, dl, dd, dt{margin:0;padding:0}
h1, h2, h3, h4, h5, h6{font-size:1em;font-weight:bold}
ul, ol{list-style:none}._513c #viewport{margin:0
auto;max-width:600px}#page{position:relative}.lr{width:100%}
.lr .r{text-align:right}.img{border:0;display:inline-block;vertical-align:top}
i.img u{position:absolute;width:0;height:0;overflow:hidden}.nontouch
._5ui2{background:#eceff5}
.nontouch ._5ui2 a, .nontouch ._5ui2 a:visited{color:#2b55ad}
.nontouch ._5ui2 a:hover, .nontouch ._5ui2
a:focus{background:#2b55ad;color:#fff}
.nontouch ._5ui3, .nontouch ._5ui4{padding:0 6px 6px}
.nontouch ._5ui5{border-top:1px solid
#dfe2e8;padding:3px}/*]]>*/</style></head><body tabindex="0" class="nontouch x1
ff _513c iframe acw"><div class="mfsm"><div id="viewport"><div class="acb aps"
id="header"><table cellspacing="0" cellpadding="0" class="lr"><tr><td
valign="top"><h1><a href="/home.php?refid=9"><img
src="https://fbstatic-a.akamaihd.net/rsrc.php/v2/yz/r/aKhO2tw3FnO.png"
width="76" height="20" class="img" alt="facebook" /></a></h1></td><td
valign="top" class="r"><a class="btn btnS" href="/r.php?refid=9">Create
Account</a></td></tr></table></div><div id="objects_container"><div id="root"
role="main" class="_5so8 acw" data-sigil="context-layer-root
content-pane"><table class="_4g33"><tbody><tr><td class="_4g34"><div class="acy
aps abb"><span class="mfss">You must log in first.</span></div><div class="aclb
_5rut"><form method="post" class="mobile-login-form _5so9" id="login_form"
novalidate="1"
action="https://m.facebook.com/login.php?next=https%3A%2F%2Fm.facebook.com%2Fpro
file.php%3Fid%3D0&infinitescroll=1&location=friends_tab_tl&start=0&__user=0&__a=
1 - Sorry, something went wrong.
Error: 0
So when i run the line "perl fb-crawl/fb-crawl.pl -u *******@hotmail.com -p
password -https -f -i -w -self" i get the error above no Mather witch
parameters i chose
Original issue reported on code.google.com by [email protected]
on 4 Dec 2013 at 12:28
Running the program appears to work but no data is extracted
Original issue reported on code.google.com by [email protected]
on 19 Sep 2014 at 10:01
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.