GithubHelp home page GithubHelp logo

postaddictme / instagram-java-scraper Goto Github PK

View Code? Open in Web Editor NEW
445.0 44.0 148.0 676 KB

Instagram Java Scraper. Get account information, photos, videos and comments.

Java 100.00%
instagram instagram-client instagram-java-scraper instagram-sdk instagram-java-sdk instagram-java-client instagram-api

instagram-java-scraper's Introduction

Instagram Java scraper

Instagram Java Scraper. Get account information, photos and videos without any authorization.

Get account by username

Instagram instagram = new Instagram(httpClient);
Account account = instagram.getAccountByUsername("kevin");
System.out.println(account.getMedia().getCount());

Get account by account id

Instagram instagram = new Instagram(httpClient);
Account account = instagram.getAccountById(3);
System.out.println(account.getFullName());

Get account medias

PageObject<Media> medias = instagram.getMedias("durov", 1);
System.out.println(medias.getNodes().get(0).getDisplayUrl());

Get media by code

Media media = instagram.getMediaByUrl("BGY0zB4r7X2");
System.out.println(media.getOwner().getUsername());

Get media by url

Media media = instagram.getMediaByUrl("https://www.instagram.com/p/BGY0zB4r7X2");
System.out.println(media.getOwner().getUsername());

Convert media id to shortcode

MediaUtil.getCodeFromId("1270593720437182847_3");
// OR
MediaUtil.getCodeFromId("1270593720437182847");
// Output: BGiDkHAgBF_
// So you can do like this: instagram.com/p/BGiDkHAgBF_

Convert shortcode to media id

MediaUtil.getIdFromCode('BGiDkHAgBF_');
// Output: 1270593720437182847

If you use this library in your project and want to help us

  • Mark project repository by star on github
  • Make pull request with bug fix
  • Follow project contributors

How to use release version of Instagram Java scraper

Released as com.github.igor-suhorukov:instagramscraper:2.2 into maven central

How to use development version of Instagram Java scraper

Read more info on jitpack page of project. Open "Commit" tab and select revision by commit hash. Just open Gradle or Maven tab copy artifact info and place it with dendency management repository in your project build configuration

IDE lombok plugin

Project Lombok is a java library that automatically plugs into your editor and build tools, spicing up your java. Never write another getter or equals method again.

If instagram-java-scraper IDE compilation failing because of all the missing getters/setters. Just setup lombok plugin for IntelliJ Idea, Eclipse or Netbeans

Setup http client to handle errors, log response and store cookies

HttpLoggingInterceptor loggingInterceptor = new HttpLoggingInterceptor();
loggingInterceptor.setLevel(HttpLoggingInterceptor.Level.BODY);

OkHttpClient httpClient = new OkHttpClient.Builder()
        .addNetworkInterceptor(loggingInterceptor)
        .addInterceptor(new ErrorInterceptor())
        .cookieJar(new DefaultCookieJar(new CookieHashSet()))
        .build();

Other

PHP library: https://github.com/postaddictme/instagram-php-scraper

instagram-java-scraper's People

Contributors

artur-barsegyan avatar asahi7 avatar bibarsov avatar cyrus07424 avatar dependabot[bot] avatar h31 avatar igor-suhorukov avatar karatemaccie avatar kingingo avatar luborliu avatar pavelsakharchuk avatar posadskiy avatar raiym avatar shiawasenahoshi avatar tan4ek avatar vorkytaka avatar xrevxp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

instagram-java-scraper's Issues

NullPointerException when using getMediasByTag

Jan 19 14:21:43 app/web.1:  java.lang.NullPointerException: null 
Jan 19 14:21:43 app/web.1:  	at java.util.ArrayList.addAll(ArrayList.java:581) 
Jan 19 14:21:43 app/web.1:  	at me.postaddict.instagram.scraper.request.GetMediaByTagRequest.updateResult(GetMediaByTagRequest.java:29) 
Jan 19 14:21:43 app/web.1:  	at me.postaddict.instagram.scraper.request.GetMediaByTagRequest.updateResult(GetMediaByTagRequest.java:13) 
Jan 19 14:21:43 app/web.1:  	at me.postaddict.instagram.scraper.request.PaginatedRequest.requestInstagramResult(PaginatedRequest.java:48) 
Jan 19 14:21:43 app/web.1:  	at me.postaddict.instagram.scraper.Instagram.getMediasByTag(Instagram.java:136) 

Please update recent source for mvn or jitpack

I got instagram-java-scraper with gradle

I'm using 'com.github.postaddictme:instagram-java-scraper:0.3.0'

Can you update recent source for mvn or jitpack?

I want to use 'public PageObject getMedias(String username, int pageCount)'

Please~~~

400 Error

I am running 8 accounts with pauses of 240 seconds for every like and follow based on hashtags, I am getting a 400 error for every like attempt, any ideas how to correct that?

Not An Issue - (Request) - Nice To have

First of all thank you for creating this, it has helped me developed an app on top of yours. I think it would be nice to provide a browser login using SWT, so we can see prompts like Was Me, Or Not Me, or verification emails. This could be useful to handle multiple of accounts using proxies.

Someone how feed the current class the login authentication obtain from the browser gui swt?

Get 403 when I try to login

Hey,
I am trying to login, and get:
<-- 403 Forbidden https://www.instagram.com/accounts/login/ajax/

The username and passwords are ok, I can login with them manually.

This is my code:

 val loggingInterceptor = new HttpLoggingInterceptor
 loggingInterceptor.setLevel(HttpLoggingInterceptor.Level.BODY)
 val httpClient: OkHttpClient =
    new OkHttpClient.Builder()
      .addNetworkInterceptor(loggingInterceptor)
      .addInterceptor(new UserAgentInterceptor(UserAgents.OSX_CHROME))
      .addInterceptor(new ErrorInterceptor)
      .cookieJar(new DefaultCookieJar(new CookieHashSet))
      .build

  val client = new Instagram(httpClient)
  client.login(username, password)
  client.basePage()

What am I missing here?

imageHighResolutionUrl returns null

From my test, imageHighResolutionUrl never works. It just returns null while high resolution image is there.

account = instagram.getAccountByUsername("eunjung.hahm");
List<Media> medias = instagram.getMedias("eunjung.hahm", 5);
for (Media medoa : medias) {
	System.out.println(medoa.imageHighResolutionUrl);
	System.out.println(medoa.imageLowResolutionUrl);
}

Error: Account with given username does not exist.

Below is the error I get.

me.postaddict.instagramscraper.exception.InstagramNotFoundException: Account with given username does not exist.
at me.postaddict.instagramscraper.Instagram.throwExceptionIfError(Instagram.java:326)
at me.postaddict.instagramscraper.Instagram.getAccountById(Instagram.java:131)

Understand the limits of the lib api.

Hey!
Not sure if this is the right place for this, but wanted to understand the limits of using this API.
While requests don't need to be authenticated, how the limits apply? by ip?
If so what are the limits?
Also, what are the limits for authenticated user (using the login method).
Looking in Instagram documentation , the limits are described for requests that are using the access_token.

CarouselResource always null

I'm assuming that a CarouselResource is a collection of images related to a parent Media (example https://www.instagram.com/p/BcaiZ8UFYyY/). When I get media by

PageObject<Media> medias = instagram.getMedias(accountName, currentPage);

the array is always null, is my assumption correct in what this should be?

Get an error when calling instagram.getMediasByTag

When I call instagram.getMediasByTag, I get the following error:

me.postaddict.instagramscraper.exception.InstagramException: Response code is not equal 200. Something went wrong. Please report issue.
at me.postaddict.instagramscraper.Instagram.getMediasByTag(Instagram.java:196)
at Driver.main(Driver.java:101)

This used to work, but all of a sudden it stopped working.

getMediaByUrl

I've tryied to execute this code

Instagram instagram = new Instagram();
media = instagram.getMediaByUrl("https://www.instagram.com/p/BGY0zB4r7X2/");

and get InstagramNotFoundException: Account with given username does not exist.

Can't login. Get http status 400

Hey,

I am not sure it's a problem in the lib, but I try to login in the recent days and fail.

  val httpClient: OkHttpClient =
  new OkHttpClient.Builder()
    .addInterceptor(new UserAgentInterceptor(UserAgents.OSX_CHROME))
    .addInterceptor(new ErrorInterceptor)
    .cookieJar(new DefaultCookieJar(new CookieHashSet))
    .build

  val client = new Instagram(httpClient)

  def login(): Unit = {
    client.basePage()
    client.login(credentials.username, credentials.password)
    client.basePage()
  }

I am using the same code to do this locally and I manage to log in, but once I upload this code to my server which is not where I am located it fails.
Do you have any suggestions on how to handle this?

Follow & Unfollow

Does the program currently handle that? I haven't been able to find it and make it work?

Парсер подписчиков

Приветствую! на хабре в комметах написали, что есть мысли как выгрузить подписчиков, поделитесь пожалуйста. знаю как в jsone на кого подписан юзер выгрузить, а подписчиков как выгрузить не знаю, есть мысль только selenium webdriver воспользоваться, но там больше 1000 подписчиков не выгрузится

XML Unmarshall Document

Hey, if i am trying to do this:

HttpLoggingInterceptor loggingInterceptor = new HttpLoggingInterceptor();
        loggingInterceptor.setLevel(HttpLoggingInterceptor.Level.BODY);

        OkHttpClient httpClient = new OkHttpClient.Builder()
                .addNetworkInterceptor(loggingInterceptor)
                .addInterceptor(new ErrorInterceptor())
                .cookieJar(new DefaultCookieJar(new CookieHashSet()))
                .build();

        Instagram instagram = new Instagram(httpClient);

        
            PageObject<Media> media = instagram.getMedias("internetgangster_", 1);

i receive the following exception:

javax.xml.bind.UnmarshalException
 - with linked exception:
[Exception [EclipseLink-25004] (Eclipse Persistence Services - 2.7.0.v20170811-d680af5): org.eclipse.persistence.exceptions.XMLMarshalException
Exception Description: An error occurred unmarshalling the document
Internal Exception: javax.json.stream.JsonParsingException: Unexpected char 60 at (line no=1, column no=1, offset=0)]

This has worked before like 30x times, but now i am receiving this error.

I was thinking this are just limits from Instagram, but i have tried to run this on my server and local and on both cases its crashes...

Instagram.Login method

I'm getting the following error when using the Login(username,password) method on Instagram Class.

Exception in thread "main" com.fasterxml.jackson.core.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') at [Source: (okio.RealBufferedSource$1); line: 1, column: 2] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1798) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:663) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:561) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2625) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:826) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:723) at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:4129) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3988) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3065) at me.postaddict.instagram.scraper.mapper.ModelMapper.isAuthenticated(ModelMapper.java:136) at me.postaddict.instagram.scraper.Instagram.login(Instagram.java:74)

Cannot find account by user id

getAccountById function returns null value for the current version. Prior to that there was a response code issue so I was using an earlier version. getAccountByUsername works fine.
At parse() method of HttpUrl class, builder threw NullPointerException.

Error 403 at login

Well I'm trying to log in using this useful program, but I get 403 error. How can I solve the issue?

Can I get my account's all media with paging?

My instagram count is over 4000;

excuting getMedias(account,4000)... occur error;

but excuting getMedias(account,2000)... works.

Can I get my account's all media(4000)?

or... Can I get my account's all media with paging?

Issue with "ModelMapper.java"

In this code section:
return getUnmarshallerCache().computeIfAbsent(mappingFile, mapping -> ThreadLocal.withInitial(() -> getUnmarshaller(mapping)));
compiler show error Cannot find symbol: getUnmarshallerCache().

What is the getUnmarshallerCache() method? Where it is defined?
Tnx!

New collaborator

Since @igor-suhorukov actively contributing to the java library and have made last biggest refactor I decided to give write permissions to this repo to him.
(I reserve the right to withdraw permission in the future)

@igor-suhorukov thank you for active work.

Getting absolute image

Hi,
When I try to get medias from one account I am getting the image like cropped. I want to get images without cropped.

How is that possible ?
Thank you

NullPointerException when using getMediasByTag

Using the new 2.0 version I get:

2018-01-17T20:59:39.519034+00:00 app[web.1]: java.lang.NullPointerException: null
2018-01-17T20:59:39.519035+00:00 app[web.1]: 	at me.postaddict.instagram.scraper.mapper.ModelMapper.mapTag(ModelMapper.java:81)
2018-01-17T20:59:39.519037+00:00 app[web.1]: 	at me.postaddict.instagram.scraper.request.GetMediaByTagRequest.mapResponse(GetMediaByTagRequest.java:40)
2018-01-17T20:59:39.519038+00:00 app[web.1]: 	at me.postaddict.instagram.scraper.request.GetMediaByTagRequest.mapResponse(GetMediaByTagRequest.java:13)
2018-01-17T20:59:39.519039+00:00 app[web.1]: 	at me.postaddict.instagram.scraper.request.PaginatedRequest.requestInstagramResult(PaginatedRequest.java:39)
2018-01-17T20:59:39.519040+00:00 app[web.1]: 	at me.postaddict.instagram.scraper.Instagram.getMediasByTag(Instagram.java:136)

testGetAccountById is broken

Exception message:
me.postaddict.instagramscraper.exception.InstagramException: Response code is not equal 200. Something went wrong. Please report issue.
Actual response code is 405 (Method Not Allowed)

videoUrl is null...

Sorry I took new issue tickets again, but I had searched repository text. After I received reply, I tried version 885979b instead of SNAPSHOT. But the result is the same.

log is...

PageObject(nodes=[Media(height=300, width=480, displayUrl=https://scontent-icn1-1.cdninstagram.com/vp/xxxxxx.jpg, videoUrl=null, displayResources=[DisplayResource(src=https://scontent-icn1-1.cdninstagram.com/vp/xxxxxx.jpg, width=150, height=150), isVideo=true, shouldLogClientEvent=null, trackingToken=null, mediaType=GraphVideo, id=16896257940972xxxxx, shortcode=BdywbUxxxxx, gatingInfo=null, caption=testxxxxxxxxxxx
, commentCount=896, commentPreview=null, firstComments=null, commentsDisabled=false, captionIsEdited=null, takenAtTimestamp=1515639206000, likeCount=63845, videoViewCount=null, firstLikes=null, location=null, owner=Account(...), viewerHasLiked=null, viewerHasSaved=null, viewerHasSavedToCollection=null, isAdvertising=null, carouselMedia=null, taggedUser=null, lastUpdated=Fri Jan 12 10:26:50 KST 2018)...

Above log is about video. but what I need is null, videoUrl, videoViewCount, carouselMedia...
And how do I get the hitCount on non-video, photo...

getMediaByUrl Issue

Hello!

Instagram instagram = new Instagram(new OkHttpClient()); 
Media media = instagram.getMediaByUrl("https://www.instagram.com/p/BWizrpZgg-w/");

This code produce exception. Seems that video mapping are wrong

Exception in thread "main" java.lang.IllegalArgumentException: javax.xml.bind.UnmarshalException
 - with linked exception:
[Exception [EclipseLink-25004] (Eclipse Persistence Services - 2.7.0.v20170811-d680af5): org.eclipse.persistence.exceptions.XMLMarshalException
Exception Description: An error occurred unmarshalling the document
Internal Exception: javax.json.stream.JsonParsingException: Unexpected char 60 at (line no=1, column no=1, offset=0)]
	at me.postaddict.instagram.scraper.mapper.ModelMapper.mapObject(ModelMapper.java:134)
	at me.postaddict.instagram.scraper.mapper.ModelMapper.mapMedia(ModelMapper.java:49)
	at me.postaddict.instagram.scraper.Instagram.getMediaByUrl(Instagram.java:101)
	at me.postaddict.instagram.scraper.Main.main(Main.java:11)
Caused by: javax.xml.bind.UnmarshalException
 - with linked exception:
[Exception [EclipseLink-25004] (Eclipse Persistence Services - 2.7.0.v20170811-d680af5): org.eclipse.persistence.exceptions.XMLMarshalException
Exception Description: An error occurred unmarshalling the document
Internal Exception: javax.json.stream.JsonParsingException: Unexpected char 60 at (line no=1, column no=1, offset=0)]
	at org.eclipse.persistence.jaxb.JAXBUnmarshaller.handleXMLMarshalException(JAXBUnmarshaller.java:1110)
	at org.eclipse.persistence.jaxb.JAXBUnmarshaller.unmarshal(JAXBUnmarshaller.java:172)
	at me.postaddict.instagram.scraper.mapper.ModelMapper.mapObject(ModelMapper.java:132)
	... 3 more
Caused by: Exception [EclipseLink-25004] (Eclipse Persistence Services - 2.7.0.v20170811-d680af5): org.eclipse.persistence.exceptions.XMLMarshalException
Exception Description: An error occurred unmarshalling the document
Internal Exception: javax.json.stream.JsonParsingException: Unexpected char 60 at (line no=1, column no=1, offset=0)
	at org.eclipse.persistence.exceptions.XMLMarshalException.unmarshalException(XMLMarshalException.java:120)
	at org.eclipse.persistence.internal.oxm.record.json.JsonStructureReader.parse(JsonStructureReader.java:146)
	at org.eclipse.persistence.internal.oxm.record.SAXUnmarshaller.unmarshal(SAXUnmarshaller.java:938)
	at org.eclipse.persistence.internal.oxm.record.SAXUnmarshaller.unmarshal(SAXUnmarshaller.java:414)
	at org.eclipse.persistence.internal.oxm.record.SAXUnmarshaller.unmarshal(SAXUnmarshaller.java:390)
	at org.eclipse.persistence.internal.oxm.XMLUnmarshaller.unmarshal(XMLUnmarshaller.java:394)
	at org.eclipse.persistence.jaxb.JAXBUnmarshaller.unmarshal(JAXBUnmarshaller.java:156)
	... 4 more
Caused by: javax.json.stream.JsonParsingException: Unexpected char 60 at (line no=1, column no=1, offset=0)
	at org.glassfish.json.JsonTokenizer.unexpectedChar(JsonTokenizer.java:532)
	at org.glassfish.json.JsonTokenizer.nextToken(JsonTokenizer.java:415)
	at org.glassfish.json.JsonParserImpl$NoneContext.getNextEvent(JsonParserImpl.java:222)
	at org.glassfish.json.JsonParserImpl$StateIterator.next(JsonParserImpl.java:172)
	at org.glassfish.json.JsonParserImpl.next(JsonParserImpl.java:149)
	at org.glassfish.json.JsonReaderImpl.read(JsonReaderImpl.java:84)
	at org.eclipse.persistence.internal.oxm.record.json.JsonStructureReader.parse(JsonStructureReader.java:138)
	... 9 more

Checkpoint required

{"message": "checkpoint_required", "checkpoint_url": "/challenge/3472751680/JNLuOCTsMd/", "lock": false, "status": "fail"}

I've been getting this error message for some time
Does anyone know how to solve it?

Thanks
Alexandre

getLikes

Is it possible to get all media likes?

Login Verification

Every time I perform a login it will always return message ok, even if the user has incorrect password. How do I really confirm the user logged in.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.