GithubHelp home page GithubHelp logo

Comments (10)

iSarabjitDhiman avatar iSarabjitDhiman commented on August 22, 2024

If you are using the async-await branch, use the following (also available for the master branch now, check the code below) :

from tweeterpy import TweeterPy
from tweeterpy import util

twitter = TweeterPy()
# get tweets or other data
data = twitter.get_user_tweets("elonmusk",total=50)

# get data by keys from the nested python dict, just pass in the dataset and the key name you want to extract.

# NOTE : THERE MIGHT BE MULTIPLE KEYS WITH THE SAME NAME. SAY ID (IT COULD BE ID OF TWEET OR A USER OR A THREAD CONVERSATION ETC. ) TRY TO PASS A UNIQUE KEY, OR JUST PASS A DATASET WITH UNIQUE KEYS.
usernames = util.find_nested_key(data,"screen_name")

If you are using the master branch. Just define the following code in your project and use it as a normal function.

#> Its available now for the master branch as well. Import it from tweeterpy.utils module. Check the code below for more details.
from functools import reduce

def find_nested_key(dataset=None, nested_key=None):
    def get_nested_data(dataset, nested_key, placeholder):
        if isinstance(dataset, list) or isinstance(dataset, dict) and dataset:
            if isinstance(dataset, list):
                for item in dataset:
                    get_nested_data(item, nested_key, placeholder)
            if isinstance(dataset, dict):
                if isinstance(nested_key, tuple) and nested_key[0] in dataset.keys():
                    placeholder.append(reduce(lambda data, key: data.get(
                        key, {}), nested_key, dataset) or None)
                    placeholder.remove(None) if None in placeholder else ''
                    # placeholder.append(reduce(dict.get,nested_key,dataset))
                if isinstance(nested_key, str) and nested_key in dataset.keys():
                    placeholder.append(dataset.get(nested_key))
                for item in dataset.values():
                    get_nested_data(item, nested_key, placeholder)
        return placeholder
    return get_nested_data(dataset, nested_key, [])

tweets_text = find_nested_key(data,"full_text")

Edit : You don't have to do it manually anymore. It has been implemented to the master branch as well.

from tweeterpy import TweeterPy
from tweeterpy.util import find_nested_key

data = twitter.get_user_tweets("elonmusk",total=50)
usernames = util.find_nested_key(data,"screen_name")
tweets_text = find_nested_key(data,"full_text")

# Just updated find_nested_key function to accept nested_key as a tuple as well.
tweets_creation = util.find_nested_key(data,("tweet_results","result","legacy","created_at"))

from tweeterpy.

ihabpalamino avatar ihabpalamino commented on August 22, 2024

thank you for your hard if i can ask you how much tweets can i scrap by month
and a second question when i use pip install i install from the master branch how can i switch to the async-await branch?

from tweeterpy.

iSarabjitDhiman avatar iSarabjitDhiman commented on August 22, 2024

thank you for your hard if i can ask you how much tweets can i scrap by month
and a second question when i use pip install i install from the master branch how can i switch to the async-await branch?

To check the rate limits, just use the async-await branch and pretty much all of the functions have an argument "return_rate_limit". Just set it to True. Take a look at this #8 .It will return the hourly rate limits, you can also google twitter api rate limits and you will get an idea of the requests you can make a day.

NOTE : If the rate limit is like 2000 per day that doesnt mean you can get only 2000 tweets a day. It means you can make 2000 requests a day. The data you can get depends on the type of data you are requesting for. Say if you are requesting for user_data, each request returns data for each user so that will be 2000 users in this case. But in case of tweeets, sometimes each request returns 30-50 or other times it does return like 100 tweets. So its better you keep an eye on those rate limits. The best way is to make a request and then request for the api limit stats to check how many requests did it cost.

# to check rate limits for user friends.

twitter.get_friends('',follower=True,return_rate_limit=True)

# to check limits for user tweets.

twitter.get_user_tweets('',return_rate_limit=True)

# it will return the total number of api calls allowed and the remaining api calls. You can get an idea from there.

Check this guide to switch to the async-await branch.

Feel free to close the issue if you got what you were looking for.

from tweeterpy.

ihabpalamino avatar ihabpalamino commented on August 22, 2024

using createdat=util.find_nested_key(user_tweets,"created_at")
['Wed Sep 30 19:02:27 +0000 2009', 'Thu Jul 27 09:21:51 +0000 2023', i got date of creation of post and date of creation of the account how can i get only date of creation of post?

from tweeterpy.

iSarabjitDhiman avatar iSarabjitDhiman commented on August 22, 2024

As I mentioned earlier, there might be multiple keys with the similar name in a single dataset, "created_at" key is used for the users and also for the tweets as well. You can just use a for loop. The nested location of creation_at for tweets is at ['content']['itemContent']['tweet_results']['result']['legacy'].

So a quick fix in your case is:

user_tweets = twitter_user_tweets("elonmusk",total=20)

[util.find_nested_key(tweet['content']['itemContent']['tweet_results']['result']['legacy'],"created_at") for tweet in user_tweets[0]['data']]

Edit : Just updated find_nested_key function to accept nested_key as a tuple as well.
So the updated/better solution is:

user_tweets = twitter_user_tweets("elonmusk",total=20)

tweets_creation = util.find_nested_key(user_tweets,("tweet_results","result","legacy","created_at"))

from tweeterpy.

ihabpalamino avatar ihabpalamino commented on August 22, 2024

As I mentioned earlier, there might be multiple keys with the similar name in a single dataset, "created_at" key is used for the users and also for the tweets as well. You can just use a for loop. The nested location of creation_at for tweets is at ['content']['itemContent']['tweet_results']['result']['legacy'].

So a quick fix in your case is:

user_tweets = twitter_user_tweets("elonmusk",total=20)

[util.find_nested_key(tweet['content']['itemContent']['tweet_results']['result']['legacy'],"created_at") for tweet in user_tweets[0]['data']]

if is it possible to know where is the data list or dictionnary that contains all the keys just to know for example what is the keys to use to have only the url of the post or id of the post . and thanks for being up to date

from tweeterpy.

iSarabjitDhiman avatar iSarabjitDhiman commented on August 22, 2024

As I mentioned earlier, there might be multiple keys with the similar name in a single dataset, "created_at" key is used for the users and also for the tweets as well. You can just use a for loop. The nested location of creation_at for tweets is at ['content']['itemContent']['tweet_results']['result']['legacy'].
So a quick fix in your case is:
user_tweets = twitter_user_tweets("elonmusk",total=20)
[util.find_nested_key(tweet['content']['itemContent']['tweet_results']['result']['legacy'],"created_at") for tweet in user_tweets[0]['data']]

if is it possible to know where is the data list or dictionnary that contains all the keys just to know for example what is the keys to use to have only the url of the post or id of the post . and thanks for being up to date

Hey @ihabpalamino

You can take a look at the official Twitter API website if they have posted some sample responses. Otherwise you gonna have to navigate through the response yourself to understand those key, values pairs. Just grab one of the results from the list, other results are quite similar most of the times.

I just updated the find_nested_key function. Now it takes nested_key as a tuple as well. Its easier this way to deal with multiple similar keys. Check the usage here
Let me know if this is what you wanted.

from tweeterpy.

ihabpalamino avatar ihabpalamino commented on August 22, 2024

thanks but stay cant get url of each post that direct me to see the tweet

from tweeterpy.

iSarabjitDhiman avatar iSarabjitDhiman commented on August 22, 2024

thanks but stay cant get url of each post that direct me to see the tweet

Twitter doesn't send the direct url to tweets in this dataset. You will have to create it on your own.

The tweets url structure is :
https://www.twitter.com/username/status/tweet_id

user_tweets = twitter.get_user_tweets("elonmusk",total=10)

for user in user_tweets:
    for tweet in user["data"]:
        #skip promoted tweets
        if tweet.get("entryId","").startswith("promote"):
            continue
        tweet_id = util.find_nested_key(tweet,("tweet_results","result","rest_id"))
        username = util.find_nested_key(tweet,("user_results","result","legacy","screen_name"))
        if tweet_id and username:
            print(f"https://www.twitter.com/{username[0]}/status/{tweet_id[0]}")

from tweeterpy.

iSarabjitDhiman avatar iSarabjitDhiman commented on August 22, 2024

Assuming this is what you were looking for, I am closing the issue.

from tweeterpy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.