Using Twitter With Python, Part 2 – Connecting Python and Twitter

Introduction

There are a great number of things that you might want to do with Twitter, for which the web or mobile clients don’t have facilities.  For example, you might want to run a script that automatically thanks anyone who follows you.  Or you might want to run a script that Likes any comment that someone adds to your post. 
It is worth noting that Twitter is extremely strict when it comes to automated actions around followers.  For example, it’s entirely possible to scrape the follower list of large accounts, and write a script to automatically follow all of those people. That would conceivably get you a large number of followers, and when you were done you could just write anther script to unfollow anyone who didn’t follow you back.   I promise that the Twitter API will pick up on what you’re doing, and put you in Twitter Jail.  Don’t do that.
In this post we’ll examine how to make a basic connection to the Twitter API, using Python and Tweepy.    We’ll investigate one of the errors you might encounter, and discuss the pagination of the results that the API returns. The example we’re using will return a list of the followers of a Twitter account.
You’ll need some basic Python knowledge to follow along with this tutorial, or at least a strong comfort level with general programming practices.

Using Tweepy

Tweepy is a Python library that acts as a wrapper around the Twitter API.  The easiest way to install it is to use something like Pip, i.e.:

pip install tweepy

More information about the installation of Tweepy can be found here.

Using the get_followers() method

We’re going to use the get_followers() method to learn about Tweepy. When learning how to use a new method, the first thing we want to do is check the documentation. The documentation for the get_followers() method is here.

Under Resource Information, we have a table of information:

Response formats JSON
Requires authentication? Yes
Rate limited? Yes
Requests / 15-min window (user auth) 15
Requests / 15-min window (app auth) 15

This tells us a number of things.  One important detail about the method is that the maximum number of requests we can make is 15, every 15 minutes.  Once that threshold is reached, the API will make us wait before it will accept any more requests. It might not sound like it at first, but this is quite a severe limitation.   Not only can you only make a certain number of requests in a given time frame, but each request will only return a certain number of results.

Under the Parameters section on that same page, you’ll find more information about this method.  The parameter that we’re interested in is called Count:

“The number of users to return per page, up to a maximum of 200. Defaults to 20.”

So, if you simply request a list of followers, the API will by default only give you 20 results.   If you specify a maximum count of 200 (we’ll explore this more in a bit), you still only get a list of 200 followers.   That leaves us with two questions:

  1. How do you get multiple pages of followers?
  2. What happens if you have more than 200 followers * 15 requests?

We’ll explore the answer to both of these questions. First, let’s take a look at a very basic example of working with Python and the Twitter API.

Connecting Python and Twitter

All of the code examples we use in this series of blog posts will make use of the same basic setup.   We need to import Tweepy, we need to set up the OAuth connection, and we need to define the API as something we can work with.  If you copy this code, ensure you replace the placeholders with your own OAuth credentials.

(You’re welcome to use whichever Python IDE works best for you.  I use Spyder.)

import tweepy

consumerKey = "<your API key goes here>"
consumerSecret = "<your API secret goes here>"
accessToken = "<your access token goes here>"
accessTokenSecret = "<your access token secret goes here>"
#Set the access credentials

auth = tweepy.OAuthHandler(consumerKey, consumerSecret)
auth.set_access_token(accessToken, accessTokenSecret)
#define the oauth parameters

api = tweepy.API(auth)
#Define the Twitter API and tell it to use the OAuth settings
#-----This is the end of the "set-up" portion of the script-----

for user in tweepy.Cursor(api.get_followers,count=200).items():
#Use the cursor to paginate
print(user.name)
#Print the name

The “followers” variable is where it gets interesting.  We declare that “followers” is a method of the already-defined “api”.   Because we simply ask it to retrieve a list of followers, it assumes we want a list of followers associated with the authenticated user.  If we wanted to specify which users’ followers we wanted to return, we would add it as a parameter:

followers = api.get_followers(screen_name="ken_mcclean")

This also works with other identifiers, such as account ID #.

Having had done that, “followers” should now contain a list of the followers of the authenticated account.  Remember that we haven’t addressed either of the above questions yet. We’ll get to those shortly.

Now we have a list of followers.  If we loop through the list and simply print each “user”, we are presented with a massive list of attributes for each user.  Assuming we only want to know the name of the user, we can specify that information using dot notation.  The example above uses dot notation to specify that we only want the “name” of each “user”.

If we run the script, we should be presented with a list of twenty users. Remember, that’s the default number of results per page, and we haven’t asked the API for more than one page.

Your First Tweepy Error – Positional Arguments

While working with the API, you may encounter an error that says something like this:

TypeError: get_followers() takes 1 positional argument but 2 were given

 

 

This generally means that you have passed a parameter to a method without specifying what that parameter means. In other words, you may have done something like this:

followers = api.get_followers("ken_mcclean")

Notice that we haven’t told the method what sort of value “ken_mcclean” actually is.   The API used to take these positional arguments, but no longer does.  This still leaves us with a question: what is the second positional argument referenced in the error?  The “invisible” positional argument is actually the method itself!

 

Increasing the Number of Results With the cursor

The Twitter API uses rate limiting. In other words, you can only make so many calls or requests to the service in a given period of time.

Let’s examine the rate limiting you’d encounter if you wanted to return a list of everyone who follows your account.  Consider the following code:

import tweepy

consumerKey = "<your API key goes here>"
consumerSecret = "<your API secret goes here>"
accessToken = "<your access token goes here>"
accessTokenSecret = "<your access token secret goes here>"
#Set the access credentials

auth = tweepy.OAuthHandler(consumerKey, consumerSecret)
auth.set_access_token(accessToken, accessTokenSecret)
#define the oauth parameters

api = tweepy.API(auth)
#Define the Twitter API and tell it to use the OAuth settings
#-----This is the end of the "set-up" portion of the script-----

for user in tweepy.Cursor(api.get_followers,count=200).items():
#Use the cursor to paginate
print(user.name)
#Print the name

Things have gotten slightly more complicated, but let’s unpack it.

We’re no longer declaring “followers” as a list.  Instead we’re using a for-loop.  The loop examines each “user” in the items that the Cursor returns.  The Cursor does all the work of figuring out how many pages of results exist, and calls each page of results for you.  Notice that we’re now specifying the maximum number of results per page (200). 

So each time the Cursor returns a page of 200 results, we call each item that is returned the “user”, and use dot notation to return only the name of that user.

This basic setup is what we’ll use for the majority of the scripts that we explore in the series.   Being able to use the Cursor, and pass methods to it, will allow you to get most of the information you desire out of Twitter.  If you were interested in sourcing the list of followers of a different user, you’d simply add it as a parameter after invoking the get_followers() method:

for user in tweepy.Cursor(api.get_followers,screen_name="ken_mcclean",count=200).items():

Rate Limiting

We haven’t yet addressed the second question.  What if you have more than 15 * 200 followers? The API limits you to that many results in a given time frame.

The answer is wait on rate limit.  We need only change one line of code:

api = tweepy.API(auth,wait_on_rate_limit=True)

If the API tells the script that the rate limit has been exceeded, the script will now wait until the API gives the all-clear and continue running. In effect, it pauses until the API says “hey, you’re good to make another 15 requests.” Without this functionality, the script simply stops when rate limiting kicks in.

Conclusion

This post will hopefully have assisted you in connecting Python to your Twitter account.    In the next post we’ll look at some more involved operations that may be accomplished, using Tweepy.

Leave a Reply

Your email address will not be published. Required fields are marked *

One response

  1. Rasmus Jensen Avatar
    Rasmus Jensen

    Hi Kenneth

    How do we export this to a csv.file?

    And is it possible to get an overview over dates for followers?

    Kind regards,
    Rasmus