Extracting tweets from Twitter using API with Python

Tweets From Twitter Using Tweepy

Hello readers, in this article I will be introducing you to the Twitter API namely Tweepy which is used to retrieve tweets using Python. I hope you will enjoy reading this article.

Requirements for Extracting Tweets from Twitter using Python

Let’s go over what we need to get started here.

1. Twitter Developer Account

In order to get access to the Tweepy API, it is important for you to create a developer account and this account must be approved from twitter. So kindly ensure that you have provided right details and the proper reason to use Tweepy.

Here is how you can create a developer account.

  • Visit the twitter developer site at dev.twitter.com.
  • Create an account on the developer site by clicking the ‘Sign In’ button at the top-right corner.
Twitter Developer
Twitter Developer site
  • After sign-in, click on the developer link on the nav-bar.
  • Click on your account and choose “Apps” from the drop-down menu that appears.
Image 7
Drop-down
  • Click on the “create app” button and fill in the details for your application.
  • Create your access token for the application. Copy this access token into a file and keep it safe.
  • Once you’ve done this, make a note of your OAuth settings, which include – Consumer Key, Consumer Secret, OAuth Access Token, OAuth Access Token Secret.

2. Spread Sheet reader software

You will need a software that can read spread sheet such as Microsoft Excel or LibreOffice Reader.

Code for Extracting Tweets from Twitter

In this coding example, we will extract data from twitter.com using Tweepy.

1. Import Required Libraries and Set up OAuth Tokens

So to begin with, import the necessary libraries such as tweepy and pandas and also declare the OAuth token that is obtained during the creation of your app at the twitter developer dashboard.

from tweepy import *

import pandas as pd
import csv
import re 
import string
import preprocessor as p

consumer_key = <enter your consumer key>
consumer_secret = <enter key>
access_key= <enter key>
access_secret = <enter key>

2. Authorize with Tweepy’s OAuthhandler

Now that we have defined the keys, we will proceed to authorize ourselves with tweepy’s OAuthHandler. We will pass the keys as shown below.

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)

We will now pass these authorization details to tweepy as shown below.

api = tweepy.API(auth,wait_on_rate_limit=True)

3. Extracting Specific Tweets from Twitter

You can define a variable by name search_words and specify the word about which you would like to retrieve tweets.

Tweepy checks through all tweets for that particular keyword and retrieves contents. This can be Hashtags, @mentions, or even normal words.

Sometimes, even retweets are extracts and to avoid that we filter the retweets.

search_words = "#"      #enter your words
new_search = search_words + " -filter:retweets"

Now for each tweet in the Tweepy Cursor, we search for the words and pass it as shown below. We then write the contents into a csv file as shown after utf-8 encoding.

4. Pulling Tweets Metadata

In the code snippet below, I wish to only retrieve the time of the creation of the tweet, the text of the tweet, username, and the location.

for tweet in tweepy.Cursor(api.search,q=new_search,count=100,
                           lang="en",
                           since_id=0).items():
    csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8'),tweet.user.screen_name.encode('utf-8'), tweet.user.location.encode('utf-8')])

We will now open a csv file in the append mode and write contents from twitter into this file.

csvFile = open('file-name', 'a')
csvWriter = csv.writer(csvFile)

5. Complete Code to Extract Tweets from Twitter using Python and Tweepy

The entire code looks like as shown below. You can execute this and find a csv file with all the data you want in the same working directory as your python file.

from tweepy import *

import pandas as pd
import csv
import re 
import string
import preprocessor as p

consumer_key = <enter your consumer key>
consumer_secret = <enter key>
access_key= <enter key>
access_secret = <enter key>

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)

api = tweepy.API(auth,wait_on_rate_limit=True)

csvFile = open('file-name', 'a')
csvWriter = csv.writer(csvFile)

search_words = "#"      # enter your words
new_search = search_words + " -filter:retweets"

for tweet in tweepy.Cursor(api.search,q=new_search,count=100,
                           lang="en",
                           since_id=0).items():
    csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8'),tweet.user.screen_name.encode('utf-8'), tweet.user.location.encode('utf-8')])

The output of the above code is a csv file which looks as follows:

Image 22
Output CSV file

Kindly note, the output will vary based on the search keywords.

Conclusion

Thus, we have come to the end of this article and have tried retrieving some information from Tweepy. Hope you enjoy doing this! Do let us know your feedback in the comments section below.