Python Geopy to find geocode of an Address

Geocoding With GeoPy

Every point on the surface of Earth can be represented using its latitude and longitude value.

According to Wikipedia, “Geocoding is the computational process of transforming a postal address description to a location on the Earth’s surface (spatial representation in numerical coordinates).”

If simply put together, The process of representing text addresses to their corresponding Latitude and Longitude on earth surface is called Geocoding.

In this article, we will retrieve the geocode of an address using Python’s GeoPy library.

GeoPy

GeoPy is not a Geocoding service but simply a python client for several popular geocoding web services. It uses third-party geocoders and other data sources to find geocode of an address.

The figure below gives some idea about function of GeoPy.

Geopy Api
Geopy

as seen in the figure above Geocoding is provided by a number of different services. These services provide APIs, GeoPy library provides an implementation of these APIs in a single package. for a complete list of geocoding service providers implemented by geopy, you can refer this documentation.

Some important points to consider:

  • Geocoding services are either paid or free so prior to selecting a service do go through their Terms of Use, quotas, pricing, geodatabase, and so on.
  • geopy cannot be responsible for any networking issues between your computer and the geocoding service.

With enough high level idea of what GeoPy does, let’s now see how to use it to retrieve geocode of an address.

Geocoding Services

There are many Geocoding services available, but I really liked GeocodeAPI. They have multiple endpoints to get lat-long from address as well as reverse geocoding. One of their advanced features is the address auto-complete API.

They can even return a complete address from a partial address. Also, they provide 10,000 free requests per day, which is great if you are just starting to build your application. You can get more details from their pricing page.

Geocoding using GeoPy

Each geolocation service i.e. Nominatim, has its own class in geopy.geocoders linking to the service’s API. Geocoders have least a geocode method, for looking up coordinates from a provided string(address we want to geocode).

this class also has an implementation of a reverse method, which is in reverse to the geocode method. here we need to provide the coordinates of a point on the earth’s surface and the method returns the address associated with the provided lat and lon.

1. Finding Geocode of an address

We’ll be using Nominatim geocoding services in this tutorial.

#Importing the Nominatim geocoder class 
from geopy.geocoders import Nominatim

#address we need to geocode
loc = 'Taj Mahal, Agra, Uttar Pradesh 282001'

#making an instance of Nominatim class
geolocator = Nominatim(user_agent="my_request")

#applying geocode method to get the location
location = geolocator.geocode(loc)

#printing address and coordinates
print(location.address)
print((location.latitude, location.longitude))
Output:
Taj Mahal, Taj Mahal Internal Path, Taj Ganj, Agra, Uttar Pradesh, 282001, India
(27.1750123, 78.04209683661315)

using the code above we found the coordinates of Taj mahal, Agra, India.

Nominatim class has a geocode method which accepts a string of an address and returns its coordinates from the service provider’s database. The object returned by using the geocode method has an address method which returns the complete address, a latitude , londitude method to retrieve lat and on of that address.

the Nominatim geocoder class accepts user_agent as an input argument that acts as a header to send the requests to geocoder API.

2. Using GeoPy with Pandas Dataframe

The RateLimiter class acts as a wrapper around the geocoder class with which we can delay the time to make requests to the server if we have to process many requests.

The number of requests to make to a geocoding service provider needs to be taken into account while making multiple requests or it will raise an error.

Let’s now apply this to a pandas dataframe having the address for some beautiful nature spots in India.

#Importing the required modules
import pandas as pd
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

#Creating a dataframe with address of locations we want to reterive
locat = ['Coorg, Karnataka' , 'Khajjiar, Himachal Pradesh',\
         'Chail, Himachal Pradesh' , 'Pithoragarh, Uttarakhand','Munnar, Kerala']
df = pd.DataFrame({'add': locat})

#Creating an instance of Nominatim Class
geolocator = Nominatim(user_agent="my_request")

#applying the rate limiter wrapper
geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)

#Applying the method to pandas DataFrame
df['location'] = df['add'].apply(geocode)
df['Lat'] = df['location'].apply(lambda x: x.latitude if x else None)
df['Lon'] = df['location'].apply(lambda x: x.longitude if x else None)

df
DataFrame With Coordinates
DataFrame With Coordinates

The RateLimiter class needs a geocoder class object and min_delay_seconds as input arguments. this method makes requests to the server of geocoding service with the specified time delay. if the location of the string is not found it automatically returns None.

with Pandas .apply method we can apply the wrapper to the specified column on our dataframe.

Conclusion

In this article, we learned what geocoding is and how python’s GeoPy library provides us with a simple implementation of Geocoding services APIs. We also geocoded an address in text format to get its latitude and longitude coordinates and applied the method on a pandas DataFrame having a column of address.

Happy Learning!