Using Python to Access Data

CanWIN used the CKAN DataStore, which offers an API for reading, searching and filtering data without the need to download the entire file first. The DataStore is an ad hoc database which means that it is a collection of tables with unknown relationships. This allows you to search in one DataStore resource (a table in the database) as well as queries across DataStore resources.

Read more about the Data API here.

Creating an API endpoint to search the DataStore

The datastore_search action 

The datastore_search action allows you to search for data in a resource. By default 100 rows are returned - see the limit parameter for more information here.

Using datastore_search in an API URL endpoint

Example

URL= 'https://canwin-datahub.ad.umanitoba.ca/data/api/3/action/datastore_search?resource_id=c5c16064-e2b3-4618-9b27-0dbf5c1388c2', where:

  • https://canwin-datahub.ad.umanitoba.ca/data is the root url
  • api/3/action/datastore_search calls the API action 'datastore_search'
  • ?resource_id=c5c16064-e2b3-4618-9b27-0dbf5c1388c2 is the query, and includes the resource id for that particular resource.

Finding the resource ID

Click on any resource on CanWIN's CKAN site. For example, the 'Greenedge Nutrient data 2016' data resource can be found here. Scroll down the page to 'Additional Information' and look for the field label 'Resource id'. This will be the ID for this specific resource.

Making requests to the API endpoint via python

Here are two ways to search and retrieve data from the DataStore using the API endpoint discussed above, and a python request module.

1. Requests

To install: pip install requests

Example using the requests module

2. urllib.request 

Example using the urllib.request module

Adding more functionality directly in the datasstore_search action call

To add more parameters to the query, use  '&' before each parameter pair (key=value) after the resource id. For example, to get the first two rows of data only add &limit=2 after the resource id. See all parameters here.

Example of filtering the data

Example of limiting the data returned