Propublica has APIs for USA congressional information such as candidate profiles and voting records.
We can use the Json parser in Migrate Plus module read the API.
First, we'll need to sign up for an api key, and pass that in the X-API-Key header when we make requests.
Take a look at the json example module to see how they set up the migration yaml file.
You can copy this into the migrations directory in your custom module and modify it there. You may need to clear the cache in between changes. Alternatively, you can put it in your config/optional directory or paste it into the add config form. You may need to reference other tutorials to understand the various ways to supply migration config.
I named the file migrate_plus.migration.propublica_congress.yml
Here is the top of the file:
id: propublica_congress
label: JSON feed of congress members.
migration_group: Propublica
migration_tags:
- propublica congress
source:
plugin: url
data_fetcher_plugin: http
data_parser_plugin: json
headers:
X-API-Key: YOUR_API_KEY
urls:
- 'https://api.propublica.org/congress/v1/116/senate/members.json'
item_selector: /results/0/members
Note that Migrate Plus has fetcher and parser plugins for the url source plugin. Using the http fetcher and json parser, we just need to put in the api key, endpoint url, and selector. Replace YOUR_API_KEY with your api key.
The item selector is an xpath style syntax to reference the items in the results object.
I hit a bug with the 0 in the selector and had to hack the parser plugin.
Create a content type to import into, and set up fields that you want to use.
To test things, I created a Representative content type and added fields for the propublica id, position title, and party affiliation.
You can use an application like Postman to test the API and see what the results look like. Here's a record from the endpoint for senate members in the 116th congression (sp?).
{
"status":"OK",
"copyright":" Copyright (c) 2021 Pro Publica Inc. All Rights Reserved.",
"results":[
{
"congress": "116",
"chamber": "Senate",
"num_results": 102,
"offset": 0,
"members": [
{
"id": "A000360",
"title": "Senator, 2nd Class",
"short_title": "Sen.",
"api_uri":"https://api.propublica.org/congress/v1/members/A000360.json",
"first_name": "Lamar",
"middle_name": null,
"last_name": "Alexander",
"suffix": null,
"date_of_birth": "1940-07-03",
"gender": "M",
"party": "R",
"leadership_role": null,
"twitter_account": "SenAlexander",
"facebook_account": "senatorlamaralexander",
"youtube_account": "lamaralexander",
"govtrack_id": "300002",
"cspan_id": "5",
"votesmart_id": "15691",
"icpsr_id": "40304",
"crp_id": "N00009888",
"google_entity_id": "/m/01rbs3",
"fec_candidate_id": "S2TN00058",
"url": "https://www.alexander.senate.gov/public",
"rss_url": "https://www.alexander.senate.gov/public/?a=RSS.Feed",
"contact_form": "http://www.alexander.senate.gov/public/index.cfm?p=Email",
"in_office": false,
"cook_pvi": null,
"dw_nominate": 0.324,
"ideal_point": null,
"seniority": "17",
"next_election": "2020",
"total_votes": 717,
"missed_votes": 133,
"total_present": 0,
"last_updated": "2020-12-30 19:01:18 -0500",
"ocd_id": "ocd-division/country:us/state:tn",
"office": "455 Dirksen Senate Office Building",
"phone": "202-224-4944",
"fax": "202-228-3398",
"state": "TN",
"senate_class": "2",
"state_rank": "senior",
"lis_id": "S289"
,"missed_votes_pct": 18.55,
"votes_with_party_pct": 96.55,
"votes_against_party_pct": 3.45
},
fields:
-
name: propublica_id
label: 'Propublica id'
selector: id
-
name: position_title
label: 'Position title'
selector: title
-
name: first_name
label: 'First name'
selector: first_name
-
name: middle_name
label: 'Middle name'
selector: middle_name
-
name: last_name
label: 'Last name'
selector: last_name
-
name: party
label: 'Party'
selector: party
ids:
propublica_id:
type: string
process:
type:
plugin: default_value
default_value: representative
title:
plugin: concat
source:
- first_name
- middle_name
- last_name
delimiter: ' '
field_propublica_id: propublica_id
field_position_title: position_title
field_party: party
sticky:
plugin: default_value
default_value: 0
uid:
plugin: default_value
default_value: 0
destination:
plugin: 'entity:node'
migration_dependencies: { }
Here I am naming fields in the API I want to use, and in the process section, mapping to the fields to store them in. The node title is concatenated from the first, middle, and last names.
Now I can run the migration using drush with Migrate Tools installed.