Learn Python Series (#15) - Handling JSON
Learn Python Series (#15) - Handling JSON
What Will I Learn?
- You will learn what the JSON file format is about and why it differs, technically, from a Python dictionary data type,
- how to serialize and deserialize between JSON data and Python data types,
- how to use the native, built-in,
json
module as well as the.json()
method from the externalrequests
library, - how to read from and write to .json files,
- a real-life example on how to combine it all using JSON data from an API (coinmarketcap.com).
Requirements
- A working modern computer running macOS, Windows or Ubuntu
- An installed Python 3(.6) distribution, such as (for example) the Anaconda Distribution
- The ambition to learn Python programming
Difficulty
- Intermediate
Curriculum (of the Learn Python Series
):
- Learn Python Series - Intro
- Learn Python Series (#2) - Handling Strings Part 1
- Learn Python Series (#3) - Handling Strings Part 2
- Learn Python Series (#4) - Round-Up #1
- Learn Python Series (#5) - Handling Lists Part 1
- Learn Python Series (#6) - Handling Lists Part 2
- Learn Python Series (#7) - Handling Dictionaries
- Learn Python Series (#8) - Handling Tuples
- Learn Python Series (#9) - Using Import
- Learn Python Series (#10) - Matplotlib Part 1
- Learn Python Series (#11) - NumPy Part 1
- Learn Python Series (#12) - Handling Files
- Learn Python Series (#13) - Mini Project - Developing a Web Crawler Part 1
- Learn Python Series (#14) - Mini Project - Developing a Web Crawler Part 2
Learn Python Series (#15) - Handling JSON
After a short break, just to make sure everybody had the time to catch on on all previous Learn Python Series
episodes ;-) , I'm back! In this episode we'll discuss handling JSON files, via some theory and by using a few practical examples.
A brief JSON introduction
JSON
, short for JavaScript Object Notation, is a text based data format used to exchange data between applications and computers. Even though its name has the language "JavaScript" in it, JSON is a textual data format that's language independent, so it can be used with lots of computer languages, including of course Python. JSON is pretty easy to read and write (for humans) and easy to parse as well (for computers). As we will see in this tutorial episode JSON-formatted data is also often used with APIs.
The JSON syntax
The JSON syntax looks a lot like a Python dictionary, we've discussed before. JSON stores data in name:value pairs, separated by commas, where curly braces { }
hold objects and square brackets [ ]
hold lists (arrays) of data.
In JSON, names must be strings surrounded by double quotes " "
, as do string values. JSON values can be strings, numbers, arrays (lists), booleans, null, or an embedded (JSON) object. This is a little bit more restrictive than what we've been discussing regarding Python dictionaries.
Using JSON in Python
Most Python distributions, including Anaconda, come bundled with the json
module. Using the json
module therefore doesn't require the installation of an external Python module, you can simply import it, like so:
import json
The json
module is used to parse JSON data from files (or strings inside a running Python script) and the other way around, to convert a Python dictionary (or list for example) back into a JSON textual string. It allows you to convert ("serialize" and "deserialize") back and forth between JSON data and Python objects.
Let's look at a few basic examples to get our feet wet using JSON with the methods json.loads()
and json.dumps()
:
json.loads()
json.loads()
deserializes a string containing JSON data to a Python object.
json_string = '{"name": "Scipio", "utopian_reputation": "Elite"}'
json_dict = json.loads(json_string)
print(type(json_dict), json_dict)
<class 'dict'> {'name': 'Scipio', 'utopian_reputation': 'Elite'}
json.dumps()
json.dumps()
works the other way around, it serializes a Python object to textual JSON.
my_dict = {'gender': 'male', 'hasJob': True, 'number': 14}
my_json = json.dumps(my_dict)
print(type(my_json), my_json)
<class 'str'> {"gender": "male", "hasJob": true, "number": 14}
As you can see, the Python dictionary object my_dict
has now been serialized into a (JSON data) string. Please notice the double quotes " "
instead of the single quotes ' '
we began with, as well as how the boolean value was changed from True
to true
.
Pretty printing using the json.dumps(indent=)
keyword argument
When using relatively large JSON datasets, while reading your JSON data, you might prefer to use a bit of indentation for readability, also dubbed as pretty printing oftentimes.
If we want to "pretty print" the my_json
variable from the previous example, we can do so like this:
my_json = json.dumps(my_dict, indent=4)
print(my_json)
{
"gender": "male",
"hasJob": true,
"number": 14
}
Reading from and writing to .json
files
JSON data can be stored to and read from disk as well, most commonly using a .json
file name extension. To do this, you can use the json.load()
and json.dump()
methods.
json.load()
Suppose you have a file called mydata.json
with some JSON data stored inside it. You can open()
that file as usual and deserialize it with json.load()
as follows:
with open('mydata.json', 'r') as f:
data = json.load(f)
json.dump()
Writing JSON data to a file is also pretty easy, using the json.dump()
method. You need to pass (at least) two positional arguments: json.dump(object, filepointer)
, where object
is the Python object you want to write to file as JSON. Like so:
some_dict = {"name": "Scipio",
"quote": "Does it matter who's right, or who's left?"}
with open('scipio.json', 'w') as f:
json.dump(some_dict, f)
As a result, the file scipio.json
will be saved in your current working directory and in it is the some_dict
data converted as JSON.
Using requests
to handle JSON data
In the previous tutorials, we've been using the external requests
library to fetch data from the internet, in a web crawler context. But we haven't yet discussed that the requests
library also has a very convenient JSON decoder built in, we can use to grab structured JSON data from web-based APIs (after all: why code a web crawler to scrape semi-structured data, if we can simply use an API providing us wiht perfectly valid structured JSON data?).
Let's show a simple real-life example with which you can fetch the current Steem price data from the coinmarketcap.com Ticker API, using the requests
built-in json()
decoder:
import json
import requests
r = requests.get('https://api.coinmarketcap.com/v1/ticker/steem/')
steem_data = r.json()
print(type(steem_data), steem_data)
<class 'list'> [{'id': 'steem', 'name': 'Steem', 'symbol': 'STEEM', 'rank': '31', 'price_usd': '3.05373', 'price_btc': '0.00034873', '24h_volume_usd': '15685600.0', 'market_cap_usd': '776589645.0', 'available_supply': '254308549.0', 'total_supply': '271282643.0', 'max_supply': None, 'percent_change_1h': '0.49', 'percent_change_24h': '-3.64', 'percent_change_7d': '17.08', 'last_updated': '1524344348'}]
Alternatively, we can just deserialize the requests
response using json.loads()
:
r = requests.get('https://api.coinmarketcap.com/v1/ticker/steem/')
steem_data = json.loads(r.text)
print(type(steem_data), steem_data)
<class 'list'> [{'id': 'steem', 'name': 'Steem', 'symbol': 'STEEM', 'rank': '31', 'price_usd': '3.05373', 'price_btc': '0.00034873', '24h_volume_usd': '15685600.0', 'market_cap_usd': '776589645.0', 'available_supply': '254308549.0', 'total_supply': '271282643.0', 'max_supply': None, 'percent_change_1h': '0.49', 'percent_change_24h': '-3.64', 'percent_change_7d': '17.08', 'last_updated': '1524344348'}]
A "real life" example fetching and storing relevant JSON API data
In the examples above, we learned how to deserialize & serialize JSON data, read JSON data from an API and/or a file, and write JSON to file. Again using the CMC Ticker API we can combine the above to code a useful bit of code which...
- creates an empty
ticks
list, - defines a base URL for the CoinMarketCap API,
- defines a list
coins
of your favorite coins, - iterates over the
coins
list, - uses the
requests
module toget
the current data, - filters the data you're interested in (in this case the coin name, USD price and timestamp (seconds since the EPOCH),
- wraps that filtered / relevant data in a temporary dictionary,
- appends that data the
ticks
list, - and finally saves the data to file using
json.dump()
with a 4 character indentation depth.
Like so:
import json
import requests
ticks = []
cmc_base_url = 'https://api.coinmarketcap.com/v1/ticker/'
coins = ['bitcoin', 'steem', 'steem-dollars']
for coin in coins:
coin_json = requests.get(cmc_base_url + coin).json()
coin_dict = {
'name': coin_json[0]['name'],
'price_usd': coin_json[0]['price_usd'],
'last_updated': coin_json[0]['last_updated']
}
ticks.append(coin_dict)
with open('cmc.json', 'w') as f:
json.dump(ticks, f, indent=4)
If we then open our saved file cmc.json
located in our current working directory, we'll see the following output:
[
{
"name": "Bitcoin",
"price_usd": "8790.62",
"last_updated": "1524345872"
},
{
"name": "Steem",
"price_usd": "3.05278",
"last_updated": "1524345847"
},
{
"name": "Steem Dollars",
"price_usd": "3.21964",
"last_updated": "1524345848"
}
]
What did we learn, hopefully?
In this episode, I showed you how to handle JSON data, deserializing with either json.load()
or json.loads()
and serializing using json.dump()
and json.dumps()
via the built-in json
module, and via using the convenient requests.json()
method in case of web APIs containing JSON data.
I've deliberately only covered "the basics" of JSON conversion, because in most of the cases, the techniques covered in this tutorial episode are all you need.
Thank you for your time!
Posted on Utopian.io - Rewarding Open Source Contributors
The contribution has been approved.
It is a very high quality tutorial and even in a relatively simple subject, the tutorial is really instructive. I personally appreciate what you do, please continue with the series as long as possible.
Need help? Write a ticket on https://support.utopian.io.
Chat with us on Discord.
[utopian-moderator]
Thank you very much! Especially for your kind words. You are appreciated as well by the way, and I feel glad I'm back & being able to continue my Utopian / Steemit contributions in the midst of.. well, you know....
@scipio
Hey @roj, I just gave you a tip for your hard work on moderation. Upvote this comment to support the utopian moderators and increase your future rewards!
Wow! Great post. I've been working through a lot of Python stuff over the past few months. Really like your style. Keep the the content of coming!!!
Hey @scipio! Thank you for the great work you've done!
We're already looking forward to your next contribution!
Fully Decentralized Rewards
We hope you will take the time to share your expertise and knowledge by rating contributions made by others on Utopian.io to help us reward the best contributions together.
Utopian Witness!
Vote for Utopian Witness! We are made of developers, system administrators, entrepreneurs, artists, content creators, thinkers. We embrace every nationality, mindset and belief.
Want to chat? Join us on Discord https://discord.me/utopian-io
I bookmarked this and followed. Will be following along your tutorials and starting from #1 tomorrow. :)