|
1 |
| -# python-parse-json |
| 1 | +# Reading & Parsing JSON Data With Python |
| 2 | + |
| 3 | +[<img src="https://img.shields.io/static/v1?label=&message=Json&color=brightgreen" />](https://github.com/topics/json) [<img src="https://img.shields.io/static/v1?label=&message=Python&color=important" />](https://github.com/topics/python) |
| 4 | + |
| 5 | + |
| 6 | +- [What is JSON?](#what-is-json) |
| 7 | +- [Converting JSON string to Python object](#converting-json-string-to-python-object) |
| 8 | +- [Converting JSON file to Python object](#converting-json-file-to-python-object) |
| 9 | +- [Converting Python object to JSON string](#converting-python-object-to-json-string) |
| 10 | +- [Writing Python object to a JSON file](#writing-python-object-to-a-json-file) |
| 11 | +- [Converting custom Python objects to JSON objects](#converting-custom-python-objects-to-json-objects) |
| 12 | +- [Creating Python class objects from JSON objects](#creating-python-class-objects-from-json-objects) |
| 13 | + |
| 14 | + |
| 15 | +JSON is a common standard used by websites and APIs and even natively supported by modern databases such as PostgreSQL. In this article, we’ll present a tutorial on how to handle JSON data with Python |
| 16 | + |
| 17 | +For a detailed explanation, see our [blog post](https://oxylabs.io/blog/python-parse-json). |
| 18 | + |
| 19 | +## What is JSON? |
| 20 | + |
| 21 | +JSON, or JavaScript Object Notation, is a format that uses text to store data objects: |
| 22 | + |
| 23 | +```json |
| 24 | +{ |
| 25 | + "name": "United States", |
| 26 | + "population": 331002651, |
| 27 | + "capital": "Washington D.C.", |
| 28 | + "languages": [ |
| 29 | + "English", |
| 30 | + "Spanish" |
| 31 | + ] |
| 32 | +} |
| 33 | +``` |
| 34 | + |
| 35 | +## Converting JSON string to Python object |
| 36 | + |
| 37 | +Let’s start with a simple example: |
| 38 | + |
| 39 | +```python |
| 40 | +# JSON string |
| 41 | +country = '{"name": "United States", "population": 331002651}' |
| 42 | +print(type(country)) |
| 43 | +``` |
| 44 | + |
| 45 | +The output of this snippet will confirm that this is indeed a string: |
| 46 | + |
| 47 | +```python |
| 48 | +<class 'str'> |
| 49 | +``` |
| 50 | + |
| 51 | +We can call the `json.loads()` method and provide this string as a parameter. |
| 52 | + |
| 53 | +```python |
| 54 | +import json |
| 55 | + |
| 56 | +country = '{"name": "United States", "population": 331002651}' |
| 57 | +country_dict = json.loads(country) |
| 58 | + |
| 59 | +print(type(country)) |
| 60 | +print(type(country_dict)) |
| 61 | +``` |
| 62 | + |
| 63 | +The output of this snippet will confirm that the JSON data, which was a string, is now a Python dictionary. |
| 64 | + |
| 65 | +```python |
| 66 | +<class 'str'> |
| 67 | +<class 'dict'> |
| 68 | +``` |
| 69 | + |
| 70 | +This dictionary can be accessed as usual: |
| 71 | + |
| 72 | +```python |
| 73 | +print(country_dict['name']) |
| 74 | +# OUTPUT: United States |
| 75 | +``` |
| 76 | + |
| 77 | +It is important to note here that the `json.loads()` method will not always return a dictionary. The data type that is returned will depend on the input string. For example, this JSON string will return a list, not a dictionary. |
| 78 | + |
| 79 | +``` |
| 80 | +countries = '["United States", "Canada"]' |
| 81 | +counties_list= json.loads(countries) |
| 82 | +
|
| 83 | +print(type(counties_list)) |
| 84 | +# OUTPUT: <class 'list'> |
| 85 | +``` |
| 86 | + |
| 87 | +Similarly, if the JSON string contains `true`, it will be converted to Python equivalent boolean value, which is `True`. |
| 88 | + |
| 89 | +``` |
| 90 | +import json |
| 91 | + |
| 92 | +bool_string = 'true' |
| 93 | +bool_type = json.loads(bool_string) |
| 94 | +print(bool_type) |
| 95 | +# OUTPUT: True |
| 96 | +``` |
| 97 | + |
| 98 | +The following table shows JSON objects and the Python data types after conversion. For more details, see [Python docs](https://docs.python.org/3/library/json.html%23json-to-py-table). |
| 99 | + |
| 100 | +## Converting JSON file to Python object |
| 101 | + |
| 102 | +Save the following JSON data as a new file and name it `united_states.json`: |
| 103 | + |
| 104 | +```json |
| 105 | +{ |
| 106 | + "name": "United States", |
| 107 | + "population": 331002651, |
| 108 | + "capital": "Washington D.C.", |
| 109 | + "languages": [ |
| 110 | + "English", |
| 111 | + "Spanish" |
| 112 | + ] |
| 113 | +} |
| 114 | +``` |
| 115 | + |
| 116 | +Enter this Python script in a new file: |
| 117 | + |
| 118 | +```python |
| 119 | +import json |
| 120 | + |
| 121 | +with open('united_states.json') as f: |
| 122 | + data = json.load(f) |
| 123 | + |
| 124 | +print(type(data)) |
| 125 | +``` |
| 126 | + |
| 127 | +Running this Python file prints the following: |
| 128 | + |
| 129 | +```py' |
| 130 | +<class 'dict'> |
| 131 | +``` |
| 132 | + |
| 133 | +The dictionary keys can be checked as follows: |
| 134 | + |
| 135 | +```py' |
| 136 | +print(data.keys()) |
| 137 | +# OUTPUT: dict_keys(['name', 'population', 'capital', 'languages']) |
| 138 | +``` |
| 139 | + |
| 140 | +Using this information, the value of `name` can be printed as follows: |
| 141 | + |
| 142 | +```python |
| 143 | +data['name'] |
| 144 | +# OUTPUT: United States |
| 145 | +``` |
| 146 | + |
| 147 | +## Converting Python object to JSON string |
| 148 | + |
| 149 | +Save this code in a new file as a Python script: |
| 150 | + |
| 151 | +```python |
| 152 | +import json |
| 153 | + |
| 154 | +languages = ["English","French"] |
| 155 | +country = { |
| 156 | + "name": "Canada", |
| 157 | + "population": 37742154, |
| 158 | + "languages": languages, |
| 159 | + "president": None, |
| 160 | +} |
| 161 | + |
| 162 | +country_string = json.dumps(country) |
| 163 | +print(country_string) |
| 164 | +``` |
| 165 | + |
| 166 | +When this file is run with Python, the following output is printed: |
| 167 | + |
| 168 | +```python |
| 169 | +{"name": "Canada", "population": 37742154, "languages": ["English", "French"], |
| 170 | + "president": null} |
| 171 | +``` |
| 172 | + |
| 173 | +Lists can be converted to JSON as well. Here is the Python script and its output: |
| 174 | + |
| 175 | +```python |
| 176 | +import json |
| 177 | + |
| 178 | +languages = ["English", "French"] |
| 179 | + |
| 180 | +languages_string = json.dumps(languages) |
| 181 | +print(languages_string) |
| 182 | +# OUTPUT: ["English", "French"] |
| 183 | +``` |
| 184 | + |
| 185 | +It’s not just limited to a dictionary and a list. `string`, `int`, `float`, `bool` and even `None` value can be converted to JSON. |
| 186 | + |
| 187 | +## Writing Python object to a JSON file |
| 188 | + |
| 189 | +The method used to write a JSON file is `dump()`: |
| 190 | + |
| 191 | +```python |
| 192 | +import json |
| 193 | + |
| 194 | +# Tuple is encoded to JSON array. |
| 195 | +languages = ("English", "French") |
| 196 | +# Dictionary is encoded to JSON object. |
| 197 | +country = { |
| 198 | + "name": "Canada", |
| 199 | + "population": 37742154, |
| 200 | + "languages": languages, |
| 201 | + "president": None, |
| 202 | +} |
| 203 | + |
| 204 | +with open('countries_exported.json', 'w') as f: |
| 205 | + json.dump(country, f) |
| 206 | +``` |
| 207 | + |
| 208 | + To make it more readable, we can pass one more parameter to the `dump()` function as follows: |
| 209 | + |
| 210 | +```python |
| 211 | +json.dump(country, f, indent=4) |
| 212 | +``` |
| 213 | + |
| 214 | +This time when you run the code, it will be nicely formatted with indentation of 4 spaces: |
| 215 | + |
| 216 | +```json |
| 217 | +{ |
| 218 | + "languages": [ |
| 219 | + "English", |
| 220 | + "French" |
| 221 | + ], |
| 222 | + "president": null, |
| 223 | + "name": "Canada", |
| 224 | + "population": 37742154 |
| 225 | +} |
| 226 | +``` |
| 227 | + |
| 228 | +## Converting custom Python objects to JSON objects |
| 229 | + |
| 230 | +Save the following code as a Python script and run it: |
| 231 | + |
| 232 | +```python |
| 233 | +import json |
| 234 | + |
| 235 | +class Country: |
| 236 | + def __init__(self, name, population, languages): |
| 237 | + self.name = name |
| 238 | + self.population = population |
| 239 | + self.languages = languages |
| 240 | + |
| 241 | + |
| 242 | +canada = Country("Canada", 37742154, ["English", "French"]) |
| 243 | + |
| 244 | +print(json.dumps(canada)) |
| 245 | +# OUTPUT: TypeError: Object of type Country is not JSON serializable |
| 246 | +``` |
| 247 | + |
| 248 | +To convert the objects to JSON, we need to write a new class that extends JSONEncoder: |
| 249 | + |
| 250 | +```python |
| 251 | +import json |
| 252 | + |
| 253 | +class CountryEncoder(json.JSONEncoder): |
| 254 | + def default(self, o): |
| 255 | + if isinstance(o, Country): |
| 256 | + # JSON object would be a dictionary. |
| 257 | + return { |
| 258 | + "name" : o.name, |
| 259 | + "population": o.population, |
| 260 | + "languages": o.languages |
| 261 | + } |
| 262 | + else: |
| 263 | + # Base class will raise the TypeError. |
| 264 | + return super().default(o) |
| 265 | +``` |
| 266 | + |
| 267 | +This class can now be supplied to the `json.dump()` as well as `json.dumps()` methods. |
| 268 | + |
| 269 | +```python |
| 270 | +print(json.dumps(canada, cls=CountryEncoder)) |
| 271 | +# OUTPUT: {“name": "Canada", "population": 37742154, "languages": ["English", "French"]} |
| 272 | +``` |
| 273 | + |
| 274 | +## Creating Python class objects from JSON objects |
| 275 | + |
| 276 | +Using a custom encoder, we were able to write code like this: |
| 277 | + |
| 278 | +```python |
| 279 | +# Create an object of class Country |
| 280 | +canada = Country("Canada", 37742154, ["English", "French"]) |
| 281 | +# Use json.dump() to create a JSON file in writing mode |
| 282 | +with open('canada.json','w') as f: |
| 283 | + json.dump(canada,f, cls=CountryEncoder) |
| 284 | +``` |
| 285 | + |
| 286 | +If we try to parse this JSON file using the `json.load()` method, we will get a dictionary: |
| 287 | + |
| 288 | +```python |
| 289 | +with open('canada.json','r') as f: |
| 290 | + country_object = json.load(f) |
| 291 | +# OUTPUT: <type ‘dict'> |
| 292 | +``` |
| 293 | + |
| 294 | +To get an instance of the `Country` class instead of a dictionary, we need to create a custom decoder: |
| 295 | + |
| 296 | +```python |
| 297 | +import json |
| 298 | + |
| 299 | +class CountryDecoder(json.JSONDecoder): |
| 300 | + def __init__(self, object_hook=None, *args, **kwargs): |
| 301 | + super().__init__(object_hook=self.object_hook, *args, **kwargs) |
| 302 | + |
| 303 | + def object_hook(self, o): |
| 304 | + decoded_country = Country( |
| 305 | + o.get('name'), |
| 306 | + o.get('population'), |
| 307 | + o.get('languages'), |
| 308 | + ) |
| 309 | + return decoded_country |
| 310 | +``` |
| 311 | + |
| 312 | +Finally, we can call the `json.load()` method and set the `cls` parameter to `CountryDecoder` class. |
| 313 | + |
| 314 | +```python |
| 315 | +with open('canada.json','r') as f: |
| 316 | + country_object = json.load(f, cls=CountryDecoder) |
| 317 | + |
| 318 | +print(type(country_object)) |
| 319 | +# OUTPUT: <class ‘Country'> |
| 320 | +``` |
| 321 | + |
| 322 | + |
| 323 | +If you wish to find out more about Reading & Parsing JSON Data With Python, see our [blog post](https://oxylabs.io/blog/python-parse-json). |
0 commit comments