August 18, 2023
6 min read
JSON Introduction For Data Science
Do you work with data from web applications? Interact with REST APIs? Have data that is hierarchical and structured? Then you've probably heard about or used JSON.
So why exactly is JSON so popular? JSON (JavaScript Object Notation) has several advantages as seen below.

JSON originated from JavaScript object literals as defined by the ECMAScript Programming Language Standard. The ECMAScript standard facilitated interoperability of web pages across different web browsers. Consequently, JSON quickly became the de-facto data interchange format of the web that helps web applications talk to each other.
As we will see below, a large part of JSON's success can be attributed to its simplicity and flexibility.
JSON is a subset of JavaScript but excludes assignment and invocation. JSON has a small set of formatting rules for the portable representation of structured data.
JSON Simple Example
JSON supports four primitive i.e. basic types:
- strings - e.g. "Predinfer"
- numbers - e.g. 3.14
- booleans - only the lowercase literal words "true" or "false" is supported
- null - only the lowercase literal word "null" is supported
JSON supports two structured i.e. complex user-defined types:
- object - unordered collection of zero or more name-value pairs
- array - an ordered sequence of zero or more values
What makes the object type very useful is that it's name-value pairs adhere to simple rules:
- name - must be a string
- value - must be one of the other supported types i.e. string, number, boolean, null, object or array
This means, you can compose your own objects from any of the supported types. The following represents a single JSON object that has values representing all the supported types in the order of string, number, boolean, null, object and array i.e. lines 2-7 below.
{
"Company": "Predinfer",
"Year Established": 2022,
"Has Website": true,
"No. of shares": null,
"Address": {"City": "Boston","State": "MA", "Zip": 21140, "Country": "US"},
"Metadata": ["EIN","Annual Report", 314884]
}You can quickly validate that the above JSON is valid by copy-pasting it into an online validator.
JSON Syntax
Since JSON only supports two complex types i.e. object & array, it needs just six characters {}[]:, to specify an object or an array:
{}- anything within curly brackets is an object
{ "Company": "Predinfer",
"Year Established": 2022
}:- name and value is separated with a colon,- values are separated from each other via a commaNote- names within an object must be unique
{
"Company": "Predinfer", "Year Established": 2022
}[]- anything within square brackets is an array,- as with the object above values are separated from each other via a commaNote- Unlike other formats, there is no requirement that the values in an array be of the same type (E.g. Metadata below has both strings and numbers in the array)
{
"Metadata": ["EIN","Annual Report", 314884] }- Insignificant whitespace (which improves readability without impacting meaning) is allowed before or after any of the six structural characters i.e.
{}[]:, - A common feature associated with JSON is
Pretty Printwhich adds insignificant whitespace to improve readability - So our example from above with
Pretty Printenabled would be transformed as follows:
{
"Company": "Predinfer",
"Year Established": 2022,
"Has Website": true,
"No. of shares": null,
"Address": {
"City": "Boston",
"State": "MA",
"Zip": 21140,
"Country": "US"
},
"Metadata": ["EIN", "Annual Report", 314884]
}- Numbers in JSON are decimal digits (base 10). This means numbers are integers (implemented as double precision) with a fractional and/or exponent part
- Exceptions worth highlighting are that leading zeros, Infinity & NaN are NOT permitted
- The following array has valid numbers:
[9999, -100, 3.14, 3e3, 3E-4]- A string in JSON must be double quoted and all Unicode characters must be placed within the double quotes
- Control characters have special meaning in JSON. If you would like to use control characters "literally" i.e. without their special meaning they must be escaped using
\i.e. backslash character. This means, if you want to use\b(backspace),\t(tab) or\n(newline) or any of the control characters you will have to escape it as follows:
["\\b\\t\\n"]- To achieve maximum interoperability, JSON text must be UTF-8 encoded
- The MIME type for JSON is
application/json - With respect to JSON parsers, in practice you may encounter limitations imposed on:
- Size of text
- Maximum depth of nesting
- Range and precision of numbers
- Length and contents of strings
JSON Complex Example
Now that we've covered all the basics, you can see that JSON is minimal, simple & flexible. This makes it extremely powerful and hence has made it a very popular format.
In the real-world, it is likely that you will encounter complex nested examples of JSON text as seen below. However, keep in mind that any piece of JSON text adheres to the simple rules we've seen above. The complexity comes from the hierarchical structural representation of real-world data which aims to preserve the relationship & associations between data points.
Below is an example of a "Document Database" which is a type of nonrelational database that stores and queries JSON documents. The bookstore database below has four primary attributes:
- name
- access permission
- location
- inventory
- The attributes "name" and "location" contain simple strings as values.
- The attributes "access permission" and "inventory" are nested further demonstrating the hierarchical nature of JSON.
- The "inventory" attribute consists of an
array of three objectswith each object representing a unique book available in the bookstore and has its own schema.
{
"Database": {
"name": "Bookstore",
"access permission": [{
"Administrator": ["Read", "Insert", "Delete", "Modify"],
"Manager": "Insert",
"Sales Representative": ["Read", "Modify"]
}],
"location": "New York, USA",
"inventory": [{
"title": "The Great Gatsby",
"author": "F. Scott Fitzgerald",
"genre": "Fiction",
"price": 12.99,
"availability": true,
"# in stock": 8,
"Find in Store": {
"Isle": 34,
"Side": "Right",
"Stack #": 11
}
},
{
"title": "To Kill a Mockingbird",
"author": "Harper Lee",
"genre": "Fiction",
"price": 10.50,
"availability": false,
"expected availability date": "End of 2023",
"Find in Store": {
"Isle": "",
"Side": "",
"Stack #": ""
}
},
{
"title": "1984",
"author": "George Orwell",
"genre": "Science Fiction",
"price": 9.75,
"availability": true,
"discount code": "My15Off",
"Find in Store": {
"Isle": 17,
"Side": "Left",
"Stack #": 10
}
}
]
}
}