NOTE: This article presents 3 node.js and python scripts. Scroll to the section “Full Path Json with Python (Better – version 2)” at the bottom for the best one. See GIST with the downloadable script. Additionally, this is possible using jq and an already existing tool called gron.

Sometimes JSON files are nested so much that it is hard to look for output. Of course, one could use JQ but to be fast with JQ you first need to know the structure of the JSON file. Here is a proposed method to look thru JSON files by using full path json. Which is just a flat 1 layer deep representation of the json object where it looks like this { full-key-path : value output }. It makes it very easy to grep thru and also understand the full json object.

Note the input requirements are that the json files must be in string format saved to a file. Also they must be valid formats that json would understand. Example: strings must be surrounded by quotes.

So this would fail to be processed:

{
  one: 1,
  two: {
      three: 3
  },
  four: { 
      five: 5,
      six: {
          seven: 7
      },
      eight: 8
   },
  nine: 9
}

But this would process:

{
  "one": 1,
  "two": {
      "three": 3
  },
  "four": { 
      "five": 5,
      "six": {
          "seven": 7
      },
      "eight": 8
   },
  "nine": 9
}

The output should just be something that is easy to grep for words to get what you are looking for.

Example:

{
  one: 1,
  'two.three': 3,
  'four.five': 5,
  'four.six.seven': 7,
  'four.eight': 8,
  nine: 9
}

# Or

{'four.eight': 8,
 'four.five': 5,
 'four.six.seven': 7,
 'nine': 9,
 'one': 1,
 'two.three': 3}

# Or 

{'four/eight': 8,
 'four/five': 5,
 'four/six/seven': 7,
 'nine': 9,
 'one': 1,
 'two/three': 3}

As you can see you can easily grep for the word “six” in those outputs to get everything under the key value of “six”. However, in the original json string grepping for “six” would not return anything meaningfull. You would need to use JQ. For JQ you need to be familiar with the layout of the file, however with this full path json you see the layout at a glance.

Note: Below are presented 3 programs: 1 node js program and 2 python program. The worst one is the nodejs version as sometimes it returns objects or errors when the others return the output. The best one is the last python one (fullpathjson2.py). Feel free to use whichever you want though. Scroll to the section “Full Path Json with Python (Better – version 2)” at the bottom for the best one.

Full Path Json With Node.JS

Here is an attempt via node.js

Step 1. Install node.js

Examples:

For Ubuntu / Dabian Linux:

apt install nodejs;

For MAC:

brew install node;

For Centos Linux:

yum install nodejs;

Step 2. Create the following node.js program. Just put this content into a file called fullpathjson.js.

cat fullpathjson.js

/* ================================================================================
function: read in json file from first arg and print out full path json format.

consider this input file:

$ cat test.json

{
  "one": 1,
  "two": {
      "three": 3
  },
  "four": { 
      "five": 5,
      "six": {
          "seven": 7
      },
      "eight": 8
   },
  "nine": 9
}

example:

$ node fullpathjson.js test.json

output:

{
  one: 1,
  'two.three': 3,
  'four.five': 5,
  'four.six.seven': 7,
  'four.eight': 8,
  nine: 9
}

================================================================================ */

const filename = process.argv[2];

const fs = require("fs");
fs.readFile(filename, "utf8", (err, jsonString) => {
  if (err) {
    console.log("Error reading file from disk:", err);
    return;
  }
  try {
    const obj = JSON.parse(jsonString);
    // console.log(obj); // DEBUG: shows given json object
    const flatObject = (obj, keyPrefix = null) =>
    Object.entries(obj).reduce((acc, [key, val]) => {
    const nextKey = keyPrefix ? `${keyPrefix}.${key}` : key
    if (typeof val !== "object") {
      return {
        ...acc,
        [nextKey]: val
      };
    } else {
      return {
        ...acc,
        ...flatObject(val, nextKey)
      };
    }
   }, {});
   console.log(flatObject(obj))
  } catch (err) {
  console.log("Error parsing JSON string:", err);
  }
});

Step 3. Run the program against some json output like so

node fullpathjson.js input.json

The output should be easier to interpret and is very easily greppable.

Here is example output from an fio job which saved its results as json. Then we ran the resulting json file agsint fullpathjson.js.

...skip...
 'fio version': 'fio-3.7',
  timestamp: 1668015571,
  time: 'Wed Nov  9 09:39:31 2022',
  'global options.iodepth': '1',
  'global options.bs': '32k,100k',
  'global options.direct': '1',
  'global options.runtime': '180',
  'global options.rwmixread': '70',
  'global options.rw': 'randrw',
  'global options.numjobs': '40',
  'client_stats.0.jobname': 'job1',
  'client_stats.0.groupid': 0,
  'client_stats.0.error': 117,
  'client_stats.0.job options.filename': '/mnt/nvme1/testfile.bin:/mnt/nvme2/testfile.bin:/mnt/nvme3/testfile.bin:/mnt/nvme4/testfile.bin:/mnt/nvme5/testfile.bin:/mnt/nv',
  'client_stats.0.read.io_bytes': 4423680,
  'client_stats.0.read.io_kbytes': 4320,
  'client_stats.0.read.bw_bytes': 552960000,
  'client_stats.0.read.bw': 540000,
  'client_stats.0.read.iops': 20250,
  'client_stats.0.read.runtime': 8,
  'client_stats.0.read.total_ios': 162,
  'client_stats.0.read.short_ios': 0,
  'client_stats.0.read.drop_ios': 0,
  'client_stats.0.read.slat_ns.min': 0,
  'client_stats.0.read.slat_ns.max': 0,
  'client_stats.0.read.slat_ns.mean': 0,
  'client_stats.0.read.slat_ns.stddev': 0,
  'client_stats.0.read.clat_ns.min': 324640,
  'client_stats.0.read.clat_ns.max': 1766444,
  'client_stats.0.read.clat_ns.mean': 604275.614815,
  'client_stats.0.read.clat_ns.stddev': 297093.707537,
  'client_stats.0.read.clat_ns.percentile.1.000000': 346112,
  'client_stats.0.read.clat_ns.percentile.5.000000': 362496,
  'client_stats.0.read.clat_ns.percentile.10.000000': 382976,
  'client_stats.0.read.clat_ns.percentile.20.000000': 419840,
...skip...

Full Path Json with Python

Here is an example of full path json using python.

Create the following python script call it fullpathjson.py

import json
import sys
import pprint

def full_path_json(json_string: str) -> dict:
    def transform(obj, parent_key='', separator='.'):
        items = []
        for k, v in obj.items():
            new_key = parent_key + separator + k if parent_key else k
            if isinstance(v, dict):
                items.extend(transform(v, new_key, separator).items())
            else:
                items.append((new_key, v))
        return dict(items)

    # Parse the JSON string and transform it
    obj = json.loads(json_string)
    return transform(obj)


if __name__ == "__main__":
    # get filename from arg
    filename = sys.argv[1]

    # Read the JSON file and transform it
    with open(filename, 'r') as f:
        json_string = f.read()

    # flattened dict
    fdict = full_path_json(json_string)

    # pretty print
    pprint.pprint(fdict)

Here is a small example json file which we can use as the input; its the same one from the fullpathjson.js comment header.

$ cat test.json

{
  "one": 1,
  "two": {
      "three": 3
  },
  "four": { 
      "five": 5,
      "six": {
          "seven": 7
      },
      "eight": 8
   },
  "nine": 9
}

Now apply the full path json.

$ python fullpathjson.py test.json

{'four.eight': 8,
 'four.five': 5,
 'four.six.seven': 7,
 'nine': 9,
 'one': 1,
 'two.three': 3}

The output is slightly different, however, it is still very grep-able and that is the whole goal.

Full Path Json with Python (Better – version 2a)

Script fullpathjson2.py

Note: Download this script from my github GIST as fullpathjson.py

import json
import sys
# import pprint

def flatten_json(nested_json: dict, exclude: list=[], delim: str = "/") -> dict:
    """Flatten json object with nested keys into a single level.
        Args:
            nested_json {dict}: A nested json object.
            exclude {list}: Keys to exclude from output.
            delim {str}: path delimiter, default is "/"
        Returns:
            {dict}: The flattened json object if successful, None otherwise.
    """
    out = {}

    def flatten(x: dict, name: str='', exclude: list=[]):
        if type(x) is dict:
            for a in x:
                if a not in exclude:
                    flatten(x[a], f"{name}{delim}{a}", exclude)
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, f"{name}{delim}{i}", exclude)
                i += 1
        else:
            out[name[1:]] = x

    flatten(nested_json, exclude=exclude)
    return out

if __name__ == "__main__":
    filename = sys.argv[1]
    file_content = open(filename,"r").read()
    dict_content = json.loads(file_content)
    flattened_dict = flatten_json(dict_content)
    # pprint.pprint(flattened_dict, width=3000)
    for key, value in flattened_dict.items():
        print(f'{key}: {value}')

Example:

# Input

$ cat test.json

{
  "one": 1,
  "two": {
      "three": 3
  },
  "four": { 
      "five": 5,
      "six": {
          "seven": 7
      },
      "eight": 8
   },
  "nine": 9
}

# Output

$ python fullpathjson2.py test.json
four/eight: 8,
four/five: 5,
four/six/seven: 7,
nine: 9,
one: 1,
two/three: 3

Full Path Json with Python (Older Good version – version 2_old)

You can get more jsonized output by running with pretty print instead of a for loop, which is the old version of the better script, hence why its called 2_old. To do that uncomment the pretty print module import at the top, uncomment the pprint call line in the main section, then comment out the for loop in the main section. You can also download the zOld-fullpathjson.py file from the GIST. The script will look like this.

import json
import sys
import pprint

def flatten_json(nested_json: dict, exclude: list=[], delim: str = "/") -> dict:
    """Flatten json object with nested keys into a single level.
        Args:
            nested_json {dict}: A nested json object.
            exclude {list}: Keys to exclude from output.
            delim {str}: path delimiter, default is "/"
        Returns:
            {dict}: The flattened json object if successful, None otherwise.
    """
    out = {}

    def flatten(x: dict, name: str='', exclude: list=[]):
        if type(x) is dict:
            for a in x:
                if a not in exclude:
                    flatten(x[a], f"{name}{delim}{a}", exclude)
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, f"{name}{delim}{i}", exclude)
                i += 1
        else:
            out[name[1:]] = x

    flatten(nested_json, exclude=exclude)
    return out

if __name__ == "__main__":
    filename = sys.argv[1]
    file_content = open(filename,"r").read()
    dict_content = json.loads(file_content)
    flattened_dict = flatten_json(dict_content)
    pprint.pprint(flattened_dict, width=3000)
    # for key, value in flattened_dict.items():
    #    print(f'{key}: {value}')

Its output with pretty print will look like this. Note the keys are surrounded by quotes and the entirety is surrounded by curly braces.

# Input

$ cat test.json

{
  "one": 1,
  "two": {
      "three": 3
  },
  "four": { 
      "five": 5,
      "six": {
          "seven": 7
      },
      "eight": 8
   },
  "nine": 9
}

# Output

$ python fullpathjson2.py test.json
{ 'four/eight': 8,
'four/five': 5,
'four/six/seven': 7,
'nine': 9,
'one': 1,
'two/three': 3 }

Full Path Json with Jq (Works Amazing)

Here is a method to create greppable and full path json with jq.

cat somefile.json | jq -r '. as $i | path(.. |  scalars) as $p | $p | map(tostring) | join("/") + ": " + ($i | getpath($p) | tostring)'

jq -r '. as $i | path(.. |  scalars) as $p | $p | map(tostring) | join("/") + ": " + ($i | getpath($p) | tostring)' < somefile.json

Output Example:

$ jq -r '. as $i | path(.. |  scalars) as $p | $p | map(tostring) | join("/") + ": " + ($i | getpath($p) | tostring)' <<< '{"a": [1,2,3], "b": 123, "c": "abc"}'
a/0: 1
a/1: 2
a/2: 3
b: 123
c: abc

Full Path Json with Gron (a jq script) (Works Amazing)

gron is a is a jq script. You can download it and set it to be exectuable.

Here is the implementation: https://gist.github.com/emanuele6/328119cf459a68c80d34818881b3fdff and another implementation https://github.com/tomnomnom/gron

Install emanuele6 version:

GRON_URL=https://gist.githubusercontent.com/emanuele6/328119cf459a68c80d34818881b3fdff/raw/11bec90b9753a3118ac11c563be7bb2a4bb7061d/gron.jq
wget "$GRON_URL" -O gron.jq || curl "$GRON_URL" -o gron.jq 
chmod +x gron.jq
sudo mv gron.jq /bin/

Now run it like this

gron.jq < somefile.json
gron.jq <<< 'json string'

Example output (sidenote for strings you must use <<< instead of <):

[root@l64sv3-b-2 azureuser]# gron.jq <<< '{"a": [1,2,3], "b": 123, "c": "abc"}'
json = {};
json.a = [];
json.a[0] = 1;
json.a[1] = 2;
json.a[2] = 3;
json.b = 123;
json.c = "abc";

Official Gron

The gron above is a simple jq script. Here is a go implementation that can gron and ungron to get back to json format. So you can gron, grp or modify content and ungron back to json format.

### INSTALLATION ###

# For MAC:

brew install gron

# For Ubuntu:

snap search gron && snap install gron

# For other AMD64. Get latest release URL from https://github.com/tomnomnom/gron/releases

GRON_URL="https://github.com/tomnomnom/gron/releases/download/v0.7.1/gron-linux-amd64-0.7.1.tgz"
wget "$GRON_URL" -O gron.tgz || curl "$GRON_URL" -o gron.tgz
tar xf gron.tgz
sudo mv gron /usr/bin/

### EXAMPLE ###

❯  gron <<< '{"a": [1,2,3], "b": 123, "c": "abc"}'
json = {};
json.a = [];
json.a[0] = 1;
json.a[1] = 2;
json.a[2] = 3;
json.b = 123;
json.c = "abc";

❯  gron <<< '{"a": [1,2,3], "b": 123, "c": "abc"}' | grep b
json.b = 123;
json.c = "abc";

❯  gron <<< '{"a": [1,2,3], "b": 123, "c": "abc"}' | grep b | gron -u
{
  "b": 123,
  "c": "abc"
}

Many Thanks

Thanks to the folks at the jq github page. I submitted a feature request for this, however it turns out that this already is possible via jq and gron. https://github.com/jqlang/jq/issues/2991

Leave a Reply

Your email address will not be published. Required fields are marked *