Partially related to an earlier blog post about minifying JSON (which I had just updated), but adressing a different problem - sometimes, rather than trying to make your JSON more compact (and, consequently, less readable), you may want to make it more readable - be that for your own debugging purposes, or not to scare players away from files that they are allowed to edit.
Take Nuclear Throne, for example. The save file is a big nested JSON structure, and, needless to say, you aren't getting around passing it through an external JSON beautifier if you want to be able to make sense of it:
With a bit of string processing, however, you can have it printed out nicely readable:
So this post is about that.
The idea
The general outline of process is as following:
- Trim any unnecessary whitespace (in case it's inconsistent).
- Cut trailing zeroes from numeric values (4.500000 -> 4.5, 1.000000 -> 1).
- Add a space after each colon if there isn't one ("key": "value", "key": [...], ...).
- After each opening bracket not immediately followed by a closing bracket, increase indentation level and add a new line (if followed, use compact format - [], {}).
- Add a new line (with indentation level in mind) after each comma.
- Reduce indentation level and add a new line before each closing bracket (except for those preceded by an opening bracket).
- Make sure not to reformat contents of strings.
The code
The code is largely based on my 2018 version of minifier script.
First, you'll need a helper script called buffer_write_slice:
/// buffer_write_slice(buffer, data_buffer, data_start, data_end) var start = argument2; var next = argument3 - start; if (next <= 0) exit; var buf = argument0; var data = argument1; var size = buffer_get_size(buf); var pos = buffer_tell(buf); var need = pos + next; if (size < need) { do size *= 2 until (size >= need); buffer_resize(buf, size); } buffer_copy(data, start, next, buf, pos); buffer_seek(buf, buffer_seek_relative, next);
This writes a section of one buffer to other buffer, and is used extensively here.
Then you can add the script itself, called json_beautify:
/// json_beautify(json_string) // initialization // in old versions of GMS, you'd have this ran separately instead. // in GMS2 it'd need to be @"..." instead of just "..." gml_pragma("global", " global.g_json_beautify_fb = buffer_create(1024, buffer_fast, 1); global.g_json_beautify_rb = buffer_create(1024, buffer_grow, 1); "); var src = argument0; // copy text to string buffer: var rb = global.g_json_beautify_rb; buffer_seek(rb, buffer_seek_start, 0); buffer_write(rb, buffer_string, src); var size = buffer_tell(rb) - 1; var rbsize = buffer_get_size(rb); // then copy it to "fast" input buffer for peeking: var fb = global.g_json_beautify_fb; if (buffer_get_size(fb) < size) buffer_resize(fb, size); buffer_copy(rb, 0, size, fb, 0); buffer_seek(rb, buffer_seek_start, 0); // var rbpos = 0; // writing position in output buffer var start = 0; // start offset in input buffer var pos = 0; // reading position in input buffer var next; // number of bytes to be copied var need; var nest = 0; while (pos < size) { var c = buffer_peek(fb, pos++, buffer_u8); switch (c) { case 9: case 10: case 13: case 32: // `\t\n\r ` buffer_write_slice(rb, fb, start, pos - 1); // skip over trailing whitespace: while (pos < size) { switch (buffer_peek(fb, pos, buffer_u8)) { case 9: case 10: case 13: case 32: pos += 1; continue; // default -> break } break; } start = pos; break; case 34: // `"` while (pos < size) { switch (buffer_peek(fb, pos++, buffer_u8)) { case 92: pos++; continue; // `\"` case 34: break; // `"` -> break default: continue; // else } break; } break; case ord("["): case ord("{"): buffer_write_slice(rb, fb, start, pos); // skip over trailing whitespace: while (pos < size) { switch (buffer_peek(fb, pos, buffer_u8)) { case 9: case 10: case 13: case 32: pos += 1; continue; // default -> break } break; } // indent or contract `[]`/`{}` c = buffer_peek(fb, pos, buffer_u8); switch (c) { case ord("]"): case ord("}"): // `[]` or `{}` buffer_write(rb, buffer_u8, c); pos += 1; break; default: // `[\r\n\t buffer_write(rb, buffer_u16, 2573); // `\r\n` repeat (++nest) buffer_write(rb, buffer_u8, 9); // `\t` } start = pos; break; case ord("]"): case ord("}"): buffer_write_slice(rb, fb, start, pos - 1); buffer_write(rb, buffer_u16, 2573); // `\r\n` repeat (--nest) buffer_write(rb, buffer_u8, 9); // `\t` buffer_write(rb, buffer_u8, c); start = pos; break; case ord(","): buffer_write_slice(rb, fb, start, pos); buffer_write(rb, buffer_u16, 2573); // `\r\n` repeat (nest) buffer_write(rb, buffer_u8, 9); // `\t` start = pos; break; case ord(":"): if (buffer_peek(fb, pos, buffer_u8) != ord(" ")) { buffer_write_slice(rb, fb, start, pos); buffer_write(rb, buffer_u8, ord(" ")); start = pos; } else pos += 1; break; default: if (c >= ord("0") && c <= ord("9")) { // `0`..`9` var pre = true; // whether reading pre-dot or not var till = pos - 1; // index at which meaningful part of the number ends while (pos < size) { c = buffer_peek(fb, pos, buffer_u8); if (c == ord(".")) { pre = false; // whether reading pre-dot or not pos += 1; // index at which meaningful part of the number ends } else if (c >= ord("0") && c <= ord("9")) { // write all pre-dot, and till the last non-zero after dot: if (pre || c != ord("0")) till = pos; pos += 1; } else break; } if (till < pos) { // flush if number can be shortened buffer_write_slice(rb, fb, start, till + 1); start = pos; } } } } if (start == 0) return src; // source string was unchanged buffer_write_slice(rb, fb, start, pos); buffer_write(rb, buffer_u8, 0); // terminating byte buffer_seek(rb, buffer_seek_start, 0); return buffer_read(rb, buffer_string);
and then use it on JSON strings you get from json_encode or wherever else.
As an additional note, this uses tabs for indentation, but you can have it to use spaces by changing buffer_write(rb, buffer_u8, 9); to buffer_write(rb, buffer_text, " "), for example.
Have fun !
It’s still usefull and works like a charm in game maker 2.3+ in 2022
Thank you !
Hey man, I’m not sure why, but after saving some in-game stats to JSON and outputting it through this script, HTML5 on GM:S 1.4 seems to not be able to parse it. It can read a straight output from json_decode, though.
As you might suspect, I would need an input JSON sample for that.
Having tested it some more… I think it might be fine? There’s /something/ affecting my JSON reading on HTML5 but I think this script is fine. My apologies Vadim!
Thank you man! It’s so helpful.