GameMaker: Escaping URL parameters


Illustrating why you should escape your URL parameters

If you are using HTTP requests in your GameMaker projects, you may need to encode parameters in URLs accordingly to prevent your requests from becoming accidentally (or less accidentally, if user input is involved) malformed.

With some luck you may have already found the script on YYG helpdesk, but there's a minor caveat - as of writing this post, it does not support Unicode, so any non-Latin glyphs will be lost in process. Also it's not as fast as it could be, coming from pre-GM:S days and all.

I have at one point implemented URL encoding for sfgml and have now made a cleaner-looking, single-script version of the function. This post is about that.

Idea

We are going to look at browser-standard encodeURIComponent for implementation reference;

  • As per page, encoding works on per-UTF8-byte basis.
    Perhaps the most reliable way to process UTF8 bytes of a string in GameMaker is to write the string into a buffer and then pull bytes out of it, so we shall do that;
  • As per page and spec (section 2.2), the only characters that should not be escaped are

    A-Z a-z 0-9 - _ . ! ~ * ' ( )
    

    to avoid having a particularly awful-looking if-statement, we shall make a 256-item lookup array indicating which bytes are allowed to go unescaped;

  • Since we have a fixed amount of possible bytes to encode, we can also pre-generate an array of hex char pairs for each byte (" " becomes "%20", so we need ord("2") | (ord("0") << 8) to produce 2 bytes for the output string)
  • To have largely-linear processing time, we'll use a buffer for building the output string.

This boils down to the following:

  • Create buffers and populate lookup arrays on first run
  • For each UTF8 byte of input string (written to a buffer), append either directly to output buffer (if allowed), or as a "%" followed by two hex digits from the lookup array.
  • Finally, read back and return the string from output buffer.

Implementation

Aforementioned concept translates to code pretty well - constructing the lookup arrays is by far the most verbose part of this.

/// url_encode(url_string)
/// @param url_string
gml_pragma("global", "global._url_encode_ready = false;");
var l_inbuf, l_outbuf, l_allowed, l_hex, l_ind;
if (global._url_encode_ready) {
    l_inbuf = global._url_encode_in;
    l_outbuf = global._url_encode_out;
    l_allowed = global._url_encode_allowed;
    l_hex = global._url_encode_hex;
} else { // first-time setup
    global._url_encode_ready = true;
    l_inbuf = buffer_create(1024, buffer_grow, 1);
    global._url_encode_in = l_inbuf;
    l_outbuf = buffer_create(1024, buffer_grow, 1);
    global._url_encode_out = l_outbuf;
    // establish which characters we do NOT need to encode:
    l_allowed = array_create(256);
    for (l_ind = ord("A"); l_ind <= ord("Z"); l_ind++) l_allowed[l_ind] = true;
    for (l_ind = ord("a"); l_ind <= ord("z"); l_ind++) l_allowed[l_ind] = true;
    for (l_ind = ord("0"); l_ind <= ord("9"); l_ind++) l_allowed[l_ind] = true;
    l_allowed[ord("-")] = true;
    l_allowed[ord("_")] = true;
    l_allowed[ord(".")] = true;
    l_allowed[ord("!")] = true;
    l_allowed[ord("~")] = true;
    l_allowed[ord("*")] = true;
    l_allowed[ord("'")] = true;
    l_allowed[ord("(")] = true;
    l_allowed[ord(")")] = true;
    global._url_encode_allowed = l_allowed;
    // pre-generate two-byte hex char sequences:
    l_hex = array_create(256);
    for (l_ind = 0; l_ind < 256; l_ind++) {
        var l_hv, l_hd = l_ind >> 4;
        if (l_hd >= 10) {
            l_hv = ord("A") + l_hd - 10;
        } else l_hv = ord("0") + l_hd;
        // second char (lower nibble):
        l_hd = l_ind & $F;
        if (l_hd >= 10) {
            l_hv |= (ord("A") + l_hd - 10) << 8;
        } else l_hv |= (ord("0") + l_hd) << 8;
        l_hex[l_ind] = l_hv;
    }
    global._url_encode_hex = l_hex;
}
// write down and measure the input string:
buffer_seek(l_inbuf, buffer_seek_start, 0);
buffer_write(l_inbuf, buffer_text, argument0);
var l_len = buffer_tell(l_inbuf);
// read bytes one-by-one, deciding for each:
buffer_seek(l_inbuf, buffer_seek_start, 0);
buffer_seek(l_outbuf, buffer_seek_start, 0);
repeat (l_len) {
    var l_byte = buffer_read(l_inbuf, buffer_u8);
    if (l_allowed[l_byte]) {
        buffer_write(l_outbuf, buffer_u8, l_byte);
    } else { // if it needs to be encoded, write %<two hex digits>
        buffer_write(l_outbuf, buffer_u8, ord("%"));
        buffer_write(l_outbuf, buffer_u16, l_hex[l_byte]);
    }
}
// finally, rewind and read the string:
buffer_write(l_outbuf, buffer_u8, 0);
buffer_seek(l_outbuf, buffer_seek_start, 0);
return buffer_read(l_outbuf, buffer_string);

How to use

Add the shown script as url_encode and pass parameters that you want escaped through it, like so:

var url = "some?val=" + url_encode("hi hello привіт");
show_debug_message(url); // "some?val=hi%20hello%20%D0%BF%D1%80%D0%B8%D0%B2%D1%96%D1%82"

As can be seen, spaces are escaped, and so is non-Latin text, as per spec.

Have fun!

Related posts:

One thought on “GameMaker: Escaping URL parameters

  1. Unrelated to the topic at hand, your work is amazing and have provided more than YOYO games. With that being said do you think you can find a way to fix Game Maker Studio 1’s Audio Falloff. It is broken and appears not to work, while I could code around it I thought I’d just mention to see if there was a way you could fix it? Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.