Saturday, January 16, 2016

Azure Table Key Encoding

I was getting BadRequest errors inserting rows into an Azure Table. I eventually noticed that some of the PartitionKey strings had \ (backslash) inside them. I then found various online arguments about which characters were forbidden and various silly workarounds, like base64 encoding all the bytes in the key string (which works of course, but is a completely ham-fisted approach).

The MSDN documentation clearly states which characters are forbidden in the key strings, but it doesn't mention % (percent) which some people have reported as troublesome. I have added % to the forbidden list.

My preferred way of encoding and decoding arbitrary strings as row keys is to use Regex Replace, which produces safe encoded strings where forbidden characters are replaced by +xx hex escape sequences. The +xx escape characters have no special meaning, I just made it up because it looks vaguely readable. You can invent your own preferred escaping system. The resulting encoded strings are still reasonably readable. The + escape prefix character itself also has to be considered forbidden and be encoded.

To encode:

string encodedKey = Regex.Replace(sourceKey,
  "[\x00-\x1f\x7f-\x9f\\\\/#?%+]",
  m => $"+{((int)m.Value[0]).ToString("X2")}");

To decode:

string decodedKey = Regex.Replace(value,
  "(\\+[0-9A-F]{2})",
  m => ((char)(Convert.ToByte(m.Groups[0].Value.Substring(1), 16))).ToString());

The string that originally caused my BadRequest error encodes like this:

Plain*RCS\Demo
Encoded*RCS+5CDemo

Running a worst case string through the code produces:

Plain*Back\\Slash \t\vSlash/Hash#Q?Pct%Dot\x95Plus+ΑΒΓΔ
Encoded*Back+5CSlash +09+0BSlash+2FHash+23Q+3FPct+25Dot+95Plus+2BΑΒΓΔ

Some web articles suggest that you use the Uri and HttpUtility classes to perform the encoding, but unfortunately they do not have the correct behaviour. The documentation makes no comment about Unicode characters outside of the explicitly forbidden set, so I ran a quick experiment to roundtrip a row with a PartitionKey containing the Greek characters ΑΒΓΔ and it worked okay.

No comments:

Post a Comment