Parsing JSON data in MultiValue - BASIC
What is JSON?
JSON is a simple, text-based way to store and transmit structured data. By using a simple syntax, you can easily store anything from a single number through to strings, arrays, and objects using nothing but a string of delimited text. You can also nest arrays and objects, allowing you to create complex data structures.
JSON is much like XML, but is much more compact. While originally designed to interact with web pages and JavaScript, JSON has become the standard way of exchanging information between hybrid systems and web services.
Why would I use JSON?
If you are looking at integrating existing 3 rd party web services into your MultiValue applications, then you will likely be using JSON to exchange the data.
For example, if you need to GEO code your information, or list your information within a mapping service like Google, then JSON will be the format they will require and provide back to you.
Format of JSON
Here are the basic rules:
- JSON string is grouped together with a "Name" and "Value". The value may be another string, number, Boolean, null, or object.
- A JSON Name/Value pair consisted of a Field Name (think MultiValue dictionary) and a Value separated by a colon. I.e.: "name":"value"
This is the same as using the XML of:
<Name>Value</Name> - A JSON Object is a list of Name/Value pairs, separated by a comma, and surrounded by curly brackets. I.e.: {"name1":"value1","name2":"value2"}
This is the same as using the XML of:
<data>
<name1>value1</name1>
<name2>value2</name2>
</data>
This looks like the following in JavaScript:
data.name1 = "value1"
data.name2 = "value2"
- A JSON Array allows you to specific a group of objects, or name/Value pairs in what would be an "Indexed" Array i.e.: [{"id":"1","Name":"Bob"},{"id":"2","Name":"Joe"}]
This would be the same as doing the following in XML:
<data>
<record>
<id>1</id>
<name>Bob</name>
</record>
<record>
<id>2</id>
<name>Joe</name>
</record>
</data>
This looks like the following in JavaScript:
data[0].id = "1"
data[0].name = "Bob"
data[1].id = "2"
data[1].name = "Joe"
- JSON Value consists of a few possibilities:
Number - Integer, or decimal
String - data surrounded by double quotes
Boolean - the value is either "True" or "False"
Null - the value "null" would be the same as not assigned or ""
Object - anything surrounded in {}
Array - Anything surrounded in []
Here is an example in both XML and JSON:
JSON Example:
{ "orderID": 12345,
"shopperName": "John Smith",
"shopperEmail": "johnsmith@example.com",
"contents": [
{
"productID": 34,
"productName": "SuperWidget",
"quantity": 1
},
{
"productID": 56,
"productName": "WonderWidget",
"quantity": 3
}
],
"orderCompleted": true}
XML Example:
<order> <orderID>1234</orderID> <shopperName>John Smith</shopperName> < shopperEmail >John Smith</ shopperEmail > <contents> <product> <productID>34</productID> <productName>SuperWidget</productName> <quantity>1</quantity> </product> <product> <productID>56</productID> <productName>WonderWidget</productName> <quantity>3</quantity> </product> </contents> <orderCompleted>true</orderCompleted> </order>
Parsing JSON
MultiValue BASIC handles data strings really well, and parsing a JSON string is actually easier than you think. Even with the nested components.
The first think to check for though is if the JSON sting is an Object or Array. You do this by checking the 1 st character for an "{" or "[":
ST.POS = 1
BEGIN CASE
CASE JSON.ITEM[ST.POS,1] = "{"
GOSUB PARSE.OBJECT
CASE JSON.ITEM[ST.POS,1] = "["
GOSUB PARSE.ARRAY
DATA.ITEM = AM: VALUE
CASE 1 ; JSON.ERROR = ST.POS
END CASE
Once you know how are you parsing, you need to find the end mark. Since you may have nested objects and/or arrays, this requires a little more intelligence than a FIELD() statement. Let's take a look at how to parse a Object.
Finding the Property names
Since we know that a object will start with an "{" and end with an "}", and between the brackets will be name and value pairs, then we can build a simple dynamic array that contains the Names and the Pairs. Start by find the end of the "Property name", which will be a colon (:).
PARSE.KEY:
POS = INDEX(JSON.ITEM[ST.POS,FN.POS],":",1)
KEY.NAME = JSON.ITEM[ST.POS,POS-1]
*
ST.POS = ST.POS + POS
*
IF KEY.NAME[1,1] = '"' OR KEY.NAME[1,1] = "'" THEN
QUOTE = KEY.NAME[1,1]
LAST.CHAR = KEY.NAME[LEN(KEY.NAME),1]
IF KEY.NAME = LAST.CHAR THEN JSON.ERROR = ST.POS
*
VALUE = KEY.NAME[2,LEN(KEY.NAME)-2]
GOSUB PROCESS.ESCAPE
KEY.NAME = VALUE
END
RETURN
Once we have the Key, or property name, for the value, we need to store the next start position of the JSON string. This helps up speed up the processing of the JSON data.
If you are not quite following what is going with the ST.POS variable, here is another way to look at it:
ST.POS = 1
{ "orderID": 12345,"shopperName": "John Smith","shopperEmail": "johnsmith@example.com"}
^
POS = INDEX(JSON.STRING,":",1)
ST.POS = ST.POS + POS
{ "orderID": 12345,"shopperName": "John Smith","shopperEmail": "johnsmith@example.com"}
-----------^
There are other ways to optimize this process, but I'll leave that up to you to do.
Handle Escaped Strings
If the supplier of the JSON object was falling the specs correctly, then the key should always be quoted. Quoted strings can also include escaped characters. For example, additional quotes, or other non ASCI characters. These are all proceeded by a backsplash (\). Once the quotes are removed, then you must look to see if there are any escape values.
PROCESS.ESCAPE:
*** Process the escape values in a string
*
DELIMITS = OCONV(VALUE,"MC/N" :VM: "MC/A")
IF INDEX(DELIMITS,"\",1) THEN
C.COUNT = 1
LOOP
C.POS = INDEX(VALUE,"\",C.COUNT)
UNTIL C.POS = 0 DO
NEXT.CHAR = VALUE[C.POS+1,1]
BEGIN CASE
CASE NEXT.CHAR = "\" OR NEXT.CHAR = "/" OR NEXT.CHAR = "'" OR NEXT.CHAR = '"'
VALUE = VALUE[1,C.POS] :NEXT.CHAR: VALUE[C.POS+2,LEN(VALUE)]
C.COUNT = C.COUNT + 1
CASE NEXT.CHAR = "f"
VALUE = VALUE[1,C.POS] :CHAR(12): VALUE[C.POS+2,LEN(VALUE)]
CASE NEXT.CHAR = "n"
VALUE = VALUE[1,C.POS] :CHAR(10): VALUE[C.POS+2,LEN(VALUE)]
CASE NEXT.CHAR = "r"
VALUE = VALUE[1,C.POS] :CHAR(13): VALUE[C.POS+2,LEN(VALUE)]
CASE NEXT.CHAR = "t"
VALUE = VALUE[1,C.POS] :CHAR(9): VALUE[C.POS+2,LEN(VALUE)]
CASE NEXT.CHAR = "u"
HEX.VALUE = VALUE[C.POS+1,4]
VALUE = VALUE[1,C.POS] :OCONV(HEX.VALUE,"MY"): VALUE[C.POS+6,LEN(VALUE)]
END CASE
REPEAT
END
RETURN
This code is pretty straight forward, but may look complex. It really isn't that complex. The main power of this routine is the OCONV() statement. You will see that it is using "MC/N" and "MC/A". These conversion codes will remove all alpha letters (a-z), and remove all numbers (0-9), and leave everything else (quotes, spaces, punctuation, etc.).
For Example:
Before: 'Hello, world \"we are quoted \"' After: ', \" \"'
the next step is to see if we have any backslashes in the converted string. If so, use the INDEX() statement to find them in the string, and replace the escaped value with the correct value.
Before: 'Hello, world \"we are quoted\"' After: 'Hello, world "we are quoted"'
There are special escape codes that mean different things. For example, if you have a \u, then the next 4 digits after the "\u" is a hex value that needs to be converted. "\n" and "\r" are Carriage Return and Linefeed values, "\t" is a tab value, and "\f" is a formfeed.
Finding the Value
Once we have the Property Name, then we need to extract the value that is assigned to that property. Since the value is not always a string, we need to test to see what type of value it is to decide how to parse it:
PARSE.VALUE:
NEXT.CHAR = JSON.ITEM[ST.POS,1]
BEGIN CASE
CASE NEXT.CHAR = "{" ;* Parse Object
ST.DELIMIT = "{" ; FN.DELIMIT = "}"
GOSUB PARSE.DELIMIT
VALUE = "{": VALUE :"}"
CASE NEXT.CHAR = "[" ;* Parse Array
GOSUB PARSE.ARRAY
CASE NEXT.CHAR = "T" ;* Value: True
GOSUB PARSE.TRUE
CASE NEXT.CHAR = "F" ;* Value: False
GOSUB PARSE.FALSE
CASE NEXT.CHAR = "N" ;* Value: Null
GOSUB PARSE.NULL
CASE NEXT.CHAR = "'" OR NEXT.CHAR = \"\
GOSUB PARSE.STRING
CASE 1
GOSUB PARSE.NUMBER
END CASE
RETURN
Depending on the value, looking for the end can be tricky. There are a few obvious and easy ones: True/False/Null
PARSE.NULL:
POS = 4
VALUE = "" ;* JSON.ITEM[ST.POS,POS]
ST.POS = ST.POS + POS + 1
RETURN
PARSE.TRUE:
POS = 4
VALUE = JSON.ITEM[ST.POS,POS]
ST.POS = ST.POS + POS + 1
RETURN
PARSE.FALSE:
POS = 5
VALUE = JSON.ITEM[ST.POS,POS]
ST.POS = ST.POS + POS + 1
RETURN
Extract a number is pretty easy as well, but requires a little bit more processing to find the end of the number. Number many only have +, -, ., and 0-9 Any other character terminates the numbers:
PARSE.NUMBER:
D = 1
LOOP
DELIMIT = OCONV(JSON.ITEM[ST.POS,FN.POS],"MC/N":VM:"MC/A")[D,1]
UNTIL INDEX(".,+-",DELIMIT,1) DO
D = D + 1
REPEAT
POS = INDEX(JSON.ITEM[ST.POS,FN.POS],DELIMIT,1)
*
VALUE = JSON.ITEM[ST.POS,POS-1]
ST.POS = ST.POS + POS
RETURN
Much like the escaped string, we are using the MC/N and MC/A conversion codes to get the next non-number, and looking for that character to find the end of the number.
Parse a string value again requires a little bit of processing, mainly because you have to watch for escaped quotes. You start by figuring out the type of quote the provider was using; double or single. Then find the next similar quote that is not escaped.
PARSE.STRING:
VALUE = ""
*
QUOTE = JSON.ITEM[ST.POS,1]
ST.POS = ST.POS + 1
*
*** Find the next quote, but make sure we check for escaped
*** quotes.
*
DELIMIT.STOP = 0
OCCUR = 1
LOOP
QUOTE.POS = INDEX(JSON.ITEM[ST.POS,FN.POS],QUOTE,OCCUR)
BEGIN CASE
CASE QUOTE.POS = 0
QUOTE.POS = FN.POS + 1
CASE JSON.ITEM[ST.POS + QUOTE.POS - 1,1] = "\"
OCCUR = OCCUR + 1
CASE 1
DELIMIT.STOP = 1
END CASE
UNTIL DELIMIT.STOP DO
REPEAT
*
*** Get the value, and translate the escaped information
*
VALUE = JSON.ITEM[ST.POS,QUOTE.POS-1]
GOSUB PROCESS.ESCAPE
ST.POS = ST.POS + QUOTE.POS
RETURN
Now, let's put this all together to create the PARSE.OBJECT subroutine.
PARSE.OBJECT:
ST.POS = ST.POS + 1
ERROR.CHECK.POS = 0
*
LOOP
UNTIL JSON.ITEM[ST.POS,1] = "}" OR ST.POS >= FN.POS OR JSON.ERROR > 0 DO
IF ST.POS = ERROR.CHECK.POS THEN
*** We have not moved any further. Parsing error
*
JSON.ERROR = ST.POS
END
ERROR.CHECK.POS = ST.POS
*
*** Parse the information
*
GOSUB PARSE.KEY
GOSUB PARSE.VALUE
*
*** Add the Data
*
LOCATE KEY.NAME IN DATA.ITEM<1> SETTING KEY.POS THEN
DATA.ITEM<2,-1> = VALUE
END ELSE
DATA.ITEM<1,-1> = KEY.NAME
DATA.ITEM<2,-1> = VALUE
END
*
*** Look for comma, since this is another value
*
TEST.POS = INDEX(JSON.ITEM[ST.POS,FN.POS],"}",1)
IF TEST.POS = 0 THEN TEST.POS = FN.POS
*
NEXT.CHAR = JSON.ITEM[ST.POS,1]
BEGIN CASE
CASE NEXT.CHAR = ","
ST.POS = ST.POS + 1
CASE NEXT.CHAR = "}"
* do nothing
CASE 1
POS = INDEX(JSON.ITEM[ST.POS,TEST.POS],",",1)
IF POS > 0 THEN
*** Place Pointer just after the ','
*
ST.POS = ST.POS + POS
END ELSE
*** Place the pointer on the "}"
*
ST.POS = ST.POS + TEST.POS - 1
END
END CASE
REPEAT
ST.POS = ST.POS + 1
RETURN
There is no real gotcha in this Object parsing code. The end result of the this code and the attached programs will generate a dynamic array contain JSON property Name and value:
DATA.ITEM<1,1> = "orderID" DATA.ITEM<2,1> = "1234" * DATA.ITEM<1,2> = "shopperName" DATA.ITEM<2,2> = "John Smith"

