Thursday, February 16, 2017

Filemaker Xpath

While doing communications with Apple GSX, I have to constantly receive XML from GSX. Filemaker does not have functions that handle XML in variables.  Normally, I would use BaseElements plugin function BE_XPath to do the node extraction. But due to its functionality, BE_Xpah treats nodes differently. If there is only one node with a specific node name, it can easily be extracted by using "//mynode". If there is multiple nodes with the same node name then it is a issue. You must use a array style like "//mynode[x]" to get the correct value.

If the XML format is always constant, there is no problem. However, if the XML returned can be one or more nodes of the same name, it is an issue. It ended up that I have to check whether there is multiple node of the same name by trying to parse "//mynode[x]" first to see it has a value then go to the appropriate routine to extract the single or multiple nodes. It is a hassle. Moreover, the script needs the BaseElements plugin. Filemaker Go does not have the capability to use Plugins.

I tried to lookup for scripts that is available in the internet. There are some but its codes are a bit too complicated to comprehend. It may be due to efficiency, they try to use as little script steps as possible to achieve results. Many resort to use "Evaluate" function. I tried as far as possible to avoid using "Evaluate" thus do not want to use it.

Fortunately, two of the scripts uses the same functionality provided by Andy Knasinsky (http://www.briandunning.com/cf/1). It triggers my mind that I could use the same method to extract nodes. I don't have Filemaker Pro Advanced so cannot copy the custom function. Therefore, I start doing my own script.

To start off, I use the same convention of defining the Xpath. The convention is like this "//mynode[x]". It is an array like node name that starts with "//". I will then have to detect whether there is a "[x]" defined. Filemaker has a function called "position". The syntax is as belows

Position[text;searchstring;start; occurrence]

Now if there is no "[x]" defined then I just set "occurrence" to one else get the value of "x" as "occurrence" value. Since "Position" always requires a "start" and "occurrence", The x value is useful to define which instance the node name occurs. Both the start node and end node position can be extracted this way.

With the use of "Middle" function, I am then able to define the exact position of the node value and extract it.

The following is the code. I can't copy and paste as FMP does not have the capability of copy and past the script to text. I do screen capture instead. The script name is "GetNode".

Well, life is not as simple. There will be instances where the same node name appears inside other node. Therefore, it is necessary to define the specific path to the node hence the term "Xpath".

It turned out that this is made easy by the above script. Since the above script is just text manipulation, it can actually retrieve the other child nodes. The syntax of Xpath is like "//node1/node2". It can be more than two and with array like syntax.

I just break up the nodes then loop it in sequence. For example, first I get "node1" then retrieve the node value (including child nodes). Then, I get "node2" based on node1 result. If there is a third child node, then I get the node2 result and look for the node value. In this way, I can get any value from the XML. Below is the script. I name it as "GetPath"



You can just do "Execute Script" and call the "GetPath" script. The script parameter will be
your xml and the xpath separated by a Filemaker carriage return symbol. You will then get the ScriptResult to get the node value.

There is a script in the above script named "Remove Subnode". This script is a result of having an XML that have subnodes that have nodenames same as the root node. If the subnode is above the root node of the same name, the subnode value gets extracted instead. It is, therefore, important to remove the subnodes as the path does not point to the subnode.


After doing the script, I felt that it is still very troublesome if I were to pull a list of values from the XML. A list script is written just to do this. The parameter supplied is the XML, the specific path, and the list of nodes that I want to get. An interesting point is that there is a need to get one or more subnodes in the list. Therefore, I set the node name with an extra "[]" to indicate that it is a subnode. It will then pull the subnode XML as value.
Ignore the Base64 comment as I though the list might be interrupted by the delimiter. It turned up that Base64 adds more trouble. Then I realize that I must compact the XML by removing all CR, LF, TAB, and extra spaces. This means that the CR delimiter is not found in the data itself thus abandoned the idea.



I make no effort to combine the script steps so that it will be easy to read by any novice. Any how, the script should not be used for XML that is tens of thousands of lines long. Try to "compress" the scripts if you need to extract large number of data from XML. It is not within my scope to show you how to use one script line to do complex calculations.

By the way, the XML has to be well formed and error free.


No comments:

Post a Comment