"GPath is a path expression language integrated into Groovy which allows parts of nested structured data to be identified. In this sense, it has similar aims and scope as XPath does for XML. The two main places where you use GPath expressions is when dealing with nested POJOs or when dealing with XML"
So it's similar to XPath expressions and you can use it not only with XML but also with POJO classes. Ok, so lets begin.
Given the following xml:
Don Xijote Manuel De Cervantes Catcher in the Rye JD Salinger Alice in Wonderland Lewis Carroll Don Xijote Manuel De Cervantes
Node's text content
First thing we are going to do is to get a value using POJO's notation. Lets get the first book's author's name (Code is available at Github.).
def "Using POJO notation: Getting a node using POJOs notation a.b.c"(){ setup: "Parsing the document" def response = new XmlSlurper().parse(xmlFile) when: "Trying to get a given node using the a.b.c notation" def authorNode = response.value.books.book[0].author then: "We can check the author's value" authorNode.text() == 'Manuel De Cervantes' }
So first we parse the document with XmlSlurper (The xmlFile is a variable of type java.io.File) and the we have to consider the returning value as the root of the XML document, so in this case is "response".
So that's why we start traversing the document from response and then value.books.book[0].author. Note that in XPath the node arrays starts in [1] instead of [0], but because GPath is Java-based it starts in [0] index.
GPathResult (XmlSlurper) and Node (XmlParser)
When using GPath with an xml parsed with XmlSlurper we'll have as a result a GPathResult object. GPathResult has many other convenient methods to convert the text inside a node to any other type such as:
- toInteger()
- toFloat()
- toBigInteger()
- ...
All these methods try to convert an String to a certain type.
If we were using a XML parsed with XmlParser we could be dealing with instances of type Node. But still all the actions applied to GPathResult in these examples could be applied to a Node as well. Creators of both parsers took into account GPath compatibility.
Attribute's content
Next step is to get the some values from a given node's attribute. In the following sample we want to get the first book's author's id. We'll be using two different approaches. Let's see the code first:
def "Using POJO notation: Getting an attribute's value using POJOs notation a.b.c"(){ setup: "Parsing the document" def response = new XmlSlurper().parse(xmlFile) when: "Trying to get a given node using the a.b.c notation" def firstBook = response.value.books.book[0] def firstAuthorIdNode1 = firstBook.author.@id def firstAuthorIdNode2 = firstBook.author['@id'] then: "Getting the id's value" firstAuthorIdNode1.toInteger() == 1 firstAuthorIdNode2.toInteger() == 1 }
Again we first parse the document and then using the POJO's notation we get the first book node. Now take a look at the first expression:
- firstBook.author.@id
- firstBook.author['@id']
I specially like the former type of notation because is more straight forward, and meaningful. The latter is more like using an instance of a map (which I guess it should be eventually).
Speeding things up: "breadfirst()" and "depthfirst()"
If you ever have used XPath you have been using the expressions like
- "//" : Look everywhere
- "/following-sibling::othernode" : Look for a node "othernode" in the same level
More or less we have their conterparts in Gpath with the methods breadfirst() and depthfirst(). The first example shows a simple use of breadfirst(). The creators of this methods created a shorter syntax for it using '*'.
def "Using '*': Getting a node using breadthFirst operator '*'"(){ setup: "Parsing the document" def response = new XmlSlurper().parse(xmlFile) when: "Looking for the node having the name 'book'" and: "with attribute id equals to 2" /* You can use the breadthFirst operator to look among a group of nodes at the same level */ def catcherInTheRye = response.value.books.'*'.find{node-> /* node.@id == 2 could be expressed as node['@id'] == 2 */ node.name() == 'book' && node.@id == '2' } then: "Getting the author's value" catcherInTheRye.title.text() == 'Catcher in the Rye' }
This Spock specification looks for any node at the same level as "books" node first, and only if it couldn't find the node we were looking for then it will look deeper in the tree, always taking into account the given the expression inside the closure.
That expression says "Look for any node with a tag name equals 'book' and having an id with a value of '2'".
Today I woke up very lazy and I'd like to look for a given value without caring where it might be. The only thing I know is that I need the id of the author "Lewis Carroll" . How do I do that? using depthFirst()
def "Using '**': Getting a node using depthFirst operator '**'"(){ setup: "parsing the document" def response = new XmlSlurper().parse(xmlFile) when: "Using the deptFirst operator we can look for something" and: "it doesn't matter how deep the node is" and: "Let's say we want to look for the book's id of the book written by Lewis Carrol" /* Beware of the name I used for the closure's parameter. It may look like the ** is too smart, but it isn't. It's just that I'm sure only books will match the query. To avoid any confusion I'd rather use 'node' */ def bookId = response.'**'.find{book-> book.author.text() == 'Lewis Carroll' }.@id then: "The bookId should be 3" bookId == "3" }
Definitely is shorter that using the POJO notation isn't it? depthfirst() is the same as looking something "everywhere in the tree from this point down". In this case we've used the method find(Closure cl) to find just the first occurrence.
What if we want to collect all book's titles?
def "Using '**': Collecting all titles"(){ setup: "parsing the document" def response = new XmlSlurper().parse(xmlFile) when: "Looking for all titles within the document" def titles = response.'**'.findAll{node-> node.name() == 'title'}*.text() then: "There should be only four" titles.size() == 4 }
I've mentioned there are some useful methods that convert a node's value to an integer,float...etc. Those methods could be convenient when doing comparisons like this:
def "Using findAll: Collecting all titles"(){ setup: "parsing the document" def response = new XmlSlurper().parse(xmlFile) when: "Looking for all titles with an id greater than 2" def titles = response.value.books.book.findAll{book-> /* You can use toInteger() over the GPathResult object */ book.@id.toInteger() > 2 }*.title then: "There should be only two" titles.size() == 2 }
In this case the number 2 has been hardcoded but imagine that value could have come from any other source (Gorm id's...etc)
Resources
- Groovy GPath page: http://groovy.codehaus.org/GPath
- Github samples: https://github.com/mariogarcia/xmlgroovy
Hi, what if I wanted to put a part of the GPath as a variable?
ReplyDeletei.e.:
==============================================================================
def gpathPiece = '**'.find{book-> book.author.text() == 'Lewis Carroll'}.@id
def bookId = response.gpathPiece //This part is not working for me
System.out.println(bookId)
==============================================================================
I am trying to use a variable to represent that piece of the GPath, but it doesn't work. Confirmed that when I put that variable's value directly back into the GPath, it works.
Any suggestions? Thanks
Thanks for the post,Really you given a valuable information on xml.worth to read this type of articles .
ReplyDeleteThank you.
oracle R12 training
Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
ReplyDeleteSelenium training in Chennai
Selenium training in Bangalore
Selenium training in Pune
Selenium Online training
All the points you described so beautiful. Every time i read your i blog and i am so surprised that how you can write so well.
ReplyDeletepython Training in Pune
python Training in Chennai
python Training in Bangalore
Thanks For Sharing The Information The information Shared Is Very valuable Please keep updating us Time Just Went On reading The article Python Online Course AWS Online Course Devops Online Course DataScience Online Course
ReplyDelete
ReplyDeleteI think things like this are really interesting. I absolutely love to find unique places like this. It really looks super creepy though!! devops training in chennai | devops training in anna nagar | devops training in omr | devops training in porur | devops training in tambaram | devops training in velachery
Great post. Thanks for sharing such a worthy information.....
ReplyDeleteEmpower your Skills with Ethical Hacking Training
Skills Needed for an Ethical Hacker
Great post. keep sharing such a worthy information.
ReplyDeleteArtificial Intelligence Course in Chennai
Best AI Courses Online
Artificial Intelligence Course In Bangalore
This post is so interactive and informative.keep update more information...
ReplyDeleteSalesforce Training in Tambaram
Salesforce Training in Anna Nagar