How do XMLParser work? (RSS example)

:information_source: Attention Topic was automatically imported from the old Question2Answer platform.
:bust_in_silhouette: Asked By dmklsv
:warning: Old Version Published before Godot 3 was released.

I have this code:

var parser = XMLParser.new()
func _ready():
    parser.open("res://test.xml")
    parser.read()
    parser.skip_section()

test.xml:

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<item>
	<title>Title</title>
	<link>http://test.com</link>
	<description>Some text here</description>
	<pubDate>Wed, 16 Nov 2016 00:00:02 +0300</pubDate>
</item>
</channel>
</rss>

ok how do I get to the field “description” ? I need some xml loop? how to do it?
I read XMLParser docs, but I did not understand

I… don’t understand either O_O that’s a very strange API…

Zylann | 2016-11-18 14:32

I found this description on Github in development branch
but I still can not understand how it works =) maybe someone make it work

int get_attribute_count ( ) const

Get the amount of attributes in the current element.

String get_attribute_name ( int idx ) const

Get the name of the attribute specified by the index in idx argument.

String get_attribute_value ( int idx ) const

Get the value of the attribute specified by the index in idx argument.

int get_current_line ( ) const

Get the current line in the parsed file (currently not implemented).

String get_named_attribute_value ( String name ) const

Get the value of a certain attribute of the current element by name. This will raise an error if the element has no such attribute.

String get_named_attribute_value_safe ( String name ) const

Get the value of a certain attribute of the current element by name. This will return an empty [String] if the attribute is not found.

String get_node_data ( ) const

Get the contents of a text node. This will raise an error in any other type of node.

String get_node_name ( ) const

Get the name of the current element node. This will raise an error if the current node type is not NODE_ELEMENT nor NODE_ELEMENT_END

int get_node_offset ( ) const

Get the byte offset of the current node since the beginning of the file or buffer.

int get_node_type ( )

Get the type of the current node. Compare with NODE_* constants.

bool has_attribute ( String name ) const

Check whether or not the current element has a certain attribute.

bool is_empty ( ) const

Check whether the current element is empty (this only works for completely empty tags, e.g.

int open ( String file )

Open a XML file for parsing. This returns an error code.

int open_buffer ( RawArray buffer )

Open a XML raw buffer for parsing. This returns an error code.

int read ( )

Read the next node of the file. This returns an error code.

int seek ( int pos )

Move the buffer cursor to a certain offset (since the beginning) and read the next node there. This returns an error code.

void skip_section ( )

Skips the current section. If the node contains other elements, they will be ignored and the cursor will go to the closing of the current element.

dmklsv | 2016-11-19 09:14

Looks like this parser works in a linear way. So when the file is parsed, the parser will point to the first node. Using read will move the “cursor” to the first child node (or the next one if there are none), while skip_section is like a “step over”, ignoring children. Needs some tests to be sure though.
XML hierarchy is primarily a tree/list, so that’s why access is not as easy as using a dictionary. Instead it’s like iterating an array, where elements can have sub-arrays of nodes, and nodes can also have attributes indexed by name.

Zylann | 2016-11-19 16:09

:bust_in_silhouette: Reply From: AnJo888

Just had the same doubt (the official documentation is, let’s say, spartan)…

So I did some testing…

from this xml

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<item>
    <title>Title</title>
    <link>http://test.com</link>
    <description>Some text here</description>
    <pubDate>Wed, 16 Nov 2016 00:00:02 +0300</pubDate>
</item>
</channel>
</rss>

and this code

onready var dicL1 = {}
onready var errorCode = 0
onready var parser = XMLParser.new()
onready var xmlPBA = PoolByteArray()

func _ready():
	errorCode = parser.open("res://xml/Test.xml")
	if errorCode != OK:
		exit(errorCode)

	errorCode = parser.open_buffer(xmlPBA)
	if errorCode != OK:
		exit(errorCode)
while parser.read() != ERR_FILE_EOF:
	print(parser.get_node_name(), ": ", parser.get_node_data())
	if parser.get_attribute_count() > 0:
		for i in range(parser.get_attribute_count()):
			print(parser.get_attribute_name(i), ": ", parser.get_attribute_value(i))
			dicL1[parser.get_attribute_name(i)] = parser.get_attribute_value(i)
	print(dicL1)
	exit(0)

func exit(error) -> void:
	get_tree().quit()

I got the following results

?xml version="1.0" encoding="UTF-8"?: 
rss: 
version: 2.0
channel: 
item: 
: 
    
title: 
: Title
title: 
: 
    
link: 
: http://test.com
link: 
: 
    
description: 
: Some text here
description: 
: 
    
pubDate: 
: Wed, 16 Nov 2016 00:00:02 +0300
pubDate: 
item: 
channel: 
rss: 
{version:2.0}

Now it’s just the work of figuring out how to sort/use those values (there are other methods useful to do this).

Hope this helps future seekers…

There is a video about XML Parser but unfortunately it’s not in English, but still watching it can help a lot, I also need to parse html files using this library but still trying…

SdSaati | 2022-05-04 04:48