Last updated on May 5th, 2016 at 01:07 am

Parse XML using PHP, Here we have 2 files one is the XML and other is the PHP. Very easy script to parse an xml file. Here i am using a big xml file and extracting data from that using php.

The first file is the XML file, name the file as parse.xml

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Tutorialz Images</title>
<link>http://www.tutorialz.tk</link>
<language>en-us</language>
<image>
<url>102724_31082007.jpg</url>
<title></title>
<link>http://www.tutorialz.tk</link>
<description/>

</image>

<image>
<url>369877_31082007.jpg</url>
<title></title>
<link>http://www.tutorialz.tk/22562/userid/3</link>
<description/>
</image>

<image>
<url>898438_0312080814461mira5abcjpg.jpg</url>

<title> </title>
<link>http://www.tutorialz.tk/22560/userid/3</link>
<description/>
</image>

<image>
<url>211000_withkids.jpg</url>
<title>rahul and gokul
</title>
<link>http://www.tutorialz.tk/22469/userid/3</link>
<description/>

</image>

<image>
<url>180625_cycle.jpg</url>
<title>
</title>
<link>http://www.tutorialz.tk/22467/userid/3</link>
<description/>
</image>

<image>
<url>368384_mission.jpg</url>

<title>The mission begins
</title>
<link>http://www.tutorialz.tk/22465/userid/3</link>
<description/>
</image>

<image>
<url>370625_04082007.jpg</url>
<title>
</title>
<link>http://www.tutorialz.tk/22350/userid/3</link>
<description/>

</image>

<image>
<url>292629_06082007744.jpg</url>
<title>
</title>
<link>http://www.tutorialz.tk/22330/userid/3</link>
<description/>
</image>

<image>
<url>22314</url>

<title></title>
<link>http://www.tutorialz.tk/22314/userid/3</link>
<description/>
</image>

<image>
<url>22313</url>
<title>comfort inn attham</title>
<link>http://www.tutorialz.tk/22313/userid/3</link>
<description/>

</image>

<image>
<url>22295</url>
<title>dial 100 if you see any of these person</title>
<link>http://www.tutorialz.tk/22295/userid/3</link>
<description/>
</image>

<image>
<url>811298_30072007676.jpg</url>

<title>
</title>
<link>http://www.tutorialz.tk/22200/userid/3</link>
<description/>
</image>

<image>
<url>974623_dog.jpg</url>
<title>
</title>
<link>http://www.tutorialz.tk/22179/userid/3</link>
<description/>

</image>

<image>
<url>270873_15082007.jpg</url>
<title></title>
<link>http://www.tutorialz.tk/22103/userid/3</link>
<description/>
</image>

<image>
<url>22102</url>

<title></title>
<link>http://www.tutorialz.tk/22102/userid/3</link>
<description/>
</image>

<image>
<url>22101</url>
<title>kenney</title>
<link>http://www.tutorialz.tk/22101/userid/3</link>
<description/>

</image>

<image>
<url>22100</url>
<title></title>
<link>http://www.tutorialz.tk/22100/userid/3</link>
<description/>
</image>

<image>
<url>22099</url>

<title></title>
<link>http://www.tutorialz.tk/22099/userid/3</link>
<description/>
</image>

<image>
<url>22098</url>
<title></title>
<link>http://www.tutorialz.tk/22098/userid/3</link>
<description/>

</image>

<image>
<url>22097</url>
<title></title>
<link>http://www.tutorialz.tk/22097/userid/3</link>
<description/>
</image>

</channel>
</rss>

The next file to be created is the PHP file to parse the above xml data.

<?php
class RSSParser{
var $url;

# (string) URL of feed

var $page;

# (string)  Raw file contents of RSS Feed

var $xml;

# (object)  Object data of RSS Feed

var $channel;

# (object)  Channel Object containing feed information and images

var $images;

# (array)  RSS images

var $feed;

# (array) Feed Information ( title, desc, publish date )

/*

Class Constrictor

Arguements:

url: Feed URL, can be a local file, or online ( http:// )

- url is required in order to execute the constrictor

*/

function __construct ( $url )

{

/*

Do we have PHP5 Installed?

If we do not have it installed,

Kill the script immediately.

*/

if ( intval( phpversion() ) < 5 )

{

die ( 'PHP5 is required to execute this class.' );

}

/*

Does the extention class exist?

Since it is an internal class

Compiled into PHP5, we can check

Whether it is installed or not.

*/

else if ( !class_exists ( 'SimpleXMLElement' ) )

{

die ( 'Please re-compile PHP5 with the simpleXmlElement extention.' );

}

// Set the URL of the feed internally.

$this->setRSS ( $url );

// Get the page contents of that feed.

$this->getRSS ();

// Parse RSS information

$this->parseRSS ();

}

/*

Function: setRSS

Arguements:

url RSS Feed url which is set interally

- url is required to run this function

*/

function setRSS ( $url )

{

$this->url = $url;

}

/*

Function getRSS

- Get the feed source of the rss feed

*/

function getRSS ()

{

$this->feed = file_get_contents ( $this->url )

or die ( 'RSS feed was not found' );

}

/*

Function: parseRSS

- Parses the rss source

- Places feed images in array: $this->images

- Places feed details in array: $this->feed

*/

function parseRSS ()

{

// Since the extention is loaded, lets create a new

// instance of this class.

$this->xml = new SimpleXMLElement ( $this->feed );

// The XML Object has another child called channel.

// It holds the RSS details as well as images

$this->channel = $this->xml->channel;

// Lets set the feed details

//  Information about the RSS Feed

$this->feed = array

(

'title' => $this->clean ( $this->channel->title ),

'description' => $this->clean ( $this->channel->description ),

'link' => $this->clean ( $this->channel->link ),

'date' => $this->clean ( $this->channel->pubDate ),

'image' => ( $this->channel->image->url ) ? $this->clean ( $this->channel->image->url ) : false,

);

// Checks if we have any images present.

// Yes, it is possible that a feed is empty =/

if ( is_object ( $this->channel->image ) && count( $this->channel->image ) )

{

// Lets loop through all the <image> objects

foreach ( $this->channel->image as $image )

{

// Add an image to the array
//$i=1;

$this->images[] = array

(

'title' => $this->clean ( $image->title ),

'link' => $this->clean ( $image->link ),

'description' => $this->clean ( $image->description ),

'category' => $this->clean ( $image->category ),

'url' => $this->clean ( $image->url),

);

}

}

}

/*

Function clean

Argueuemts: string in which to clean.

Cleans off the object tag from an object variable.

*/

function clean ( $i )

{

return (string) htmlspecialchars ( html_entity_decode ( $i ) );

}

}

$RSSParser = new RSSParser ( 'parse.xml' );

echo 'Thank You,You are successful in parsing XML.<br><b>The parsed XML is:-</b> <pre>';

print_r( $RSSParser->feed );

print_r( $RSSParser->images );

?>

If u are experienced OOP programmer you can easily customize the script.Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *