|
|
I decided to buy a Palm the day I discovered the fabulous AvantGo application. It brings to the device the content of web sites updated at each synchronization.
AvantGo is basically a proxy that transform HTML 3.2 to a format suitable for a display on small devices. Here is the AvantGo architecture:
+--------+ +----+ +-------------+ +--------+ | Palm | - serial port --> | PC | - Internet -> | AvantGo.com | - Internet -> | target | +--------+ +----+ +-------------+ +--------+
For example I have the schedule of the 3 cinemas around to my home, the daily news from Le Monde, and computer news from ZDNet.
You can also add your own custom channels, if the target site has a simple layout built using HTML 3.2, without tables or frames. However, many cool news sites are not like that. My target in particular was HNS, which is not layout from a small screen as the Palm.
So it was time for me to build a bot that, at the AvantGo proxy request, will download the original page, extract the info, and give it to AvantGo formatted with a basic layout. My architecture just adds a layout between AvantGo.com and HNS: my PHP script.
+-------------+ +----------------+ +-----+ ... --> | AvantGo.com | - Internet -> | PHP reformater | - Internet -> | HNS | +-------------+ +----------------+ +-----+
The code is below.
<?php
$f = fopen("http://www.net-security.org:80/", "r");
if (!$f) {
echo "Error!";
exit;
}
$data = "";
while (!feof($f)) {
$data = $data . fgets($f, 1024);
}
fclose($f);
function intelligent_split($s, $pos, $a)
{
// Use the code in comments to debug
$i = 0;
//echo "[ pos = $pos ]<BR>\r\n";
//echo "[ sub = <TT>".htmlspecialchars($a[$i])."</TT> ]<BR>\r\n";
$pos = strpos($s, $a[$i++], $pos);
while ($i < count($a) && !is_string($pos)) {
//echo "[ pos = $pos ]<BR>\r\n";
$pos += strlen($a[$i-1]);
$pos2 = strpos($s, $a[$i], $pos);
//echo "[ sub = <TT>".htmlspecialchars($a[$i])."</TT> ]<BR>\r\n";
if (is_string($pos2))
break;
$result[$i++] = substr($s, $pos, $pos2-$pos);
$pos = $pos2;
}
//echo "[ pos = $pos ]<BR>\r\n";
//echo "[ pos2 = $pos2 ]<BR>\r\n";
//echo "[ sub = <TT>".htmlspecialchars($a[$i])."</TT> ]<BR>\r\n";
$result[0] = $pos;
return $result;
}
?><HTML>
<HEAD>
<TITLE>HNS News</TITLE>
<META Name="HandheldFriendly" Content="True">
</HEAD>
<BODY>
<H1 Align="center">H E L P N E T S E C U R I T Y</H1>
<CENTER><B>www.net-security.org</B><BR>
AvantGo version by <A HRef="mailto:dolmen@_nospam_.bigfoot.no2spam.com">Dolmen</A><BR>
Last update: <?php echo gmdate("M d Y, G:i") ?> GMT</CENTER>
<HR>
<?php
$a = array(
'<P><FONT SIZE="-1" FACE="Arial"><B>', // $title
'</B><BR><FONT SIZE="-2" FACE="Arial">by </FONT><FONT'."\n".'SIZE="-2" FACE="Arial"><a'."\n".'href="mailto:', // $email
'"><FONT'."\n".'SIZE="-2" FACE="Arial"><B>', // $author
'</A></B></FONT> <FONT SIZE="-2"'."\n".'FACE="Arial">', // $datestr
'<BR><BR><font size="-1" face="arial, verdana">', // $text
'</FONT><P><FONT SIZE="-1"'."\n".'FACE="Arial"><img'."\n".'src="http://net-security.org/images/news_divider.gif"</FONT></P>'
);
$pos = 0;
do {
list($pos, $title, $email, $author, $datestr, $text) = intelligent_split($data, $pos, $a);
if (!$pos)
break;
//echo "$pos<BR>\r\n";
echo "<P>\n<B>$title</B><BR>\n";
//echo "$email<BR>\r\n";
//echo "$author<BR>\r\n";
echo "$text\n";
} while (1);
?><HR>
</BODY>
</HTML>
This was my first useful (I use it every day) PHP script, let's hope it wont be the last.
Look at my other essay about Java bots!
Dolmen (dolmen*at*bigfoot*dot*com)
May 1st, 2001
|
|