Perl question
Eric Schwartz
schwartz at ll.mit.edu
Thu May 22 12:20:47 EDT 2003
Hello all,
Thank you for your help in this matter. I have decided to move
forward with using "html scraping" I am using this code from a book on
perl, and i cant seem to get it to work. I tried to modify it to search
specifically for estimated pages remaining, and I want it to look for the
group of numbers that is right after, but i dont seem to be doing anything
right. When I run this code it prints "here it is" and nothing
else. Maybe because it just finds a blank space after the designated
search, im not sure. Here is a small clip of the html i am looking at:
<td width="90%">
<p align="left"><font face="Arial,Helvetica" size="1" color="#000000">
Estimated Pages Remaining:
</font></p>
</td>
<td width="10%">
<p align="right"><font face="Arial,Helvetica" size="1" color="#000000">
6052
</font></p>
</td>
</tr>
MY CODE:
my $html = get("ipaddress");
$html =~ m{Estimated Pages Remaining:<td width="90%"> ([\d,]+) </font><br>};
my $blkpgsrem=$1;
#$blkpgsrem =~ tr[,][]d;
print "here it is:$blkpgsrem\n";
PS:
I also have found this code in the book i am looking into, but I cant seem
to understand it.
$text = qq(<a href="file.html"><b>Dog</b></a>Woof\nWoof</p>);
($file, $title, $summary) =
$text =~ m{<a href="(.*?)"><b>(.*?)</b></a>\s*(.*?)</p>};
It looks like it is searching for multiple things and assigning different
variables to each seach parameter, however i do not know how to make this
apply to my HTML.
Again, I appreciate all the help you guys have been giving me, Thank you.
Eric
At 03:15 PM 5/22/2003 +0000, dsr at tao.merseine.nu wrote:
>On Thu, May 22, 2003 at 12:00:58AM -0400, Bill Bogstad wrote:
> >
> >
> > Assuming Eric is continuing the previous line then #2 above is false.
> > In a scalar context, the (implicit) match operator returns the
> > number of strings captured by the regexp. In a list context (this
> > case), it returns a list consisting of all of the matched strings.
> > Eric is doing the equivalent of assigning $1 to $etapagerem after the
> > regexp matching completes.
> >
> > As for the regexp issue, I agree that his regexp is unlikely to do
> > anything useful. However, I believe that it is well formed. I
> > interpret it as follows:
>
>Thank you, I stand corrected.
>
>-dsr-
More information about the Discuss
mailing list