[Date Prev][Date Next][Thread Prev][Thread Next][Interchange by date
][Interchange by thread
]
[ic] Froogle.google.com anyone using this yet?
> > The problem you have is the HTML in the database. That makes
> > it really hard to reuse. You might want to consider ways of
> > getting HTML out of your raw data.
> >
> A quick test script for you:
>
> ----------------------------------------------------------------------
> use HTML::TreeBuilder;
> use HTML::FormatText;
> use strict;
>
> my $text =<<'EOB';
> <body>
> <p>
> This is a test blah blah.
> <a href="foobar.html">What's this, a link?</a>.
> </p>
> <p>
> Let's have some text in <font color="#FF0000">red</font>.
> </p>
> <p>
> Some "entities" will make another test case.
> </p>
> </body>
> EOB
>
> my $tree = new HTML::TreeBuilder;
> $tree->parse($text);
>
> my $formatter = new HTML::FormatText(
> leftmargin => 4,
> rightmargin => 74,
> );
> $text = $formatter->format($tree);
> print $text;
> ----------------------------------------------------------------------
>
> The output is:
>
> This is a test blah blah. What's this, a link?.
>
> Let's have some text in red.
>
> Some "entities" will make another test case.
Excellent! This is just what I need for Helpem's text file output format
(near the top of the todo).
thanks Kevin.
Jonathan
Webmaint.