, the rest of the page doesn't really change. I'd like to store
and index only the contents of this
, to basically avoid re-indexing
over and over the same content (header, footer, menu).
I've checked the W...
Sent 2010-03-12 by Susam Pal <susam.pal@...>
On Fri, Mar 12, 2010 at 2:09 PM, Graziano Aliberti
wrote:
> Il 11/03/2010 16.20, Susam Pal ha scritto:
>>
>> On Thu, Mar 11, 2010 at 8:24 PM, Graziano Aliberti
>> wrote:
>>
>>>
>>> Hi everyone,
>>>
>>> I'm trying to use nutch ver. 1.0 on a s...
Sent 2010-03-12 by Graziano Aliberti <graziano.aliberti@...>
Il 11/03/2010 16.20, Susam Pal ha scritto:
> On Thu, Mar 11, 2010 at 8:24 PM, Graziano Aliberti
> wrote:
>
>> Hi everyone,
>>
>> I'm trying to use nutch ver. 1.0 on a system under squid proxy control. When
>> I try to fetch my website list, into the log file I see ...
Sent 2010-03-12 by Hannu Väisänen <Hannu.Vaisanen@...>
On Wed, Feb 24, 2010 at 03:42:20PM +0200, Sami Siren wrote:
> Hannu,
>
> Do you use same set of QueryFilters both in the webapp and when
> running from shell?
>
> Perhaps your filter is not executed when running from cli? You can
> verify how your query is transformed by running bin/nutch
> org...
Sent 2010-03-11 by conficio <KajKandler@...>
Andrzej Bialecki wrote:
>
> I was involved in a project to implement this (as a proprietary plugin).
> ...
> So, if you target 10 sites, you can make it work. If you target 10,000
> sites all using slightly different methods, then forget it.
>
>
> --
> Best regards,
> Andrzej Bialecki <...
Sent 2010-03-11 by Andrzej Bialecki <ab@...>
On 2010-03-11 15:53, nikinch wrote:
>
> Hi everyone
>
> I've been using nutch for a while now and i've come up on a snag.
>
> I'm trying to find where new linked pages are added to the segment as a
> specific entry.
> To make myself clear i've been through the fetch class and the crawlDBFilter
>...
Sent 2010-03-11 by Susam Pal <susam.pal@...>
On Thu, Mar 11, 2010 at 8:24 PM, Graziano Aliberti
wrote:
> Hi everyone,
>
> I'm trying to use nutch ver. 1.0 on a system under squid proxy control. When
> I try to fetch my website list, into the log file I see that the
> authentication was failed...
>
> I've configure...
Sent 2010-03-11 by Graziano Aliberti <graziano.aliberti@...>
Hi everyone,
I'm trying to use nutch ver. 1.0 on a system under squid proxy control.
When I try to fetch my website list, into the log file I see that the
authentication was failed...
I've configured my nutch-site.xml file with all that properties needed
for proxy auth, but my error is "http...
Sent 2010-03-11 by nikinch <maillard@...>
Hi everyone
I've been using nutch for a while now and i've come up on a snag.
I'm trying to find where new linked pages are added to the segment as a
specific entry.
To make myself clear i've been through the fetch class and the crawlDBFilter
and reducer.
But i'm looking for the initial entry ...
Sent 2010-03-11 by Jesiel Trevisan <jesieltrevisan@...>
Please
Keep me out this Group.
Tks
_______________________________________________________
Jesiel A.S. Trevisan
Email: jesieltrevisan@gmail.com.br
MSN: jesieltrevisan@hotmail.com
Skype & AIM: jesieltrevisan
YahooMessager: jesiel.trevisan
ICQ:: 46527510
__________________________________________...