Found 29,421 results in 0.014 seconds. Displaying page 7 of 2,943, sorted by
Sent 2010-03-03 by John Martyniak <john@...>
Andrezj,
Thanks for the information. I anxiously await the update. The HBase
integration would be a nice to have but I don't think that it should hold up
the release.
-John
On Wed, Mar 3, 2010 at 5:04 PM, Andrzej Bialecki wrote:
> On 2010-03-03 20:12, John Martyniak wrote:
...
Sent 2010-03-03 by Andrzej Bialecki <ab@...>
On 2010-03-03 20:12, John Martyniak wrote:
> Does anybody have an idea of when a new version of nutch will be
> availale, specifically supporting a latest version of hadoop. And
> possibly hbase?
>
> Thank you for any information.
We should roll out a 1.1 soon (a few weeks), the nutch+hbase is ...
Sent 2010-03-03 by John Martyniak <john@...>
Does anybody have an idea of when a new version of nutch will be
availale, specifically supporting a latest version of hadoop. And
possibly hbase?
Thank you for any information.
-John
Sent 2010-03-01 by reinhard schwab <reinhard.schwab@...>
QueroVc schrieb:
> But the crawl-urlfilter.txt not accept only characters instead of strings?
>
> If accepted, as I write?
>
> # Skip URLs containing certain characters as probable queries, etc..
> -[?*!@=]
>
> Could be?
>
> # Skip URLs containing certain characters as probable queries, etc..
> -...
Sent 2010-03-01 by QueroVc <yuri.gopfert@...>
But the crawl-urlfilter.txt not accept only characters instead of strings?
If accepted, as I write?
# Skip URLs containing certain characters as probable queries, etc..
-[?*!@=]
Could be?
# Skip URLs containing certain characters as probable queries, etc..
- [ "menu"]
Thanks
QueroVc wrote:...
Sent 2010-03-01 by conficio <KajKandler@...>
Recovering after crash: Nutch 1.0
Hi I did crash one of my fetch jobs, as the output terminal died (ssh
connection broke). After that I got all sorts of error messages about
segments not having the right folders. As I had only test data in it, I
figured I just delete the content of the crawl fol...
Sent 2010-03-01 by Ian Evans <ianevans@...>
On Mon, March 1, 2010 1:24 am, Sami Siren wrote:
> Andrzej Bialecki wrote:
>> On 2010-02-28 18:42, Ian M. Evans wrote:
>>> I've been digging around the nutch-user archives a bit and have seen
>>> some people discussing how to ignore menu items or other unnecessary
>>> div areas like common footer...
Sent 2010-03-01 by Ken Krugler <kkrugler_lists@...>
On Feb 28, 2010, at 10:24pm, Sami Siren wrote:
> Andrzej Bialecki wrote:
>> On 2010-02-28 18:42, Ian M. Evans wrote:
>>> Using Nutch as a crawler for solr.
>>>
>>> I've been digging around the nutch-user archives a bit and have seen
>>> some people discussing how to ignore menu items or other
...
Sent 2010-03-01 by Adilson Oliveira Cruz <adilsonocruz@...>
Do you video-recorded the meeting?
On Thu, Feb 25, 2010 at 5:00 AM, Bradford Stephens <
bradfordstephens@gmail.com> wrote:
> Thanks for coming, everyone! We had around 25 people. A *huge*
> success, for Seattle. And a big thanks to 10gen for sending Richard.
>
> Can't wait to see you all next m...
Sent 2010-03-01 by Sami Siren <ssiren@...>
Andrzej Bialecki wrote:
> On 2010-02-28 18:42, Ian M. Evans wrote:
>> Using Nutch as a crawler for solr.
>>
>> I've been digging around the nutch-user archives a bit and have seen
>> some people discussing how to ignore menu items or other unnecessary div
>> areas like common footers, etc. I stil...