Enterprise Search support for Apache Lucene and Solr by Lucid Imagination

Secondary links

  • Contact Us
  • Log in
  • Downloads
  • Solutions
    • Software |
    • Services |
    • Training |
    • White Papers & Case Studies |
    • Webinars & Events |
  • Developers
    • Blog |
    • Tech Articles |
    • Community |
    • Documentation |
    • Downloads |
    • Webcasts & Podcasts |
  • About
    • Market Overview |
    • Management |
    • Company News |
    • In the Media |
    • Contact |

beta

Start new search

Options

  • results per page

Clear all facets

  • Project clear projects

  • Source clear sources

  • Author clear authors

Search Results for

Results loading...

Found 29,424 results in 0.028 seconds. Displaying page 5 of 2,943, sorted by

  1. [nutch-user] RE: Two Nutch parallel crawl with two conf folder.

    Sent 2010-03-09 by Pravin Karne <pravin_karne@...>

    Hi Millebii, Thanks for your valuable inputs. As per our requirements we need to run multiple nutch instances with each instance pointing to their own conf dir and crawlDB. crawl -urlfilter.txt is different in both conf folder. But in our case both nutch instances picking same conf dir instead ...

  2. [nutch-user] Re: Two Nutch parallel crawl with two conf folder.

    Sent 2010-03-09 by MilleBii <millebii@...>

    Yes it should work, I personnaly run some tests crawl on the same hardware, even on the same nutch directory thus I share the conf directory. But If you don't want that I would use two nutch directory and of course two different crawl directory because with hadoop they will end-up on the same hdf...

  3. [nutch-user] RE: Two Nutch parallel crawl with two conf folder.

    Sent 2010-03-09 by Pravin Karne <pravin_karne@...>

    Can we share Hadoop cluster between two nutch instance. So there will be two nutch instance and they will point to same Hadoop cluster. This way I am able to share my hardware bandwidth. I know that Hadoop in distributed mode serializes jobs. But I will not affect my flow. I just want to share m...

  4. [nutch-user] AW: By Indexing I get: OutOfMemoryError: GC overhead limit exceeded ...

    Sent 2010-03-08 by Patricio Galeas <pgaleas@...>

    Hello Ted, I ran the command 'ps -aux' and I confirmed that only 1GB was defined. I adjust NUTCH_HEAPSIZE to 8GB (physical RAM) and ran it again successfully. Do you know which parameters need to be adjusted if not enough physical RAM is available on the server? For example for 2GB RAM. I ran ...

  5. [nutch-user] RE: Content of redirected urls empty

    Sent 2010-03-08 by BELLINI ADAM <mbellil@...>

    i'm sorry...i just checked twice...and in my index i have the original URL, which is the HTTP one with the empty content...but it dosent index the HTTPS one....and i using solr index thx > From: mbellil@msn.com > To: nutch-user@lucene.apache.org > Subject: RE: Content of redirected urls empty...

  6. [nutch-user] RE: Content of redirected urls empty

    Sent 2010-03-08 by BELLINI ADAM <mbellil@...>

    Hi, i'v just dumped my segments and found that i have both 2 URLS, the original one (HTTP) with an empty content and the REDIRCTED TO or the DESTINATION URL (HTTPS) with NON EMPTY content ! but in my search i found only the HTTPS URL with an empty content !! logically the content of the HTTPS U...

  7. [nutch-user] Re: Content of redirected urls empty

    Sent 2010-03-08 by Andrzej Bialecki <ab@...>

    On 2010-03-08 14:55, BELLINI ADAM wrote: > > > is there any idea guys ?? > > >> From: mbellil@msn.com >> To: nutch-user@lucene.apache.org >> Subject: Content of redirected urls empty >> Date: Fri, 5 Mar 2010 22:01:05 +0000 >> >> >> >> hi, >> the content of my redirected urls is empty...but still ...

  8. [nutch-user] Re: Two Nutch parallel crawl with two conf folder.

    Sent 2010-03-08 by MilleBii <millebii@...>

    How parallel is parallel in your case ? Don't forget Hadoop in distributed mode will serialize your jobs anyhow. For the rest why don't you create two Nutch directories and run things totally independently 2010/3/8, Pravin Karne : > Hi guys any pointer on followi...

  9. [nutch-user] RE: Content of redirected urls empty

    Sent 2010-03-08 by BELLINI ADAM <mbellil@...>

    is there any idea guys ?? > From: mbellil@msn.com > To: nutch-user@lucene.apache.org > Subject: Content of redirected urls empty > Date: Fri, 5 Mar 2010 22:01:05 +0000 > > > > hi, > the content of my redirected urls is empty...but still have the other metadata... > i have an http urls that i...

  10. [nutch-user] RE: Two Nutch parallel crawl with two conf folder.

    Sent 2010-03-08 by Pravin Karne <pravin_karne@...>

    Hi guys any pointer on following. Your help will highly appreciated . Thanks -Pravin -----Original Message----- From: Pravin Karne Sent: Friday, March 05, 2010 12:57 PM To: nutch-user@lucene.apache.org Subject: Two Nutch parallel crawl with two conf folder. Hi, I want to do two Nutch para...

  1. <<
  2. 1
  3. 2
  4. 3
  5. 4
  6. 5
  7. 6
  8. 7
  9. 8
  10. 9
  11. 10
  12. >>

Solr Powered

Give us your feedback

  • Lucene
  • Solr
  • Nutch
  • Tika
  • Mahout
  • Droids
  • PyLucene
  • Lucene.Net
  • Lucy
  • Lucene4c
  • Open Relevance Project
  • How We Can Help:
    • Getting Started |
    • Support Subscriptions |
    • White Papers |
    • Training |
    • Consulting |
    • Contact Us |
  • Developers:
    • Blog |
    • Documentation |
    • Tech Articles |
    • Podcasts and Videos |
    • Community |
  • Downloads:
    • LucidWorks for Solr |
    • LucidWorks for Lucene |
    • LucidGaze for Solr |
    • LucidGaze for Lucene |
  • Products:
  • Services:

Contact | Privacy Policy | Legal Terms of Use | Copyrights and Disclaimers | Admin

Apache Solr, Apache Lucene, ApacheCon and their logos are trademarks of the Apache Software Foundation.

© 2010 Lucid Imagination. All Right reserved.