Enterprise Search support for Apache Lucene and Solr by Lucid Imagination

Secondary links

  • Contact Us
  • Log in
  • Downloads
  • Solutions
    • Software |
    • Services |
    • Training |
    • White Papers & Case Studies |
    • Webinars & Events |
  • Developers
    • Blog |
    • Tech Articles |
    • Community |
    • Documentation |
    • Downloads |
    • Webcasts & Podcasts |
  • About
    • Market Overview |
    • Management |
    • Company News |
    • In the Media |
    • Contact |

beta

Start new search

Options

  • results per page

Clear all facets

  • Project clear projects

  • Source clear sources

  • Author clear authors

Search Results for

Results loading...

Found 57,806 results in 0.027 seconds. Displaying page 5 of 5,781, sorted by

  1. [solr-user] Re: Cleaning up dirty OCR

    Sent 2010-03-11 by Robert Muir <rcmuir@...>

    > > I don't deal with a lot of multi-lingual stuff, but my understanding is > that this sort of thing gets a lot easier if you can partition your docs > by language -- and even if you can't, doing some langauge detection on the > (dirty) OCRed text to get a language guess (and then partition by l...

  2. [solr-user] Re: Cleaning up dirty OCR

    Sent 2010-03-11 by Chris Hostetter <hossman_lucene@...>

    : Interesting. I wonder though if we have 4 million English documents and 250 : in Urdu, if the Urdu words would score badly when compared to ngram : statistics for the entire corpus. Well it doesn't have to be a strict ratio cutoff .. you could look at the average frequency of all character...

  3. [solr-user] Re: How to edit / compile the SOLR source code

    Sent 2010-03-11 by JavaGuy84 <bbarani@...>

    Erik, That was a wonderful explanation, I hope many folks in this forum will be benefited from the explanation you have given here. Actually I Googled and found the solution when you had earlier mentioned that I can do a leading wildcard without hacking the code. I found out the patch that h...

  4. [solr-dev] Re: Abstractify FacetComponent and SimpleFacets

    Sent 2010-03-11 by Grant Ingersoll <gsingers@...>

    On Mar 11, 2010, at 6:30 PM, Yonik Seeley wrote: > Interesting looking stuff Marcus! > Seems sort of related to stat.facet (calc stats on unique facet values) > http://wiki.apache.org/solr/StatsComponent And https://issues.apache.org/jira/browse/SOLR-1622 > > > On Thu, Mar 11, 2010 at 5:49 P...

  5. [solr-user] Re: How to edit / compile the SOLR source code

    Sent 2010-03-11 by Erick Erickson <erickerickson@...>

    Leaving aside some historical reasons, the root of the issue is that any search has to identify all the terms in a field that satisfy it. Let's take a normal non-leading wildcard case first. Finding all the terms like 'some*' will have to deal with many fewer terms than 's*'. Just dealing with t...

  6. [solr-user] Re: Solr Performance Issues

    Sent 2010-03-11 by Mike Malloy <mike@...>

    I dont mean to turn this into a sales pitch, but there is a tool for Java app performance management that you may find helpful. Its called New Relic (www.newrelic.com) and the tool can be installed in 2 minutes. It can give you very deep visibility inside Solr and other Java apps. (Full disclosur...

  7. [solr-user] Re: field length normalization

    Sent 2010-03-11 by Jay Hill <jayallenhill@...>

    The fieldNorm is computed like this: fieldNorm = lengthNorm * documentBoost * documentFieldBoosts and the lengthNorm is: lengthNorm = 1/(numTermsInField)**.5 [note that the value is encoded as a single byte, so there is some precision loss] So the values are not pre-set for the lengthNorm, bu...

  8. [solr-dev] Re: Abstractify FacetComponent and SimpleFacets

    Sent 2010-03-11 by Yonik Seeley <yonik@...>

    Interesting looking stuff Marcus! Seems sort of related to stat.facet (calc stats on unique facet values) http://wiki.apache.org/solr/StatsComponent On Thu, Mar 11, 2010 at 5:49 PM, Marcus Herou wrote: > I have now implemented Facet with FunctionQueries it is really...

  9. [solr-user] Re: How to edit / compile the SOLR source code

    Sent 2010-03-11 by JavaGuy84 <bbarani@...>

    Eric, Thanks a lot for your reply. I was able to successfully hack the query parser and enabled the leading wild card search. As of today I hacked the code for this reason only, I am not sure how to make the leading wild card search to work without hacking the code and this type of search is t...

  10. [solr-user] Re: Cleaning up dirty OCR

    Sent 2010-03-11 by Tom Burton-West <tburtonwest@...>

    We've been thinking about running some kind of a classifier against each book to select books with a high percentage of dirty OCR for some kind of special processing. Haven't quite figured out a multilingual feature set yet other than the punctuation/alphanumeric and character block ideas mentio...

  1. <<
  2. 1
  3. 2
  4. 3
  5. 4
  6. 5
  7. 6
  8. 7
  9. 8
  10. 9
  11. 10
  12. >>

Solr Powered

Give us your feedback

  • Lucene
  • Solr
  • Nutch
  • Tika
  • Mahout
  • Droids
  • PyLucene
  • Lucene.Net
  • Lucy
  • Lucene4c
  • Open Relevance Project
  • How We Can Help:
    • Getting Started |
    • Support Subscriptions |
    • White Papers |
    • Training |
    • Consulting |
    • Contact Us |
  • Developers:
    • Blog |
    • Documentation |
    • Tech Articles |
    • Podcasts and Videos |
    • Community |
  • Downloads:
    • LucidWorks for Solr |
    • LucidWorks for Lucene |
    • LucidGaze for Solr |
    • LucidGaze for Lucene |
  • Products:
  • Services:

Contact | Privacy Policy | Legal Terms of Use | Copyrights and Disclaimers | Admin

Apache Solr, Apache Lucene, ApacheCon and their logos are trademarks of the Apache Software Foundation.

© 2010 Lucid Imagination. All Right reserved.