Lucid Imagination

Secondary links

  • Contact Us
  • Sign Up or Login
  • Downloads
  • Solutions
    • Partners |
    • Blog |
    • Software |
    • Services |
    • Training |
    • Case Studies |
    • Webcasts |
  • Developers
    • Blog |
    • Tech Articles |
    • Community |
    • Docs |
    • Downloads |
    • Whitepapers |
    • Podcasts |
  • About
    • Market Overview |
    • Management |
    • Company News |
    • In the Media |
    • Contact |

beta

Start new search

Back to search results

  1. FromDate
  2. "Ian M. Evans"2010-02-25 01:06
  3. MilleBii2010-02-25 03:54
  4. "Andreas P. Koenzen"2010-02-25 06:57

[nutch-user] regex-urlfilter.txt and paging variables

Subject:
Re: regex-urlfilter.txt and paging variables
From:
"Andreas P. Koenzen" <akoenzen@...>
Date:
2010-02-25 06:57
Replace it with this: -[@!*]

That's it...

Best regards,

---
Andreas P. Koenzen

On 25/02/2010, at 03:06 a.m., Ian M. Evans wrote:

I suck at regex and in keeping with the Olympic spirit, I probably suck at giant slalom too. In the regex-urlfilter.txt there's the suggested probable queries exclude of: -[?*!@=] My only problem is that there's a couple of areas of the site that use, for example, ?page=2 for paging through things like news archives. So, how in regex would I edit the -[?*!@=] to allow ?page=number Thanks.

Solr Powered

Give us your feedback

  • Lucene
  • Solr
  • Nutch
  • Tika
  • Mahout
  • Droids
  • PyLucene
  • Lucene.Net
  • Lucy
  • Lucene4c
  • Open Relevance Project
  • How We Can Help:
    • Getting Started |
    • Support Subscriptions |
    • White Papers |
    • Training |
    • Consulting |
    • Contact Us |
  • Developers:
    • Blog |
    • Documentation |
    • Tech Articles |
    • Podcasts and Videos |
    • Community |
  • Downloads:
    • LucidWorks for Solr |
    • LucidWorks for Lucene |
    • LucidGaze for Solr |
    • LucidGaze for Lucene |
  • Products:
  • Services:

Contact | Privacy Policy | Legal Terms of Use | Copyrights and Disclaimers | Admin

Apache Solr, Apache Lucene, ApacheCon and their logos are trademarks of the Apache Software Foundation.

© 2010 Lucid Imagination. All Right reserved.