Solprovider Lenya Kongregate Registration for Free Flash Games and Chat with solprovider

Searching Publications on Windows

Lenya 1.2.4's Default Publication includes this, but it may be broken. See Easy Download for details.

This assumes Lenya 1.2.2 was installed to C:\apache-lenya-1.2.2. If your installation is different, adjust the paths.
NOTE: Replace {pub} with your publication name in all instructions. All other braces should be typed as written.

1. Server-wide BUG FIX:
(Linux: apache-lenya-1.2.2/build/lenya/webapp/WEB-INF/classes/org/apache/lenya/lucene/index/configuration2xslt.xsl)
Add the following line:
<xsl:template match="namespace"/>

TECHNICAL REASON: The indexer adds namespaces to the data of Fields in the index. The namespaces are not used (and are annoying), so remove them. An alternative is to fix the XML later in the process, but why bother? They do not belong in the data.
Example entry for ConfigurableIndexer:
<luc:field name="htmltitle" type="Text">
<namespace prefix="xhtml"></namespace>

- Without the namespace line, it errors.
- Without the bug fix, this would return: MyTitle
- With the bug fix, this returns:
- Isn't that better?

2. Set the configuration by changing:

<?xml version="1.0"?>
<update-index type="new"/>
<index-dir src="../../work/search/lucene/index/live/index"/>
<htdocs-dump-dir src="../../content/live"/>
<indexer class="org.apache.lenya.lucene.index.ConfigurableIndexer">
<configuration src="lenyadocs.xconf"/>
<extensions src="xml"/>

3. Create a new file in the same directory to tell lucene what fields to index (filename must match the configuration src in lucene-live.xconf):
Add this:
<?xml version="1.0"?>
<luc:document xmlns:luc="">
<luc:field name="title" type="Text">
<namespace prefix="lenya"></namespace>
<namespace prefix="dc"></namespace>
<luc:field name="subject" type="Text">
<namespace prefix="lenya"></namespace>
<namespace prefix="dc"></namespace>
<luc:field name="htmltitle" type="Text">
<namespace prefix="xhtml"></namespace>
<luc:field name="language" type="Text">
<namespace prefix="lenya"></namespace>
<namespace prefix="dc"></namespace>
<luc:field name="description" type="Text">
<namespace prefix="lenya"></namespace>
<namespace prefix="dc"></namespace>
<luc:field name="htmlbody" type="Text">
<namespace prefix="xhtml"></namespace>
<luc:field name="contents" type="UnStored" xpath="/"/>

3. Create a batch file:
With this:
SET ANT_HOME=C:\apache-lenya-1.2.2\tools
ant -f ../../build/lenya/webapp/lenya/bin/crawl_and_index.xml -Dlucene.xconf=../../build/lenya/webapp/lenya/pubs/%LENYAPUB%/config/search/lucene-live.xconf index

Setting CLASSPATH to only the current directory (the period) removes issues if you have Tomcat or Ant installed elsewhere.
ANT_HOME is the "tools" directory that contains the "bin" directory that contains "ant.bat".

4. To create a logfile (and avoid some ant errors), create a file:
With this:
log4j.rootLogger=INFO, lucene
log4j.appender.lucene = org.apache.log4j.FileAppender
log4j.appender.lucene.File = lucene.log
log4j.appender.lucene.Append = false
log4j.appender.lucene.layout = org.apache.log4j.PatternLayout
log4j.appender.lucene.layout.ConversionPattern = %d{ABSOLUTE} [%t] %-5p %-30.30c{2} %x - %m %n

NOTE (for people not using all of these instructions): must be on the CLASSPATH. The ANT Batch file in step #3 has:
so having in the tools\bin directory works. If you are not resetting the CLASSPATH, must be on your CLASSPATH.

First Test

1. Quit Lenya (to avoid file-locking issues).
2. Run the batch file: double-click on C:\apache-lenya-1.2.2\tools\bin\Index-{pub}.bat
3. Check C:\apache-lenya-1.2.2\tools\bin\lucene.log

The index created "works", but the results are not formatted properly.
1. "sitemap.xml" may appear.
2. All links are wrong. They have an extra slash and "/index_xx.xml" must be changed to ".html".
3. The excerpt is not available and displays a Java error.
These are fixed in the next section.

Fix the XML results to be usable.


In the new copy of "search.xsl":
1. Add this to the xsl:stylesheet tag at the very top:
2. Add this line after the other params:
<xsl:param name="chosenlanguage"/>
Add the usecase and language fields to the form tag:
<form action=""><div style="display:inline;"><input type="hidden" name="lenya.usecase" value="search"/><input type="hidden" name="language" value="{$chosenlanguage}"/><input class="searchfield" type="text" name="query" alt="Search field"/><input class="searchsubmit" i18n:attr="value" type="submit" value="Search" name="find"/></div></form>
Updated on 20051015. The ACTION attribute and the useless DIV tag were added for XHTML 1.0 strict validation.

Download new file:
Updated on 20051012

Based on
- Removed useless information. (I like dynamic lists better than anyone, but Search is a standard function with standard outputs, so why bother? I only left the <fields> tag to separate our output from lucene's.)
- Added language filter.
- Added protected section filter.
- - Hardcoded ProtectedUrls. The default is to require visitors be in an "employee" Group to access "/live/employee". Configure this for your website.
- - Uses Groups rather than Roles. (Roles are useless as long as "world" inherits "visit" for everything.)
- Fixed counters and total. (Total-hits changed from property to element of results.)
- Other bug fixes.

Download new file:
This file converts our poor output to something usable.
- Add languages to configuration.
- Move total-hits from element to property of results.
- Choose "title" from "htmltitle" or Lucene's "title".
- Choose "excerpt" from "htmlbody", Lenya's "description", or Lucene's "excerpt"
- Transform URI from lucene's "uri" (/about/jobs/index_en.xml) to Lenya link (about/jobs_en.html).
The default is to use htmlbody for the excerpt. This file must be modified when using Custom Doctypes that do not have a /html/body.

Download new file:
20050917: Bugs fixed.
20050616: Added {page-envelope:context-prefix} to the "root" variable to fix CSS and other URLs in the results page when not using the default (empty) context.

Download new file:
- Patched to add <xhtml:div id="body">. Removed page:* tags.
- Patched to use i18n.
- The title "Search {pubname}" is now an H2 tag. Optionally, adjust the tag or the CSS for your website.
Updated on 20050702

(Optional) The search page defaults to English. Add the following entries to \lenya\resources\i18n\cmsui*.xml files for other languages, and change the translation text. The first entry is used for the submit button.

<message key="Search">Search</message>
<message key="search-pagetitle">Search</message>
<message key="search-fieldlabel">Search</message>
<message key="search-languagefieldlabel">Language(s)</message>
<message key="search-sortfieldlabel">Sort by</message>
<message key="search-sort-scorevalue">Score</message>
<message key="search-sort-titlevalue">Title</message>
<message key="search-results-summarypages">Documents</message>
<message key="search-results-summaryto">-</message>
<message key="search-results-summaryof">of</message>
<message key="search-results-summaryfit">matches</message>
<message key="search-results-columncount">&#160;</message>
<message key="search-results-columnscore">Score</message>
<message key="search-results-columninfo">Document</message>
<message key="search-noresults">No results found.</message>
<message key="search-resultpages">Result Pages</message>
<message key="Next">Next</message>
<message key="Previous">Previous</message>

Second Test

1. Clear your cache. Delete all files in:
(Otherwise cached pages may display the default search Form.)
2. Start Lenya
3. Open your publication.
4. Use the Search.


FILE: {pub}\lenya\xslt\search\search-and-results.xsl
Many websites uses Apache httpd to rewrite the URLs so "/pub/live" is never displayed. The default Search adds "/pub/live" to the result links. To not add it, remove the "uri" match.

<< LinuxSecurity >>

Contact Solprovider
Paul Ercolino