Constellio search/indexing

From TempusServa wiki
Jump to navigation Jump to search


Constellio search in Tempus Serva installations

Activate the search servlet in your installation

The search servlet is deactivated by default.

  1. Edit the <tomcat>/webapps/<Tempus Serva>/WEB-INF/web.xml
  2. Remove comments from the search servlet
  3. Remove comments from the search filter

If you are using web container security please remove it from the search servlet: The servlet filter will handle authentication of crwaler robots using a specialized form of basic authetication (normal users will be redirected to the main servlet instead).

Option: Create a user for crawling

You will need at least 1 user for crawling the content in Tempus Serva, possibly more if content restrictions apply to different search user groups.

The following applies to crawling users

  • All group and policies will be respected through the indexing
  • No codeunits will be activated
  • No log entries will be created

You can test what the crawler will se by logging in with an extra parameter:

  /login?SearchIndexing=true

Prepare Constellio

Install Constellio

  1. Download the 1.3 installer
  2. Run the installer by doubleclicking the .jar file
  3. Install to MySQL database
  4. Run the Start constellio

Setting up a connector

Before setting up a connector create or choose a valid search scope

  1. Choose connector type: auth-http-conector
  2. Ensure that Use security is checked
  3. Set start URL to: http://<server name>/TempusServa/search
  4. Include the same URL in include patterns
  5. Enter username for the crawler user (a valid TS user)
  6. Enter password for the crawler user (a valid TS user)
  7. After submitting the new connector, crawling/indexing will start by itself

No further actions are needed:

The search servet will automatically redirect real users after they click on a search result.

Option: Tweak search results

The search servlet wil automatically deliver content in a crude form, without any extra html such as wrappers. It will also provide the crawler with information about when it was last updated, and document Title will be se to current records Resume value.

You might consider excluding the command=list pages for better (less redundant) search results.