Skip to content

Content sources

Content sources are the document repositories where all the information
resides. ChemLocator supports many kinds of repositories like:

ChemLocator can connect to the above repositories and make them chemically
and free-text searchable.

To see the registered content sources, open the Administration web pages,
then click the Indexing menu. Click the Content sources menu item in the
menu:

The main page for the content source administration:

If there are any settings turned on which can slow down the indexing, a
yellow warning bar is shown to inform the user. These kinds of settings are
likely time consuming property calculations, optical structure and optical
character recognition.

Actions

The following actions are available on the Content Sources page:

New content source

New content source action

This button can be used to register a new content source.

Start full crawl

New content source action

Full crawl is the way of pre-processing whereby all the documents in the
repository will be re-interpreted. All the content will be read and
processed.

Full crawl is usually required when:

  • A new content source is created
  • About half of the documents have been modified or changed in some way
  • Settings have changed
  • New Integration specified
  • After an Index reset

Start incremental crawl

New content source action

Incremental crawls are much faster then full crawls, because they only handle
the changes in the repository. If a document has been changed in any way
(metadata, content, etc.) the incremental crawl picks it up and processes.

Incremental crawl is usually useful when:

  • The documents are added, removed, changed in the repository
  • Permissions changed for documents, folders (Permission handling is
    available in Server Edition only!)

Stop crawl

New content source action

Stops an indexing before it is completed.

Note that stopping does not occur immediately after the button was pressed, but
take some time depending on the type of the content source and the status of the
indexing. The status "Stopping" indicates that the content source is being
stopped.

Delete content source

New content source action

Deletes an existing content source. All the indexed data will be removed.

Additional information

Crawl schedules

You have possibility to adjust a schedule for each content source (Content
sources), if you want the program to automatically start the crawling rather
than you having to manually start the process.

Overlapping start addresses

Please make sure that the start addresses of the content sources don't overlap.
Indexing the same folder via more than one content sources is not supported.