Content sources are the document repositories where all the information resides. ChemLocator supports many kinds of repositories like:
ChemLocator can connect to the above repositories and make them chemically and free-text searchable.
To see the registered content sources, open the Administration web pages, then click the Indexing menu. Click the Content sources menu item in the menu:
The main page for the content source administration:
If there are any settings turned on which can slow down the indexing, a yellow warning bar is shown to inform the user. These kinds of settings are likely time consuming property calculations, optical structure and optical character recognition.
The following actions are available on the Content Sources page:
This button can be used to register a new content source.
Full crawl is the way of pre-processing whereby all the documents in the repository will be re-interpreted. All the content will be read and processed.
Full crawl is usually required when:
Incremental crawls are much faster then full crawls, because they only handle the changes in the repository. If a document has been changed in any way (metadata, content, etc.) the incremental crawl picks it up and processes.
Incremental crawl is usually useful when:
Stops an indexing before it is completed.
Note that stopping does not occur immediately after the button was pressed, but take some time depending on the type of the content source and the status of the indexing. The status "Stopping" indicates that the content source is being stopped.
Deletes an existing content source. All the indexed data will be removed.
You have possibility to adjust a schedule for each content source (Content sources), if you want the program to automatically start the crawling rather than you having to manually start the process.
Please make sure that the start addresses of the content sources don't overlap. Indexing the same folder via more than one content sources is not supported.