here.ChemCurator is a desktop application of ChemAxon for computer-aided chemical information extraction. ChemCurator is a standalone desktop application. Running this application, you need to download and run ChemCurator installer. Some short video tutorials demonstrating the main functionality are available
The main menu contains "File", "View", "Window" and "Help" elements.
Most of the panels and views in ChemCurator are optionally resizable, or can be moved to different location or screen depending your preferences. The default settings can be restored by the Reset Windows function.
Project explorer panel displays the opened projects and represents the project's structure in a tree-like hierarchical way. Every project representing one document and you are able to add as many Markush structures and compound lists as you want to it. All Markush structure automatically have an Exemplified structures list.
Document view is the viewer component of the annotated documents and the related selections. The recognized chemical entities are highlighted by gray. In structure selection modeusers can select recognized chemical structures by clicking on any highlighted component or select a larger part of the document by pressing left mouse button and dragging it over the targeted part of the document. The selected structures are highlighted by red and displayed under the document in the selection panel. In text selection mode users can select the document text directly. Document linking turns on the automatic scrolling of the document based on the structure selections in the editor views. With and document's zoom level can be changed.
Compounds view is the display component of compounds lists. Can handle not only chemical structures but also the related additional information columns. Data can be edited by double-clicking on any of the cells.
Markush Editor View is the display component of the Markush structures and related exemplified structures. Markush Editor View is based on the same component like Markush Editor Desktop Application, therefore, the details of editing Markush structures are available in Markush Editor documentation. Markush Editor View compared contains an additional bottom line containing the exemplified structures related the Markush structure. Exemplified structures continuously validated against the Markush structure. Examples matching to the Markush highlighted by green non-matching structures highlighted by red.
Structure checker panel displays the structure drawing errors and warnings related to the active editor component. In the case of an error, an exclamation mark appears in a red circle appears . By clicking on the checker items you are able to choose between the available automatic fixer options. You are able to fix the issues one-by-one with the Fix Selected button or all together with the Fix All button., in the case of warning in as yellow triangle
In ChemCurator, every project represents a document and the extracted chemical information belongs to this document. ChemCurator offers multiple project creation option based on different search formats. Independently from the original format, all document converted to an annotated HTML preserving the structure and layout of the original document. The time of annotation process strongly depends on the format, size, and content of the original document. The new project wizard available from File>New Project... or from the main toolbar with theicon.
The project can be created from a file stored in your local machine. ChemCurator can process pdf, html, xml and txt documents.
Patent documents can be imported directly from Google Patents by using the publication number of the document. The import wizard automatically tries to find the corresponding document in Google Patents and automatically download the HTML version of the patent. Most of the non-English patents machine translated English version is available in Google Patents. If you want to download the original version select Original from language preferences.
If you have IFI Claim access, you can also import documents directly from IFI Claims. The import wizard automatically tries to find the corresponding document in IFI Claims and automatically download the HTML version of the patent.
With creating demo project function, an example project can be created containing the annotated version of US6756383B2 patent document from Google Patents and some curated data including a Markush structure and compound list.
With annotation configuration, you are able to fine tuning the annotation parameters according to your needs. The settings panel available from File>Options... or from the main toolbar with theicon.
ChemCurator offers multiple function to help in the recognition and extraction of the relevant chemical information from documents.
ChemCurator supports two type of chemical information, the Markush structures, and Compound list. Markush structure objects are always created together with a linked special compound list the Examples.
Any annotated structure can be selected from the document. After selection, it can be moved using drag and drop from the selected structures view to editor components.
Compounds extraction wizard is available in Compounds and Markush view. This wizard can help to automatically find and extract a large number of chemical structures from the documents. In the first panel of the wizard some basic filter criteria available.
The extraction process can be parametrized with some filter options.
Filter duplicates: Ignore the duplications by extracting only the first occurrence of compounds from the document.
Minimum mass: Set a minimum molecular mass filter criteria.
Maximum mass: Set a maximum molecular mass filter criteria.
Structure filtering options:
None: Structure filter option ignored.
Substructure: A substructure filter criteria can be set after clicking on the Next button.
Similarity with threshold: A similarity filter criteria can be set after clicking on the Next button. MCS-based similarity calculation executed in the background and structures filtered by the Tanimoto similarity of the sutures.
If Substructure or Similarity with threshold selected by clicking on the Next button you can navigate to the second tab of the extraction wizard. In a case of Similarity with threshold only exact compounds can be used as a filter without any variability feature.In a case of Substructure any atom lists, bond lists, and any query property can be used.
After clicking on the Finish button extraction started. In a case of Similarity with threshold an additional column added to the extracted compound containing the similarity value of the compound.
Compounds view is capable of handling not only the chemical structures but also the related assay data, properties, comments, etc. You can manually add this information to the compounds lists using the Creat new column function.
A simple dialog opens where the name and type of the new column can be selected. The newly created column can be edited by simply double clicking on it.
Markush fragments and compounds can be added manually from fragment and compound list's context menu and with the Add new row menu item of the compounds view.
The manually added compound can be linked to the corresponding part of the document. After a right click on any structure, you can select Add reference to document... function to specify the corresponding part of the document. After starting reverse linking document view enters reverse linking mode and any part of the document can be selected. After selecting the corresponding part of the document and clicking on OK the selected part of the text will be marked as a chemical entity and linked to the manually added compound. If Add to local dictionary check box selected, the selected text and the linked compound are added to ChemCurator dictionary and will be recognized the next time during annotation.
Import compounds wizard can add compounds file with molecule properties to the selected project as a new Compounds List and automatically associate the important compounds to the first occurrence in the document.
The accuracy of structure recognization is not 100% so annotated documents always contain some unrecognized or misrecognized structures.
Text and image based misrecognized structures can be fixed by selecting the problematic structure in the document view, simply double clicking on it on the selection view or right-clicking on it and choosing the edit option.
Unrecognized structures can be annotated by the Fix annotation menu item.
Clicking on this button the document view enters reverse linking mode and any part of the document can be selected.
After clicking on the OK button, an interactive text fixing dialog opens. If the modified text can be recognized, the recognized structure appears under the text input field. The structure immediately following any modification of the text. Potentially problematic parts of the chemical names are underlined. After successful fixing, the recognized chemical structure can be added to the corresponding part of the document by clicking on OK button.
Any unwanted annotation can be removed by selecting the problematic structure in the document view, and right-clicking on it in the selection view and choosing the Remove option.
ChemCurator offers multiple options for project sharing and export of the annotated data in various formats.
ChemCurator Integration Server is the most standard way to share your project with your colleagues and store them in a central database. For the server installation details please check the Integration Server Administrator Guide additionally you need to configure the server connection details in the Chem Curator desktop application following the corresponding section of the Installation Guide.
After successful sharing, a new indicator icon appears next to the project, and you are able to upload your modifications or download the newer version of the project.
Structure export functionis available in compounds and Markush view. The structure and related information from the view can be exported in various file formats
A project can be exported to a ZIP file by File>Export Project to ZIP... in this way the project can be easily shared by e-mail or any file sharing method.
The zipped project can be imported in a similar way by File>Import Project from ZIP... function.
All projects are available in project directories. The default location of the projects is the C:\Users\<user name>\Documents\ChemCurator directory. The name of the project directory is the project name. Every project contains a project file (an xml with some metainformation), a document html with the connected resources and the extracted chemical information in sdf (compound lists) and mrv (Markush structures) formats.