SD file import basic visualization and overlap analysis

    IJC tutorial: SD file import, basic visualization and overlap analysis


    This tutorial will introduce you to the basic operations in Instant JChem. It will guide you through the process of file import, calculating chemical properties by using chemical terms field, visualization the distribution of properties. It will also give a step by step guidance to the identifying common structures between the different files by overlap analysis.


    Importing a SD file

    Let's start with the import of a single SD file. Before the actual import, new project must be created. Use File -> New Project... menu entry or appropriate icon in the toolbar(shortcut - Ctrl+Shift+N). Create a new project and choose IJC Project (with local database).


    We will be using MDL SD file substances1.sdf packed in this zip file:

    Second file substances2.sdf will be used later for overlap analysis:

    File import to the fresh local database project

    1. Importing file can be achieved through the File -> Import File... menu. Other option is to use right click on the schema in Projects window and choose Import File...

    <img src="images/download/attachments/1802560/2_1_import_menu.png" alt="images/download/attachments/1802560/2_1_import_menu.png"/>

    1. Choose the file to import.

    1. Import file dialog pops up and proceed with theNext button.


    1. This leads to the Field details tab where one can choose which fields in the file should be imported. Default setting should be OK.


    1. After the importing process has finished, the default grid view containing the molecules and the corresponding fields is shown. images/download/attachments/1802560/2_5_import_wiz3.png

    Calculating properties

    Chemical Terms Field allows to directly compute a chemical property from the chemical structure.

    1. New Chemical Terms (CT) Field can be created using toolbar button or by choosing New Chemical Terms Field... from the ... button menu

    <img src="images/download/attachments/1802560/3_1_CT_add_field.png" alt="images/download/attachments/1802560/3_1_CT_add_field.png"/><img src="images/download/attachments/1802560/3_2_CT_add_field_button.png" alt="images/download/attachments/1802560/3_2_CT_add_field_button.png"/>

    1. New Chemical Terms Field Window allows to write an expression in Chemical Terms Language. For now we choose logP in the predefined favorites expressions and add them by clicking Finish.


    1. Add the new CT field for H bond donors.

    <img src="images/download/attachments/1802560/3_4_CT_donors.png" alt="images/download/attachments/1802560/3_4_CT_donors.png"/>

    1. Following screenshot shows fields in database. We use them for visualization in the next section. Molecular Weight and Formula are created automatically by default, when importing file. images/download/attachments/1802560/3_5_grid_view.png

    Visualizing properties

    Using the calculated property fields this section will show how this data can be viewed in Instant JChem using the Chart Widgets.

    Creating a form view

    1. In the Project Window right-click on the data tree containing the products and select "New View"

    1. Set the View type to "Empty Form View" in the dialogue and click Finish

    1. Go to Design mode by clicking on the Design button in the upper left corner of the Form View

    1. Add a table to the form through the corresponding button in the top toolbar, or through the r-click drop-down menu (Right-click on the form and select Table)

    1. Choose the fields visible in the table. It means binding table to fields.


    1. Click Bind; resize and reposition the table according to your needs

    1. Right-click on the form and select Histogram from the menu


    1. Select LogP as the Value to Bin and H bond donors as Category field


    1. Click OK; resize and reposition the chart widget

    1. Analogically add a Scatter plot; X axis - Mol Weight; Y axis - LogP


    1. Click on Browse in the top left corner of the view to see the result of the above steps images/download/attachments/1802560/4_6_form_view_final.png

    Overlap analysis

    This is a very useful approach when two different molecule datasets needs to be compared. Here is the second molecule dataset available for download as a sample file.

    Performing overlap analysis

    1. Import the file to the new entity named substances2

    1. Go to Chemistry -> Overlap analysis

    <img src="images/download/attachments/1802560/5_1_overlap_menu.png" alt="images/download/attachments/1802560/5_1_overlap_menu.png"/>

    1. Set the substances1 table as the Query table and substance2 as the Target table


    1. Click next and select Duplicate search mode


    1. Click Next and after giving a name to the new overlap field click Finish images/download/attachments/1802560/5_4_overlap_wiz3.png

    Exporting query structures not presented in the Target table

    1. Go to Grid view for substances1

    1. Now you need to include your Overlap analysis results in the gridview. You can do this by selecting Customize Widget Settings in the ... button menu. Then press the Modify Fields button and add both result fields to the table.

    <img src="images/download/attachments/1802560/5_6_widget_modify.png" alt="images/download/attachments/1802560/5_6_widget_modify.png"/><img src="images/download/attachments/1802560/5_7_widget_add.png" alt="images/download/attachments/1802560/5_7_widget_add.png"/>

    1. Click to Query mode in the top left corner and set Overlap count = 0


    1. Run query; unique structures are shown in the Browse mode


    1. You can export the unique structures by using menu entry File -> Export to file...


    Congratulations! You have just compared content of two structure files and exported unique structure in the first file. In this tutorial you have learned:

    • How to create a new project

    • How to import a file

    • How to estimate chemical properties from chemical structure by using Chemical terms field

    • How to visualize chemical properties

    • How to run overlap analysis