If you think your experimental data could improve the performance of the default logP calculator, you can take advantage of the supervised logP learning method that is built into the logP calculator.
If you create a local logP model, the scope of the logP calculator will be limited. It means that the calculated logP will only provide reasonable prediction for a few types of structures. Practically only those types of structures will be predicted correctly which were introduced to the training set during the teaching process. For example, if the training set contains only certain types of hydrocarbon but no other functional groups are present in the training set, the predicted logP of any amine-like molecule will not be accurate.
Therefore you need to be aware that a more robust general logP model requires a large, diverse training set with thousands of structures. You can generate a logP training library with the cxtrain command line tool.
As the first step of the training you have to create a training set from your experimental data. The training set should have a format which supports saving molecular properties (SDF or MRV). This can be easily done by using the graphical user interface of Instant JChem. This training set must contain the following items:
Fig. 1 Example file used for training
Then you have to run the training algorithm which creates a logP training library from your pre-compiled set. Execute the following command from command line:
cxtrain logp -t LOGP -i [library name] -a [training file]
Fig. 2 The logP options window showing how to apply the training library
To apply your logP dataset use the --trainingid and the --method parameter:
--method user --trainingid[library name] [input file/string]
Without training the result is:
Chemical Terms are available from Chemical Terms Evaluator or from Instant JChem. The method and trainingid parameters can be used in Chemical Terms Evaluator as well:
evaluate -e "logp('method:user trainingid:[library name]')" "[input file/string]"
You can also apply your logP training library via Chemical Terms in Instant JChem.
The following figure presents the usage of logP training in the 'New Chemical terms' window. The expression
defines that the plugin use the user defined logP training library myplogp.
Fig. 3 Using Chemical Terms function for training in Instant JChem
Part of the results of this calculation is presented below. You can see the difference between the untrained (column LogP) and trained (column trained LogP) values.
Fig. 4 JChem table showing the untrained and trained logP values