Post subject: Controlling the number of substructure matches!!
Posted: Wed Apr 23, 2008 8:52 am
Hi - the enclosed java code should, when provided with the "-uniquess" flag (unique SubStructure) be so kind and return, out of the data base in question, ONLY compounds including a SINGLE occurrence of the given query (passed as "-q query-smarts-or-whatever-file"). It does not, although I extensively played around with (what I though to be) both deprecated & non-deprecated syntaxes for MatchCountOptions setting & getting. Whatever I do... I get the same hit list, in which molecules with multiple occurrences happily show up - however, only ONE of their incarnations of the matched substructure is green-colored in the substructure-sensitive .mrv output formats. Could one get them ALL colored?
(login, select the query tool and then create a "ChemAxon" subquery, which will give you the option to do default or Unique substructure matching.)
I need the unique stuff, as you may well guess, for chemical filtering trying to avoid picking compound with multiple reactive groups...
By the way, none of my attempts to see "what's in its head" by trying getMatchCountOptions.toString() or .toList() did not enlighten me at all. Why can't we just keep it simple: rather than imagining more and more weird data types one does not know how to deal with, an Option String is a String is a String (and may serve on Brasilian beaches as well).
I took a quick look at your code and it had several syntax problems.
Maybe it did not compile and you were running an older class version, that's why you saw now change in behavior ?
Please find the syntactically fixed file attached.
(have not checked semantically yet)
This was a bug in the 5.0.x series.
A parameter was not passed on, the setting (either the old or the new way) had no effect at all.
We have fixed it, it will work fine from the next release.
Some comments about the following line in your code:
Code:
searchOptions.setMatchCountOptions(" < ",2);
- Please use "<" instead, no leading a trailing spaces are accepted.
(otherwise it will throw an exception in the fixed version)
- Please be aware the ("<",2) also includes hits with 0 occurrence of the query substructure: only those structures are excluded these there are 2 or more hits.
I suggest ("=",1) to achieve the original goal.
Thanks a lot for solving this quickly! Actually, I have no syntax troubles when compiling with my blend of javac (maybe it's due to l'exception culturelle française, by Nico BlingBling Sarko).
As far as the "=",1 is concerned - that's what I tried in the first place, it was only due to despair that I finished up trying funny stuff such as "<",2... but I never would have guessed that "<" should be right (DOCUMENTATION!!!)
See you tomorrow; Cheers!
Szilard
Joined: 21 May 2004
Posts: 935
ChemAxon personnel
but I never would have guessed that "<" should be right (DOCUMENTATION!!!)
Please note that in the notification e-mail these characters are not displayed right, just when viewing the html in the forum. It was a "less than" sign.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You can download files in this forum