Treelet kernel incorporating cyclic, stereo and inter pattern information in Chemoinformatics

Benoit Gauzere &
Pierre-Anthony Grenier &
Luc Brun &
Didier Villemin.

Chemoinformatics is a research field concerned with the study of physical or biological molecular properties through computer science's research fields such as machine learning and graph theory. From this point of view, graph kernels provide a nice framework which allows to naturally combine machine learning and graph theory techniques. Graph kernels based on bags of patterns have proven their efficiency on several problems both in terms of accuracy and computational time. Treelet kernel is a graph kernel based on a bag of small subtrees. We propose in this paper several extensions of this kernel devoted to chemoinformatics problems. These extensions aim to weight each pattern according to its influence, to include the comparison of non-isomorphic patterns, to include stereo information and finally to explicitly encode cyclic information into kernel computation.