Exploration on Deep Drug Discovery: Representation and Learning

Liu, Shengchao

File(s)

tech report (2.640Mb)

Date

2018-09-20

Author

Liu, Shengchao

Metadata

Show full item record

Abstract

Virtual (computational) high-throughput screening provides a strategy for prioritizing compounds for experimental screens, but the choice of virtual screening algorithm depends on the dataset and evaluation strategy. We start by considering a wide range of ligand-based machine learning and docking-based approaches for virtual screening, and present a strategy for choosing which algorithm is best for prospective compound prioritization. During this process, we find that input information may affect the model performance. Thus we emphasize the impacts of different levels of molecule representation and introduce N-gram graph, a novel representation for a molecular graph. N-gram graph on traditional machine learning models is able to reach the state-of-the-art performance. Another issue we observe is that multi-task learning can negatively impact the performance on some individual tasks. We propose a reinforced multi-task learning (RMTL) framework, and preliminary results show that RMTL can address the issue in the two-task cases.

Subject

deep learning

graph representation

reinforcement learning

multi-task learning

negative transfer

Permanent Link

http://digital.library.wisc.edu/1793/78768

Citation

TR1854

Part of

CS Technical Reports