Dynamic treatment regime (DTR) is an emerging paradigm in recent medical studies that establishes data-driven treatment rules for future patients, based on clinical histories of individuals. Because each patient may have different genetic and clinical characteristics, this approach is more attractive than the standard method of allocating the same treatment to patients without taking into account individual status. There have been many methodologies on this subject, but most focus on complete data without censoring. However, when censorship is present, as in many longitudinal studies, standard methods are not directly applicable with censored responses. In this paper, we propose accountable survival contrast-learning algorithms for optimal dynamic treatment regime. To reflect the censorship, we adopt the pseudo-value approach that replaces survival quantities with pseudo-observations for time-to-event outcomes. Our estimating procedure is originated from weighted classification scheme, which is double-robust to model misspecification by taking more flexible contrast weight function. We further use SCAD-penalization to find informative clinical information in a sparse model setting and present potential possibilities for multiple treatment allocation by searching upper and lower bounds of the objective function. We demonstrate the utility of our proposal via extensive simulations and an application to AIDS data.