Several applications of quantum machine learning (QML) rely on a quantum measurement followed by training algorithms using the measurement outcomes. However, recently developed QML models, such as variational quantum circuits (VQCs), can be implemented directly on the state of the quantum system (quantum data). Here, we propose to use a qubit as a probe to estimate the degree of non-Markovianity of the environment. Using VQCs, we find an optimal sequence of qubit-environment interactions that yield accurate estimations of the degree of non-Markovianity for the amplitude damping, phase damping, and the combination of both models. This work contributes to practical quantum applications of VQCs and delivers a feasible experimental procedure to estimate the degree of non-Markovianity.