By leveraging the knowledge of separate single tasks, we propose a simple and principled algorithm for multitask Gaussian process (GP), known as stochastic hyperparameter averaging (SHA), to obtain better generalization. Specifically, we focus on multivariate time series learning to improve the generalization of extrapolation and interpolation. The knowledge of a single task is extracted by a GP separately trained on one task-specific dimension of a multivariate time series. The single task GP (STGP) has the same kernel with the latent functions in multitask GP. By averaging hyperparameters of separate STGPs to initialize the latent functions of multitask GP,SHA identifies solutions that are significantly better than those found by popular training methods, but with only a few training steps of STGPs. SHA is kernel agnostic, remarkably straightforward to implement, and enhances generalization performance. Our SHA attains a significant boost in test accuracy across various diverse multivariate time series tasks, including interpolation, extrapolation, robustness with varying model complexities, and insensitivity to different hyperparameter initializations.