In machine learning (ML), particularly in fields like natural language processing and computer vision, developing state-of-the-art models faces a significant challenge due to the high computational power needed for training. These models usually require vast datasets and long training periods, resulting in substantial costs and environmental impacts. Even though extremely large-scale models show promising performances without the need for further finetuning through few-shot and zero-shot learning, they still lag behind fine-tuning alternatives by big margins.This research explores training ML models with smaller yet highly representative subsets of datasets, utilizing submodular data selection. We propose a method Submodular Subset Selection with Importance Sampling (SuBIS), a two-stage process that synergizes clustering with importance sampling alongside submodular functions. This approach is designed to enhance dataset diversity while simultaneously reducing computational demands. Our empirical research indicates that training models with as little as \(10%\) carefully selected subsets of the original dataset can achieve performances that are competitively close, within three standard deviations, to those attained using the full training datasets. Moreover, SuBIS demonstrates its efficacy in scaling submodular functions to accommodate extremely large datasets. It substantially reduces the runtime required for these functions on large datasets by nearly a factor of \(10\) without any deterioration in downstream classification performance.