Recent progress in large transformers-based foundation models have demonstrated impressive capabilities in mastering complex chemical language representations. These models show promise in learning task-agnostic chemical language representations through a two-step process: pre-training on extensive unlabeled corpora and fine-tuning on specific downstream tasks. By utilizing self-supervised learning capabilities, foundation models have significantly reduced the reliance on labeled data and task-specific features, streamlining data acquisition and pushing the boundaries of chemical language representation. However, their practical implementation in further downstream tasks is still in its early stages and largely limited to sequencing problems. The proposed multimodal approach using MoLFormer, a chemical large language model, aims to demonstrate the capabilities of transformer based models to non-sequencing applications such as capturing design space of liquid formulations. Multimodal MoLFormer utilizes the extensive chemical information learned in pre-training from unlabeled corpora for predicting performance of battery electrolytes and showcases superior performance compared to state-of-the-art algorithms. The potential of foundation models in designing mixed material systems such as liquid formulations presents a groundbreaking opportunity to accelerate the discovery and optimization of new materials and formulations across various industries.