Machine learning and artificial intelligence (AI) is becoming a ubiquitous technology of the looming industry-4.0 era. However, progress of adopting intelligent automation of systems is limited by hardware overhead such as throughput, power consumption, and latency. At a conceptual level, electronics is at the end of its scaling law and alternative accelerators are sought after. Optical co-processors offer a high-degree of algorithmic homomorphism to implement general matrix-matrix multiplication operations via on-the-fly multiplication performed by electro-optic components, and accumulation operations performed by photodetectors. However, recent emerging photonic AI engines are cumbersome to program, follow overhead-heavy scaling laws, or rely on discretely-packaged photonic components, all reducing performance. Here we introduce a photonic tensor core processor featuring a complete chip-integrated multiply-accumulate engine on the photonic circuits, signal parallelism via wavelength division multiplexing, and tunable 5-bit electronic programmable weights, all integrated into a small-formfactor silicon photonics platform. This approach shows a high throughput-efficiency AI engine (0.21 TOPS/W). Using this stand-alone hybrid photonic-electronic machine learning accelerator, we demonstrate versatile applications including low-error image filtering and edge detection, augmented reality, and high-accuracy machine learning image classification tasks. This hybrid electronic-photonic tensor processor offers a versatile low-overhead and compact formfactor solution for extreme edge AI applications, but also can be utilized for in-cloud machine learning training tasks.