Keypoint detection serves as a fundamental task in precision livestock farming, enabling the automated analysis of animal posture and behaviour through the localisation of anatomical keypoints from RGB images. Despite recent progress in deep learning, existing livestock pose estimation methods often lack systematic evaluation on diverse datasets; multi-species datasets and open-source releases remain relatively rare, which restricts users to a limited set of available methods. This paper presents a comprehensive benchmark evaluation of three representative state-of-the-art frameworks — SimCC, SAR, and YOLOX-Pose — combined with multiple backbone networks, including ResNet-50, HRNet-W48, LiteHRNet, Swin and HRFormer. Moreover, publicly available livestock datasets still remain relatively few and narrow in scope, typically constrained to a single species or environment. A novel, high-resolution multi-species dataset (cattle, horses, and sheep), featuring diverse environments and manual annotations of 18 keypoints with occlusion states is introduced in this paper. We conduct extensive experiments across different species and body regions to evaluate detection accuracy, robustness, and computational efficiency. SimCC-HRNet-W48 achieves the strongest overall accuracy, while SimCC-ResNet-50 provides the best speed–accuracy balance. YOLOX-Pose-s/m are the fastest and most compact, suiting strict real-time or memory-limited deployment. For accuracy-first use, SimCC with HRNet-W48 or Swin is preferred, and SAR-HRNet-W48 remains comparatively robust under complex poses. We further evaluate the models’ zero-shot transferability, impact of occlusion, and fine-tuning effects, and compare their deployability on RTX 4090, Jetson AGX Orin, and CPU platforms, providing a comprehensive characterisation and practical recommendations. The proposed dataset and benchmark framework aim to facilitate fair comparison and further development of livestock pose estimation methods.