Fog computing brings computation and services to the edge of networks enabling real time applications. In order to provide satisfactory quality of experience, the latency of fog networks needs to be minimized. In this paper, we consider a peer computation offloading problem for a fog network with unknown dynamics. Peer competition occurs when different fog nodes offload tasks to the same peer FN. In this paper, the computation offloading problem is modeled as a sequential FN selection problem with delayed feedback. We construct an online learning policy based on the adversary multi-arm bandit framework to deal with peer competition and delayed feedback. Simulation results validate the effectiveness of the proposed policy. © 2020 IEEE.