Determining coplanarity among four points in a 2D image is challenging due to the lack of 3D information. The discussion highlights the need for a homography matrix to classify ground and non-ground pixels, which is essential for building a navigation map. Without depth perception or multiple views from different camera positions, it's difficult to ascertain coplanarity. Suggestions include using two images from different angles or leveraging known object sizes and positions. Ultimately, the limitations of a single camera setup restrict the ability to determine coplanarity effectively.