Image registration is a basic task in computer vision, for its wide potential applications in image stitching, stereo vision, motion estimation, and etc. Most current methods achieve image registration by estimating a global homography matrix between candidate images with point-feature-based matching or direct prediction. However, as real-world 3D scenes have point-variant photograph distances (depth), a unified homography matrix is not sufficient to depict the specific pixel-wise relations between two images. Some researchers try to alleviate this problem by predicting multiple homography matrixes for different patches or segmentation areas in images; in this letter, we tackle this problem with further refinement, i.e. matching images with pixel-wise, depth-aware homography estimation. Firstly, we construct an efficient convolutional network, the DPH-Net , to predict the essential parameters causing image deviation, the rotation (