I describe parallel implementations of a total variational image-denoising algorithm. Specifically, the primal-dual formulation of the TV-$L^1$ model is implemented in both the shared memory and distributed memory paradigms using the Open Multi-Processing (OpenMP) API and the Message Passing Interface (MPI), respectively. Experiments on the Stampede supercomputer show good weak and strong scaling performance. Weak scaling performance suggests that the shared memory implementation may be particularly suited to very large problems where the overhead associated with passing and receiving messages is much less than the local work. A short discussion on applications follows.