An Efficient Implementation for Linear Convolution with Reduced Latency in FPGA