Factors that affect performance of parallel algorithms in cuda