\newcommand{\para}[1]{\left(#1\right)} \section{Experimental Setup} An optimal choice of nonlinear medium and operating regime enables BS frequency translation in the quantum regime. There are two major design decisions that affect the choice of medium. First, one must fulfill the phasematching for a given set of pumps and signals. In fact, given the selectivity of the process, a very specific dispersion profile must be used for any given configuration. The second design decision is the balance between nonlinearity and dispersion properties: an optimal ratio between the amount of pump power required for full conversion and the acceptance bandwidth, as well as the minimum frequency separation $\Delta \omega$ for which cascaded BS becomes prominent. In order to maximize the former with respect to the latter, it is convenient to operate at large dispersion values $|\beta^{(2)}(\omega_{1,2,s,i}|\simeq |\beta^{(3)}\Delta\Omega$. As a reference, the demonstrations reported above showed translation over $\Delta \Omega > 10$ nm. As mentioned, one obstacle for most quantum implementation of BS is technical noise, in the form of spurious non-linear and Raman processes \verb|\cite{Lefrancois_2015}|. Modulation instability (MI) (purple in figure \verb|\ref{fig:tech_noise}|) is a competing FWM process, its gain profile given by $G_{MI} = 1 + (\gamma P / g_{MI})^2 \sinh^2(g_{MI} L)$ where $g_{MI} = \sqrt{(\gamma P)^2 - (k_{MI} + \gamma P)^2}$ and $\kappa_{MI} = \left[\beta(\omega_p + \delta\omega) + \beta(\omega_p - \delta\omega) - 2 \beta(\omega_p)\right]/2$ are respectively the MI wavevector and linear phasematching term for a frequency detuning $\delta \omega$ from the a strong pump set at $\omega_p$. Modulation instability affects BS in multiple ways, either by depleting the pump and generating spurious sidebands, or amplifying the vacuum fluctuations and generating pairs of energy correlated photons. It can be managed placing the pump on the normal dispersion, where parametric gain is forbidden and the bandwidth for pair generation is minimized, as well as detuning the signal far from the pump frequencies. Spontaneous and Stimulated Raman Scattering are processes that couple light with the thermal phonons bath of the medium. Scattering strength depends on the density of occupied states as well as the Raman spectrum $g_r(\delta\omega)$. The probability of a spontaneous Stokes scattering (i.e. the photon losing energy) is given by $p_S = g_r(\delta\omega) \frac{1}{1-exp(-\delta\omega \hbar/ k_b T)}$ while for anti-Stokes it is $p_S = g_r(\delta\omega) \frac{exp(-\delta\omega \hbar/ k_b T)}{1-exp(-\delta\omega \hbar/ k_b T)}$ The spectrum depends on the material: amorphous materials have a broad spectrum (i.e. for glass it extends to about $40$ THz \verb|\cite{Stolen_1973}|), while crystalline materials have strong, sharp features. Both processes become less probable for very large detuning, though the anti-Stoke probability depends exponentially on the temperature when $-\delta\omega \hbar \gg k_b T$ ($6$ THz for room temperature), so that cooling the fiber can further reduce the noise by several order of magnitude (red $300$ K and blue $90$ K in figure \verb|\ref{fig:tech_noise}|) \verb|\cite{Li_2004, Takesue_2008}|. In both cases, noise is reduced the farther we place the signal from the pumps: because of to the BS phasematching flexibility, there are no limitations on the amount of detuning $\Delta\Omega$ \verb|\cite{M_chin_2006}| between pumps and signal, the only fundamental parameter being $\omega_{ZDW}$ (rather than $\beta^{(3)}$ ), easily tunable via dispersion engineering. We operate using a dispersion-shifted fiber (Vistacor, Corning): although the fiber is not optimized for nonlinear interactions ( $\gamma \simeq 3$ W/km), a sufficiently long spool makes up for the reduced nonlinear parameter. Measurement of the dispersion (shown in figure \verb|\ref{fig:dispersion}|) measures $\beta^{(3)} = 9 \times 10^{-2}$ ps$^3$/km and $\beta^{(4)} = -2.2 \times 10^{-4)$ ps$^4$/km with $\lambda_{ZDW} = 1420 $ nm that corresponds to a $\omega_{ZDW} = 1330 $ THz, with the signal/idler and the pumps placed respectively in the O-Band (1260 nm - 1320 nm ) and the C-Band (1530 nm - 1565 nm) for a $\Delta \Omega ~ 120$ THz . This is an attractive configuration because it enables the large detuning needed for a low-noise operation, while still operating at wavelengths where off-the-shelf equipment is available. For $\delta\omega = 5 $ THz ($\delta\lambda_p = 6.5$ nm at $1550$ nm and $\delta\lambda_s = 4.3$ nm fat $1280$ nm) the calculated acceptance bandwidth is $\delta_\omega_{BS} = 0.15$ THz. The experimental setup is depicted in figure \verb|\ref{fig:setup}|: to generate the pump fields, we use temperature stabilized laser diodes that are current modulated via a pulse generator, producing pulses of duration $\tau = 1-10$ ns and peak power $5$ mW. The pumps are amplified with cascaded C-band erbium-doped fiber amplifier (EDFA). The last EDFA (Keopsys) is optimized for high power pulsed amplification at low duty cycle. Both pulses are temporally separated when traversing the EDFA to avoid mutual nonlinear effects in the gain medium, and synchronized afterward using an unbalance combination of $1551.7$ nm fiber wavelength division multiplexers (WDM). Signal and pumps are coupled together using O-Band/C-band WDM, temporally synchronized and injected in the nonlinear fiber. The polarization of each field is independently controlled to ensure parallel polarization in the nonlinear fiber. The $~100$ m of nonlinear fiber is spooled and placed in a cryogenic container. Signal losses through the setup are as low as $2.6$ dB, due to splices, connectors and WDMs. Since the second WDM removes > 30dbB of pump power, we consider most of the Raman noise generated between the two WDMs, and we take care of placing as much amount of fiber as possible in the cryostat. At the end of the interaction, a second WDM removes most of the pumps, and the signal is sent to the detection stage. Before detection, the signal has to be carefully filtered of all residual pump photons and thermal noise: we used a combination of a fiber-based $1300$ nm/$1550$ nm pass/reject filter and a free-space filtering setup (3 dB losses) composed of a lowpass sharp-edge filter at $1310$ nm (Semrock) and a narrow band grating filter ($0.60(1)$ nm, 2.6 dB). Alternatively we used a Dispersion Compensating Module (DCM $D = -1200$ nm at $1300$ nm, $7.6$ dB losses) for spectral characterization of the entire $1260-1300$ nm band, using time of arrival information to recover the wavelength. For detection, we utilized superconductive nanowire single photon detector SNSPD, with nominal q.e. $~70\%$ at $1300$ nm and $~200$ dark counts per second in free-running operation. Temporal measurements are performed using a Time Tagging Module (TTM, Roithner Lasertechnik) with internal resolution $82.3$ ps. The time delay between triggering and detection is about 1200 ns. An external cavity laser (OSICS), tunable between $1260$ nm and $1340$ nm, is used both for testing purposes and, in combination with a tunable attenuator and a elecro-optical modulator, to generate a weak coherent signals. To generate single photons we use a source based on spontaneous parametric downconversion in a PPLN crystal. In a 10 mm long LiN crystal, a CW pump at $543$ nm generates photon pairs (phasematching achieved via temperature tuning): 940-nm photons are detected by a Si APD (Perkin-Elmer) to herald the presence of $1283$-nm photons and generate a synchronization signal that is used trigger the pulse generator. The marginal bandwidth of the signal photons is larger than $10$ nm, and to match it to the acceptance bandwith of BS-FWM, both heralding and signal photons are spectrally filtered, the former with an holographic $0.3$ nm filter (OptiGrate), the latter with the free space tunable grating setup of $0.60$ nm FWHM. \textit{Superseeded} We monitor the noise at one of the outputs while the fiber is changing temperature. If we collect the full signal setup, where contributions come over $12$ nm bandwidth of the WDM, we can observe the temperature dependency (fig. \verb|\ref{fig:raman}|). The reduction of noise is limited to 2 orders of magnitude, which is expected because of the amount of fiber not placed in the cryostat (about $1$ meter over $100$ m of fiber in the cooler). Taking losses into account, we calculate the probability of generating a photon of noise is, while being already extremely low compared with other BS demonstrations, about 1 photon of noise per gate in highest efficiency reported \verb|\cite{Clark_2013}|), can be additionally filtered both temporally and spectrally to match the acceptance bandwidth.