As expected, the bottle neck has disappeared at the floating point conversion. At least, I can say that I didn’t have trouble pulling 8 million samples per second (msps), which with 32 bit floating point (and 2 floating point numbers per sample) is 64 MB/sec. (I am short on time this evening, so I haven’t had a chance yet to try 20 msps.) As before, I fed the data into a Gnu Radio FM demodulator and got clean FM radio station audio out of it.
GCC 6.3 with -O3
appears to do some SSE optimization on the byte buffer to float buffer conversion:
christopher@nightshade:~/Repos/hackrf-shell$ objdump hackrf-shell -x -D | less
<snip>
172c: 66 0f 6f c1 movdqa %xmm1,%xmm0
1730: 66 0f 68 cc punpckhbw %xmm4,%xmm1
1734: 66 0f 60 c4 punpcklbw %xmm4,%xmm0
1738: 66 0f 6f f1 movdqa %xmm1,%xmm6
173c: 66 0f 65 e8 pcmpgtw %xmm0,%xmm5
1740: 66 0f 6f f8 movdqa %xmm0,%xmm7
1744: 66 0f 61 fd punpcklwd %xmm5,%xmm7
1748: 66 0f 69 c5 punpckhwd %xmm5,%xmm0
174c: 66 0f 6f eb movdqa %xmm3,%xmm5
1750: 0f 5b ff cvtdq2ps %xmm7,%xmm7
1753: 66 0f 65 e9 pcmpgtw %xmm1,%xmm5
1757: 0f 17 3c 24 movhps %xmm7,(%rsp)
<snip>
The next thing, perhaps, should be to create a little demo program where it captures the data at certain times of day. Or I could work on the next stages of an FM receiver, i.e., frequency multiplication, a low pass filter, and the FM demodulator.