Not RLHF, but my understanding was they heavily use that data and it was a big part of their moat, part of why competitors wanted to clone their results because they couldn't derive as good of quality from the web alone (Microsoft used the bing toolbar to clone them in the 2010s).