Using the request module to load a webpage, I notice that for he UK pound symbol £ I sometimes get back the unicode replacement character \uFFFD.
An example URL that I'm parsing is this Amazon UK page: http://ift.tt/1NqpRtC
I'm also using the iconv-lite module to decode using the charset returned in the response header:
request(urlEntry.url, function(err, response, html) {
const contType = response.headers['content-type'];
const charset = contType.substring(contType.indexOf('charset=') + 8, contType.length);
const encBody = iconv.decode(html, charset);
...
But this doesn't seem to be helping. I've also tried decoding the response HTML as UTF-8.
How can I avoid this Unicode replacement char?
Aucun commentaire:
Enregistrer un commentaire