一、往文件中写入数据
往TXT文件写入的时候报错 'gbk' codec can't encode character '\xXX' in position XX
打开文件时,增加一个选项:
def wirteFile(content):
path = os.getcwd()
a_path = os.path.join(path, "content.txt")
f = open(a_path, "w+", encoding="utf-8")
f.write(content)
f.close()
open()中添加编码类型
二、lxml etree 显示爬取的内容时,中文乱码
session = requests.Session()
res = session.get('https://seat.lib.whu.edu.cn/login?targetUri=%2F', headers = headers)
html = etree.HTML(res.text) # 转换成element对象,构建成xpath解析对象
result = etree.tostring(html, encoding='utf-8') # bytes类型 这里需要增加编码方式
print(result.decode("UTF-8"))