Python socket如何解析HTTP请求内容
作者:aefuimn 时间:2022-05-06 20:09:23
socket解析HTTP请求内容
思路
1. 解析HTTP请求的头部
HTTP请求头部的结束符行为"\r\n",可以按行读取HTTP请求头的内容,如果读到一行为"\r\n",说明HTTP请求头结束。
2. 请求头里面含有Content-Length参数
如果HTTP请求里面有Content-Length参数,说明HTTP请求的内容大小是确定的,请求直接读取Content-Length的值,然后读取相应字节的的内容即可。
3. 请求头里面含有Transfer-Encoding: chunked 参数
如果HTTP请求里面有Transfer-Encoding参数,说明HTTP请求的内容大小是不确定的,这种内容的结束符是"0\r\n\r\n",因此可以按行读取HTTP请求的内容部分,如果连续读到"0\r\n"和"\r\n"说明内容读取完毕。
代码实现
代码中: self._file 代表的是socket.makefile()
def get_http_content(self):
content_length = 0
transfer_encoding = False
while True:
req_line = self._file.readline()
req_line = str(req_line, "utf-8")
# 遇到http头结束符
# 读取http内容
if req_line == "\r\n":
if content_length != 0:
content = self._file.read(content_length)
content = str(content, "utf-8")
self._content = content
return None
if transfer_encoding:
content = ""
self._file.readline()
while True:
line = self._file.readline()
line = str(line, "utf-8")
if line == "0\r\n":
sub_line = self._file.readline()
sub_line = str(sub_line, "utf-8")
if sub_line == "\r\n":
self._content = content
return None
else:
content += line
continue
self._content = False
# 头文件没有结束
# 并且没有找到关于内容大小的字段
else:
if content_length == 0 and transfer_encoding is False:
words = req_line.split()
if words[0] == "Content-Length:":
content_length = int(words[1])
if words[0] == "Transfer-Encoding:":
transfer_encoding = True
self._content = False
socket 模拟http请求
# coding: utf-8
import socket
from urllib.parse import urlparse
def get_url(url):
url = urlparse(url)
host = url.netloc
path = url.path
if path == "":
path = "/"
# 建立 socket 连接
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect((host, 80))
client.send("GET {} HTTP/1.1\r\nHost:{}\r\nConnection:close\r\n\r\n".format(path, host).encode("utf-8"))
data = b""
while True:
d = client.recv(1024)
if d:
data += d
else:
break
data = data.decode("utf-8")
html_data = data.split("\r\n\r\n")[1]
print(html_data)
client.close()
pass
if __name__ == '__main__':
get_url("http://www.baidu.com")
来源:https://blog.csdn.net/m0_37954775/article/details/100114334
标签:Python,socket,HTTP,请求
![](/images/zang.png)
![](/images/jiucuo.png)
猜你喜欢
页面新开窗口的一点补充
2008-09-10 12:57:00
thinkphp微信开发(消息加密解密)
2023-11-21 06:08:43
使用Python处理KNN分类算法的实现代码
2023-11-03 07:03:07
![](https://img.aspxhome.com/file/2023/4/63244_0s.png)
python各种语言间时间的转化实现代码
2022-06-27 14:54:28
清理你的CSS
2009-10-06 15:11:00
![](https://img.aspxhome.com/file/UploadPic/200910/6/dust-60s.jpg)
python交易记录整合交易类详解
2022-09-15 20:18:37
![](https://img.aspxhome.com/file/2023/9/63939_0s.png)
asp 页面允许CACHE的方法
2011-02-16 11:20:00
ASP操作XML的方法
2008-03-06 21:43:00
golang时间/时间戳的获取与转换实例代码
2023-09-02 06:04:43
Python2.6版本中实现字典推导 PEP 274(Dict Comprehensions)
2022-04-13 02:53:50
Javascript:window对象出身何处
2007-08-28 15:16:00
使用ajax开发的五大误区
2008-09-03 12:46:00
浅谈视觉设计的准确性
2007-09-18 17:59:00
![](https://img.aspxhome.com/file/uploadpic/20079/18/20079181814327.gif)
JS完美实现对象克隆
2008-08-03 16:51:00
深入研究WINDOW.EVENT对象
2012-04-26 16:31:58
jQuery.sheet – 创建Excel界面风格的jQuery在线应用
2010-01-27 13:03:00
![](https://img.aspxhome.com/file/UploadPic/20101/27/20101271369489s.jpg)
PL/SQL 类型格式转换
2009-02-26 11:07:00
asp中的on error resume next用法
2008-03-09 15:22:00
python GUI库图形界面开发之PyQt5信号与槽的高级使用技巧(自定义信号与槽)详解与实例
2022-02-07 05:22:10
![](https://img.aspxhome.com/file/2023/1/69061_0s.png)
Python多线程同步Lock、RLock、Semaphore、Event实例
2023-08-03 20:47:15
![](https://img.aspxhome.com/file/2023/1/61461_0s.png)