Should a TCP client be able to pause the server, when the TCP server reads a non-blocking socket
我有一个简单的问题,下面的代码。希望我没有在代码中犯错。
我是一名网络工程师,我需要在网络中断期间测试我们的业务应用程序keepalive的某些linux行为(我稍后将插入一些iptables内容以实现连接-首先,我想确保获得客户端和服务器权限)。
作为我正在进行的网络故障测试的一部分,我编写了一个非阻塞的Python TCP客户端和服务器,该客户端和服务器应该在循环中盲目地相互发送消息。为了了解正在发生的事情,我正在使用循环计数器。
服务器的循环应该相对简单。我遍历上循环时会暂停,但是由于某种原因,服务器代码会间歇性地暂停 strike>(下面有更多详细信息)。
最初,我没有在客户端的循环中入睡。在客户端没有睡眠的情况下,服务器和客户端似乎像我想要的那样高效。但是,当客户端在对服务器执行间歇地 strike>在客户端睡眠时暂停。
我的问题:
-
我应该能够编写一个单线程的Python TCP服务器,当客户端在客户端的fd.send() 循环中按下time.sleep() 时,该服务器不会暂停吗?如果是这样,我在做什么错了? strike> <-回答 -
如果我正确编写了此测试代码,并且服务器不应暂停,那么为什么TCP服务器在轮询客户端连接以获取数据时会间歇性地
strike>暂停?
重现场景
我正在两台RHEL6 linux机器上运行它。重现问题...
- 打开两个不同的终端。
- 将客户端和服务器脚本保存在不同的文件中
- 将shebang路径更改为本地python(我正在使用python 2.7.15)
-
将客户端代码中的
SERVER_HOSTNAME 和SERVER_DOMAIN 更改为在其上运行服务器的服务器的主机名和域 - 首先启动服务器,然后启动客户端。
客户端连接后,您将在服务器的终端中看到如图1所示的消息快速滚动。 几秒钟后 strike>当客户端点击 strike>暂停。我不希望看到这些停顿,但也许我误会了一些东西。
附件1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | --- LOOP_COUNT 0 --- LOOP_COUNT 1 --- LOOP_COUNT 2 --- LOOP_COUNT 3 CLIENTMSG: 'client->server 0' --- LOOP_COUNT 4 --- LOOP_COUNT 5 --- LOOP_COUNT 6 --- LOOP_COUNT 7 --- LOOP_COUNT 8 --- LOOP_COUNT 9 --- LOOP_COUNT 10 --- LOOP_COUNT 11 --- |
最终的非阻塞代码(在答案中包含建议):
tcp_server.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | #!/usr/bin/python -u from socket import AF_INET, SOCK_STREAM, SO_REUSEADDR, SOL_SOCKET from socket import MSG_DONTWAIT #from socket import MSG_OOB <--- for send() from socket import socket import socket as socket_module import select import errno import fcntl import time import sys import os def get_errno_info(e, op='', debugmsg=False): """Return verbose information from errno errors, such as errors returned by python socket()""" VALID_OP = set(['accept', 'connect', 'send', 'recv', 'read', 'write']) assert op.lower() in VALID_OP,"op must be: {0}".format( ','.join(sorted(VALID_OP))) ## ref: man 3 errno (in linux)... other systems may be man 2 intro ## also see https://docs.python.org/2/library/errno.html try: retval_int = int(e.args[0]) # Example: 32 retval_str = os.strerror(e.args[0]) # Example: 'Broken pipe' retval_code = errno.errorcode.get(retval_int, 'MODULEFAIL') # Ex: EPIPE except: ## I don't expect to get here unless something broke in python errno... retval_int = -1 retval_str = '__somethingswrong__' retval_code = 'BADFAIL' if debugmsg: print"DEBUG: Can't {0}() on socket (errno:{1}, code:{2} / {3})".format( op, retval_int, retval_code, retval_str) return retval_int, retval_str, retval_code host = '' port = 6667 # IRC service DEBUG = True serv_sock = socket(AF_INET, SOCK_STREAM) serv_sock.setsockopt(SOL_SOCKET, SOCK_STREAM, 1) serv_sock.bind((host, port)) serv_sock.listen(5) #fcntl.fcntl(serv_sock, fcntl.F_SETFL, os.O_NONBLOCK) # Make the socket non-blocking serv_sock.setblocking(False) sock_list = [serv_sock] from_client_str = '__DEFAULT__' to_client_idx = 0 loop_count = 0 need_send_select = False while True: if need_send_select: # Only do this after send() EAGAIN or EWOULDBLOCK... send_sock_list = sock_list else: send_sock_list = [] #print"---" #print"LOOP_COUNT", loop_count recv_ready_list, send_ready_list, exception_ready = select.select( sock_list, send_sock_list, [], 0.0) # Last float is the select() timeout... ## Read all sockets which are output-ready... might be client or server... for sock_fd in recv_ready_list: # accept() if we're reading on the server socket... if sock_fd is serv_sock: try: clientsock, clientaddr = sock_fd.accept() except socket_module.error, e: errstr, errint, errcode = get_errno_info(e, op='accept', debugmsg=DEBUG) assert sock_fd.gettimeout()==0.0,"client socket should be in non-blocking mode" sock_list.append(clientsock) # read input from the client socket... else: try: from_client_str = sock_fd.recv(1024, MSG_DONTWAIT) if from_client_str=='': # Client closed the socket... print"CLIENT CLOSED SOCKET" sock_list.remove(sock_fd) except socket_module.error, e: errstr, errint, errcode = get_errno_info(e, op='recv', debugmsg=DEBUG) if errcode=='EAGAIN' or errcode=='EWOULDBLOCK': # socket unavailable to read() continue elif errcode=='ECONNRESET' or errcode=='EPIPE': # Client closed the socket... sock_list.remove(sock_fd) else: print"UNHANDLED SOCKET ERROR", errcode, errint, errstr sys.exit(1) print"from_client_str: '{0}'".format(from_client_str) ## Adding dynamic_list, per input from EJP, below... if need_send_select is False: dynamic_list = sock_list else: dynamic_list = send_ready_list ## NOTE: socket code shouldn't walk this list unless a write is pending... ## broadast the same message to all clients... for sock_fd in dynamic_list: ## Ignore server's listening socket... if sock_fd is serv_sock: ## Only send() to accept()ed sockets... continue try: to_client_str ="server->client: {0} ".format(to_client_idx) send_retval = sock_fd.send(to_client_str, MSG_DONTWAIT) ## send() returns the number of bytes written, on success ## disabling assert check on sent bytes while using MSG_DONTWAIT #assert send_retval==len(to_client_str) to_client_idx += 1 need_send_select = False except socket_module.error, e: errstr, errint, errcode = get_errno_info(e, op='send', debugmsg=DEBUG) if errcode=='EAGAIN' or errcode=='EWOULDBLOCK': need_send_select = True continue elif errcode=='ECONNRESET' or errcode=='EPIPE': # Client closed the socket... sock_list.remove(sock_fd) else: print"FATAL UNHANDLED SOCKET ERROR", errcode, errint, errstr sys.exit(1) loop_count += 1 |
tcp_client.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | #!/usr/bin/python -u from socket import AF_INET, SOCK_STREAM from socket import MSG_DONTWAIT # non-blocking send/recv; see man 2 recv from socket import gethostname, socket import socket as socket_module import select import fcntl import errno import time import sys import os ## NOTE: Using this script to simulate a scheduler SERVER_HOSTNAME = 'myServerHostname' SERVER_DOMAIN = 'mydomain.local' PORT = 6667 DEBUG = True def get_errno_info(e, op='', debugmsg=False): """Return verbose information from errno errors, such as errors returned by python socket()""" VALID_OP = set(['accept', 'connect', 'send', 'recv', 'read', 'write']) assert op.lower() in VALID_OP,"op must be: {0}".format( ','.join(sorted(VALID_OP))) ## ref: man 3 errno (in linux)... other systems may be man 2 intro ## also see https://docs.python.org/2/library/errno.html try: retval_int = int(e.args[0]) # Example: 32 retval_str = os.strerror(e.args[0]) # Example: 'Broken pipe' retval_code = errno.errorcode.get(retval_int, 'MODULEFAIL') # Ex: EPIPE except: ## I don't expect to get here unless something broke in python errno... retval_int = -1 retval_str = '__somethingswrong__' retval_code = 'BADFAIL' if debugmsg: print"DEBUG: Can't {0}() on socket (errno:{1}, code:{2} / {3})".format( op, retval_int, retval_code, retval_str) return retval_int, retval_str, retval_code connect_finished = False while not connect_finished: try: c2s = socket(AF_INET, SOCK_STREAM) # Client to server socket... # Set socket non-blocking #fcntl.fcntl(c2s, fcntl.F_SETFL, os.O_NONBLOCK) c2s.connect(('.'.join((SERVER_HOSTNAME, SERVER_DOMAIN,)), PORT)) c2s.setblocking(False) assert c2s.gettimeout()==0.0,"c2s socket should be in non-blocking mode" connect_finished = True except socket_module.error, e: errstr, errint, errcode = get_errno_info(e, op='connect', debugmsg=DEBUG) if errcode=='EINPROGRESS': pass to_srv_idx = 0 need_send_select = False while True: socket_list = [c2s] # Get the list sockets which can: take input, output, etc... if need_send_select: # Only do this after send() EAGAIN or EWOULDBLOCK... send_sock_list = socket_list else: send_sock_list = [] recv_ready_list, send_ready_list, exception_ready = select.select( socket_list, send_sock_list, []) for sock_fd in recv_ready_list: assert sock_fd is c2s,"Strange socket failure here" #incoming message from remote server try: from_srv_str = sock_fd.recv(1024, MSG_DONTWAIT) except socket_module.error, e: ## https://stackoverflow.com/a/16745561/667301 errstr, errint, errcode = get_errno_info(e, op='recv', debugmsg=DEBUG) if errcode=='EAGAIN' or errcode=='EWOULDBLOCK': # Busy, try again later... print"recv() BLOCKED" continue elif errcode=='ECONNRESET' or errcode=='EPIPE': # Server ended normally... sys.exit(0) ## NOTE: if we get this far, we successfully received from_srv_str. ## Anything caught above, is some kind of fail... print"from_srv_str: {0}".format(from_srv_str) ## Adding dynamic_list, per input from EJP, below... if need_send_select is False: dynamic_list = socket_list else: dynamic_list = send_ready_list for sock_fd in dynamic_list: # outgoing message to remote server if sock_fd is c2s: try: to_srv_str = 'client->server {0}'.format(to_srv_idx) sock_fd.send(to_srv_str, MSG_DONTWAIT) ## time.sleep(1) ## Client blocks the server here... Why???? ## to_srv_idx += 1 need_send_select = False except socket_module.error, e: errstr, errint, errcode = get_errno_info(e, op='send', debugmsg=DEBUG) if errcode=='EAGAIN' or errcode=='EWOULDBLOCK': ## Try to send() later... print"send() BLOCKED" need_send_select = True continue elif errcode=='ECONNRESET' or errcode=='EPIPE': # Server ended normally... sys.exit(0) |
原始问题代码:
tcp_server.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | #!/usr/bin/python -u from socket import AF_INET, SOCK_STREAM, SO_REUSEADDR, SOL_SOCKET #from socket import MSG_OOB <--- for send() from socket import socket import socket as socket_module import select import fcntl import os host = '' port = 9997 serv_sock = socket(AF_INET, SOCK_STREAM) serv_sock.setsockopt(SOL_SOCKET, SOCK_STREAM, 1) serv_sock.bind((host, port)) serv_sock.listen(5) fcntl.fcntl(serv_sock, fcntl.F_SETFL, os.O_NONBLOCK) # Make the socket non-blocking sock_list = [serv_sock] from_client_str = '__DEFAULT__' to_client_idx = 0 loop_count = 0 while True: recv_ready_list, send_ready_list, exception_ready = select.select(sock_list, sock_list, [], 5) print"---" print"LOOP_COUNT", loop_count ## Read all sockets which are input-ready... might be client or server... for sock_fd in recv_ready_list: # accept() if we're reading on the server socket... if sock_fd is serv_sock: clientsock, clientaddr = sock_fd.accept() sock_list.append(clientsock) # read input from the client socket... else: try: from_client_str = sock_fd.recv(4096) if from_client_str=='': # Client closed the socket... print"CLIENT CLOSED SOCKET" sock_list.remove(sock_fd) except socket_module.error, e: print"WARNING RECV FAIL" print"from_client_str: '{0}'".format(from_client_str) for sock_fd in send_ready_list: if sock_fd is not serv_sock: try: to_client_str ="server->client: {0} ".format(to_client_idx) sock_fd.send(to_client_str) to_client_idx += 1 except socket_module.error, e: print"TO CLIENT SEND ERROR", e loop_count += 1 |
tcp_client.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | #!/usr/bin/python -u from socket import AF_INET, SOCK_STREAM from socket import gethostname, socket import socket as socket_module import select import fcntl import errno import time import sys import os ## NOTE: Using this script to simulate a scheduler SERVER_HOSTNAME = 'myHostname' SERVER_DOMAIN = 'mydomain.local' PORT = 9997 def handle_socket_error_continue(e): ## non-blocking socket info from: ## https://stackoverflow.com/a/16745561/667301 print"HANDLE_SOCKET_ERROR_CONTINUE" err = e.args[0] if (err==errno.EAGAIN) or (err==errno.EWOULDBLOCK): print 'CLIENT DEBUG: No data input from server' return True else: print 'FROM SERVER RECV ERROR: {0}'.format(e) sys.exit(1) c2s = socket(AF_INET, SOCK_STREAM) # Client to server socket... c2s.connect(('.'.join((SERVER_HOSTNAME, SERVER_DOMAIN,)), PORT)) # Set socket non-blocking... fcntl.fcntl(c2s, fcntl.F_SETFL, os.O_NONBLOCK) to_srv_idx = 0 while True: socket_list = [c2s] # Get the list sockets which can: take input, output, etc... recv_ready_list, send_ready_list, exception_ready = select.select( socket_list, socket_list, []) for sock_fd in recv_ready_list: assert sock_fd is c2s,"Strange socket failure here" #incoming message from remote server try: from_srv_str = sock_fd.recv(4096) except socket_module.error, e: ## https://stackoverflow.com/a/16745561/667301 err_continue = handle_socket_error_continue(e) if err_continue is True: continue else: if len(from_srv_str)==0: print"SERVER CLOSED NORMALLY" sys.exit(0) ## NOTE: if we get this far, we successfully received from_srv_str. ## Anything caught above, is some kind of fail... print"from_srv_str: {0}".format(from_srv_str) for sock_fd in send_ready_list: #incoming message from remote server if sock_fd is c2s: #to_srv_str = raw_input('Send to server: ') try: to_srv_str = 'client->server {0}'.format(to_srv_idx) sock_fd.send(to_srv_str) ## time.sleep(1) ## Client blocks the server here... Why???? ## to_srv_idx += 1 except socket_module.error, e: print"TO SERVER SEND ERROR", e |
TCP套接字几乎总是准备好进行写入,除非它们的套接字发送缓冲区已满。
因此,总是为套接字选择可写性是不正确的。仅在遇到因EAGAIN / EWOULDBLOCK而导致发送失败后才这样做。否则,您的服务器将无意识地旋转来处理可写套接字,通常是所有套接字。
However, when I put a time.sleep(1) statement after the client does an
fd.send() to the server, the TCP server code intermittently pauses
while the client is sleeping.
AFAICT运行提供的代码(很好的独立示例,顺便说一句)后,服务器将按预期运行。
特别地,
因此,在这种情况下,您的服务器程序已经告诉
我看到(通过打印调试),当服务器程序阻塞时,它正在
这是为什么?好吧,让我们往下看。
那么,此行为实际上是服务器的问题吗?实际上不是,因为服务器仍将对连接到该服务器的任何其他客户端作出响应。尤其是,只要
被黑/速度慢的客户端可能是用户的问题,但是服务器对此无能为力(除非强行断开客户端的TCP连接,否则可能会打印出一条日志消息,要求有人调试所连接的客户端程序,我假设:))。
我同意EJP,顺便说一句-选择"准备写入"仅应在您实际要向其中写入一些数据的套接字上进行。如果您实际上不希望尽快写入套接字,那么指示
If I wrote this test code correctly and the server shouldn't pause, why is the TCP server intermittently pausing while it polls the client's connection for data?
回答我自己的问题。 我的阻塞问题是由于用非零的
当我将