HTTPS请求导致在Windows中使用Python 3重置连接

当我将以下函数与cygwin中的Python 3.2.3软件包一起使用时,它会挂接到对任何https主机的任何请求上.它将引发以下错误:[Errno 104] 60秒后,对等方重置了连接.更新:我以为它仅限于cygwin,但这也发生在Windows 7 64位...

当我将以下函数与cygwin中的Python 3.2.3软件包一起使用时,它会挂接到对任何https主机的任何请求上.它将引发以下错误:[Errno 104] 60秒后,对等方重置了连接.

更新:我以为它仅限于cygwin,但这也发生在Windows 7 64位和Python 3.3中.我现在尝试3.2.使用Windows命令外壳时的错误是:
urlopen错误[WinError 10054]远程主机强行关闭了现有连接

UPDATE2(Electric-Bugaloo):这仅限于我要使用的两个站点.我针对Google和其他主要网站进行了测试,没有任何问题.看来与这个错误有关:

http://bugs.python.org/issue16361

具体来说,服务器挂在客户端问候之后.这是由于python3.2和3.3的编译版本随附的openssl版本所致.错误标识服务器的ssl版本.现在,当需要打开与受影响站点的连接时,我需要代码将我的ssl版本自动降级为sslv3,如下文所示:

How to use urllib2 to get a webpage using SSLv3 encryption

但我无法正常工作.

def worker(url, body=None, bt=None):
    '''This function does all the requests to wherever for data

    takes in a url, optional body utf-8 encoded please, and optional body type'''

    hdrs = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
                   'Accept-Language': 'en-us,en;q=0.5',
                   'Accept-Encoding': 'gzip,deflate',
                   'User-Agent': "My kewl Python tewl!"}
    if 'myweirdurl' in url:
        hdrs = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
                   'Accept-Language': 'en-us,en;q=0.5',
                   'Accept-Encoding': 'gzip,deflate',
                   'User-Agent': "Netscape 6.0"}

    if bt:
        hdrs['Content-Type'] = bt
    urlopen = urllib.request.urlopen
    Request = urllib.request.Request
    start_req = time.time()
    logger.debug('request start: {}'.format(datetime.now().ctime()))
    if 'password' not in url:
        logger.debug('request url: {}'.format(url))
    req = Request(url, data=body, headers=hdrs)
    try:
        if body:
            logger.debug("body: {}".format(body))
            handle = urlopen(req, data=body, timeout=298)
        else:
            handle = urlopen(req, timeout=298)
    except socket.error as se:
        logger.error(se)
        logger.error(se.errno)
        logger.error(type(se))
        if hasattr(se, 'errno') == 60: 
            logger.error("returning: Request Timed Out")
            return 'Request Timed Out'
    except URLError as ue:
        end_time = time.time()
        logger.error(ue)
        logger.error(hasattr(ue, 'code'))
        logger.error(hasattr(ue, 'errno'))
        logger.error(hasattr(ue, 'reason'))
        if hasattr(ue, 'code'):
            logger.warn('The server couldn\'t fulfill the request.')
            logger.error('Error code: {}'.format(ue.code))
            if ue.code == 404:
                return "Resource Not Found (404)"
        elif hasattr(ue, 'reason') :
            logger.warn('We failed to reach a server with {}'.format(url))
            logger.error('Reason: {}'.format(ue.reason))

            logger.error(type(ue.reason))
            logger.error(ue.reason.errno)
            if ue.reason == 'Operation timed out':
                logger.error("Arrggghh, timed out!")
            else:
                logger.error("Why U no match my reason?")
                if ue.reason.errno == 60:
                    return "Operation timed out"
        elif hasattr(ue, 'errno'):
            logger.warn(ue.reason)
            logger.error('Error code: {}'.format(ue.errno))
            if ue.errno == 60:
                return "Operation timed out"
        logger.error("req time: {}".format(end_time - start_req))
        logger.error("returning: Server Error")
        return "Server Error"
    else:
        resp_headers = dict(handle.info())
        logger.debug('Here are the headers of the page : {}'.format(resp_headers))
        logger.debug("The true URL in case of redirects {}".format(handle.geturl()))           
        try:
            ce = resp_headers['Content-Encoding']
        except KeyError as ke:
            ce = None
        else:
            logger.debug('Content-Encoding: {}'.format(ce))
        try:
            ct = resp_headers['Content-Type']
        except KeyError as ke:
            ct = None            
        else:
            logger.debug('Content-Type: {}'.format(ct))
        if ce == "gzip":
            logger.debug("Unzipping payload")
            bi = BytesIO(handle.read())
            gf = GzipFile(fileobj=bi, mode="rb")
            if "charset=utf-8" in ct.lower() or ct == 'text/html' or ct == 'text/plain':
                payload = gf.read().decode("utf-8")
            else:
                logger.debug("Unknown content type: {}".format(ct))
                sys.exit()
            return payload
        else:
            if ct is not None and "charset=utf-8" in ct.lower() or ct == 'text/html' or ct == 'text/plain':
                return handle.read().decode("utf-8")
            else:
                logger.debug("Unknown content type: {}".format(ct))
                sys.exit()

解决方法:

我发现了,这是在Windows上进行此工作所需的代码块:

'''had to add this windows specific block to handle this bug in urllib2:
http://bugs.python.org/issue11220
'''
if "windows" in platform().lower():
    if 'my_wacky_url' or 'my_other_wacky_url' in url.lower():
        import ssl
        ssl_context = urllib.request.HTTPSHandler(
                                              context=ssl.SSLContext(ssl.PROTOCOL_TLSv1))
        opener = urllib.request.build_opener(ssl_context)
        urllib.request.install_opener(opener)
#end of urllib workaround

我在第一次尝试之前添加了该blob:block,它的工作原理就像一种魅力.感谢您的帮助andrean!

本文标题为:HTTPS请求导致在Windows中使用Python 3重置连接

基础教程推荐