Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http logger should respect the encoding of its log outputs #12179

Closed
michelcrypt4d4mus opened this issue May 20, 2024 · 5 comments · Fixed by #12198
Closed

http logger should respect the encoding of its log outputs #12179

michelcrypt4d4mus opened this issue May 20, 2024 · 5 comments · Fixed by #12198
Labels

Comments

@michelcrypt4d4mus
Copy link

michelcrypt4d4mus commented May 20, 2024

Describe the incorrect behavior you saw

At this line Twisted is assuming that logFile has an encoding of "utf8" but that isn't always the case... I'm currently on macOS redirecting my logs to stdout through the system logs which use "utf-16-le" (via std-nslog) , and so when Twisted goes to log a basic http request it crashes on some log lines with builtins.UnicodeDecodeError: 'utf-16-le' codec can't decode byte 0x0a in position 160: truncated data.

If I change that line to

logFile.write(line.encode("utf-16-le"))

everything works fine, but I shouldn't have to do that given that the encoding is stored in the logFile variable at the time the call is made. Here's the output of vars(logFile):

{'level': <LogLevel=info>, 'log': <Logger 'twisted.python.log'>, '_encoding': 'utf-16-le', '_buffer': '', '_closed': False}

So twisted.python.log already knows that the logFile is in a different encoding but it goes ahead and forces the wrong encoding anyways. It seems to me - correct me if I'm wrong - that changing the line to

logFile.write(line.encode(logFile._encoding))

might fix this fairly trivially?

It's also possible there's some order of log instantiation that would fix this but I have tried many things as detailed here and I always get the same result, even when i replace the twisted logger with a logger that does nothing.

Describe how to cause this behavior

  1. Run a Twisted webserver using std-nslog to reroute sys.stdout and sys.stderr through macOS system logs
  2. make a web request

Stack Trace

2024-05-19T22:52:53-0400 [_GenericHTTPChannelProtocol,18,127.0.0.1] Unhandled Error
	Traceback (most recent call last):
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/python/log.py", line 96, in callWithLogger
	    return callWithContext({"system": lp}, func, *args, **kw)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/python/log.py", line 80, in callWithContext
	    return context.call({ILogContext: newCtx}, func, *args, **kw)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/python/context.py", line 117, in callWithContext
	    return self.currentContext().callWithContext(ctx, func, *args, **kw)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/python/context.py", line 82, in callWithContext
	    return func(*args, **kw)
	--- <exception caught here> ---
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/internet/selectreactor.py", line 148, in _doReadOrWrite
	    why = getattr(selectable, method)()
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/internet/tcp.py", line 250, in doRead
	    return self._dataReceived(data)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/internet/tcp.py", line 255, in _dataReceived
	    rval = self.protocol.dataReceived(data)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/http.py", line 3184, in dataReceived
	    return self._channel.dataReceived(data)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/protocols/basic.py", line 549, in dataReceived
	    why = self.lineReceived(line)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/http.py", line 2330, in lineReceived
	    self.allContentReceived()
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/http.py", line 2463, in allContentReceived
	    req.requestReceived(command, path, version)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/http.py", line 1019, in requestReceived
	    self.process()
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/server.py", line 228, in process
	    self.processingFailed(failure.Failure())
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/server.py", line 416, in processingFailed
	    self.finish()
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/server.py", line 276, in finish
	    return http.Request.finish(self)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/http.py", line 1166, in finish
	    self.channel.factory.log(self)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/web/http.py", line 3328, in log
	    logFile.write(line.encode("utf8"))
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/app_packages/twisted/logger/_io.py", line 152, in write
	    text = message.decode(self._encoding)
	  File "/Users/uzor/workspace/briefcase_app/myapp/build/myapp/macos/app/Canary Castle.app/Contents/Resources/support/python-stdlib/encodings/utf_16_le.py", line 16, in decode
	    return codecs.utf_16_le_decode(input, errors, True)
	builtins.UnicodeDecodeError: 'utf-16-le' codec can't decode byte 0x0a in position 140: truncated data

Describe the correct behavior you'd like to see

Twisted's log mechanism should respect the encoding of the writeable object its trying to write to.

Testing environment

  • Operating System and Version; paste the output of these commands:
ProductName:		macOS
ProductVersion:		13.6.4
BuildVersion:		22G513
  • Twisted version: 24.3.0
@michelcrypt4d4mus
Copy link
Author

michelcrypt4d4mus commented May 20, 2024

I forked the codebase and put in the small change and everything seems OK for my use case though I'm too concerned about unforeseen side effects and unclear on how to test etc. to open a PR.

@adiroiban
Copy link
Member

Hi,

Thanks for the report.

I would say that the call should be just logFile.write(line)
Then, the logFile should do its magic to make sure the line is stored into the expected encoding.

And just open the log in text mode here

def _openLogFile(self, path):
"""
Override in subclasses, e.g. to use L{twisted.python.logfile}.
"""
f = open(path, "ab", 1)
return f


logFile._encoding is a private API


Don't worry to much about breaking things.

If you send a PR with good test coverate and for changes which pass on the existing tests, the PR should be good to merge.


As a side note, I don't know why the Twisted HTTP is not using the standard logging infrastructure, and instead uses a custom file.

I am using the server in production, but I have just overwritten the factory not to handle the logs in a separate file.

Thanks again!

@glyph
Copy link
Member

glyph commented Jun 4, 2024

Whew. OK, there are a couple of mixed-up issues here.

  • HTTPFactory really needs to delegate some of its logging responsibilities to a dedicated object, which can implement structured logging methods. NSLog is capable of outputting structured information, and it's a bit sad it has to log everything with "@" as its format string. In HTTP/3 you still need to log requests, but the server is not going to be a "Factory" at all.
  • why is the log file being opened in binary mode when we are encoding everything we are writing to it?

@glyph
Copy link
Member

glyph commented Jun 4, 2024

  • why does log silently discard its logs?

@michelcrypt4d4mus
Copy link
Author

thanks - i tested out the changes from with #12198 and my problems are solved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants