By giving access to the X server, he effectively gives access to all keyboard and mouse events, so recording the audio of the keystrokes is not necessary (but could be another attack vector, if firefox could not be exploited for code execution).
Just to be clear. Under normal circumstances, Firefox has all that access anyway. Just like any other app running under X11.
Running it under a different user id means that if it is compromised, there is an extra step required before it can access files. A step which could lead to the compromise being noticed.