Web Sockets tutorial with simple Python server

The Landscape: HTML5

HTML5 is an emerging and in-flux client-side standard for developing web applications. It’s really more of a rich client platform specification than just a markup language, including the following slew of new features:

  • canvas for vector graphics
  • video and audio for multimedia
  • local offline storage
  • drag and drop operations
  • Web Socket API for bidirectional client-server communications
  • GeoLocation API
  • standard WYSIWYG HTML editor component
  • Web Workers API for message-passing processes
  • webcam and microphone access
  • 3D graphics rendering engine
  • and more…

A lot of this effort is about wrapping up and building into browsers native support for various (proprietary) technologies already in widespread use on the Web, such as Flash video/webcam/mic and Google Gears offline/drag-and-drop. Others are about cleaning up things for which there currently exist various hacks, and Web Sockets fall into this category.

Introducing Web Sockets

This Chromium blog post contains a nice introduction to Web Sockets:

The Web Sockets API enables web applications to handle bidirectional communications with server-side process in a straightforward way. Developers have been using XMLHttpRequest (“XHR”) for such purposes, but XHR makes developing web applications that communicate back and forth to the server unnecessarily complex. XHR is basically asynchronous HTTP, and because you need to use a tricky technique like long-hanging GET for sending data from the server to the browser, simple tasks rapidly become complex. As opposed to XMLHttpRequest, Web Sockets provide a real bidirectional communication channel in your browser. Once you get a Web Socket connection, you can send data from browser to server by calling a send() method, and receive data from server to browser by an onmessage event handler. A simple example is included below.

In addition to the new Web Sockets API, there is also a new protocol (the “web socket protocol”) that the browser uses to communicate with servers. The protocol is not raw TCP because it needs to provide the browser’s “same-origin” security model. It’s also not HTTP because web socket traffic differers from HTTP’s request-response model. Web socket communications using the new web socket protocol should use less bandwidth because, unlike a series of XHRs and hanging GETs, no headers are exchanged once the single connection has been established. To use this new API and protocol and take advantage of the simpler programming model and more efficient network traffic, you do need a new server implementation to communicate with — but don’t worry. We also developed pywebsocket, which can be used as an Apache extension module, or can even be run as standalone server.

(The mentioned technique of a long-hanging GET is also known as Comet.)

Chrome is presently the only browser that has Web Sockets, and only in the Dev channel releases. Firefox and Safari/WebKit support are under way, according to the implementation status page.

The Web Sockets Protocol

The protocol has the client and server do an HTTP-style handshake, where all text is in UTF-8 and newlines include a carriage return and newline. After this, arbitrary data can be sent back and forth, but delimited in frames, which begin with a 0×00 byte and end with a 0xff byte. Contrast this with the byte stream abstraction presented by raw TCP—having the system hand you whole frames frees the application from having to manually buffer and parse messages out of the stream (which the browser may be able to do more efficiently).

As for how this mixes with browser security policies, the basic gist is that the same-origin policy no longer applies. Requiring the Web Socket to communicate only with the same origin (same host and port as where the HTML/Javascript came from) would be a barrier to deployment because it would require the httpd to additionally speak Web Sockets. (That said, the default port for Web Sockets is in fact port 80.) More generally, this prevents all cross-site communication, which is critical to many classes of applications such as mash-ups and widget dashboards.

But the protocol does require the browser to send the origin information to the server, the server to validate this by echoing the origin, and finally the client to validate that the server echoed this. According to the protocol specs, the response must include the exact same origin, location, and protocol as the request, where:

  • the origin is just the (protocol, host, port) triplet (http://foo.com:8080/),
  • the location is the target of the request (ws://bar.com:81/path/to/some/resource), and
  • the protocol is an arbitrary string used to identify the exact application-level protocol expected.

(Note that the origin is different from the Referrer, which includes the full resource path, thus leading to privacy concerns. I hope to write more on the Origin header in a broader context and client-side web security in general soon.)

Example Client and Server

To give you a flavor of how to write a complete end-to-end web application using Web Sockets, the following is a simple client and server application where the server sends two messages down to the client, “hello” and “world.” This example is from my sandbox.

The client-side API for Web Sockets is very simple. The example client just connects to a server on port 9876 and alerts the user of each new message. Just to make this a wholesome HTML5 experience, we’ll write everything in XHTML5 (yes, there exists a stricter, XML flavor of HTML5 for those who preferred XHTML over HTML tag soup):

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Web Socket Example</title>
    <meta charset="UTF-8">
    <script>
      window.onload = function() {
        var s = new WebSocket("ws://localhost:9876/");
        s.onopen = function(e) { alert("opened"); }
        s.onclose = function(e) { alert("closed"); }
        s.onmessage = function(e) { alert("got: " + e.data); }
      };
    </script>
  </head>
    <body>
      <div id="holder" style="width:600px; height:300px"></div>
    </body>
</html>

Now for the server, which is written in Python. It sends the two messages after a one-second delay before each. Note that the server is hard-coding the response to expect a locally connected client; this is for simplicity and clarity. In particular, it requires that the client is being served from localhost:8888. A real server would parse the request, validate it, and generate an appropriate response.

#!/usr/bin/env python

import socket, threading, time

def handle(s):
  print repr(s.recv(4096))
  s.send('''
HTTP/1.1 101 Web Socket Protocol Handshake\r
Upgrade: WebSocket\r
Connection: Upgrade\r
WebSocket-Origin: http://localhost:8888\r
WebSocket-Location: ws://localhost:9876/\r
WebSocket-Protocol: sample
  '''.strip() + '\r\n\r\n')
  time.sleep(1)
  s.send('\x00hello\xff')
  time.sleep(1)
  s.send('\x00world\xff')
  s.close()

s = socket.socket()
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('', 9876));
s.listen(1);
while 1:
  t,_ = s.accept();
  threading.Thread(target = handle, args = (t,)).start()

To run the above, start the Web Socket server (./server.py) and start a web server on port 8888 serving index.html:

./server.py &
python -m SimpleHTTPServer 8888

Further Exploration

For a more complete application (that’s still reasonably simple), I threw together a real-time X-Y scatter plotting application called Real-Time Plotter. It plots some number of data sources and supports streaming to multiple clients.

The Python server listens for data sources on port 9876. It expects a stream of text, where the first line is the name of the data source and each subsequent line contains a space-separated x-y pair of floating point numbers in the series to be plotted. It listens also on port 9877 for Web Socket clients. A simple data source that issues a random y-value per second can be started from the shell using netcat:

{
  echo 'my data source'
  while true ; do
    echo "${i}000 $RANDOM"
    sleep 1
  done
} | nc localhost 9876

The client page uses Web Sockets to connect to the server and fetch historical data, as well as start streaming new data. Plotting is done using Flot, a jQuery library for generating decent-looking plots. For throttling when the server is streaming new points quickly, the client only fetches new data (by sending an empty frame) after a complete redraw; the server responds by sending a batch of all new points since the last fetch. (Note: the server’s pump routine currently treats the x values as millisecond timestamps and only issues a single point per second, but this can be easily tweaked/removed.)

Web Sockets can also be used over TLS. This is done by using wss: instead of ws: in the URL, and this defaults to the HTTPS port 443.

Follow me on Twitter for stuff far more interesting than what I blog.

  • Chung

    Wow, this is, like, a real blog post! Very nice.

  • Chung

    Wow, this is, like, a real blog post! Very nice.

  • chessweb

    Can’t get it to work. Server gives error message:

    new source GET /websock_3/ HTTP/1.1
    source GET /websock_3/ HTTP/1.1 stopped
    Exception in thread Thread-3:
    Traceback (most recent call last):
    File “/usr/lib/python2.6/threading.py”, line 525, in __bootstrap_inner
    self.run()
    File “/usr/lib/python2.6/threading.py”, line 477, in run
    self.__target(*self.__args, **self.__kwargs)
    File “./server.py”, line 36, in handle_source
    x, y = map(float, line.split())
    ValueError: invalid literal for float(): Host:

    new source GET / HTTP/1.1
    source GET / HTTP/1.1 stopped
    Exception in thread Thread-4:
    Traceback (most recent call last):
    File “/usr/lib/python2.6/threading.py”, line 525, in __bootstrap_inner
    self.run()
    File “/usr/lib/python2.6/threading.py”, line 477, in run
    self.__target(*self.__args, **self.__kwargs)
    File “./server.py”, line 36, in handle_source
    x, y = map(float, line.split())
    ValueError: invalid literal for float(): Upgrade:

    Any idea?

  • chessweb

    Can’t get it to work. Server gives error message:

    new source GET /websock_3/ HTTP/1.1
    source GET /websock_3/ HTTP/1.1 stopped
    Exception in thread Thread-3:
    Traceback (most recent call last):
    File “/usr/lib/python2.6/threading.py”, line 525, in __bootstrap_inner
    self.run()
    File “/usr/lib/python2.6/threading.py”, line 477, in run
    self.__target(*self.__args, **self.__kwargs)
    File “./server.py”, line 36, in handle_source
    x, y = map(float, line.split())
    ValueError: invalid literal for float(): Host:

    new source GET / HTTP/1.1
    source GET / HTTP/1.1 stopped
    Exception in thread Thread-4:
    Traceback (most recent call last):
    File “/usr/lib/python2.6/threading.py”, line 525, in __bootstrap_inner
    self.run()
    File “/usr/lib/python2.6/threading.py”, line 477, in run
    self.__target(*self.__args, **self.__kwargs)
    File “./server.py”, line 36, in handle_source
    x, y = map(float, line.split())
    ValueError: invalid literal for float(): Upgrade:

    Any idea?

  • http://www.mit.edu/~y_z/ yang

    chessweb: For the real-time plotting application, the server is using port 9876 for data sources. It looks like you’re issuing Web Socket requests to that port; these should go to port 9877 instead. (This is different from the “hello world” example, which uses port 9876; sorry for the confusion.) I’ve updated my post with an example of a simple data source started from the shell using netcat. Hope this clears things up.

  • http://www.mit.edu/~y_z/ yang

    chessweb: For the real-time plotting application, the server is using port 9876 for data sources. It looks like you’re issuing Web Socket requests to that port; these should go to port 9877 instead. (This is different from the “hello world” example, which uses port 9876; sorry for the confusion.) I’ve updated my post with an example of a simple data source started from the shell using netcat. Hope this clears things up.

  • chessweb

    Neither the plotting app nor the hello world example work as expected. Here is what I do and what I get:

    Hello World:
    Start server.py and load index.html via localhost:9876/index.html
    Result: Chrome downloads index.html which contains helloÿ?worldÿ but does not open the page in the browser.

    Plotting app:
    Start server.py and load rtp.html via localhost:9877/rtp.html
    Result: Chrome waits for localhost indefinitely while the server says

    new sink ‘GET /websock_3/rtp.html HTTP/1.1rnHost: localhost:9877rnConnection: keep-alivernUser-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.0.249.43 Safari/532.5rnAccept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5rnAccept-Encoding: gzip,deflaternAccept-Language: en-US,en;q=0.8rnAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3rnrn’
    sent historical data

    When I close Chrome the server says ‘client closed’

    So something is happening, alright, but Chrome doesn’t show any plots.

  • chessweb

    Neither the plotting app nor the hello world example work as expected. Here is what I do and what I get:

    Hello World:
    Start server.py and load index.html via localhost:9876/index.html
    Result: Chrome downloads index.html which contains helloÿ?worldÿ but does not open the page in the browser.

    Plotting app:
    Start server.py and load rtp.html via localhost:9877/rtp.html
    Result: Chrome waits for localhost indefinitely while the server says

    new sink ‘GET /websock_3/rtp.html HTTP/1.1\r\nHost: localhost:9877\r\nConnection: keep-alive\r\nUser-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.0.249.43 Safari/532.5\r\nAccept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5\r\nAccept-Encoding: gzip,deflate\r\nAccept-Language: en-US,en;q=0.8\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3\r\n\r\n’
    sent historical data

    When I close Chrome the server says ‘client closed’

    So something is happening, alright, but Chrome doesn’t show any plots.

  • http://www.mit.edu/~y_z/ yang

    If the text/html is not loading in your browser, then something is probably mis-configured with your web server. Note that you need to serve the page from port 8888 of your web server.

  • http://www.mit.edu/~y_z/ yang

    If the text/html is not loading in your browser, then something is probably mis-configured with your web server. Note that you need to serve the page from port 8888 of your web server.

  • Pingback: Scalable, WSGI-compatible Websockets « Eventlet

  • Jason Tsao

    Hey,
    I'm having a problem running the code (the first simple example).
    I first run the Websocket server and run the server:

    ./server.py
    python -m SimpleHTTPServer 8888

    then I run the client::
    google-chrome client.html

    then I get the following error:
    $ ./server.py
    'GET / HTTP/1.1rnUpgrade: WebSocketrnConnection: UpgradernHost: localhost:9876rnOrigin: nullrnrn'
    Exception in thread Thread-1:
    Traceback (most recent call last):
    File “/usr/lib/python2.6/threading.py”, line 525, in __bootstrap_inner
    self.run()
    File “/usr/lib/python2.6/threading.py”, line 477, in run
    self.__target(*self.__args, **self.__kwargs)
    File “./server.py”, line 18, in handle
    s.send('x00worldxff')
    error: [Errno 32] Broken pipe

    Sorry I'm new to this and am not to sure how to fix the problem. Any advice?

  • Jason Tsao

    Nevermind! I figured it out. Chessweb if you still care:
    it should be:
    localhost:8888/index.html
    where index.html is where the websocket html is stored.

  • Jason Tsao

    Also, I just would like to say thank you so much yang. So far, this is the only tutorial on websockets that I have gotten to work. I like the simple approach you take and it helped me a lot when seeing this stuff for the first time. Thanks!

  • yaaang

    Glad this was of help, Jason — and glad that you figured out the problem. :)

  • Jason Tsao

    Hey yang,

    I”m trying to run the Real Time plotter, but I am getting the same error as chessweb (the first error):

    I first run the server.py and the http server:
    ./server.py
    python -m SimpleHTTPServer 8888

    I then add the data source in the terminal:
    {
    echo 'my data source'
    while true ; do
    echo “${i}000 $RANDOM”
    sleep 1
    done
    } | nc localhost 9876

    and the ./server.py responds with the
    new source my data source

    and use google chrome to goto
    http://localhost:9877/rtp.html

    I get the following error from the websocket server (./server.py):

    new sink 'GET /rtp.html HTTP/1.1rnHost: localhost:9877rnConnection: keep-alivernUser-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.9 Safari/533.2rnAccept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5rnAccept-Encoding: gzip,deflate,sdchrnAccept-Language: en-US,en;q=0.8rnAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3rnrn'
    sent historical data

    When this happens, the website kind of just hangs. It keeps on trying to load something, but nothing ever happens. Do I need to install anything? Do you know whats wrong? Thank you!

    Also:

    I also tried:
    http://localhost:8888/rtp.html
    The websocket server did not respond (nothing happened), while the http server had the following error:

    localhost – - [07/May/2010 15:46:01] code 404, message File not found
    localhost – - [07/May/2010 15:46:01] “GET /favicon.ico HTTP/1.1″ 404 -

    I think the first way was correct, but this is just in case.

  • Pingback: More WebSockets, now with Python! » dev.enekoalonso.com - having fun with code

  • Pingback: Web sockets | Software development

  • http://wink.com/p/SEOP-Inc. SEOP.com

    This post is simply amazing. It does have good information that is really helpful. Thanks for sharing.

  • Edenist

    hi yang, many thanks for the work on the tutorial!

    I have been having some troubles getting this to work, however.

    When I load up the page in chrome [http://localhost:8888/rtp.html], I get the following message on the server, and nothing actually appears on the webpage.

    Serving HTTP on 0.0.0.0 port 8888 …
    new source my data source
    localhost – - [08/Sep/2010 20:33:43] “GET /rtp.html HTTP/1.1″ 200 -

    I am using chrome 6, and this does not seem to work on both linux (ubuntu x64) or windows [over my local network].
    The same seems to happen when I try the first example, I simply get a box which says “closed” on it….

  • http://www.idcreate.net Web Design Edinburgh Scotland

    very good tutorial
    thanx

  • Eddy

    I am having a similar problem to one posted below (on the simple example)
    I’m running the server as outlined above.
    When I navigate to the url http://localhost:8888 (or http://localhost:8888/index.html

    The page loads with a 2 second delay and the ‘closed’ alert.

    stdout lookes like
    ——-8<—————————8X416rnCookie: user=eddyrnrnxdaxc2xedi;xbe/W’
    localhost – - [07/Jan/2011 17:58:15] code 404, message File not found
    localhost – - [07/Jan/2011 17:58:15] “GET /favicon.ico HTTP/1.1″ 404 -
    ————8<—————-8<———
    As far as I can see I'm running the example exactly as described.
    Browser is Chrome 8.0.552.224 on Ubuntu

    Any advice appreciated.

  • Roger Erens

    Hi Eddy,

    Chrome is sending a different handshake (as indicated by the ‘Sec-WebSocket-Key1′ phrase) from what the server is expecting. The server is using the outdated draft-74 protocol, whereas your version of Chrome is using probably draft-76 (also outdated; currently I see
    http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-04 as the most recent draft, but I don’t know if it is or will be implemented in a browser already). Earlier today I got a working draft-76-server — chrome combo working with the help of
    http://atrueswordsman.wordpress.com/2010/12/01/python-websocket-server-simple-version/

    Mind the 127.0.0.1 vs. localhost issue.

    Cheers,

    Roger

  • James Mills / prologic

    Based on the code found in this blog (and other reading material on the web) circuits.web (1) as of revision fe3965b21fe2 now implements a working WebSockets Server:

    https://bitbucket.org/prologic/circuits/changeset/fe3965b21fe2

    This allows you to have a WebSockets server that sits alongside your application server (no need for a separate server, etc). This will be integrated into the circuits.web namespace before the next release.

    cheers
    James

    1. http://bitbucket.org/prologic/circuits/

  • Anonymous

    Yeap, things have changed. Thanks for replying, Roger. I really should either update this post or link to someone’s up-to-date tutorial…

  • Anonymous

    Interesting, thanks James. Is Circuits comparable to Twisted?

  • James Mills / prologic

    Hi yaaang, circuits has a different implementation and architecture than Twisted, but yes it does share similar features (eg: asynchronous).

  • http://sumppumpbatterybackups.blogspot.com/ Sump Pump Battery Backup
  • Chehayeb Makram

    Hi I have a python server on my machine binding a socket to its
    IP ADDRESS and to a port that I chose(50001) I would like to build a
    simple web application that allows me only to send a string from a
    websocket to my raw TCP socket what should I do? I am actually trying to implement a web Remote Control for my nao robot any help will be much appreciated

  • Tobias Berthold

    same here-first websocket example that I found and works. Thank you!!!