Category Archives: feature article

Making sense of OpenID, OAuth, OpenSocial, Google Friend Connect, Facebook Connect, and more

Last Thursday I dropped in to the Google SIPB hackathon, where I got a chance to chat with several Googlers in the Cambridge office about the whole ecosystem of decentralized identity and social networking services. I had actually previously spent a bit of time searching for a high-level map laying out how these various services related to each other, strictly out of curiosity, but never really found anything that was succinct, clear, and free of BS. There also seems to be a lot of contradictory information and general confusion. Along with the recent news expecting similar service stacks from Twitter, it seems timely to share all the things I’ve been learning.

The executive summary:

  • OpenID: authentication; use one login across many sites
  • OpenID Attribute Exchange: a key-value store protocol for OpenID
  • OAuth: authorization and authentication; grant sites specific access to your account
  • OAuth WRAP: a simpler OAuth that leverages PKI instead of its own signature scheme
  • OpenSocial: a standard API into social networks
  • Google Friend Connect: an OpenSocial router, plus a bunch of other less-important stuff
  • Facebook Platform: all the above (and a bit more), for the Facebook stack
  • Facebook Connect: establish links between Facebook and third-party user account systems
  • Portable Contacts: just the slice of OpenSocial that deals with contacts

Continue reading

Web Sockets tutorial with simple Python server

The Landscape: HTML5

HTML5 is an emerging and in-flux client-side standard for developing web applications. It’s really more of a rich client platform specification than just a markup language, including the following slew of new features:

  • canvas for vector graphics
  • video and audio for multimedia
  • local offline storage
  • drag and drop operations
  • Web Socket API for bidirectional client-server communications
  • GeoLocation API
  • standard WYSIWYG HTML editor component
  • Web Workers API for message-passing processes
  • webcam and microphone access
  • 3D graphics rendering engine
  • and more…

A lot of this effort is about wrapping up and building into browsers native support for various (proprietary) technologies already in widespread use on the Web, such as Flash video/webcam/mic and Google Gears offline/drag-and-drop. Others are about cleaning up things for which there currently exist various hacks, and Web Sockets fall into this category.

Introducing Web Sockets

This Chromium blog post contains a nice introduction to Web Sockets:

The Web Sockets API enables web applications to handle bidirectional communications with server-side process in a straightforward way. Developers have been using XMLHttpRequest (“XHR”) for such purposes, but XHR makes developing web applications that communicate back and forth to the server unnecessarily complex. XHR is basically asynchronous HTTP, and because you need to use a tricky technique like long-hanging GET for sending data from the server to the browser, simple tasks rapidly become complex. As opposed to XMLHttpRequest, Web Sockets provide a real bidirectional communication channel in your browser. Once you get a Web Socket connection, you can send data from browser to server by calling a send() method, and receive data from server to browser by an onmessage event handler. A simple example is included below.

In addition to the new Web Sockets API, there is also a new protocol (the “web socket protocol”) that the browser uses to communicate with servers. The protocol is not raw TCP because it needs to provide the browser’s “same-origin” security model. It’s also not HTTP because web socket traffic differers from HTTP’s request-response model. Web socket communications using the new web socket protocol should use less bandwidth because, unlike a series of XHRs and hanging GETs, no headers are exchanged once the single connection has been established. To use this new API and protocol and take advantage of the simpler programming model and more efficient network traffic, you do need a new server implementation to communicate with — but don’t worry. We also developed pywebsocket, which can be used as an Apache extension module, or can even be run as standalone server.

(The mentioned technique of a long-hanging GET is also known as Comet.)

Chrome is presently the only browser that has Web Sockets, and only in the Dev channel releases. Firefox and Safari/WebKit support are under way, according to the implementation status page.

The Web Sockets Protocol

The protocol has the client and server do an HTTP-style handshake, where all text is in UTF-8 and newlines include a carriage return and newline. After this, arbitrary data can be sent back and forth, but delimited in frames, which begin with a 0×00 byte and end with a 0xff byte. Contrast this with the byte stream abstraction presented by raw TCP—having the system hand you whole frames frees the application from having to manually buffer and parse messages out of the stream (which the browser may be able to do more efficiently).

As for how this mixes with browser security policies, the basic gist is that the same-origin policy no longer applies. Requiring the Web Socket to communicate only with the same origin (same host and port as where the HTML/Javascript came from) would be a barrier to deployment because it would require the httpd to additionally speak Web Sockets. (That said, the default port for Web Sockets is in fact port 80.) More generally, this prevents all cross-site communication, which is critical to many classes of applications such as mash-ups and widget dashboards.

But the protocol does require the browser to send the origin information to the server, the server to validate this by echoing the origin, and finally the client to validate that the server echoed this. According to the protocol specs, the response must include the exact same origin, location, and protocol as the request, where:

  • the origin is just the (protocol, host, port) triplet (http://foo.com:8080/),
  • the location is the target of the request (ws://bar.com:81/path/to/some/resource), and
  • the protocol is an arbitrary string used to identify the exact application-level protocol expected.

(Note that the origin is different from the Referrer, which includes the full resource path, thus leading to privacy concerns. I hope to write more on the Origin header in a broader context and client-side web security in general soon.)

Example Client and Server

To give you a flavor of how to write a complete end-to-end web application using Web Sockets, the following is a simple client and server application where the server sends two messages down to the client, “hello” and “world.” This example is from my sandbox.

The client-side API for Web Sockets is very simple. The example client just connects to a server on port 9876 and alerts the user of each new message. Just to make this a wholesome HTML5 experience, we’ll write everything in XHTML5 (yes, there exists a stricter, XML flavor of HTML5 for those who preferred XHTML over HTML tag soup):

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Web Socket Example</title>
    <meta charset="UTF-8">
    <script>
      window.onload = function() {
        var s = new WebSocket("ws://localhost:9876/");
        s.onopen = function(e) { alert("opened"); }
        s.onclose = function(e) { alert("closed"); }
        s.onmessage = function(e) { alert("got: " + e.data); }
      };
    </script>
  </head>
    <body>
      <div id="holder" style="width:600px; height:300px"></div>
    </body>
</html>

Now for the server, which is written in Python. It sends the two messages after a one-second delay before each. Note that the server is hard-coding the response to expect a locally connected client; this is for simplicity and clarity. In particular, it requires that the client is being served from localhost:8888. A real server would parse the request, validate it, and generate an appropriate response.

#!/usr/bin/env python

import socket, threading, time

def handle(s):
  print repr(s.recv(4096))
  s.send('''
HTTP/1.1 101 Web Socket Protocol Handshake\r
Upgrade: WebSocket\r
Connection: Upgrade\r
WebSocket-Origin: http://localhost:8888\r
WebSocket-Location: ws://localhost:9876/\r
WebSocket-Protocol: sample
  '''.strip() + '\r\n\r\n')
  time.sleep(1)
  s.send('\x00hello\xff')
  time.sleep(1)
  s.send('\x00world\xff')
  s.close()

s = socket.socket()
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('', 9876));
s.listen(1);
while 1:
  t,_ = s.accept();
  threading.Thread(target = handle, args = (t,)).start()

To run the above, start the Web Socket server (./server.py) and start a web server on port 8888 serving index.html:

./server.py &
python -m SimpleHTTPServer 8888

Further Exploration

For a more complete application (that’s still reasonably simple), I threw together a real-time X-Y scatter plotting application called Real-Time Plotter. It plots some number of data sources and supports streaming to multiple clients.

The Python server listens for data sources on port 9876. It expects a stream of text, where the first line is the name of the data source and each subsequent line contains a space-separated x-y pair of floating point numbers in the series to be plotted. It listens also on port 9877 for Web Socket clients. A simple data source that issues a random y-value per second can be started from the shell using netcat:

{
  echo 'my data source'
  while true ; do
    echo "${i}000 $RANDOM"
    sleep 1
  done
} | nc localhost 9876

The client page uses Web Sockets to connect to the server and fetch historical data, as well as start streaming new data. Plotting is done using Flot, a jQuery library for generating decent-looking plots. For throttling when the server is streaming new points quickly, the client only fetches new data (by sending an empty frame) after a complete redraw; the server responds by sending a batch of all new points since the last fetch. (Note: the server’s pump routine currently treats the x values as millisecond timestamps and only issues a single point per second, but this can be easily tweaked/removed.)

Web Sockets can also be used over TLS. This is done by using wss: instead of ws: in the URL, and this defaults to the HTTPS port 443.

HOWTO: Crack Yahoo’s Intranet

Ben Reed gave me a run-down of how he managed to win Yahoo’s internal Crack Day by getting into the corporate network and making commits to their code repositories, all using off-the-shelf tools and without writing a line of code. These flaws have been fixed by now.

The first step is to associate with the wireless network. The network is secured using WEP, which is straightforward to crack using WEPCrack. This can be done from anywhere near the campus premises, and it took Ben at most 30 minutes.

Yahoo uses Cisco VPN and Aruba VPN. Cisco VPN turns out to be IPsec with extensions, and Yahoo’s is configured to use pre-shared keys (as opposed to certificates/PKI). It’s also configured to use aggressive mode for a faster three-packet initial key exchange, as opposed to main mode, which uses six packets but is more secure. From the excellent NIST Guide to IPsec VPNs:

…unlike main mode, aggressive mode can be used with pre-shared secret key authentication for hosts without fixed IP addresses. However, with the increased speed of aggressive mode comes decreased security. Since the Diffie-Hellman key exchange begins in the first packet, the two parties do not have an opportunity to negotiate the Diffie-Hellman parameters. Also, the identity information is not always hidden in aggressive mode, so an observer could determine which parties were performing the negotiation. (Aggressive mode can conceal identity information in some cases when public keys have already been exchanged.) Aggressive mode negotiations are also susceptible to pre-shared key cracking, which can allow user impersonation and man-in-the-middle attacks. Another potential issue is that while all IPsec devices must support main mode, aggressive mode support is optional. Unless there are performance issues, it is generally recommended to use main mode for the phase one exchange.

The flaws were pointed out back in 1999:

Aggressive mode does not usually provide identity protection, as this option is not required to be implemented. The identities can be exchanged in the clear, before a common shared-secret has been established. This is considered a feature for mobile users. Yet it is mobile users who are most likely to be affected by eavesdropping on wireless links. Such revealed identities are long-term liabilities. Compromised identities continue to be useful to an adversary until all participants have revoked the associated permissions. Identity attacks are extremely easy and may be mounted from anywhere on the Internet. Moreover, the revealed identities might be encrypted in other exchanges. This provides a ripe opportunity for cryptanalysis of those exchanges. This fundamental design flaw is inherent in the specification, and remediation will require removal of the aggressive mode feature.

Windows laptops are required to use the Aruba VPN client, of which we know nothing. Only Macs use the Cisco VPN client, so we need to find some Macs; this can be done with passive OS fingerprinting tools like p0f by Michal Zalewski.

Now we can use ARP spoofing to fool the Macs into thinking we’re the gateway and sending all their packets through us. Ben used ARPoison. Make sure you’ve activated IP forwarding on your system so that you actually route packets.

Once we’re the man in the middle, we can use FakeIKEd to nab the credentials:

FakeIKEd, or fiked for short, is a fake IKE daemon supporting just enough of the standards and Cisco extensions to attack commonly found insecure Cisco PSK+XAUTH VPN setups in what could be described as a semi MitM attack. Fiked can impersonate a VPN gateway’s IKE responder in order to capture XAUTH login credentials; it doesn’t currently do the client part of full MitM.

One problem is that you need to tell fiked the shared key. You can get this by grabbing a copy of Yahoo’s VPN client from Yahoo Frontyard, which is Yahoo’s external-facing site for employees. However, the site requires a valid login and is served over HTTPS. The trick is that it installs an authenticator cookie, which is also sent along with non-HTTPS requests to other yahoo.com sites.

Note that the VPN system has since been reconfigured, and now uses certificates/PKI to avoid MITM attacks. Furthermore, the VPN authentication has been augmented with RSA SecurID, which provides a rolling token for two-factor authentication. This complicates the attack, though SecurID is still vulnerable to MITM attacks executed within the appropriate timeframe.

Once you’re in the corporate network, you can mount user home directories which are exported over NFS. Permissions are not enforced over NFS (which is designed to be used by a trusted set of hosts, but apparently the NFS servers here don’t use any such list of hosts), so you can assume any user ID and touch anyone’s files—including their SSH authorized_keys file. Simply drop your public key in ~filo/.ssh/authorized_keys, and you can now log in as David Filo and (among other things) commit code.

Yahoo has since disabled the ability for sshd to use users’ authorized_keys files and instead has a separate mechanism for adding public keys.

Summary

  • preliminary: get the VPN shared key
    • find a Yahoo employee at a cafe and sniff his cookies to yahoo.com sites
    • this traffic includes the cookies to Yahoo Frontyard, Yahoo’s external-facing site for employees
    • log in to Yahoo Frontyard as this employee and grab the VPN shared key
  • crack the wireless network WEP on the Yahoo campus
  • find a Mac using OS fingerprinting; Macs use the weak Cisco VPN
  • become the gateway router for the Mac by spoofing ARPs, thus causing all their IP packets to go through you
  • (using the VPN shared keys stolen earlier) interpose as the fake VPN server, thus stealing the user’s credentials and gaining access to the corporate network
  • mount an NFS-exported user directory, and inject your key into someone’s .ssh/authorized_keys

Cooperative threads for C/C++

Event-based programming is a PITA. In C/C++, there are a number of libraries that provide cooperative threads (a.k.a. fibers or lightweight threads or…), which can make single-kernel-threaded asynchronous programming less painful.

I’ve been using the solid, fast, and simple State Threads library. I haven’t used Pth, nor do I have a good sense of the performance difference, but that appears to be the only other option worth considering. There exist other libraries, and you can find links to most of them from the Pth links page, but they aren’t as useful out of the box. To mention a few:

  • libcoro, libcoroutine, and libpcl (which supersedes coro) don’t provide IO, only basic coroutine primitives, and don’t appear to be widley used or supported.
  • libtask provides IO, but the implementation uses the slower ucontext.h primitives, the IO functions aren’t very rich (they don’t appear to offer timeouts), and again the library doesn’t appear to be widely used or supported.

Pitfalls: tooling, composition, memory

Using something like this in your next project is always a bit of a risk. Your code will almost certainly be more natural to express and easier to maintain, but you may encounter incompatibilities with other systems/libraries/tools. For instance, one major issue is that ST is (or at least has been at some point in the past) incompatible with system threads. I haven’t stress-tested this, but mixing the two hasn’t produced any problems for me yet. (The problem, if you’re curious, appears to be due to both pthreads and ST attempting to use the same location to store thread-local data.) Other tools that I’ve found to not play nicely with ST include Valgrind and various CPU profilers (gprof, Google perftools, etc.). gdb works fine, though.

A downside to using cooperative threads instead of events is that you can’t compose together multiple asynchronous operations without using multiple stacks. E.g., if I wanted to listen for the first of n sockets to yield something to read, and I wanted to do this using only the provided primitive IO functions, I’d need to perform a read from each of n threads (and have the winner send interrupts to the losers). To do this more efficiently, I’d need to write my own IO functions, ones that cannot be expressed using the threaded abstraction, but must instead fall back to raw asynchronous operations.

In practice, I haven’t had to do much arbitrary composition. Composing IO operations with sleeps (in other words, adding timeouts to IO operations) is important, but that tends to be built in to the IO functions already provided by the threading library. Composing together inter-thread signals and IO operations is also important (e.g. for interrupts), but interrupts also tend to be already provided by the library.

Another downside is that events may be more scalable/flexible in terms of memory consumption—this is one of the main reasons why (system) threads are regarded as generally less scalable than events. The stack requires contiguous memory, whereas event state objects are dynamically heap-allocated, so you needn’t to worry about providing enough room in advance for each “stack.” Conservative over-estimates occupy more space than necessary, so when juggling hundreds of thousands of connections, you may exhaust main memory and incur excessive swapping (thrashing).

Variations: Tamer, Capriccio

Tamer is a preprocessor that lets you annotate functions that are expected to perform asynchronous operations as “tamed”. You also annotate which “stack” variables are used across these calls. Tamer transforms these tamed functions into objects with the appropriate callbacks, and it changes the “stack” variables into fields so that they persist across callbacks.

The Tamer approach lets you compose events without the stack expense, but every time you enter/exit a tamed function, a state object is dynamically allocated/freed. For another approach, Capriccio is a system from Berkeley that introduces a simple idea: use linked stacks, or dynamically growable stacks. I hope we eventually see this concept used in a maintained system that actually builds and is usable on modern platforms.

Also, Tamer’s annotation requirement is both a blessing and a curse: annotations clearly identify at what parts of your program you may be preempted, but of course this reduces flexibility, since you can’t (say) embed an asynchronous operation into certain places, such as inside an streambuf::underflow() (this is the method that pulls more data from the streambuf’s underlying stream source).

In an ideal world, you won’t need to annotate functions, but the fact that a function is asynchronous should be inferred by tools (starting with a root set of primitive asynchronous operations), so that your IDE can highlight which functions are asynchronous. This grants you the freedom to change what’s asynchronous and what isn’t without propagating lots of annotation changes throughout your code, but still remain cognizant of where you might be preempted and where you definitely won’t. This would only work given enough static assumptions, though—hiding asynchronous IO operations behind any form of dynamic dispatch/polymorphism would preclude an analysis that is both accurate and useful.

Lazy C++

Ever been fed up with having to separate declaration from definition in C++? Annoyed by the need to forward-declare your types/variables/functions, or at least write them in dependency-dictated order? Sick of keeping header files in sync with the source? With Lazy C++, you can leave your frustrations behind! Simply plop both your declarations and your definitions into a single file, and Lazy C++ will automagically tease the code apart into a standard header file and source file. The website and documentation provide self-explanatory examples:

For example, given the following code:

// A.lzz
class A
{
public:
  inline void f (int i) { ... }
  void g (int j = 0) { ... }
};
bool operator == (A const & a1, A const & a2) { ... }

Lzz will generate a header file:

// A.h
#ifndef LZZ_A_h
#define LZZ_A_h

class A
{
public:
  void f (int i);
  void g (int j = 0);
};
inline void A::f (int i) { ... }
bool operator == (A const & a1, A const & a2);
#endif

And a source file:

// A.cpp
#include "A.h"

void A::g (int j) { ... }
bool operator == (A const & a1, A const & a2) { ... }

Lazy C++ is a simple idea, but its implementation is at best tedious, requiring the parsing of a non-trivial subset of C++. It gets things right, though. It knows, for instance, to place inlines and templates into headers, to move method and function bodies out of the header when possible, and to restrict static declarations to the source. It also has escape hatches to let you directly place code into the header or source (this is particularly useful when using macros like unit test definitions). It tries to leave function bodies unparsed, which provides flexibility in allowing for C++ language extensions (such as clamp’s lambda expressions).

D—whose primary implementation recently had its source released (albeit not fully free nor open-source)—takes things a step further:

Features To Drop [from C/C++]

  • Forward declarations. C compilers semantically only know about what has lexically preceded the current state. C++ extends this a little, in that class members can rely on forward referenced class members. D takes this to its logical conclusion, forward declarations are no longer necessary at the module level. Functions can be defined in a natural order rather than the typical inside-out order commonly used in C programs to avoid writing forward declarations.
  • Include files. A major cause of slow compiles as each compilation unit must reparse enormous quantities of header files. Include files should be done as importing a symbol table.

While I’m on it: D is a pretty interesting language to follow, for a number of reasons. It has the feel of being a sensible evolution and clean-up of C++, and it’s the only thing out there besides C/C++ that one can pick up and start using right away as a systems programming language with (accessible) manual memory management. D is actually being used/gathering libraries within a growing community. Also, the language has some pretty good people working on its design, including C++ experts like Andrei Alexandrescu.

The C++ Lambda Preprocessor

Tired of waiting around for anonymous closures to arrive in C++0x (or at least in GCC)? Boost.Lambda and other hacks leaving you pining for something with free variable capture? Get them today with the amazing C++ Lambda Preprocessor (“clamp”)!

Here’s a little demo illustrating the use of lambdas with threads—something which may be more interesting to systems programmers than the simple accumulator examples on the clamp homepage.

#include <boost/thread.hpp>
#include <iostream>

using namespace boost;
using namespace std;
#include "lambda_impl.clamp_h"

enum { n = 3 };
int main() {
  const string msgs[n] = {"first", "second", "third"};
  mutex m;
  for (int i = 0; i < n; ++i) {
    boost::thread(lambda() {
      mutex::scoped_lock l(__ref(m));
      cout << "message from thread " << __ctx(i)
           << ": " << __ref(msgs[i]) << endl;
    });
  }
  return 0;
}

// clamp < clamp2.cc.clamp | sed '1d' > clamp2.cc
// g++     clamp2.cc  -lboost_thread-gcc43-mt -o clamp2
// ./clamp2

There are a few annoyances with clamp:

  • All free variables must be explicitly captured with __ref() or __ctx(). This is because clamp doesn’t try to actually parse your C++ (understandably). (Not parsing the C++ also makes it more flexible, accommodating various language extensions.) Contrast this with C++0x, where you can use short-hand notations like [&], which captures all free variables by reference (perhaps the most common case).
  • You will most likely need to muck with the output. clamp generates a number of header files that form the first #include in your source file. However, if you ever refer to global types/variables/functions, you will need to reorder things, add forward declarations, separate the generated header contents into declarations and definitions, etc. This is made nearly trivial, however, if you also use Lazy C++, which I’ll cover next time.

Despite these minor quibbles, clamp is simple and effective. The generated code is straightforward—pretty much exactly what you would expect to write yourself by hand.

unique_ptr and nullptr for C++03

(I’ve been using C++ a bunch lately, so the next several posts will probably be pretty C++-centric.)

unique_ptr is the future. And the future is now! Start using unique_ptr today with this little gem, a C++03-compatible implementation. Of course, you get fancy new C++0x move semantics, pioneered by auto_ptr, but now safer, saner, more widely usable (e.g. in containers), etc. unique_ptr also supports arrays without a separate unique_array class: unique_ptr<T[]> is template-specialized. (I won’t spend this post describing unique_ptr; the provided links already do a much better job at that.)

Some more opinion from the C++03-compatible unique_ptr page:

A const unique_ptr may be considered a better scoped_ptr than even boost::scoped_ptr. The only way to transfer ownership away from a const unique_ptr is by using const_cast. Unlike scoped_ptr, you can’t even swap const unique_ptr’s. The overhead is the same (if using the default_delete or an empty custom deleter – sizeof(unique_ptr<T>) == sizeof(scoped_ptr<T>)). And custom deleters are not even a possibility with scoped_ptr.

(Note BTW that boost::interprocess::unique_ptr is not the same as the C++0x unique_ptr.)

In general, you can already achieve move semantics without rvalue-references, just by using const_cast (or mutable). This is indeed how this unique_ptr implementation works. The core idea illustrated:

template<typename T>
class moveable_ptr {
  T *p;

  // Disable copying from lvalues.
  moveable_ptr(moveable_ptr<T> &o) : p(nullptr) {}

public:
  moveable_ptr(T *p) : p(p) {}

  // In C++0x:
  // moveable_ptr(moveable_ptr<T> &&o) : p(o.p) {
  //   o.p = nullptr;
  // }

  // In C++03, settle with:
  moveable_ptr(const moveable_ptr<T> &o) : p(o.p) {
    const_cast<T*>(o.p) = nullptr;
  }

  ~moveable_ptr() { if (p) delete p; }
  T *get() const { return p; }
  ...
};

Notice above that I use nullptr instead of 0 or NULL—that’s another C++0x feature you can begin using right away, and without any hackery at all.

These goodies are all collected in my growing C++ Commons library.

C++ copy elimination

What does the following C++ program print?

class A {
  public:
    A()           { cout << "A()" << endl; }
    A(const A &a) { cout << "A(A)" << endl; }
};
A g() { A a; return a; }
int main() { A a(g()); }

The answer: it depends on what compiler you’re using. You might expect:

A()
A(A)

With most full-featured C++ compilers, however (including GCC), the constructor is called once:

A()

This turns out to be an “optimization” that is explicitly mention in the language standard. It’s one for which C++ implementations are allowed to forgo the straightforward semantics. From section 12.8.14 of the C++98 standard:

Whenever a temporary class object is copied using a copy constructor, and this object and the copy have the same cv-unqualified type, an implementation is permitted to treat the original and the copy as two different ways of referring to the same object and not perform a copy at all, even if the class copy constructor or destructor have side effects. For a function with a class return type, if the expression in the return statement is the name of a local object, and the cv-unqualified type of the local object is the same as the function return type, an implementation is permitted to omit creating the temporary object to hold the function return value, even if the class copy constructor or destructor has side effects. In these cases, the object is destroyed at the later of times when the original and the copy would have been destroyed without the optimization.

This is described more elsewhere in the spec, with examples (e.g. 12.2.2). The MSDN Library also has an article on named return value optimization.

In the end, it’s nice to be able to write more expressive/higher-level-looking code without worrying about unnecessary copies being made. However, you may have to pay attention to certain cases where your ctors/dtors have side effects you’re relying on, and it may be subtle to figure out when copies are or aren’t actually being eliminated.

More experimentation with this feature can be found in this source file, part of my ever-growing sandbox of toy code.

Thanks to SIPB and ##c++ for the discussion.

Practical media downloading

A random assortment of tips for getting stuff with some mix of expedience, safety, and broad coverage.

  • For single tracks, start with Skreemr. It doesn’t manage to find that much, unfortunately, but it’s a quick check. Other similar services include BeeMP3 and SeeqPod, which have (IMO) clumsier UIs. Once you’ve checked with Skreemr, just try Googling for the artist, title, and “mp3” – this usually works, esp. for popular tracks, but is less slick than Skreemr since you’ll probably land at a site where you need to satisfy some captcha, run a gauntlet of ads or gaudy web designs, wait some number of seconds, find a broken link, and retry with the next few hits.
  • Alternatively, download full albums. Even if you’re after just a single track, if you can’t find it, you may be able to find its album. You can do this by Googling for the artist and album name alongside the names of popular “private” file-sharing services (“rapidshare | megaupload | …”). These file hosting services are actually good for finding large media files in general, including movies and TV shows. I have a little script for making these kinds of searches.
  • You can always resort to BitTorrent or a file-sharing network if you’re feeling a bit more promiscuous. If you’re on Windows, one of the better file-sharing clients is Shareaza, which supports Gnutella, ed2k, and “Gnutella2.” For BitTorrent, Shareaza’s support wasn’t that great when I tried. I’ve heard good things about uTorrent, which is probably the most popular client on Windows, but I’ve only ever used the cross-platform Vuze (formerly Azureus). Vuze sports an awesome search interface that lets you breezily grab torrents from some of the popular sites out there including btjunkie and mininova.
  • When engaging in promiscuous file-sharing, try to operate from a WLAN for which logs are not kept (e.g., StataCenter). For a general public WLAN, you can fiddle with your MAC address and minimize concurrent traffic from the same host (esp. anything that could identify you).
  • You can use an IP filter (“firewall”) to prevent communication with certain hosts. BlueTack maintains popular blocklists. A paper from UCR called “P2P: Is Big Brother Watching You?” (Ars Technica article) concludes that users will exchange data with blocklisted users 100% of the time, and that blocking just 5 IPs reduces this to 1%.
  • Blocklist managers are optimized to filter large numbers/ranges of IPs. PeerGuardian 2 and moblock are popular blocklist managers for Windows and Linux, respectively.
  • Should you be simply unable to find the song anywhere but a streaming source, such as Songza, Last.fm, MySpace, the artist’s website, etc., then just capture your system audio output while playing it from the streaming source. (Streaming audio quality tends to be lower, though.)
  • Find iTunes shares and public file shares (CIFS/SMB, FTP, etc.) on your LAN. Might work well if you’re in something like a dorm or frat setting, depending on the network configuration. I don’t know if there’s some working continuation of myTunes or if ourTunes still works, but those are places to start looking for iTunes pulling.
  • For TV shows, another possibility is to just watch them streamed from the web. You can find a lot on YouTube (though certain videos might just be around for short windows of time). Sidereel is a community that organizes links to these videos into shows and episodes.

Updated 7/21/2009: added note on Googling for single tracks.

Web form i18n

By default, if you submit a web form containing certain non-ASCII (Unicode) characters, the browser sends ambiguous encodings of these characters, such that completely different text appears indistinguishable to the server.

An Example

For instance, if you enter the Chinese character for “one” into some text box in a web form:

the raw characters that the server receives off the socket are:

%26%2319968%3B

which is the URL encoding of the NCR of that Unicode character:

&#19968;

The NCR is just an HTML entity that refers to the numeric ID of a Unicode character; this character’s numeric ID is 19968.

Now if you instead wrote that NCR literally into the same text box:

&#19968;

the raw characters that the server receives off the socket are again:

%26%2319968%3B

Notice that it is impossible for the server to determine whether the user had typed in a Unicode character or the NCR thereof. Ideally, we want the server to see the direct URL encoding of the Unicode character (not its NCR):

%E4%B8

Note: we always need URL encoding because that’s how HTML form data is transmitted. HTTP is a (mostly) ASCII protocol, and form data also needs to be delimited (key=value&key=value&... means that at least = and & characters must be URL encoded).

Unsuitable Charsets

Why is the browser sending the URL encoding of the NCR instead of the URL encoding of the original character? Because the browser is using an unsuitable character set. By default, the browser uses an ASCII charset, meaning it assumes the server only handles ASCII characters. When the user enters a non-ASCII character, the browser won’t be able to send this. However, instead of failing, it silently makes an effort to capture the character in ASCII by sending its NCR.

This implicit and ambiguous encoding behavior is AFAICT undefined, but all browsers I’ve tried seem to do this. More relevant information can be found in the HTML 4 specs on forms.

Specifying Charsets

One way to specify a suitable charset (e.g., UTF-8) that the browser should use for submitting data is to add

accept-charset="UTF-8" enctype="multipart/form-data"

to your form element. According to the specs:

The content type “multipart/form-data” should be used for submitting forms that contain files, non-ASCII data, and binary data.

This means that your form data doesn’t need get URL encoded. If Leaving it out, so that the URL encoding of the UTF-8 encoding is sent over the wire, is fine.

Another way is to leave the form alone and to specify the charset either in the HTTP response header:

Content-type: text/html; charset=utf-8

or in the HTML page head:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

These specify the page encoding, and the form submission encoding should default to the page encoding:

The default value for this attribute is the reserved string “UNKNOWN”. User agents may interpret this value as the character encoding that was used to transmit the document containing this FORM element.

Specifying a Unicode encoding for the entire page also allows you to display Unicode characters, as opposed to just accepting them in form input. More details on specifying the page encoding are in the HTML specs on charsets.

Notice that the charset specifications are actually specifying a particular encoding for that charset (e.g., UTF-8 for the Unicode character set). In Python, this data can be decoded using bytes.decode('utf-8') (str.decode('utf-8') also works in Python 2), which returns a unicode (or str in Python 3).

The moral of the story: to make sure your application is prepared to work with Unicode properly, you will always want to specify a Unicode encoding like UTF-8 in either your form’s accepted-charset or in your page’s Content-Type.

I was able to find a good discussion of this topic in this Bugzilla ticket.