Friday, July 4, 2014

Thrift vs. Protobuf

If You Choose Protobuf:

  • Languages Supported: Primarily Java, C++, and Python.
  • Experimental Use: While third-party implementations are available, they are not recommended for production.
  • RPC Implementation: Requires custom implementation for RPC.
  • Data Size: Protobuf typically has a smaller data size compared to Thrift.
  • Documentation: Protobuf has rich documentation.
  • Compatibility: Works well with frameworks like Netty and Infinispan.

If You Choose Thrift:

  • Languages Supported: In addition to Java, C++, and Python, Thrift supports many other languages, including:
    • PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml, Delphi, and others.
  • Data Structures: Supports Map and Set data structures.
  • RPC Implementation: No need for custom RPC implementation, as Thrift handles it for both the server and client.
  • Documentation: Documentation and examples are somewhat lacking.

Reference Usage:

  • Protobuf is used by:

    • Google
    • ActiveMQ for message storage
    • Netty
  • Thrift is used by:

    • Facebook
    • Cassandra project
    • Hadoop for HDFS API
    • HBase for cross-language API
    • Hypertable
    • LastFM
    • DoAT
    • ThriftDB
    • Scribe
    • Evernote for public API
    • Junkdepot
  • Avro: While not explored in detail here, Avro includes schema information during serialization, but it is known for bugs and lack of adequate documentation.

Size Comparison:

  • Thrift (TCompactProtocol): 278 bytes
  • Thrift (TBinaryProtocol): 460 bytes
  • Protobuf: 250 bytes (winner)

Runtime Performance:

  • Thrift generally has an advantage in terms of CPU usage and latency.

Reference Link: