Proto2 vs Proto3

Last Updated: 2022-02-12

Proto1 is deprecated.

Proto3 is a simplification of Proto2. Both Proto2 and Proto3 are active


Proto2 and proto3 are wire compatible: the same construct in proto2 and proto3 will have the same binary representation. Which means they can reference symbols across versions and generate code that works well together.



  • Proto2: supports optional natively, the wrapper types were recommended but should be avoided in new applications.
  • Proto3: originally did not support presence tracking for primitive fields. As of 2020, proto3 supports both optional fields which have has_foo() methods and "singular" fields, which do not. Be sure to use optional if your protocol requires knowledge of field presence.

Default values

Proto3 does not permit custom default values. All fields in proto3 have consistent zero defaults.

Required fields

Proto3 removes support for required fields.

Enums Defaults

  • Proto3: enums require an entry with the value 0 to act as the default value.
  • Proto2: enums use the first syntactic entry in the enum declaration as the default value where it is otherwise unspecified.

Enums Unrecognized

In languages with closed enums (ex. Java):

  • all proto3 enums generate an UNRECOGNIZED entry to accommodate unknown enum values. proto3 setters prohibit UNRECOGNIZED values, so a simple copy of an enum field from one proto to another will crash if the enum field value is UNRECOGNIZED
  • Proto2 enums never represent unknown enum values, but instead place them in the unknown field set. A proto2 enum can have confusing behavior (ex. repeated fields report incorrect counts and are reordered in reserialization when an unknown value is encountered)

Enums cross reference

  • A proto2 message can reference a proto3 enum or message
  • A proto3 message cannot reference a proto2 enum due to differences in semantics.

Extensions / Any

Proto3 removes support for extensions; instead use of Any fields to represent untyped fields. The extensions mechanism is wire compatible with a normal field declaration whereas Any is not, so a field cannot be changed to an Any as the schema evolves, while it could be changed to an extension in proto2.

Any is significantly more verbose on the wire as it uses a string based type_url as a key while extensions use a varint encoded field number.

Parsed eagerly or lazily:

  • Extensions (other than MessageSet) are parsed eagerly (and sometimes selectively if you provide a custom ExtensionRegistry)
  • Any is always parsed lazily. This delta in performance profile may be important for some applications (e.g. an Android app may prefer to parse messages off the UI thread).

String field validation

Protocol Buffer string fields have always been documented to be UTF-8 encoded.

  • Proto2 does not validate that inbound / outbound bytes are indeed UTF-8 encoded.
  • Proto3 validates that all string fields are appropriately UTF-8 encoded during parsing and in byte-oriented setters.

This validation means that parsing string fields in proto3 is more CPU intensive and parse failures are possible when passed an improperly structured string field. The flipside is that eager validation ensures that the problem can be identified quickly and resolved at the source.

String field parsing

In Java, proto3 parses String fields as UTF-8 eagerly whereas proto2 parses them lazily.

JSON support

Proto3 defines a canonical JSON specification for all features whereas there is no specification for various proto2 features like extensions. The behavior of proto2 features is thus implementation-dependent.


  • proto3 adds int min/max sentinels to C++ enums, preventing use of -Werror,-Wswitch.
  • In proto3, optional fields cannot be changed to repeated because that will cause old messages to be declared invalid.
  • it is unsafe to rename or change proto packages of any proto used in an Any proto. Extension resolution is numeric, like field numbers. Any proto resolution is stringy like stubby methods.