One of the first things you’ll notice in Go is that two different types are commonly used for representing text.
byte. A quick example is the regexp package which has functions for both
What is the difference?
string is immutable and
byte is mutable. Both can contain arbitrary bytes.
The name “string” implies unicode text but this is not enforced. Operating on
string is like operating on
byte. You are working with bytes not characters.
Q: If strings are just arbitrary bytes, then how do you work with characters?
A: What you are thinking of as a character, Go calls a rune. One way to iterate the characters in a string is to use the
for...rangeloop. Range will parse the string as UTF-8 and iterate the runes. Read the
forloop section of Effective Go for more information.
When to use
Ask not when to use
string but rather, when to use
byte. Always start with
string and switch to
byte when justified.
When to use
byte when you need to make many changes to a string. Since
string is immutable, any change will allocate a new
string. You can get better performance by using
byte and avoiding the allocations.
Even if your code isn’t directly manipulating the string, you may want to use
byte if you are using packages which require it so you can avoid the conversion.
Converting to and from
byte is easy. Just remember that each conversion creates a copy of the value.
s := "some string" b := byte(s) // convert string -> byte s2 := string(b) // convert byte -> string
bytecopies the entire value. Using lots of type conversions in your code is typically a warning sign that you need to reevaluate the types you are using. You want to minimize conversions both for performance and clean code.
More about strings
The Go blog has posted in detail about strings, bytes, runes, and characters in Go. You should definitely read that post to fully understand the topic.
Update: Thanks to @mholt6 for reviewing the post and helping improve it!