Go 101: String or Byte Slice?
• http://joshua.poehls.me/2014/04/go-101-string-or-byte-slice/
One of the first things you’ll notice in Go is that two different types are commonly used for representing text. string
and []byte
. A quick example is the regexp package which has functions for both string
and []byte
.
What is the difference?
string
is immutable and []byte
is mutable. Both can contain arbitrary bytes.
The name “string” implies unicode text but this is not enforced. Operating on string
is like operating on []byte
. You are working with bytes not characters.
They are nearly identical and differ only in mutability. The strings
and bytes
packages are nearly identical apart from the type that they use.
Q: If strings are just arbitrary bytes, then how do you work with characters?
A: What you are thinking of as a character, Go calls a rune. One way to iterate the characters in a string is to use the
for...range
loop. Range will parse the string as UTF-8 and iterate the runes. Read thefor
loop section of Effective Go for more information.
When to use string
?
Ask not when to use string
but rather, when to use []byte
. Always start with string
and switch to []byte
when justified.
When to use []byte
?
Use []byte
when you need to make many changes to a string. Since string
is immutable, any change will allocate a new string
. You can get better performance by using []byte
and avoiding the allocations.
C# perspective:
[]byte
is toSystem.StringBuilder
asstring
is toSystem.String
when it comes to performance.
Even if your code isn’t directly manipulating the string, you may want to use []byte
if you are using packages which require it so you can avoid the conversion.
Converting to and from []byte
is easy. Just remember that each conversion creates a copy of the value.
s := "some string"
b := []byte(s) // convert string -> []byte
s2 := string(b) // convert []byte -> string
Converting to/from
string
and[]byte
copies the entire value. Using lots of type conversions in your code is typically a warning sign that you need to reevaluate the types you are using. You want to minimize conversions both for performance and clean code.
More about strings
The Go blog has posted in detail about strings, bytes, runes, and characters in Go. You should definitely read that post to fully understand the topic.
Update: Thanks to @mholt6 for reviewing the post and helping improve it!