StringBuilder implementation in Python

Intro

A string is a collection of characters. In Java, StringBuilder class is used to create mutable string. But there isn’t a built-in StringBuilder in Python. In Python, string is an immutable object. New memory has to be allocated for every string.

In Pythin, there are following ways to implement a StringBuilder.

  1. string concatenation(append)
  2. join() function
  3. string IO module

String concatenation(append)

def method1():
    string = ""
    for i in range(count):
        string += str(i)
    return string

join() function

def method2():
    strings = []
    for i in range(count):
        strings.append(str(i))
    return "".join(strings)

StringIO module

def method3():
    si = StringIO()
    for i in range(count):
        si.write(str(i))

    return si.getvalue()

Efficiency measurement

I’m interested to see which method is more efficient. So, we are building a long string with one million numbers in order to measure the run time of each method.

from io import StringIO
import time
from array import array


# using string concatenation(append)
def method1():
    string = ""
    for i in range(count):
        string += str(i)
    return string


# using join()
def method2():
    strings = []
    for i in range(count):
        strings.append(str(i))
    return "".join(strings)


# using StringIO module
def method3():
    si = StringIO()
    for i in range(count):
        si.write(str(i))

    return si.getvalue()


# implement a StringBuilder in Python
class StringBuilder:
    def __init__(self):
        self.sb = StringIO()

    def add(self, str):
        self.sb.write(str)

    def toString(self):
        return self.sb.getvalue()


# using array
def method4():
    char_array = array("L")
    for i in range(count):
        char_array.append(i)
    return char_array


# test function
def run_test(method):
    start_time = time.perf_counter()
    eval("method" + str(method))()
    end_time = time.perf_counter()
    print("elapsed time {}".format(round(end_time - start_time, 2)))

We run the tests for the four different methods twice to measure the efficiency.

# tests
count = 1000000
run_test(1)
run_test(2)
run_test(3)
run_test(4)


run_test(4)
run_test(3)
run_test(2)
run_test(1)

# Output:
# elapsed time 0.29
# elapsed time 0.16
# elapsed time 0.19
# elapsed time 0.09

# elapsed time 0.08
# elapsed time 0.19
# elapsed time 0.15
# elapsed time 0.31

Conclusion

The string appending method is the slowest. The join() function is relative efficient. Using an array seems efficient as well. However, we don’t count the array to string convention time yet.

In general, the join() functions seems fast enough for long string construction.