# cache test stuff from random import randrange, paretovariate import time # access patterns def single(i, size): # single user return 0 def linear(i, size): # round robin return i % size def random(i, size): # fully random return randrange(0, size) def pareto(i, size): # silly 80/20 hack (improvements are welcome) while 1: index = int((paretovariate(1.16) - 1) * size / 12) if index < size: return index # simple cache hit statistics def simple_run(cachemaker, keymaker, size): cache = cachemaker() hits = 0 for i in range(10000): key = keymaker(i, size) try: cache.get(key) hits += 1 except KeyError: cache.set(key, None) print keymaker.__name__ + ":", print "%5d" % size, "-", print "%.1f%%" % (100.0 * hits / (i+1)), print "(cache=%d)" % cache.size() def simple(cachemaker): for function in single, linear, random, pareto: for size in 10, 50, 100, 200, 500, 1000, 2000, 5000, 10000: simple_run(cachemaker, function, size) print benchmark(cachemaker) # simple benchmark def benchmark(cachemaker): cache = cachemaker() keys = [pareto(i, 500) for i in range(100000)] penalty = 0 maxsize = 0 t0 = time.clock() for key in keys: try: cache.get(key) except KeyError: penalty += 0.01 # assume it takes 10 ms to create an object cache.set(key, None) maxsize = max(maxsize, cache.size()) print "benchmark:", t1 = time.clock() - t0 print "speed=%.2fs (%.2f+%.2f)" % (t1 + penalty, t1, penalty), print "memory=%dk" % maxsize # and that each object needs 1k if __name__ == "__main__": # generate histogram size = 20 histogram = [0]*size for i in range(100000): histogram[pareto(i, size)] += 1 maxkey = max(histogram) acc = 0 for i, v in enumerate(histogram): acc += v print "%2d" % i, (68*v/maxkey)*"=" or ".", "%d%%" % (100*acc/100000)