Optimizing blog GeoIP DB

Published 2026-03-23 | Updated 2026-03-31

As part of migrating my blog from Python to Go, I wanted to gather anonymized statistics about which countries my visitors were coming from. This "small" feature ended up requiring a bit of optimization due to the large number of entries in the database. This turned into quite an adventure for me and I would like to share the process with you. Not to spoil anything but I make some pretty impressive optimizations, if I may say so myself.

To get going, I had to select a GeoIP database provider. I went with IPinfo. They approximate the location of IP-addresses by pinging them from servers spread out in the world, triangulating locations by the response time. They also provide a ipinfo.csv.gz file for download that is updated daily.
This was great for me as I wanted a local in-memory database; I didn't want to rely on sending GeoIP lookups to an external API-service to get the visitor's country. This is important as the country is determined when the requested page is served; waiting for external API calls would make the page load slower.

Starting with some numbers, the file is about 30MB. The unpacked .csv file is ~445MB. The file is ~5,7 million lines long where each line contains one subnet and its corresponding country. This is format of the file:

network,country,country_code,continent,continent_code,asn,as_name,as_domain1.0.0.0/24,Australia,AU,Oceania,OC,AS13335,"Cloudflare, Inc.",cloudflare.com1.0.1.0/24,China,CN,Asia,AS,,,1.0.2.0/23,China,CN,Asia,AS,,,1.0.4.0/22,Australia,AU,Oceania,OC,AS38803,Gtelecom Pty Ltd,gtelecom.com.au1.0.8.0/21,China,CN,Asia,AS,,,1.0.16.0/24,Japan,JP,Asia,AS,AS2519,ARTERIA Networks Corporation,arteria-net.com1.0.17.0/24,Japan,JP,Asia,AS,,,1.0.18.0/23,Japan,JP,Asia,AS,,,1.0.20.0/22,Japan,JP,Asia,AS,,,

I only care about the network, country name (and possibly country code) columns. Also, my Internet provider only provide IPv4, so I only have to keep the ~1,3 million IPv4 entries. How do I get this data into my Go program, and how do I make the search fast? Let's tackle these problems one at a time.

Version 1

This was my initial implementation. All data lives inside the CountryIPData struct that is loaded on program startup. The struct has two methods. The parseIPInfoCSV() method run on struct initialization and put all IPv4 subnets in a map[netip.Prefix]string where the string is the country name. The AddrCountry() method perform the ip-to-country lookup.

This are the relevant parts of the code:

type CountryIPData struct {    subnetCountries map[netip.Prefix]string} func (c *CountryIPData) parseIPInfoCSV(ipInfoCSV *os.File) error {    subnetCountries := make(map[netip.Prefix]string, 1_500_000)    scanner := bufio.NewScanner(ipInfoCSV)    for scanner.Scan() {        fields := strings.Split(scanner.Text(), ",")        prefix := fields[0]         subnet, err := netip.ParsePrefix(prefix)        if err != nil {            if fields[0] == "network" {                continue // first line contains CSV headers, first header is "network"            }            return fmt.Errorf("line %s: %v", scanner.Text(), err)        } else if subnet.Addr().Is6() {            break // We only care about ipv4 subnets, they are displayed first        }         country := fields[1]        subnetCountries[subnet] = country    }     c.subnetCountries = subnetCountries    return nil} func (c *CountryIPData) AddrCountry(ip netip.Addr) (country string) {    for subnet, country := range c.subnetCountries {        if subnet.Contains(ip) {            return country        }    }    return ""}

A relatively straightforward implementation. Turning each subnet into a netip.Prefix gives me access to the convenient Contains() method, making the code easy to follow.

Benchmarking

I measured the memory usage and average lookup time using this benchmark function:

func BenchmarkIPLookup(b *testing.B) {    countryIP, err := NewCountryIPData()    if err != nil {        fmt.Printf("new CountryIPData: %v\n", err)        return    }     runtime.GC()    var m runtime.MemStats    runtime.ReadMemStats(&m)    alloc := m.Alloc    b.Logf(        "countryIP: %d MB (%d bytes per entry)\n",        alloc/1024/1024, alloc/uint64(countryIP.Length()),    )    b.Logf("entries: %d\n", countryIP.Length())     octet := 0    for b.Loop() {        // 0.0.0.0, 1.1.1.1, 2.2.2.2, 3.3.3.3, etc        countryIP.AddrCountry(fmt.Sprintf("%d.%d.%d.%d", octet, octet, octet, octet))        octet++        if octet > 239 {            octet = 0        }    }    b.Log("avg time per lookup:", b.Elapsed()/time.Duration(b.N))}

Checking the map memory usage, I found that the 1,3M+ entries use ~223MB memory. That is 174 bytes per entry, on average. Considering that the .csv file itself is ~445MB we have saved some space, but I expected the memory usage to be less considering we are only processing IPv4 entries, not the full file.

Benchmarking AddrCountry() gives an average result of ~10 milliseconds per lookup, which is fine but not great. I would probably get a similar result by querying the IP-info API. Version 1 works but we can definitely do better. Where to start?

Version 2

I chose to first optimize the 10ms lookup time. Since I'm doing the lookup as part of serving each page it makes the page load that much slower. The lack of speed comes from the fact that each lookup has to potentially go through all 1,3M+ entries before a match is found.

After some thinking, I figured I could split the map[netip.Prefix]string into a nested map where the outer map groups prefixes based on the first byte of the IP-address: map[byte]map[netip.Prefix]string. This separates the data into ~240 nested maps where each map contains much fewer prefixes.

This is what the v2 code looks like:

type CountryIPData struct {    subnetCountriesByFirstByte map[byte]map[netip.Prefix]string} func (c *CountryIPData) parseIPInfoCSV(ipInfoCSV *os.File) error {    scanner := bufio.NewScanner(ipInfoCSV)    for scanner.Scan() {        fields := strings.Split(scanner.Text(), ",")        prefix := fields[0]        subnet, err := netip.ParsePrefix(prefix)        if err != nil {            if fields[0] == "network" {                continue // first line contains CSV headers, first header is "network"            }            return fmt.Errorf("line %s: %v", scanner.Text(), err)        } else if subnet.Addr().Is6() {            break // We only care about ipv4 subnets        }         // Create nested map if not exist        firstByte := subnet.Addr().As4()[0]        if _, exist := c.subnetCountriesByFirstByte[firstByte]; !exist {            c.subnetCountriesByFirstByte[firstByte] = make(map[netip.Prefix]string)        }         // Add prefix/country to nested map        country := fields[1]        c.subnetCountriesByFirstByte[firstByte][subnet] = country    }    return nil}

The CSV-parser now places each subnet into a nested map based on the first byte. This extra map-layer increases the memory usage from ~223MB to ~227MB. This is not optimal, but will do for now.

This is how the new IP lookup does its job:

func (c *CountryIPData) AddrCountry(ip netip.Addr) (country string) {    firstByte := uint8(addr.As4()[0])    for firstByte > 0 {        subnetCountries, exist := c.subnetCountriesByFirstByte[firstByte]        if !exist {            firstByte -= 1            continue        }        for subnet, country := range subnetCountries {            if subnet.Contains(ip) {                return country            }        }        return ""    }    return ""}

The AddrCountry() lookup got more complicated as the first byte of the user IP-address might not perfectly match an entry. Funnily enough, the only instance of this is the 28.0.0.0/7 IP-range that is assigned to the United States Department of Defense. The last IP-address in that range is 29.255.255.255.
If a user visits with that IP-address, the lookup searches for key 29 in the outer map, but no such key exist. This is why the lookup occur inside a for loop; if key does not exist, firstByte is decremented until a match is found. The second iteration of the loop will therefore try key 28 and find a matching key. Its value is a nested map[netip.Prefix]string. The lookup can now proceed like it did in version 1, returning the subnet's country name.

The lookup time now averages 46 microseconds. Compared to the 10 milliseconds achieved by version 1, it's a massive 217x gain in performance! This is thanks to each nested map containing somewhere between 1k and 50k entries instead of 1M+. Performing a lookup on page serve is now so fast that it's practically unnoticable.

But I'm not yet satisfied. Is the 227MB memory usage a lot, or is this what I have to expect from keeping so many entries in memory?

Version 3

I wanted to reduce the memory usage so I focused on the value of each key, the country name. For every subnet we are storing the country as a string. With 248 countries in the file and 1,3M+ entries, each country's name is duplicated over 5000 times on average. I tested the impact of this by leaving all strings empty. The memory usage dropped from 227MB to 116MB, so there is potential to save on memory usage.

I finally settled on storing the country name strings in a separate map[countryCode]countryName where the countryCode key is a [2]byte byte-array (CN for China, for example). I also had to alter the old subnetCountries map to instead store the 2-byte country code as its value. A separate map lookup on countriesByCC is now required to get the full country name.

This is the new code. Some error handling had to be cut for example brevity:

type CountryIPData struct {    subnetCountryCodes map[byte]map[netip.Prefix][2]byte    countriesByCC      map[[2]byte]string // map[countryCode]countryName} func (c *CountryIPData) parseIPInfoCSV(ipInfoCSV *os.File) error {    var countryCode [2]byte    scanner := bufio.NewScanner(ipInfoCSV)    for scanner.Scan() {        fields := strings.Split(scanner.Text(), ",")        subnet, _ := netip.ParsePrefix(prefix) // error skipped for brevity         // Populate country in separate map        copy(countryCode[:], fields[2])        if _, exist := c.countriesByCC[countryCode]; !exist {            country := fields[1]            c.countriesByCC[countryCode] = country        }         // Populate subnet entries with countrycode value        firstByte := subnet.Addr().As4()[0]        if _, exist := c.subnetCountryCodes[firstByte]; !exist {            c.subnetCountryCodes[firstByte] = make(map[netip.Prefix][2]byte)        }        c.subnetCountryCodes[firstByte][subnet] = countryCode    }    return nil} func (c *CountryIPData) AddrCountry(ip netip.Addr) (country string) {    firstByte := uint8(addr.As4()[0])    for firstByte > 0 {        subnetCountries, exist := c.subnetCountryCodes[firstByte]        if !exist {            firstByte -= 1            continue        }        for subnet, countryCode := range subnetCountries {            if subnet.Contains(addr) {                country := c.countriesByCC[countryCode] // Extra map lookup                return country            }        }        return ""    }    return ""}

So, fewer strings but an extra key lookup. What's the verdict? Well, memory usage dropped from ~227MB to ~100MB. That is an average of 78 bytes per entry, compared to 177 bytes in the previous version. This is even better than my previous test using empty strings. I'm guessing that this is because each string adds some overhead that a byte-array does not? The lookup time now averages 47 microseconds, up one microsecond from the previous version. So this is yet another great improvement. But can we do better? Yes!

Version 4

I have to admit that I am very proud of this version, making some fantastic gains and improvements. Based on the string optimizations in v3, we can deduce that most of the 100MB memory left in use is now taken up by the netip.Prefix. Each entry takes 78 bytes of memory on average. An IPv4-address is 4 bytes and the countryCode is another 2 bytes, so where do the rest of the memory go?

Unaware of more sciency options, I figured that instead of using a netip.Prefix and its convenient Contains() method, I could just convert the first and last IP-address of each subnet into separate uint32's. I can then compare the user IP to the first and last address of each subnet to check for a match; the user IP must be equal to or larger than the first address of the subnet while also being smaller or equal to the last address. If true, it's a match.

According to my counting, each entry would store 10 bytes of data. My Go LSP tells me it's 12 bytes, so there may be some alignment going on. But still, 12 bytes for an entry is much less than our current 78. That is promising.

This is how I implemented v3:

type CountryIPData struct {    subnetCountries []IPv4SubnetCountry // now a slice    countriesByCC   map[[2]byte]string} type IPv4SubnetCountry struct {    netAddr, lastAddr uint32  // 8 bytes    countryCode       [2]byte // 4 bytes} func (c *CountryIPData) parseIPInfoCSV(ipInfoCSV *os.File) error {    var (        subnetCountries = make([]IPv4SubnetCountry, 0, 1_500_000)        scanner         = bufio.NewScanner(ipInfoCSV)        countryCode     [2]byte    )     for scanner.Scan() {        fields := strings.Split(scanner.Text(), ",")        subnet, _ := netip.ParsePrefix(fields[0]) // error skipped for brevity        var netAddrBytes [4]byte = subnet.Addr().As4()         copy(countryCode[:], fields[2])        countryName := fields[1]        c.countriesByCC[countryCode] = countryName         subnetCountries = append(subnetCountries, IPv4SubnetCountry{            netAddr:     binary.BigEndian.Uint32(netAddrBytes[:]),            lastAddr:    netAddr + 1<<(32-subnet.Bits()) - 1,            countryCode: countryCode,        })    }     // Create a new slice containing the exact number of entries    subnets := make([]IPv4SubnetCountry, len(subnetCountries))    copy(subnets, subnetCountries)     // Sort entries in ascending order by the netAddr field    slices.SortFunc(subnets, func(a, b IPv4SubnetCountry) int {        return int(a.netAddr - b.netAddr)    })     c.subnetCountries = subnets    return nil}

We can't convert a netip.Prefix directly into a uint32, it has to first be converted into a [4]byte. In this example the subnet is 1.0.1.0/24. The network address 1.0.1.0 is converted to uint32 16777472.

Get the last IP-address in subnet

To figure out the last address of the 1.0.1.0/24 subnet in the above code, we first have to figure out the size of the subnet. We do that by subtracting subnet.Bits() from 32: 32 - 24 = 8. Our subnet is 8 bits in size.

To convert 8 bits into a decimal number, we bit-shift decimal 1 to the left eight times. Each bit-shift perform a "times 2" operation, giving this result: 1 << 8 = 256. This is what the bit-shifts look like in binary format:

000000001 = initial   (integer 1)000000010 = 1st shift (integer 2)000000100 = 2nd shift (integer 4)000001000 = 3nd shift (integer 8)000010000 = 4th shift (integer 16)000100000 = 5th shift (integer 32)001000000 = 6th shift (integer 64)010000000 = 7th shift (integer 128)100000000 = 8th shift (integer 256)

We have figured out that the subnet contains 256 unique IP-addresses. With the network address being one of these addresses, we have to subtract 1 from 256. The last IP-address of the subnet is therefore 1.0.1.0 + 255 = 1.0.1.255.

Compressing all of this into a single line turns it into this:

lastAddr := netAddr + 1<<(32-subnet.Bits()) - 1

Now that both network-address and last-address of the subnet have been calculated, we can put them in a struct that also contain the country-code. All structs are sorted and stored in a slice ready for IP-lookups performed by the AddrCountry() method.

Bit-shifting

Bit-shifting is the fastest operations a CPU can perform. This is because it's one of the simplest instructions for a CPU to execute. Comparatively, muliplication and division are some of the most expensive operations a CPU can perform due how to complex the instructions are. To divide a number in half, a nice performance trick is to bit-shift the variable one step to the right: 30 >> 1 = 15. The same is true for doubling, just bit-shift the value one bit to the left: 15 << 1 = 30.

Binary Search

Now that the subnetCountries map has changed to a slice, I should probably attempt to improve the IP lookup process. My immediate thought was to loop through all subnet entries in the slice until a match was found. But with more than a million entries, there must be a more efficient way.

This is when I remember reading about the binary search algorithm. The name sounded scary but I figured I would check it out anyway. To my surprise it looked relatively intuitive!

To explain the algo, imagine you are holding a book; you want to open page 60 in the book, but you don't know where that page is. The algorithm starts by opening the book right in the middle. In a 100-page book, this opens page 50. We now know that page 60 is in the upper half of the book, so we can ignore the first 50 pages. At this first step we have already ruled out 50% of the pages in the book.

We run the algorithm again, but this time opening the page that is halfway between pages 50 and 100. We get page 75. As page 60 is lower than 75, we can now ignore higher page numbers. Running the algo again, we open the page halfway between 50 and 75. We get page 63. We run it again for 50 and 63 and we get page 57. Run it again for 57 and 63 and we get page 60!

With each iteration halving the size of the search area, we can efficiently search through any long list of entries. The only caveat is that the entries must be sorted. In my case, I have to make sure the entries are sorted by the netAddr value.

The code below is my implementation of this binary search:

func (c *CountryIPData) AddrCountry(ip netip.Addr) (country string) {    var (        addrBytes = ip.As4()        addrInt   = binary.BigEndian.Uint32(addrBytes[:])        low       = 0        needle    = len(c.subnetCountries) >> 1 // halfway point        high      = len(c.subnetCountries)    )     for range 30 {        subnet := c.subnetCountries[needle]         if high-low == 1 && low > 0 {            return "" // No country matching, RFC1918        } else if addrInt < subnet.netAddr {            high = needle            needle = (low + high) >> 1            continue        } else if addrInt > subnet.lastAddr {            low = needle            needle = (low + high) >> 1            continue        }        return c.countriesByCC[subnet.countryCode]    }    return ""}

At the start of the function, we set the low, needle, high variables. Low is the first entry in the slice, high is the last entry in the slice. Needle is the entry right in the middle of the slice.

Each iteration start by fetching the IPv4SubnetCountry struct at the needle index. If our IP-address is lower than the subnet's network address we know that we need to search in the lower parts of the slice. We then make high = needle and then pick a new needle index based on the new low/high values: needle = (low+high) >> 1.
Likewise, if our IP-address is higher than the subnets last address we know we need to search the upper parts of the slice, so we make low = needle, then calculate needle to be the new halfway point.

This process continues until a matching subnet is found, at which point we do a countryCode-to-countryName lookup and return the result. If no matching subnet is found, the algorithm continue iterating until the low and high values are only one digit apart. This happen if traffic belong to a private IP-address (RFC1918) or its IP-range has not been assigned a country.

With 1,3M entries in the slice, the algorithm needs about 20-21 iterations to find a match. I manually set the maximum for-loop iterations to 30, just to avoid it running forever in case of some unforeseen bug.

What's the performance verdict? The previous version achieved a memory usage of 100MB and a lookup time of ~47 microseconds. Version 4 achieved a memory usage of 15MB and a lookup time of ~290 nanoseconds (0.29 microseconds)! This is another 162x speedup and I'm honestly quite proud of myself for implementing this.

Version 5

I was perfectly satisfied with the v4 implementation and had almost finished writing this article when I realized that there exist one more optimization. If you looked carefully at the example output from the ipinfo.csv file, you could see that multiple adjacent subnets were assigned to the same country. This is a common pattern throughout the entire file.

Due to how subnetting works, these had to be specified as separate subnets in the CSV file. As I am now using integers to represent first/last IP-addresses in each range, I can easily extend one range to also include the next as long as they are adjacent. And the next one after that. So let's make one final optimization at the end of the parseIPInfoCSV() function:

func (c *CountryIPData) parseIPInfoCSV(ipInfoCSV *os.File) error {    [...]    slices.SortFunc(subnetCountries, func(a, b IPv4SubnetCountry) int {        return int(a.netAddr - b.netAddr)    })     // Countries tend to use IP-ranges after each other.    // Combining adjacent ranges further reduce the number of entries    combinedSubnets := make([]IPv4SubnetCountry, 0, len(subnetCountries))    for i := range subnetCountries {        this := subnetCountries[i]        if i == 0 {            combinedSubnets = append(combinedSubnets, this)            continue        }         prev := subnetCountries[i-1]        if prev.countryCode != this.countryCode ||           prev.lastAddr+1 != this.netAddr {            // Different countries or ranges not adjacent            combinedSubnets = append(combinedSubnets, this)            continue        }        // extend existing IP-range instead of adding another entry        combinedSubnets[len(combinedSubnets)-1].lastAddr = this.lastAddr    }     copyCombined := make([]IPv4SubnetCountry, len(combinedSubnets))    copy(copyCombined, combinedSubnets)    c.SubnetCountries = copyCombined    return nil

This optimization more than halved the number of entries from 1,34M to 607K entries. The memory usage also halved from ~15MB to ~7MB.

Version 6

Ok, I really should stop now, but I can't! I found that most subnets in the CSV-file are adjacent, so instead of checking that the IP-address is between netAddr and lastAddr of an entry, I could scrap lastAddr and instead compare it to netAddr in the next entry. This cuts memory usage from 12 byte per entry down to 8 byte. To cover cases where there actually is a gap (RFC1918, etc), we add filler entries with an empty countryCode. These fillers add about 10 entries or so, so overhead is minimal.

With a 8 byte struct hosting a 4 byte IP-address and a 2 byte country code we're pretty damn lean. However, we'd still only use 6 of the 8 allocated bytes, wasting 25% of the space. This is because Go aligns each struct in memory to avoid spreading a struct over multiple cache lines.

To remove this overhead I have to scrap my IPv4SubnetCountry struct. Instead I split the two data types into separate slices. The netAddr is placed in a []uint32; the corresponding country code in a [][2]byte. I believe this is called data-oriented design as opposed to the object-oriented design we have been doing up until this point.

This allows each slice to perfectly align each entry, no longer wasting 2 bytes for every entry. I have to make sure they occupy the same index in each slice, but that shouldn't be a big problem. If the network address 1.2.3.0 is added to index 5 in a slice, its corresponding country code must also live on index 5 in the other slice.

Let's view the new code:

type CountryIPData struct {    subnets       []uint32  // Separate slices optimize memory usage    countryCodes  [][2]byte // Separate slices optimize memory usage    countriesByCC map[[2]byte]string} func (c *CountryIPData) parseIPInfoCSV(ipinfoCSV *os.File) error {    var countryCode = [2]byte{}    subnets := make([]uint32, 0, 1_000_000)    countryCodes := make([][2]byte, 0, 1_000_000)     // Create initial 0.0.0.0-0.255.255.255 entry    subnets = append(subnets, 0) // 0.0.0.0    countryCodes = append(countryCodes, countryCode)    var prevSubnetLastAddr uint32 = 16777215 // 0.255.255.255     scanner := bufio.NewScanner(ipInfoCSV)    for scanner.Scan() {        fields := strings.Split(scanner.Text(), ",")        prefix, _ := netip.ParsePrefix(fields[0]) // skip for example brevity         copy(countryCode[:], fields[2])        c.countriesByCC[countryCode] = fields[1] // country name         subnetBytes := prefix.Addr().As4()        subnetNetAddr := binary.BigEndian.Uint32(subnetBytes[:])        subnetLastAddr := subnetNetAddr + 1<<(32-prefix.Bits()) - 1         // Subnets are not adjacent, place a "empty-country" entry inbetween        if prevSubnetLastAddr+1 != subnetNetAddr {             // Create fill entry with empty country code            subnets = append(subnets, prevSubnetLastAddr+1)            countryCodes = append(countryCodes, [2]byte{})             // Create new entry            subnets = append(subnets, subnetNetAddr)            countryCodes = append(countryCodes, countryCode)             prevSubnetLastAddr = subnetLastAddr            continue        }         // Subnets are adjacent but different countries        if countryCodes[len(countryCodes)-1] != countryCode {            // Create new entry            subnets = append(subnets, subnetNetAddr)            countryCodes = append(countryCodes, countryCode)             prevSubnetLastAddr = subnetLastAddr            continue        }         // Extending is simply not creating a new entry :)        prevSubnetLastAddr = subnetLastAddr    }     copySubnets := make([]uint32, len(subnets))    copy(copySubnets, subnets)    c.subnets = copySubnets     copyCountryCodes := make([][2]byte, len(countryCodes))    copy(copyCountryCodes, countryCodes)    c.countryCodes = copyCountryCodes     if len(c.subnets) != len(c.countryCodes) {        return fmt.Errorf(            "length mismatch: %d != %d",             len(c.subnets), len(c.countryCodes),        )    }    return nil}

Total memory usage with 6 byte entries is now down to ~3MB. I might finally have reached the end, I don't see any way to improve this any further.

Conclusion

By iterating through various solutions for optimizing the memory usage and lookup time of my in-memory GeoIP database, we got the memory usage down from ~223MB in v1 to ~3MB in v6. The lookup time plummeted from ~10 milliseconds to ~290 nanoseconds.

I shared this article on the Go subreddit and one feedback I got was to look into using a radix-tree. I did look at it but it seemed like a highly complex data structure, so I put it on hold. I think 6 bytes per entry is quite hard to beat.

Honestly, this was a super fun exercise for me. I usually don't have to think about these sorts of optimizations in my daily work, so it was a great learning experience. Now that I have gotten the memory usage way down, I am comfortable to (in the future) add the 4M+ IPv6 entries that are currently skipped. This will happen the day I get an Internet provider that support IPv6.

If you want to look at the different versions in full, feel free to check them out here: https://github.com/emieli/go-country-ip.

Finally, I hope you enjoyed reading this. I thank you for your time and hope to see you again soon.