- Resolving A Name Is Complex
- gethostbyname(3) and getaddrinfo(3)
- How To Debug
- Big Picture
Can also be found in presentation format here
Resolving A Name Is Complex
Resolving a domain name is complex. It’s not limited to the DNS, the
Domain Name System — A decentralized and hierarchical system to associate
names and other information to IP addresses.
It’s not something we, as users, usually pay attention to. We notice it only when we’re facing an issue. It normally works out of the box but really nobody get the crux.
You search online for clarifications but they barely help and add more confusion.
Here are some schemas trying to decipher the mystery that domain name resolution came to be.
One, two, and three, I think you get me, it is not easy. It’s never as
simple as taking a hostname as a string, getting the DNS address in the
/etc/resolv.conf config, then sending a request to the DNS on port 53
to be greeted back with the IP.
Behind the scene there are ton of files and libraries involved, all of this to get a domain name solved.
So in this talk we’ll try to create some order to try to understand thing
as an end-user. Let’s make sense and reason behind this mess even if I
have to say, I don’t get it much myself.
I can’t assess I haven’t made mistakes but if I did, please correct me, that would be great!
Let’s start with the misfits, the ones that don’t follow the rules,
the not-invented-here syndrome found within our tools.
When it comes to DNS resolution, there’s no one-size fit all solution. Obviously, many of us don’t want to deal with all the complexity, so we say, “let’s pack these bytes ourselves, and forget the hassle”.
That’s pure heresy though. We’d prefer everything to work the same way, so that it’s easier to follow. It would be preferable that they all use the same lib, to all have the same behavior. That is, in our case to rely on the C standard lib, or the POSIX API our savior.
In all cases, let’s note some software that don’t rely on it, as we said, all the misfits.
- The ISC/BSD BIND tools: from host, to dig, to drill, to nslookup, and more, used for debugging chores.
- Firefox/Chrome/Chromium: There are the browsers, because they are one of a kind, bypassing libc and POSIX mechanism, implementing their own DNS API for performance reasons and perfectionism.
- Any applications needing advanced DNS features, other than simple name to IP.
- Language that don’t wrap around a libc: The Go programming language comes to mind. It implements it’s own resolver API.
Fortunately, I can ease your mind by letting you know that all
of these will at least respect
configurations. Files that we’ll see in the next sections.
I’ve taken a look at over a dozen different technologies and I think the
best way to understand them is through their archaeologies. There’s a
lot that can be explained about DNS resolution simply based on all the
The main thing you need to understand, is that there’s not a single clean library call to resolve a hostname. Standards and new specs have pilled up over the years, with some software that haven’t followed, but risking to disappear.
Overall, libc and POSIX provide multiple resolution APIs:
- There’s the historic, low level one provided by ISC/BSD BIND resolver
implementation within libc. Accessed though
gethostbyname(3)and related functions, implementing an obsolete POSIX C specification.
getaddrinfo(3), that is the modern POSIX C API for name resolution.
All these combinations, ladies and gentlemen, are the standard ways
to resolve a name.
Newer applications will use
getaddrinfo while older ones will use
gethostbyname. Both of these 2 will often rely on something called
NSS and another part to manage
Now let’s dive into each of these and you’ll get them like a breeze.
The resolver layer is the oldest and most stable in our quest. It originates from 1983, today almost 37 years ago, at Berkeley university.
It comes from a project called BIND, Berkely Internet Name Domain, which
was sponsored by a DARPA grants. And like the Berkeley socket that gave
rise to the internet, it has now turned into much much pain.
It was the very first implementation of the DNS specifications. It got released in BSD4.3 and today the BIND project is maintained by the Internet Systems Consortium, aka ISC.
It not only offers servers and clients, and the debug tools which we
mentioned earlier, but also offers a library called “libbind”. This
library is the defacto implementation, the standard resolver, the one
of a kind. It is initially based on all the original RFC discussions,
namely RFC 881, 882, and 883.
The BSD people wrote technical papers assessing its feasibility, and went on recommending and implementing it within BSD.
At that point BIND wasn’t a standard yet, it was an optionally-compiled
code for those who wanted to get their feet wet, those who wanted to
Then it got part of the C standard library interface through
resolv.h, and closed the case
If you take a look at most Unix-like systems today, from MacOS, to
OpenBSD, to Linux, and company, you’ll see clearly in
copyright going back to 1983, to that very date. But obviously, it depends
on the choice of the implementer, a case by case
So then the code diverged, there’s the libresolv provided by the C standardization and the libbind provided by the BIND implementation. However, most Unix only add small specific changes to their needs. For example, resolver in glibc is baselined off libbind from BIND version 8.2.3.
This layer is normally used for low level DNS interactions because it’s missing the goodies we’ll see later in this presentation.
Now let’s talk about environments and configurations.
The resolver configuration file
The resolver configuration files were mentioned in BIND first release, in section 184.108.40.206 of “The Design and Implementation of ‘Domain Name Resolver’” by Mark Painter based on RFC883, part of the DNS RFC series.
This particular file being
/etc/resolv.conf, you’ll see it hardcoded in
resolv.h and if that file is missing, it’ll fall back to the localhost
as the DNS, just to be safe.
/etc/host.conf, according to the manpage also
“the resolver configuration file”, it’s so appropriately named. It’s a
conf that dictates the working of
/etc/hosts, the “static table lookup
So what’s in these files.
resolv.conf takes care of how to resolve names and which
to use for that, while
hosts simply has a list of known host aliases,
ip + name, as simple as that.
resolv.conf you can also have a
search list for domains.
That’s if a name you’re searching for doesn’t have the minimum number
of dots in it then it’ll add one of these TLD to it, top-level-domains,
and keep searching until it finds something that fits.
This can also be manipulated in an environment variable
There can also be a sortlist IP netmask, for when there’s many results to match but you don’t want to give priority to the cloud VPS that lives only for cash.
Finally, there’s the
option field, also overriden on the command
line by the
RES_OPTIONS environment variable. It manipulates the minimum
number of dots we mentioned and also if you want can set debug as enabled.
hosts file is but a key-value db, simply made of domain
names and IPs.
Its config also lets you change the order of results and for the rest
host.conf to consult.
So remember, that all of these are mostly used everywhere because it’s the lowest layer. So it’s used by libbind and libresolv but also the custom NIH syndrome
Alright, so far that’s all classic clean stuff. Let’s move on to the next sections, you’ll scratch your head until there’s no dandruff.
gethostbyname(3) and getaddrinfo(3)
The C library POSIX specs create a superset over the C standard
library. They add a few simpler calls to resolve hostnames and make it
easy. These focus on returning A and AAAA records only, ipV4 and ipV6
gethostbyname(3) which is deprecated, and there’s the newer
getaddrinfo(3) defined in IEEE Std 1003.1g-2000, which mainly adds
RFC3493 aka ipV6 is now supported. So applications are recommended to
use this updated version unless they want to divert from mainland.
There are functions to resolve IP addresses to host names, but let’s focus only on name to ip for today, I know it’s lame.
Apart from ipV6 support being added, some internal structures have been updated as they weren’t so safe between subsequent calls and thus could be your demise and your fall.
Obviously they both return different structures.
hostent struct is returned to
gethostbyname function caller.
getaddrinfo returns an
Both being defined in the
Some libc implementations will get fancy and add their own modified
gethostbyname. For instance in glibc they add support for
ipV6 in their modified
gethostbyname2 for backward compatibility.
Regarding configuration files,
getaddrinfo will consult
which takes care of the precedence of the addresses returned in the
results. And now, you’re going to brandish your torch yelling at me “but
resolver(3) already does that by default”. But I’ll let you know that
resolver(3) is only interested in DNS calls only while these two POSIX
functions in their egocentrism are more interested in all the ways,
files, and mechanism that a name can be converted to an IP.
That is, they often rely on something called NSS which is what we’ll see in our next analysis.
getaddrinfo(3) will most likely rely on
the NSS service, but what is NSS, aka Name Service Switch.
First of all it is not to be confuse with “Network Security Services”, which has the same accronym but has a lib called
-libnss. In our case it’s
-lnetdb, with the
netdb.h header, so keep this in mind for later.
To understand what’s NSS is, we, again, have to go back in time, back
when the tech was still in its prime.
There always has been the idea of sharing configurations between machines, however back in the days it was all hardcoded, with the exception of Ultrix.
Hardcoded in files like
aliases for emails,
/etc/hosts for local
domains, the finger database and all that it entails. This idea dates
back for so long that
netdb.h header was almost always there, but was
looking in these files we mentioned earlier
There are also a bunch of POSIX functions to get these values getservbyname, gethostent, gethostbyname, getservbyport, etc.. I think you can continue.
From that point on we needed something more flexible, and so Solaris OS
said let’s not have it hardcoded, that’s not-acceptable. Let’s create
something called the Yellow Page, a sort of phone book for configurations
brokerage. But the name Yellow Page had legal issues so let’s go with NIS,
for the Network Information Service.
Other Unices liked what they were doing in their business so they reproduced it in something called NSS. Though NSS, Network Service Switch is much simpler than NIS.
Let’s have a side note about OpenBSD OS which doesn’t implement NSS
but has a pseudo-N.I.S., something called the
ypserv(8), the Yellow
Pages written by Theo de Raadt from scratch, but he doesn’t care about
the legal name wrath.
On OpenBSD you can also find the
The name-service switch dispatcher, something similar to NSS
But I’m not sure, I’ll recheck my citations.
So let’s summarize, NSS is a client-server directory service protocol that has as role to distribute system config between different computers, to keep them harmozined. It is more flexible than the fixed files in libc and POSIX, and is arguably like LDAP, or zookeeper, if you know it. Or actually, like any modern way to share configs between containers and microservices.
“But what does it have to do with domain names”, you may ask, well, a map of name with ip is a config like any others, so it’s the same task. That also includes things from hosts, password, port, aliases, and groups. Yep, it’s quite the big soup.
Apart from the functions in POSIX there is command line utilities that
goes by the name of
getent that lets you access NSS facilities to do
simple queries for its entries.
So for example you can get a service port based on the name of that service Yes, simple the name suffice.
This particular module will read the
NSS is quite versatile.
We can obviously query for a hostname which is our main game.
And note that you can disable the IDN encoding too Remember all that domain name we did on the forums, all that voodoo
So how is NSS actually working, how does it also do the resolving.
The NSS library consults the
files and depending on the entries it will sequentially attempt until
it’s satisfied, until it find what it wants until it got the demand.
You’ll find the “hosts” entry in this file, along with a list of string on its right.
These strings are the modules which will dynamically be loaded and
sequentially executed, the format even allowing to have appended
Like here I’m skipping resolve plugin if it’s not available on my machine.
To get a list of all modules, you can look in your lib directory mess
for anything that starts with
The most common modules are the following: files, dns, nis, myhostname, and resolve (for systemd-resolved).
- files: Reads a local file in our case /etc/resolv.conf or /etc/hosts, no polling or anything
- dns: will try to resolve the name remotely, in this case yes, it’s pulling it.
- nis: To use solaris YP/NIS
- myhostname: which reads local files such as /etc/hosts and /etc/hostname similar to the files plugin in case you missed.
- resolve: the resolve plugin is the systemd-resolved, yes don’t put me on a crucifix.
And theres a bunch of others In case you’re in a mood to be a crusader.
Let’s open a parenthesis on the
resolve plugin, before you throw it
quickly in the dustbin. It’s quite advanced having multiple features like
caching, to DNSSEC validation, to resolveconf, as well as being an NSS
plugin. And when used as an NSS plugin, you communicate with systemd-resolved
via dbus sockets, otherwise it always listens on port 53 for fallback
in case you didn’t use NSS.
You can consult its
part of the
org.freedesktop.resolve1.Manager dbus object.
Now let’s move to something else, something you haven’t thought of yet.
As we said, resolv.conf is used by all these components, but not only them, also all network agents. They are also in charge of setting or changing the DNS address, each of them, from dhcp client, ppp daemon, vpn manager, network manager, they all want access. And what about having 2 network connections concurrently, each requiring their own separate DNS, obviously.
So everyone wants to use the resolv.conf file, thus we need a manager
to handle it. We want to avoid an inconsistent state, it’s vital not
let everyone mess with it, and that is what
resolvconf(8) role is.
Anyone wanting to change the resolv.conf should instead pass through resolveconf to avoid the hassle. It does that by using it’s resolvconf command line executable. Similarly to the resolv.conf configuration, you can pass anything to it like domain, search, and options.
Now resolv.conf is rarely a plain normal file itself because the manager finds it easier to create a symbolic link and avoid the abusiveness. The default implementation has it in
Accordingly, like any other tooling, resolvconf has configuration
/etc/resolvconf.conf, and a directory with hooks in
/etc/resolvconf/. Within these files you can mention if you want the
symlink to be at another location.
I’m saying default implementation because like anything else on a system you can replace it with your own concoction. Two popular alternatives solution to this problem: openresolv, systemd-resolved, which we mentioned earlier.
So resolv.conf is rarely a file it’s more of a symlink, check all of these for example, you’ll be surprised I think.
In computers you can make anything faster with another level of
indirection. That’s what all cache mechanism try to offer and domain
name resolving is no exception.
There are two places where caching is available, either through a local dns proxy or through something called nscd. Just remember that this last one isn’t very stable.
Let’s start with nscd which is an NSS proxy, so it not only caches the DNS queries but also anything related to getting an NSS entry.
The other caching method is to run your own local dns server, be it bind9,
djbdns, dnscache, lwresd, dnscrypt-proxy or any other resolver.
These can either be full featured, bells and whistles or only provide lightweight cache proxy if you’re not feeling like you want the details.
Another reason to run such service would be to block ads and all their malice.
Also, just beware of flushing the cache, otherwise you’ll get surprises that will make you crash.
EDIT: OpenBSD uses
unwind.8 as a local DNS server caching along with
resolvd.8 daemon to manage
How To Debug
So now you sort of know that it depends on what everthing uses Once you got that you can now start an analysis.
You can use a BIND tool To debug if DNS is the fool Or simply do a wireshark trace if you don’t want to bother or these are not under your grace
You can also check which NSS pluging is loaded And make sure they’re not aborted
Remember that each tool can have their own configurations So it adds complexity to the equation.
Let’s conclude here.
You should now be comfortable with anything in the domain name resolution sphere. It’s all about shared config management, like zookeeper, ldap, and these other arrangements.
I hope you’ve learned a thing or two and that domain name resolution is
less of a taboo.
Thanks for listening and have a nice evening.
If you want to have a more in depth discussion I'm always available by email or irc.
We can discuss and argue about what you like and dislike, about new ideas to consider, opinions, etc..
If you don't feel like "having a discussion" or are intimidated by emails then you can simply say something small in the comment sections below and/or share it with your friends.