This research paper details some recent concerns regarding DNS services and consumer privacy. This paper summarizes the concepts of DNS. IT discusses how DNS is used on the Internet. It discusses how DNS services are provided to consumers and what types of entities provide the service for daily use. This paper continues with a discussion of how DNS has been and is currently being used as a mechanism to collect and profile the behavior of users on the Internet and how these mechanisms can be abused. The alternatives available to consumers for DNS are presented in closing and suggestions for methods for finding a balance to privacy and utility Internet service are made.
DNS is an acronym for Domain Name System. It is one of the most fundamental and important services provided throughout the Internet. Nearly every networked client that uses a symbolic name to access a web server, email server or any other service depends on DNS. The domain name system translates symbolic names like www.ibm.com or mail.google.com into 32-bit Internet Protocol (IP) addresses. DNS also translates IP addresses back into domain names. The translations process from a name to an address is called forward lookup. The translation process from an address back into a symbolic name is called reverse lookup. Forward lookup is used more often than reverse lookup. The DNS concept dates back to 1987. RFC 1034 and RFC 1035 define the concepts, specification and implementation of the domain name system and protocol we use today on the Internet. According to (RFC1034, 2009) the DNS has three major components:
- domain name space and resource records, which are specifications for a tree structured name space and data associated with the names,
- name servers are server programs which hold information about the domain tree’s structure and set information and
- resolvers are programs that extract information from name servers in response to client requests
In the simplest form, the servers providing resolution of domain names and addresses are organized into a hierarchy. Resolving a name to an IP address may take many queries across several domain name servers located in different places on the Internet to complete the process. Resolving a domain name to an IP address happens from right to left. For a name such as www.gap.com, the server or servers handling the root domain for .com are queried first. They are queries for the servers of the next component to the left. The .com root servers are queried for the .gap name. The .com servers will return one or more servers that handle the sub-domains for the gap.com domain. The gap.com servers are queried for an address of www within the domain. Through recursive querying of servers from root domain to specific sub-domain, the IP address of www.gap.com is found. Some details have been left out in this example, but this is in essence what happens. Performing this query each time a client asks for the IP of www.gap.com would place too much burden on the communications infrastructure of the Internet, so caching of DNS information happens as well. Domain resolution includes the amount of time from a few seconds to days for that information to remain current. Clients and servers can retain this resolution data in memory until it expires, and then query for it again from the source servers. Caching allows repeated queries for the same domain name to resolve almost instantaneously. Caching of DNS information can happen at several levels of scale, starting at the workstation, the local network and up to the Internet service provider.
As mentioned above there are nameservers and resolvers. Nameservers are queried that provide translation from name to address or from address to name. Resolvers are built into our workstations and other Internet-capable devices. A resolver knows the client-side of the DNS protocol that can ask a nameserver to perform a translation. Caching nameservers are a hybrid server that includes both the ability to provide services to resolvers - DNS clients - and act as resolvers to query servers upstream from them to perform forward or reverse resolution. Caching nameservers can be found in consumer firewall devices we use in our homes. They are very often used by large organizations, including Internet service providers as a convenience to their subscribers. The main purpose of caching nameservers is to provide a resolution service closer to the client and reduce the number of queries traveling across the Internet. Caching nameservers are a performance optimization.
Internet service providers are the most common providers of caching DNS services that consumers use to query and resolve domain names to IP addresses. You employer, if they have a large enough IT department, may elect to run their own caching DNS system for performance reasons. Your workstation or notebook at the office may be using a DNS server that runs on the local area network. That server queries other servers on the Internet as needed to perform forward and reverse resolution. Recently, several alternative, value-added DNS providers have increased their presence. One of the more popular services is called OpenDNS. In addition to providing name and address resolution services for free, they maintain a system that prevents name resolution of sites known to distribute malware and viruses. They also allow a customer of OpenDNS to tailor what categories of sites on the Internet they will resolve. For example, a parent of a family with young children can elect to prevent OpenDNS from resolving sites with violent or sexually explicit content. Instead of providing an address for the objectionable site, the user’s browser is redirected to page within OpenDNS’ network explaining why they have arrived there. What is important to note here is that a consumer must elect to use OpenDNS and it is implied they understand how the service will behave. Not all consumers are informed or understand how their provider’s DNS service will perform for them.
Most consumer DSL and cable routers will pull their configuration from the service provider. That configuration will include one or more addresses of DNS servers. DSL and cable routers will also act as Dynamic Host Configuration Protocol servers for internal networks. The router will provide IP addresses to each client. The router will also do one of two things: provide the DNS addresses to each client that it was provided, or the router will act as a caching nameserver and provide its address to each client as the DNS server. Unless you have taken action to use a different DNS server, there is a good chance you are using the DNS servers supplied by your Internet service provider.
The privacy issues for DNS are different depending on whose services are used. Let us assume in a consumer is at home and their default configuration for their Internet connection uses the DNS servers provided by their ISP. The ISP may also be the telephone company and television company of this user. The ISP issues the IP address to the consumer’s cable or DSL router. When queries are made to the ISP’s DNS servers, the source IP address will be that of the customer’s router. Using relational database technology, the sites queried from the home router can be stored and analyzed to form a behavioral profile of this customer’s interests. That information can be used to market new telecommunications products to them, or it can be sold to other businesses or potentially provided to government entities to help understand this family’s patterns of Internet usage. This is possible because of the ability to relate key elements of information - DNS queries, router address, and existing personal data on file - back to a customer and others in the customer’s home. Recently Internet service providers have tried a new approach in using DNS to help generate revenue streams. “Several consumer ISPs such as Cablevision’s Optimum Online, Comcast, Time Warner, Rogers, and Bell Sympatico have also started the practice of DNS hijacking on non-existent domain names, for the purpose of making money by displaying advertisements. This practice violates the RFC standard for DNS (NXDOMAIN) responses, and can potentially open users to cross-site scripting attacks.” (HIJACK, 2009). This technique redirects a user’s browser from an error page to a search page or advertisement page when a non-existent domain name is requested through DNS. There have been documented cases of redirecting legitimate addresses to an alternate web site as well. Most of these approaches require the manipulation of established Internet protocols such as DNS. Not surprisingly, they are met with consumer hostility. According to Kirk (2009), “ISPs are trying to find revenue streams other than simply providing Internet access to subscribers for a monthly fee. Some have investigated behavioral advertising systems, which monitor a person’s Web surfing in order to deliver targeted ads. Those systems have largely failed to take hold due to privacy concerns.” Because the deployments of these DNS and web-based redirection systems require the manipulation of Internet protocols on several levels, some have been found to be vulnerable to manipulation for client exploit and attack. “Kaminsky demonstrated [a] vulnerability by finding a way to insert a YouTube video from 80s pop star Rick Astley into Facebook and PayPal domains. But a black hat hacker could instead embed a password-stealing Trojan. The attack might also allow hackers to pretend to be a logged-in user, or to send e-mails and add friends to a Facebook account.” (Singel, 2008).
This paper detailed some recent concerns regarding DNS and privacy. In addition to discussing the concepts of DNS, it detailed how and who provides DNS services to consumers. A discussion of how DNS can be leveraged as a mechanism to collect and profile consumer behavior followed with alternatives available to consumers to limit the collection of their behavioral data. Internet service providers are under pressure to increase and discover new avenues of income. Consumers are likewise under constant pressure to maintain their guard against subtle privacy violations. Consumers maintain the ability for now to limit manipulation of Internet standards to prevent unknowingly leaking personal and behavioral information to a wider audience. As discussed in this paper, methods are available to reduce the risk of privacy invasion of consumers without their complete knowledge.
HIJACK. (2009). DNS hijacking. Retrieved August 9, 2009 from http://en.wikipedia.org/wiki/DNS_hijacking.
Kirk, J. (2009). Comcast Redirects Bad URLs to Pages With Advertising. PC World, Business Center. Retrieved August 8, 2009 from http://www.pcworld.com/businesscenter/article/169723/comcast_redirects_bad_urls_to_pages_with_advertising.html
RFC1034. (2009). Request for Comments: 1034, DOMAIN NAMES - CONCEPTS AND FACILITIES. Retrieved August 8, 2009 from http://www.ietf.org/rfc/rfc1035.txt.
RFC1035. (2009). Request for Comments: 1035, DOMAIN NAMES - IMPLEMENTATION AND SPECIFICATION. Retrieved August 8, 2009 from http://www.ietf.org/rfc/rfc1035.txt.
Singel, R. (2008). ISPs’ Error Page Ads Let Hackers Hijack Entire Web, Researcher Discloses. Privacy, Crime and Security Online. Wired. April 19, 2008. Retrieved August 7, 2009 from http://www.wired.com/threatlevel/2008/04/isps-error-page/.