I recently interviewed at Google for a system administrator position. A few friends have expressed interest in the questions. Here’s the quick review in case you’re curious…
First session, one hours, two interviewers:
I think I did pretty well on File systems, Mysql db, Networking.
File systems: What is a journalled filesystem? What are its advantages? Can you think of a situation where journalling might cause problems?
Mysql: Say you have a newspaper company (or companies) and companies own one or more newspapers. Newspapers have issues, issues contains stories, and stories have attribution to reporters (say one reporter per story). Draw a schema on the board representing the tables you might use and their relationships.
Networking: Forgot what the question was here… I’ll check my notes and possibly update this entry if I find anything.
Did reasonably well on creating a sample script starting with
L1=”words words words”
L2=”words words words”
and ending with L3 containing words in L1 *not* also in L2. I got the basic concept but probably lost points on the “it has to run” part. My answer drew heavily from having solved this “subtract” type of problem before in a script, but that was for files. Drawing on that, I proposed two for loops that would echo words from L1 to a temp file, each line prefixed by “9”, and words from L2 to the same file prefixed by “1”. Then the file is munged by sort, uniq, and grep to keep only one of each duplicate word (the copy prefixed by “1”) and then grep out the remaining stuff from L1 which survives in the file because it was unique and is prefixed by “9”. Kind of an “ugly hack” but I readily admitted that I had cribbed from a previous script that I wrote designed to subtract files of arbitrary length, not to subtract sets of words given on one line.
Second session, one hour, two more interviewiers:
Did OK on DNS/Resolver. The question started with “If I type ping www.google.com into the shell, describe what happens in terms of resolving and DNS. Answer involved talking through types of queries and root nameservers, though I missed nsswitch.conf and described a resolver that went straight to resolv.conf, and gave one wrong answer and then corrected myself on the describing exactly what was in each query and response packet.
Did OK on the “write a script to identify users whose home dir is not /home/$user and move their directories”… I went for the rewriting /etc/passwd and forgot about nis/ldap users but mostly I showed good understanding of perl and regex.
I believe I did well on describing what happens when you’re tailing a logfile and the file is moved, and what to do.
I correctly described the difference between hard links and soft links. I messed up on “what’s in an inode vs. whats in a dir entry” but corrected myself with a little prompting.
Lunch break. Google cafe rocks. I had braised ox tail and dry-braised string beans.
Third session, one hour, hiring manager only:
Mgr asked me “why do you like being a sysadmin” (A: I enjoy problem solving, among other things). Mgr also asked me to come up with a process for upgrading the kernel on 10,000 machines, I believe I did well on that. I started off describing a complete system upgrade via kickstart, but then with a little prompting also described installing kernel RPMs while the system is still running and then doing controlled reboots.
He then showed a network diagram and asked me to walk through troubleshooting steps in the case of a certain user complaining that access to his mailserver is slow. I asked mostly the right troubleshooting questions, though I assumed perhaps incorrectly that it wasn’t due to congestion/maxing out the link because no packet loss was observed, only high latency. My assumption was that it wasn’t due to maxing out the link, because if you max out the link, even if you are able to buffer a second or two of traffic at the router, eventually the buffer would get full and have to drop something. Eventually I walked through enough “virtual troubleshooting” to determine that the link was running close to capacity since 2am according to MRTG and reporting on the flows showed most of the traffic on TCP port 3389, which I didn’t immediately see the significance of, but suggested to track down the two machines involved in the heavy conversation and netstat -ap to see what was listening on that port.
I asked Mgr how big the team is, how is it structured, and how many levels of management between sysadmins and the CEO.
I kept some notes from when I had a phone interview (about a month before). This was one person, one hour.
I think I did well there too. He asked me about the difference in quoting styles in shell and/or perl, and to describe how I would write a script to parse /etc/passwd to get a list of users. He asked for a simple command to transform comma-separated files into tab-separated, I said “sed -e ‘s/,//'” where “” is literal ^I or ^V^I depending on the shell.
He asked how I would delete a file named “-f” and I gave a bunch of alternatives, eventually coming to the one he was looking for. He asked for a description of how to compile a linux kernel, I think I did OK on that one, and also to describe what LILO actually does in detail.
Finally we talked about CIDR, netmask, and how to figure out netmask for a /22 net in some detail, which took me longer to describe over the phone than to draw and point at, but eventually got through it. I also described my “netmask shortcut” which is to figure out how many addresses are in the network based on how many bits, then subtract that number from 256 to get the last byte (example, /28 is 4 bits smaller than /24, 2^4 is 16, 256-16=240, 255.255.255.240)… or if the network is larger than /24, count how many bits removed from /24 it is and use that to figure how many /24s are contained in it, and apply the same logic to the third octet instead (/19 is 5 bits from /24, so it is the size of 32 class C’s, so 256-32 is 224, and you get 255.255.224.0, not that you would ever build a single network with 8096 nodes, but if you did that would be your netmask).
Port 3389 is Microsoft Remote Desktop/Terminal Services, my friend :).
Groovy. Though why it should eat up a link I have no idea. Fun screensaver on the remote machine?
Or file transfers or quality of display settings. Probably file copying (you can share hard drives over the link).
Also, you might know the answer to the other part. If a link is maxed out, would you expect to see packet loss, high latency, or both? What would explain high latency but no packet loss? (Besides QOS placing ICMP first so pings always get priority… assuming no QOS)
If a link is saturated you will have latency and ICMP loss and a lot of TCP retransmit/retry errors. What would cause high latency but no loss would be an overworked router (very large route table causing high CPU utilization. I saw that all the time) or inefficient routes like a 256 Kbps circuit in the loop.
Latency is a factor of distance, bandwidth, media type, saturation and hops (each router adds 2-10 ms).
Did reasonably well on creating a sample script starting with
L1=”words words words”
L2=”words words words”
and ending with L3 containing words in L1 *not* also in L2. I got the basic concept but probably lost points on the “it has to run” part.
Did it have to be a shell script? I might have suggested that it be written in Python, because it has Set classes (and later versions have set function calls). However, I don’t remember either syntax; I would have to look at a manual.
He asked how I would delete a file named “-f” and I gave a bunch of alternatives, eventually coming to the one he was looking for.
I didn’t know the answer to this offhand. I can think of ways to do this that involve writing C code (thus bypassing the option processing of rm and unlink, but that’s probably not the desired answer. I thought about it a little and decided to look at the man page for getopt(3) to see if there was any mention of escaping or otherwise disabling option processing. Turns out (at least in FreeBSD) that if you give ‘–‘ on the command line, it signals the end of option processing, so you can give something on the command line afterwards that won’t be treated as an argument with an option.
So in both cases I didn’t know the answer but was able to figure it out. If I were interviewing and didn’t get the job for these reasons, I’d like to know in what ways this would make me different from a candidate who got the job. Can this be quantified in terms of the performance of the company? Or does this indicate some level of engagement that is common to employees? I have heard that some places (like Yahoo!) like to ask these types of questions because they come up in “geek trivia” circles. I guess I don’t attribute any particular value to such information (above and beyond getting a job); not that the information isn’t useful in case it comes up, but it’s just something to be filed away in case it comes up. In other words, I don’t consider it to have some special significance.
If you get hired, I guess you won’t be able to talk about what happens at work much, but I’m very curious about what the differences are between people who were hired by Google and people who weren’t (who seem competent enough to work there).
I would agree that either getting or not getting the answer to something like that probably isn’t a good indicator of future performance.
Also, I had been told by a friend before going into the interview that they would be looking for how I thought something through, so I should make sure to TALK my way through it and (as I believe you said) state my assumptions.
I immediately said that I believe rm has a — option on linux to make it treat the rest of the arguments as non-options. When prompted for more, I also suggested my preferred way of getting rid of files that I couldn’t type the name, such as instead of rm “-f” I could do “rm ?f” (hopefully preceded directly before with “ls ?f” to make sure). I pointed out that this was what I normally do with files that have invisible or control characters like ” ” or xxx^I.
Apparently several perfectly good and provably correct answers didn’t dissuade him from prompting me for more, eventually after a couple hints (I don’t actually remember the hints) I stumbled on the answer HE wanted, which was either “rm ./-f” or “rm `pwd`/-f”. He then admitted that this wasn’t really “better” than the other answers, and wouldn’t work for invisible characters whereas “rm -i *” would. (I still wonder if I had said “rm ./-f” first if I would have been prompted for more.)
I was also told by the same friend before the interview that they sometimes ask questions that have no answers, just to see how someone thinks about (well, discusses thinking about) the topic.
Ah well. I hope it really is all about the journey and not really the destination.