Keeping a track of computers that are all connected to a network can be a hassle, without desktop management software. Even with it, you can’t necessarily tell exactly which physical computer is connected to a given switch port. Eventually, you will come back to a switch closet, but this doesn’t necessarily help the situation at all; especially when you encounter a stack of Cisco 3750s, spewing yellow ropes of technological vomit off in all kinds of directions. So the best things you (we) can determine are a wall socket (if it is labelled) and maybe a MAC address too, if you log in.
But this is all a heap of work that just shouldn’t have to be done at all and really frustrates, further, any problems that you’re trying to troubleshoot. So, I decided to come up with a plan. This is quite a long post, but it gives background to what motivated me to do this in the first place!
Standard problems:
In lots of situations, you’ll know what the cause of a network problem is. In some cases, we notice connection issues across the board and it is likely that the problem is already known by the IT services department. Specific PCs with issues can often come down to the issues below:
- DHCP scope has been exhausted, so no more IP addresses can be given out
- Port is disconnected (no signal)
- Port is shutdown (electrical connection is present but no traffic goes over the wire)
- VLAN mismatch (subnets of specific PCs will be assigned differently than that designated to the rest of a given room)
(This is just going to go and assume that a PC has been physically checked that it is connected to the right network interface (each interface has a different MAC address which can affect FOG host registration settings and port security settings on a switch) and is connected through the correct wall port that was assigned to that PC (or to a port that we know is – or should be – assigned to a specific network)).
In the case of a DHCP scope issue, the problem can be partly determined by the PXE boot message popups, for example it may display a message that it is receiving proxyDHCP offers but not DHCP offers. If a port is disconnected, this can be determined through a network cable tester. A shutdown can be determined in a similar fashion (electrical signal present but no traffic received) but in all these cases, the only information that can be provided to IT services, that would be of any use, are a patch panel port number and a system MAC address. However, these won’t necessarily help in determining the right location of the switch in a building that has possibly 100 or more switches around.
The case could even be that a PC may work fine and even connect to the internet with no problem – but then if a room of 30 PCs were to receive multicast traffic on a system and, say, two or three are on a different subnet, multicast issues will present themselves and PCs may stop imaging before others have finished. In fact, it is only if someone knows the symptoms and has had experience with the last issue – VLAN mismatches – that you can really identify that as the cause of the issue.
Working with others
Now, in this environment, all PCs are allocated an IP address in specific subnets that corresponds to a respective VLAN, which in turn corresponds to a room (usually). VLAN configuration is implemented by switches. If we want to check some of this information to see what has been – perhaps incorrectly – configured, then we need to be able to access the switch configuration. As these switches are (almost all) operated by the IT services people, our team cannot see this configuration. Even if we were given the configuration, it would eventually become outdated, especially as things can be disconnected and reconnected by other people.
So we could be simply stuck at only being able to provide patch panel numbers (the wall sockets that a PC is connected to), a room number that the PC is in, a MAC address of the PC and an IP address. With the MAC address, IT services could possibly find out where a PC is connected to if a MAC address is provided, but this information is not readily available unless we provide it – and this is assuming it will be accurate and never change. Plus this presents another issue; if the configurations are stored on switches, how can any of the information we provide correlate to what is stored on a switch? The answer is it can’t – unless someone can trace a computer back to a switch port or someone can produce a mapping of this information and stores it elsewhere.
What has been tried
In our networking environment, all computers are connected to switches. Most importantly, this means that, regardless of all cables and panels, there is a direct link between a switch port and a PC. These switch ports are where VLANs are assigned to and where any port security will be set on. Therefore, being able to identify what is connected to all switchports is highly valuable – arguably to both IT services and our own technical team.
When there is an issue with a port or a PC, somebody has to trace that port back. If a port is broken, someone has to go to where the switch is physically located and trace back the cable from the switch-end of a patch panel to the switchport. As mentioned at the beginning, this is a nightmare approach, but it still has to be done.
However, this method is just too labourious for my liking. So the next thing to try is to use something called a fluke tester. These devices, which I think cost way too much money (even second hand they appear to be going for around £1000 – £1500 at the time of writing) for what we would be using them for, can be plugged into a wall socket and tell you all sorts of information about what is on the other side. Crucially for us, they tell us the switchport number and the switch IP address. This is actually brilliant and – whilst this method also requires manual work – is far more accurate. In conjunction with collating a list of wall ports, it can be used to accurately map out switch port to patch panel mapping.
This actually worked very effectively and, by late 2013, I had finished making a chart of many of our labs and detailed this on our internal wiki site.
The idea was for part of it to be updated and maintained by our IT services and part of it by our department; we would make sure that PCs were in the correct position according to our own records for lab checks and that they were plugged into a corresponding wall socket. But this requires extensive user input and, predictably, will be prone to user errors. It was also pointed out to me at the time, when I borrowed the Fluke tester, that this collation of patch panel/switch port IDs had already been done by some interns and was now almost certainly already outdated. Nevertheless, I finished the mapping and maintained our side of things as changes were made.
And then changes were made that completely invalidated the entire chart.
Over the summer period in 2014, we had a network upgrade that saw the replacement of over 100 switches around the campus. Whilst no word to this effect had been made, it seems as though the switch configurations were probably from over a year ago – before any requested changes to the configuration had ever been made. With absolutely no communication about the upgrade or what might have happened, this presented a huge issue when upgrading our imaging system as it was slowly realised that there were a lot of small – with minimal overall impact – issues with the network configuration.
A new solution
It is clear by now that something more than just checking each PC individually needs to be done. This method is still viable for single sytems that might have the odd issue here or there, but to ensure 100% accuracy across all our systems there needs to be a different approach.
Wireshark provides a really cool piece of functionality that could help. Actually, this was what I have been using for about a year since I realised that it can be used to filter out LLDP and CDP packets after I noticed that the Fluke tester would pick up its information through these two types of packet (LLDP packets are broadcast every 30 seconds and CDP packets every minute across our network). The structure of these packets is very different (CDP uses an Ethernet frame and LLDP uses an Ethernet II frame) but both contain VLAN information, Switch information (for example IP, platform, version) and.. the switchport ID. This is accurate, it can be run on all PCs and we can get more information (VLAN ID, for example) than the Fluke tester would give us (I think it must be able to pull VLAN information out but I didn’t work out how to when I had a go with the one we used for a couple of weeks).
It is progress, but it was still intensive to do it this way, even with 4 or 5 people helping to go around a room. So can this be scripted?
If you strip out the GUI and just look at what Wireshark does, its functionally very similar to the Linux tool tcpdump which has a Windows port, WinDump (Actually, it is built using the same fundamental protocol – WinPCap, which itself is a port of libpcap). So in late 2014, I went about seeing what I could do to script WinDump and found through some quick googling that I could capture an LLDP packet or a CDP packet and then stop the capture after outputting the results of this one packet to a text file. This is great, although the formatting of the file isn’t particularly useful and would require some additional post-processing.
Windump curiously hasn’t implemented a nice formatting for LLDP packets and I would like to be able to utilise either CDP or LLDP outputs in a nice way – or really any network protocol. The only consistent way to output data is in the form of hexadecimal values. Both CDP or LLDP packets should output in the same format if the hex option is specified, but then what? There is still the issue to do with efficiency of data collection and obtaining the meaningful information – you still have to have somebody actually collect these results. The logical thing to do is to perhaps extend such a script to upload the output to a central server, but then I would have to process all of these text files and pick the useful information out.
So why not just do that processing on the PC before sending it? This is possible with a script, which could perhaps dissect an output text file – but with developing complexity, I decided that it could be a good idea to – instead – make a bespoke application that would actually grab all of this of data and upload only the useful parts that I want. Furthermore; rather than uploading a file, it would make sense to upload a record to a database that can store all of these results, foregoing any human interaction and even without needing WinDump at and only capture the relevant data we want in the first place.
So my work on this project, when time permits, has been to write a C (or C++) command-line application that utilises libpcap/winpcap – just as Win/TCPDump or Wireshark does – to capture either a CDP or LLDP packet and upload a string of text from that containing the switchport, vlan, hostname and IP address to a central database. So far it works quite well and hopefully will be releasing it for people to test out before making the source code freely available. I hope people can contribute to and help improve it and make it a genuinely useful tool for sysadmins across multiple platforms.
Edit: It could be suggested that I find out about connected hosts from the switches themselves or by using SNMP. The thing with various “switchport mappers” out there, however, is that they mostly seem to rely on using SNMP only. If you query a device that runs SNMP, you don’t necessarily get any more information than you would if you were to just look at that device yourself (for example, you can see all of a Cisco switch’s interface details to see host mac addresses; this is all you would receive back if you interrogated it with SNMP anyway).
What this program does, however, is a bit different. Whereas SNMP will give you a network map from the switch-side, this application will generate more detailed host information from the client-side. Switches won’t know a host’s IP address in all likelihood and certainly not hostname or any other information that you might want. Although you can find most of this out with other applications, this program will combine its own information with that given to it by a switch.
Of course, having the SNMP-derived information about a switch actually is not un-useful; this way, you can see every device that is connected to a switch, whereas what I have developed will only show you the information about hosts that you have run this application on (or that you can have remote access to, or have imaged with it already on and scheduled). It is quite possible in an environment such as the one that I work in that you can have other devices that you don’t have access to (AV and Wifi access points for example) and so you can’t make assumptions about free space on switches. So for the future, perhaps some SNMP integration with this application would make it a step beyond what most switchport mappers would normally provide (without being overly complicated). I guess this could help direct what the name should be, a bit more..
No comments:
Post a Comment