Networks today have evolved quickly to include business critical applications and services, relied on heavily by users in the organization. In this environment, network technicians are required to do more than simply add new machines to the network. Often they are called on to troubleshoot more complex issues, thus keeping the network up and running at top speed. This white paper discusses ten common problems encountered by technicians today and their symptoms, causes, and resolutions.
Problem 1 – Dead drop
Symptom: A PC, phone, access point or printer is connected to the wall jack and the connection is not activated. The switch port does not show a link light, nor does the network adapter.
Cause: Dead drops are commonly simple problems that occur when a connection is not patched through to the wall jack. In many organizations, only those connections that are actively being used are patched. When offices or meeting rooms are moved around, sometimes the network jacks are not tested for the new users, or drops that are intentionally left disconnected may not be properly documented. Additionally, the switch port may be administratively disabled.
Resolution: Check to be sure the switch port is activated and connection has been properly patched. When any devices have been physically moved in the office, be sure to test the new connections to ensure they are working properly. In the case of an IP Phone, it may also be that sufficient power is not being supplied to the phone.
Problem 2 – Can’t get an IP address
Symptom: The network appears down or inoperable. The operating system may alert that an address was not received from the DHCP server. After checking the network adapter status, no address may be configured.
Cause: No address has been received from the DHCP server. The DHCP server may be out of addresses, the server service may be down, the end device may be configured to use a static address instead of a DHCP address, or the DHCP request from the end device never made it to the server in the first place. This may be the case especially if a new device is configured for a VLAN that is not set up to forward DHCP requests to the DHCP server.
Resolution: Key question – is this problem restricted to one user or are many users affected? If only one user is affected, check the NIC settings to be sure it is configured to use DHCP. Next, check the switch to see which VLAN the port is configured as a member. Check that other devices on this VLAN can get addresses. If they cannot, the problem may be that the router is not forwarding DHCP requests to the DHCP server. If many devices across several subnets are having this problem, the problem is likely the server itself. The server may not have the DHCP service running, or the address pool may be empty. If there are multiple DHCP servers one or both may be configured incorrectly to share offers or one may even be a bogus DHCP coming from a rogue home wireless router.
Problem 3 – Can’t connect to the application server
Symptom: The application the user is trying to open may alert that it cannot connect to the application server. This can be the case when using e-mail applications or CRM Applications. The common complaint into the help desk is that the network is down, even though this is not the true problem.
Cause: Several things can cause this event. The key question to ask the user is whether this problem happens constantly or only sporadically. If the user has a proper IP address for the connection they are on, there may be a routing issue on the network between the user and server. This can be verified with a simple ping. If connectivity is lost sporadically, this can be caused by a busy server that is not responsive to client requests.
Resolution: In the case that routing is not the issue (ping test), check the server load and resources. Is the server busy running another task such as a backup? If this is not the case, check the network load between client and server, focusing on WAN connections if there are any. Often, periods of high network utilization between client and server can cause connectivity problems for the client. The best way to do this is by using an SNMP tool that will monitor utilization over time on these links. Additionally, look for Ethernet errors on all switches and routers that cause packet loss between client and server.
Problem 4 – Incorrect VLAN assignment
Symptom: When installing new services on the network such as wireless or VoIP, VLANs are typically used to isolate this traffic from other users. This requires each switch port supporting these services to be configured for the proper VLAN. If this is not done properly, the service may not work. The IP Phone may not register with the call manager, the PC connected to the phone may not be able to connect to key servers, or the wireless users may not get proper addressing for the wireless environment.
Cause: The switches responsible for connecting these services have not been properly configured. Perhaps it was not communicated within the organization to reconfigure certain ports to support new services.
Resolution: Test the port to verify which VLANs are supported. If possible, use a VLAN Tag to generate VLAN specific traffic to check which VLANs are configured on the port. Check the IP address provided by the DHCP server to determine which VLAN is being provided to the port if it is untagged. Alternatively, check the switch configuration to validate the VLAN configuration.
Problem 5 – Duplex mismatch
Symptom: With a duplex mismatch, the connection will work, just poorly. The link lights will become active on both the switch and network adapter. Network performance will suffer greatly, with throughput dropping to 100kbps or lower.
Cause: One side of the connection is operating in full duplex (transmit and receive at the same time) and the other device is operating in half duplex (transmit OR receive at one time). The full duplex side does not have to wait to transmit, it is configured to transmit whether it is receiving or not. The half-duplex side must wait until it is not receiving in order to transmit. This means that the full duplex side has the potential to interrupt the half duplex side, causing the half duplex side to abort transmission. If transmission is aborted, the frame will need to be retransmitted. This will dramatically reduce the bandwidth the half duplex side is able to make use of.
Resolution: In nearly all cases, a duplex mismatch is the result of forcing one side of the connection (usually the switch) to full duplex, while leaving the other side, the PC, to auto-negotiate the link. The myth is that auto-negotiation will determine the forced full setting and match this configuration. This however is not the case. The side forced to full duplex is no longer sending the appropriate signalling that auto-negotiation relies on in order to determine speed and duplex. The auto-negotiating side of the connection will be left to guess at the duplex of the link. When in doubt, auto-negotiation will always default to half duplex. This is how most duplex mismatch issues happen on the network. To resolve this, set all connections on the network to auto-negotiate – unless you have a very good reason not to. In these isolated cases, such as inter-switch connections, be sure to statically set both sides to full duplex.
Problem 6 – Slow application performance
Symptom: The application appears sluggish. It may freeze on a certain screen or halt while accessing data. Often, the network is blamed for these issues.
Cause: Exonerating the network in application performance problems can assist server maintenance personnel to take out the guesswork and isolate the issue to the right place. Many issues can cause an application to slow down. Among the most common causes are server backups occurring during production hours, slow response from database servers, and packet loss on the network. From a network technician’s point of view, the most important thing to determine is whether the problem is caused by the server or by the network. To determine this, a capture of application traffic can be collected from a client machine. Look for any retransmissions between the client and server. If retransmissions exist, there is packet loss on the network, which severely affects application performance. If there are no retransmissions and connectivity between client and server is established, the problem is likely in the server and can be monitored from that perspective.
Resolution: Although packet analyzers can be very difficult to use when tracking down a problem, they often are equipped with simple counters displaying TCP retransmissions. Use this counter to assist in determining if there is packet loss on the network between client and server. Look for Ethernet errors (FCS Errors, Alignment Errors, or Late Collisions) on any switches and routers between client and server that could be causing this packet loss. If there are no errors, watch for packet loss on the WAN because of excessive utilization across the link. Alternatively, application performance can be tested with a professional network analyser to provide the same information as derived from capturing the packets and looking at response times for DNS lookup, TCP connect (network time), data start (server response) and data transfer (download speed) help to determine what part of the process is making the application slow.
Problem 7 – Printing problems
Symptom: Printing does not consistently work on the network. A printer may appear available, but print jobs that are sent to it are not completed.
Cause: Determine if only one user is experiencing this problem or if several people have the same issue. If only one user is having the issue, it may be that the PC is not mapped correctly to the print server. If this is not the cause, the network between client and printer may be to blame. Packet loss can cause printing problems, as well as network connectivity problems on the printer itself.
Resolution: Check the printer configuration to make sure it has a good IP address and can access the print server if it is external to the printer. At times, updating the printer driver has resolved printing issues. Overall, be sure that traffic is getting to and from the printer on the network and that all printer drivers are up to date.
Problem 8 – Poor or bad cable
Symptom: If the client PC is able to link to the network, performance may be poor. The PC may not be able to connect at all.
Cause: In networks today, Gigabit to the desktop is common. Gigabit requires four pairs of cable, so anything lower than Category 5e will not work for Gig. In older buildings this must be taken into consideration. In addition, any amount of untwisting of the cable (often near the RJ-45 termination or patch panel) can cause signal loss. This will result in FCS Errors on the switch ports or network adapters.
Resolution: In most cases with cable problems, the cable simply needs to be replaced. If the problem is because the cable has become untwisted, re-terminating the cable may resolve the errors. When supporting new technologies such as Gigabit or Power over Ethernet, the cable must be Cat 5e or better.
Problem 9 – DNS problems
Symptom: The user cannot access the internet or key applications. The network appears to be down.
Cause: Domain Name Services may be to blame. The client PC cannot resolve the name of the server with the IP address of that server, so it will not send a connection request. This is often caused by having the wrong DNS server configured on the client, sending DNS requests that the server does not have in its database, or packet loss on the network. DNS is a UDP-based protocol, so packets that are lost will not be retransmitted, causing DNS to fail.
Resolution: Check the client configuration to see which DNS server it is set up to use. If this is the wrong server, adjust this setting in the client or in the DHCP server which provided it. Test the DNS server from the client connection repeatedly to determine if there is delay in response due to packet loss. If packets are lost, look for Ethernet errors between the client and server. Capture failing DNS requests to determine if there is any response at all from the server. Ideally, setup a tool that will constantly test the DNS server and alert when a problem occurs.
Problem 10 – Wireless client can’t connect
Symptom: The client can detect the wireless access point, but it cannot connect to the wireless network.
Cause: Security credentials, wireless channel interference, and dead spots can cause this problem. Since wireless is invisible, it can be very difficult to track these problems down without a proper wireless tool.
Resolution: Using a wireless monitoring tool measure the signal strength from the affected area, and if possible, conduct a site survey in the area to look for rogue or unknown APs. These may be configured for overlapping wireless channels and can affect known-good users. Check for noise in the signal from surrounding access points or from noise inducing devices such as microwaves and cordless phones. Monitor the client as it attempts to connect to the access point and watch for which step fails – association, authentication, or authorization.
We just reviewed the ten top problems encountered on today’s networks. In many cases, the problem can be narrowed down to one or two things and then resolved. Be sure to document the resolution of common problems on the network so other engineers or technicians can quickly resolve issues as well. It’s time to take control, with the right tools; problems can be isolated and resolved quickly, returning the network to top-notch performance in no time.
The evolving mantra for network engineers and technicians is to focus on the end-user. To get the facts about this trend and why it matters, please visit www.criticaledgetech.com or call 870-336-0707