SFS2X Docs / GettingStarted / troubleshooting
» Troubleshooting guide
In this guide we provide a few tips and tricks to overcome typical problems encountered while setting up or running the Server.
- Unable to reach the server
- Flash crossdomain policy issues
- Server startup problems
- Dropped messages
- Ghost Users
» Unable to reach the server
One common runtime problem is the inability to connect to the test or production server after the first installation.
Once your SFS2X instance is running you should make sure that no firewall (software or hardware) is blocking the TCP ports in use. Specifically you should open the TCP ports 9933 and 8080 to "the world". If this is something you can't do you should ask your system administrator to read this document.
If your server is not connected directly to the internet you should also make sure that the NAT service (port-forwarding) is configured correctly.
The NAT service allows to expose one or more services from your computer to the public internet address. For example a well configured router will send all traffic coming to port 9933 to the server machine running SmartFoxServer 2X.
In order to determine if something is blocking your server you can try telnetting its public address from outside. Telnet is a popular command-line utility available on most operating systems that allows to establish a TCP connection to a remote host. Execute the following steps to test the connection to your server.
-
Open the system console:
- Under Windows use Windows Key + R and then type cmd in the dialogue box that appears.
- Under MacOS X the terminal is found in /Applications/Utilities/Terminal.
- Under Linux... well, every Linux user should know how to launch a terminal! :)
-
Type the following:
telnet <ip-address> 9933
where <ip-address> is the IP address to test. - You should get something like this:
Lapo$ telnet localhost 9933 Trying ::1... Connected to localhost. Escape character is '^]'.
The server has started a new connection so it's reachable.
If the attempt fails you should double check that no firewall is blocking the communication and that port-forwarding is properly configured.
» Flash crossdomain policy issues
If the server is reachable at its public address but you are still having connection problems, you should make sure that you are not incurring in any Flash Player sandbox restrictions. For security reasons the Flash Player does not allow connecting to external domains without setting explicit permissions via a crossdomain policy file.
You can learn more about the Flash Player security settings in this white-paper (it is located in the SFS 1 documentation, but the same concepts apply to SFS2X).
NOTE
SFS2X can automatically serve the policy file via socket. The file is located at SFS2X/config/crossdomain.xml and can be edited to meet your requirements.
» Server Startup problems
If the server doesn't complete the startup phase we recommend to check the log files.
SFS2X runs a two-phase startup sequence: first the low-level network engine is booted, then the SmartFox services are started in sequence until the server is ready to work.
The first low-level boot phase produces a detailed report in the {sfs2x-install-dir}/SFS2X/logs/boot/ folder. The SmartFoxServer operative logs are located in the {sfs2x-install-dir}/SFS2X/logs/ folder.
Log files are rotated on a daily basis and easy to indentify by the date appended to the filename.
» Dropped messages
Among the questions asked in our support board many revolve around the topic of dropped messages. At times you might notice a very high count from the AdminTool's Dashboard module, especially for the outgoing dropped messages.
The Incoming Dropped Messages are very simple to explain: SFS2X drops any message that is malformed or not conform to its protocol. Additionally it will discard messages whose size is greater than the value set in the config/core.xml file for the maxIncomingRequestSize parameter.
The Outgoing Dropped Messages value keeps a count of the server messages that were not sent to their respective recipients. Before discussing the possible reasons for dropping a message, let's take a look at the following diagram:
The server associate an outgoing message queue to each connected session in order to store data that cannot be written immediately. In fact the server attempts to deliver each packet as quick as possible, but there are times in which the user connection is congested and very little to no data can be sent.
In this case the server is forced to store the remaining bits of data in the queue and wait for the network pipe to become available again for a new transmission.
According to the user configuration, SmartFoxServer will keep data in each queue until its capacity is exhausted: at that point messages are dropped. This mechanism allows the server to protect itself from indefinite memory allocation which might eventually end up in a crash of the Java VM.
You can fine tune the server's tolerance for dropped messages from the Server Configurator module in the AdminTool.
» Causes of dropped outgoing messages
- Bad or slow client connection: the user has a slow response time and not enough bandwidth to keep up with the server responses. This causes the socket to become busy very soon and forces the server to keep the data in the queue.
- Too much data being sent too frequently: this is a slight variation on the previous theme. Only this time we can not necessarily blame the client connection but it is probably depending on the server logic sending too large data or too frequent updates. It is important to remember that over the internet every connection experiences a certain amount of latency which usually varies between 50 to 200 milliseconds, but it can also reach several seconds.
It is important to keep in mind these limitations and work around them with specific client and server algorithms to reduce the lag. In general it is not recommended to send more than 10-20 updates/sec via TCP, while this value could be significantly higher for UDP packets.
If you are interested into learning more about network lag we suggest you to read this article. - Lack of bandwidth on the server side: as you reach the capacity of your hosting bandwidth you will experience a general slow down of the network performance. Your players will probably start to complain about the game or application getting slower and you should see a rising number of dropped messages.
You can easily detect this issue by monitoring your bandwidth usage on a daily basis.
» Ghost Users
Ghost Users can be generated when an incomplete TCP disconnection occurs. The TCP transport protocol uses a 4 way exchange between client and server to shut down a previously opened connection. If during this exchange the communication is interrupted abruptly the disconnection will not complete, leaving the client in a state of partial disconnection.
The low-level TCP stack uses several flags to maintain the status of each connection which can be inspected using the netstat utility from the commandline. It's not the scope of this section to go into the technical details of the TCP inner workings, but if you want to learn more you can check the following resources:
» Common causes of Ghost Users
Ghost Users are an extreme rarity when working in a LAN/WAN environment because the connections are direct, and there is very little congestion. On the other hand over the internet the situation is very different because the client goes through a number of network hops, which represent the routers/firewalls/gateways that the packets go through from one end of the connection (e.g. the client) to the other (e.g. the server).
In this scenario there is a consistent number of variables that can lead to a loss of connection, including network congestion, timeouts, resets, unrecoverable packet loss etc... When this happens the user socket can remain in a "lingering state" for a long time, as each of the two parties wait for the final bits of the communication.
Until this process is completed a disconnection event can't be notified. Therefore the server is unaware of what is happening and so is the developer's custom Extension code, waiting for a USER_DISCONNECT event.
Depending on the operating system, each TCP implementation has slightly different rules and timeout settings to deal with lingering connections: eventually all this sockets will be closed forcefully causing the disconnection event to bubble up to SmartFoxServer and its Extensions.
» How to deal with Ghosts
SmartFoxServer 2X, since version 2.4, attempts to reduce the impact of pending connections by running a scheduled task (called Ghost Hunter) that checks the integrity of each connection and by removing those that are stale.
The server's configuration is also important to reduce this problem, in particular there are two settings that can be fine tuned via the AdminTool:
- ServerConfigurator > Session Max Idle Time: regulates the maximum idle time for a single session. A session is only used by clients to login in the Server. Once they are logged in the client is turned into a User. For this reason it is recommended to keep this value in a range of 10-40 seconds. Any other client that might open a connection to the Server without logging in will be removed quickly.
- ZoneConfigurator > User Max Idle Time: regulates the maximum idle time for a User. This value can change a lot depending on how the application works and how long a User is allowed to remain idle without being disconnected. In general we suggest to avoid values greater than 30 minutes.
» Detecting network issues
It is important to note that Ghost User should occur rarely in a production environment and should represent no more than 1-2% of the total amount of CCU. Should the phenomena occur with a higher incidence, it will be necessary to perform a complete check of the production environment, including:
- Server configuration
- Bandwidth resources (monitoring usage and peaks)
- Network configuration (kernel settings, routers, firewalls)
- Security (DDOS attacks)